[Lava-users] Better bootloader failure detection

Milosz Wasilewski milosz.wasilewski at linaro.org
Thu Nov 23 13:32:51 UTC 2017


On 23 November 2017 at 13:23, Matt Hart <matthew.hart at linaro.org> wrote:
> Currently it is difficult to tell the difference between an
> infrastructure problem in a device bootloader, or a kernel failure.
>
> If a kernel silently fails to boot, LAVA throws a bootloader-commands
> timeout because it hasn’t matched ‘Linux version’ to know the kernel
> has started. However, this timeout could also be caused by a real
> problem in the bootloader, such as a DHCP failure or a TFTP timeout.
> KernelCI would like to catch actual infrastructure problems in the
> bootloader, but can’t tell if the kernel just didn’t boot, or the
> commands actually timed out in the bootloader.
>
> To fix this, we're going to:
> - change the bootloader-commands action to finish when it has sent the
> last command
> - have auto-login-action takeover monitoring the kernel boot process

would it be possible to re-instantiate the v1 feature that measured
kernel boot time until "Freeing unused kernel memory"? Currently I see
no way of measuring this interval. The only available measurement is
auto-login-action that completes when there is a prompt available.

milosz

> - extend bootloader-commands to match more infrastructure problems
> - update uboot commands to execute the commands in order (like the
> other bootloader implementations), rather than building a script and
> then calling that as the last command
>
> This work is scoped for the January 2018.1 LAVA release.
> _______________________________________________
> Lava-users mailing list
> Lava-users at lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/lava-users


More information about the Lava-users mailing list