[Lava-users] Re: How to fail a test run on kernel warnings that happen after the boot action?

5 May 2023

      Hi Stefan!
On Fri, 2023-05-05 at 14:21 +0200, Stefan wrote:
...
Hey Florian!
Yeah, communication isn't that easy, that's what I figured out too.
As far as I understood your observations are correct. With standard setup (LAVA attached to console) it isn't easy to distinguish between script and kernel log outputs after login prompt, maybe that's why it isn't done anymore after login.
But there's a manual workaround you can put into your jobs:

Run 'dmesg -c' to clear the ringbuffer
Run that test that might trigger the RCU WARNING
Run 'dmesg' and parse the output if that warning string appears

So that triggering test might not fail, but parsing dmesg output will fail if that string appears, thus your job will report a failed test.
For parsing the dmesg output you can use a inline test job definition like this, or put something similar somewhere into your scripts:

test:

timeout:
      minutes: 1
    definitions:
    - repository:
        metadata:
          format: Lava-Test Test Definition 1.0
          name: parse-dmesg-output
          description: "Test for RCU WARNING in kernel log"
        run:
          steps:
          - lava-test-case test-RCU-WARNING --shell test $(dmesg | grep "RCU WARNING" | wc -l) -eq 0
      from: inline
      name: env-dut-inline
      path: inline/env-dut.yaml
Well, yes, I could do it manually but I expected it to be one of the
main use cases of LAVA to run tests on a Linux system and check if any
WARNING, BUG, ..., occurred. Not all Kernel bugs trigger a complete
system hang or crash.
If there is really no build-in test suite providing that functionality
I wonder if all main users of LAVA (especially kernel-ci) had to
implement that on their own. I would expect a lot of such lava-test-
cases to exist, so a lot of code duplication...
Following the upstream first principle I should not implement and
especially maintain that my own...
Florian
...
Hope this is an idea for you.
Best regards
Stefan
On 5/4/2023 5:44 PM, Florian Bezdeka wrote:
...
Hi all,
I'm basically repeating [1] here as there was no reaction for some
months now. Maybe I used the wrong communication channel, let's see...
We have a testsuite that is able to trigger a RCU WARNING inside the
Linux kernel. My expectation was that whenever a kernel warning / oops
/ call stack dump / ... occurs the LAVA job is marked as "failed".
This assumption seems to be wrong. It took some time to realize that we
have a real problem as manual inspection of test logs only happens from
time to time.
After scanning the code my understanding is that the output of the
connection (serial connection in my case) is only parsed during kernel
boot (until the login action takes over). That is not sufficient for
detecting problems that happen during test execution.
Is there a way to scan the full log for the same patterns that are used
by the boot action? If so, how to configure that? Whenever a kernel
problem occurs my test run should be marked as "failed".
Any ideas? Did I overlook something?
Best regards,
Florian
[1] https://git.lavasoftware.org/lava/lava/-/issues/576

Lava-users mailing list -- lava-users@lists.lavasoftware.org
To unsubscribe send an email to lava-users-leave@lists.lavasoftware.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

Lava-users mailing list -- lava-users@lists.lavasoftware.org
To unsubscribe send an email to lava-users-leave@lists.lavasoftware.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

[Lava-users] Re: How to fail a test run on kernel warnings that happen after the boot action?