Hi Stefan!
On Fri, 2023-05-05 at 14:21 +0200, Stefan wrote:
Hey Florian!
Yeah, communication isn't that easy, that's what I figured out too.
As far as I understood your observations are correct. With standard setup (LAVA attached to console) it isn't easy to distinguish between script and kernel log outputs after login prompt, maybe that's why it isn't done anymore after login.
But there's a manual workaround you can put into your jobs:
- Run 'dmesg -c' to clear the ringbuffer
- Run that test that might trigger the RCU WARNING
- Run 'dmesg' and parse the output if that warning string appears
So that triggering test might not fail, but parsing dmesg output will fail if that string appears, thus your job will report a failed test.
For parsing the dmesg output you can use a inline test job definition like this, or put something similar somewhere into your scripts:
- test:
timeout: minutes: 1 definitions: - repository: metadata: format: Lava-Test Test Definition 1.0 name: parse-dmesg-output description: "Test for RCU WARNING in kernel log" run: steps: - lava-test-case test-RCU-WARNING --shell test $(dmesg | grep "RCU WARNING" | wc -l) -eq 0 from: inline name: env-dut-inline path: inline/env-dut.yaml
Well, yes, I could do it manually but I expected it to be one of the main use cases of LAVA to run tests on a Linux system and check if any WARNING, BUG, ..., occurred. Not all Kernel bugs trigger a complete system hang or crash.
If there is really no build-in test suite providing that functionality I wonder if all main users of LAVA (especially kernel-ci) had to implement that on their own. I would expect a lot of such lava-test- cases to exist, so a lot of code duplication...
Following the upstream first principle I should not implement and especially maintain that my own...
Florian
Hope this is an idea for you.
Best regards
Stefan
On 5/4/2023 5:44 PM, Florian Bezdeka wrote:
Hi all,
I'm basically repeating [1] here as there was no reaction for some months now. Maybe I used the wrong communication channel, let's see...
We have a testsuite that is able to trigger a RCU WARNING inside the Linux kernel. My expectation was that whenever a kernel warning / oops / call stack dump / ... occurs the LAVA job is marked as "failed".
This assumption seems to be wrong. It took some time to realize that we have a real problem as manual inspection of test logs only happens from time to time.
After scanning the code my understanding is that the output of the connection (serial connection in my case) is only parsed during kernel boot (until the login action takes over). That is not sufficient for detecting problems that happen during test execution.
Is there a way to scan the full log for the same patterns that are used by the boot action? If so, how to configure that? Whenever a kernel problem occurs my test run should be marked as "failed".
Any ideas? Did I overlook something?
Best regards, Florian
[1] https://git.lavasoftware.org/lava/lava/-/issues/576
Lava-users mailing list -- lava-users@lists.lavasoftware.org To unsubscribe send an email to lava-users-leave@lists.lavasoftware.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
Lava-users mailing list -- lava-users@lists.lavasoftware.org To unsubscribe send an email to lava-users-leave@lists.lavasoftware.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s