Hi all,
I'm basically repeating [1] here as there was no reaction for some
months now. Maybe I used the wrong communication channel, let's see...
We have a testsuite that is able to trigger a RCU WARNING inside the
Linux kernel. My expectation was that whenever a kernel warning / oops
/ call stack dump / ... occurs the LAVA job is marked as "failed".
This assumption seems to be wrong. It took some time to realize that we
have a real problem as manual inspection of test logs only happens from
time to time.
After scanning the code my understanding is that the output of the
connection (serial connection in my case) is only parsed during kernel
boot (until the login action takes over). That is not sufficient for
detecting problems that happen during test execution.
Is there a way to scan the full log for the same patterns that are used
by the boot action? If so, how to configure that? Whenever a kernel
problem occurs my test run should be marked as "failed".
Any ideas? Did I overlook something?
Best regards,
Florian
[1] https://git.lavasoftware.org/lava/lava/-/issues/576
Hello everyone,
There seems to be a bug in LAVA. I was on version 2022.04 and have also tried 2023.03. Both versions have the same bug.
The same configurations works in a 2018 build of LAVA on an old machine.
I am trying to connect to an always on board via ssh.
The healthcheck is failing with this error :
lava-dispatcher, installed at version: 2023.03<https://10.1.52.17/scheduler/job/8857#L1>start: 0 validate<https://10.1.52.17/scheduler/job/8857#L2>Start time: 2023-04-12 14:07:00.373707+00:00 (UTC)<https://10.1.52.17/scheduler/job/8857#L3>Traceback (most recent call last): File "/usr/lib/python3/dist-packages/lava_dispatcher/job.py", line 198, in validate self._validate() File "/usr/lib/python3/dist-packages/lava_dispatcher/job.py", line 183, in _validate self.pipeline.validate_actions() File "/usr/lib/python3/dist-packages/lava_dispatcher/action.py", line 190, in validate_actions action.validate() File "/usr/lib/python3/dist-packages/lava_dispatcher/actions/deploy/ssh.py", line 81, in validate if "serial" not in self.job.device["actions"]["deploy"]["connections"]: KeyError: 'connections' <https://10.1.52.17/scheduler/job/8857#L4> validate duration: 0.00<https://10.1.52.17/scheduler/job/8857#results_244238>case: validate
case_id: 244238
definition: lava
result: fail
<https://10.1.52.17/results/testcase/244238><https://10.1.52.17/scheduler/job/8857#L6>Cleaning after the job<https://10.1.52.17/scheduler/job/8857#L7>Root tmp directory removed at /var/lib/lava/dispatcher/tmp/8857<https://10.1.52.17/scheduler/job/8857#L8>LAVABug: This is probably a bug in LAVA, please report it.<https://10.1.52.17/scheduler/job/8857#results_244239>case: job
case_id: 244239
definition: lava
error_msg: 'connections'
error_type: Bug
result: fail<https://10.1.52.17/results/testcase/244239>
The health check looks like this:
job_name: SSH check
timeouts:
job:
minutes: 10
action:
minutes: 2
priority: medium
visibility: public
actions:
- deploy:
timeout: # timeout for the connection attempt
seconds: 30
to: ssh
os: oe
- boot:
timeout:
minutes: 2
prompts: ['root@(.*):~#']
method: ssh
connection: ssh
- test:
timeout:
minutes: 5
definitions:
- repository:
metadata:
format: Lava-Test Test Definition 1.0
name: smoke-tests-basic
description: "Basic smoke test"
run:
steps:
- lava-test-case linux-linaro-ubuntu-pwd --shell pwd
- lava-test-case linux-linaro-ubuntu-uname --shell uname -a
- lava-test-case linux-linaro-ubuntu-vmstat --shell vmstat
- lava-test-case linux-linaro-ubuntu-ip --shell ip a
from: inline
name: smoke-tests-basic
Any ideas ?
Best regards,
Sebastian
Hi,
There are mcu and soc on my board, two serial port for them( ttyusb0:mcu, ttyusb1:soc), the hostpc ( it is used for lava worker) connected to the two serial ports, and reboot soc cmd is echo "soc boot" > /dev/ttyUSB0 . then we can see boot log at ttyUSB1.
We want to use lava to test soc system, it is arm64 with linux , without uboot . and we hope to add build action in the device.
So the deploy and boot steps should be:
1. run build.sh on hostpc (it should be lava worker) and check the result (failed and pass)
2. run echo "soc burn" > /dev/ttyUSB0 on hostpc to change the soc to burn mode, and check the result (failed and pass)
3. run burn.sh on hostpc to burn to soc , and check the result (failed and pass)
4. run echo "soc boot" > /dev/ttyUSB0 to reboot soc , and check the result (failed and pass)
5. connect to /dev/ttyUSB1 to get soc boot log, and check the result (failed and pass)
6. ssh to the linux of soc.
What I want to know is:
1. Is the above design feasible on lava?
2. What do I need to do for this? Are there any device type templets that I can refer to?
The following is my lava system, I can run the test job with qemu device now.
~/work/src/lava $ dpkg -l |grep lava
ii lava 2022.11.1+10+buster all Linaro Automated Validation Architecture metapackage
ii lava-common 2022.11.1+10+buster all Linaro Automated Validation Architecture common
ii lava-coordinator 2022.11.1+10+buster all LAVA coordinator daemon
ii lava-dev 2022.11.1+10+buster all Linaro Automated Validation Architecture developer support
ii lava-dispatcher 2022.11.1+10+buster all Linaro Automated Validation Architecture dispatcher
ii lava-dispatcher-host 2022.11.1+10+buster all LAVA dispatcher host tools
ii lava-server 2022.11.1+10+buster all Linaro Automated Validation Architecture server
ii lava-server-doc 2022.11.1+10+buster all Linaro Automated Validation Architecture documentation
ii lava-tool 0.25-2 all deprecated command line utility for LAVA
ii lavacli 0.9.7+buster all LAVA XML-RPC command line interface
~/work/src/lava $
Hi, guys,
I currently have 2 jobs:
Job1:
actions:
- boot:
failure_retry: 4
method: bootloader
bootloader: u-boot
commands: []
prompts: ['=>']
timeout:
minutes: 2
- test:
timeout:
minutes: 4
interactive:
- name: check-uboot
prompts: ["=> ", "/ # "]
script:
- command: "printenv"
name: printenv
successes:
- message: "soc_type=imx93"
Job2:
actions:
- deploy:
to : uuu
images :
boot :
url : /path/to/bootloader
- boot:
method: uuu
commands :
- bcu: reset usb
- uuu: -b sd {boot}
- bcu: deinit
timeout:
minutes: 2
- boot:
method: bootloader
bootloader: u-boot
commands: []
prompts: ['=>']
timeout:
minutes: 2
- test:
interactive:
- name: check-uboot
prompts: ["=> ", "/ # "]
script:
- command: "printenv"
name: printenv
successes:
- message: "soc_type=imx93"
timeout:
minutes: 2
The first job just boot the board and check if the bootloader ok, the second job flash a new bootloader to board everytime before check the bootloader.
I wonder if possible for me to combine the two, define next logic in a job:
1. Boot the board to check the uboot
2. If uboot is ok, then the job finish
3. If uboot not ok or action timeout, then go to flash action to flash new bootloader to the device, then job finish.
The idea is: the step 3 is optional, we only flash a new bootloader when previous boot action failure.
Regards,
Larry
Hi all.
I am trying to setup my own embedded device using multinode API.
The senario is simple. target needs to wait until host role done.
The test action that wait for host role done looks like this:
```
- test:
interactive:
- name: send_target_ready
prompts:
- 'Generate Erased Block'
script:
- command: null
name: result
- lava-send: booted
- lava-wait: done
role:
- target
```
The multinode job done very well, but the test action I mentioned not show live logs from the device uart connection.
Is it possible to show live logs from the device while waiting with 'lava-wait host'?
Thanks.
Hello Team,
I'm using notify action in the job definition to notify users about the
status of the job. I'm already a registered user in lava and after
completion of my job, the administration site shows the status as "*not
sent*".
Please let me know what can be the reason, and how can i achieve it?
[image: lava-notify.PNG]
Below is the job definition i'm using:
*device_type: ADT-UNIT1job_name: sample test to notify usertimeouts: job:
minutes: 15 action: minutes: 10 connection: minutes: 5visibility:
publicactions:- command: name: relay_pwr_on timeout:
minutes: 1- deploy: to: flasher images: package:
url:
https://artifactory.softwaretools.com/artifactory/gop-generic-stable-local/…
<https://artifactory.softwaretools.com/artifactory/gop-generic-stable-local/…>-
boot: method: u-boot commands: - setenv factorymode 1 -
boot auto_login: login_prompt: 'login:' username: root
password_prompt: 'Password:' password: root login_commands:
- touch /home/root/test_file - ifconfig prompts: -
'root@hon-grip' - 'root@GRIP'notify: recipients: - to: method:
email user: pavan criteria: status: finished verbosity: verbose*
Thanks & Regards,
Pavan
Hi. all
I am struggling with situation that one of multinode job stuck in
scheduling status forever.
[image: image.png]
It occurs 1 in 10 times.
Sometimes one of the multinode job stuck in 'scheduling' status and the
other job goes timeout waiting first multinode job.
- lava-scheduler log
[image: image.png]
when this issue occur, the lava-scheduler's log indicate only one of
multinode job scheduled, and the other's not.
- lava-dispatcher log
[image: image.png]
and when issue occur, lava-dispatcher's log seems only one job has
triggered.
I am using lava-server and lava-dispatcher with docker instance (version
2023.08)
It occur in 2023.06 too.
It seems the issue related lava-scheduler. What should i check for resolve
this issue?
Please advise.
Thank you