lava-users September 2023

lava-users@lists.lavasoftware.org

5 participants
8 discussions

How to fail a test run on kernel warnings that happen after the boot action?

by Florian Bezdeka

Hi all, I'm basically repeating [1] here as there was no reaction for some months now. Maybe I used the wrong communication channel, let's see... We have a testsuite that is able to trigger a RCU WARNING inside the Linux kernel. My expectation was that whenever a kernel warning / oops / call stack dump / ... occurs the LAVA job is marked as "failed". This assumption seems to be wrong. It took some time to realize that we have a real problem as manual inspection of test logs only happens from time to time. After scanning the code my understanding is that the output of the connection (serial connection in my case) is only parsed during kernel boot (until the login action takes over). That is not sufficient for detecting problems that happen during test execution. Is there a way to scan the full log for the same patterns that are used by the boot action? If so, how to configure that? Whenever a kernel problem occurs my test run should be marked as "failed". Any ideas? Did I overlook something? Best regards, Florian [1] https://git.lavasoftware.org/lava/lava/-/issues/576

1 year, 7 months

Bug using SSH as primary connection

by Sebastian Lucas

Hello everyone, There seems to be a bug in LAVA. I was on version 2022.04 and have also tried 2023.03. Both versions have the same bug. The same configurations works in a 2018 build of LAVA on an old machine. I am trying to connect to an always on board via ssh. The healthcheck is failing with this error : lava-dispatcher, installed at version: 2023.03<https://10.1.52.17/scheduler/job/8857#L1>start: 0 validate<https://10.1.52.17/scheduler/job/8857#L2>Start time: 2023-04-12 14:07:00.373707+00:00 (UTC)<https://10.1.52.17/scheduler/job/8857#L3>Traceback (most recent call last): File "/usr/lib/python3/dist-packages/lava_dispatcher/job.py", line 198, in validate self._validate() File "/usr/lib/python3/dist-packages/lava_dispatcher/job.py", line 183, in _validate self.pipeline.validate_actions() File "/usr/lib/python3/dist-packages/lava_dispatcher/action.py", line 190, in validate_actions action.validate() File "/usr/lib/python3/dist-packages/lava_dispatcher/actions/deploy/ssh.py", line 81, in validate if "serial" not in self.job.device["actions"]["deploy"]["connections"]: KeyError: 'connections' <https://10.1.52.17/scheduler/job/8857#L4> validate duration: 0.00<https://10.1.52.17/scheduler/job/8857#results_244238>case: validate case_id: 244238 definition: lava result: fail <https://10.1.52.17/results/testcase/244238><https://10.1.52.17/scheduler/job/8857#L6>Cleaning after the job<https://10.1.52.17/scheduler/job/8857#L7>Root tmp directory removed at /var/lib/lava/dispatcher/tmp/8857<https://10.1.52.17/scheduler/job/8857#L8>LAVABug: This is probably a bug in LAVA, please report it.<https://10.1.52.17/scheduler/job/8857#results_244239>case: job case_id: 244239 definition: lava error_msg: 'connections' error_type: Bug result: fail<https://10.1.52.17/results/testcase/244239> The health check looks like this: job_name: SSH check timeouts: job: minutes: 10 action: minutes: 2 priority: medium visibility: public actions: - deploy: timeout: # timeout for the connection attempt seconds: 30 to: ssh os: oe - boot: timeout: minutes: 2 prompts: ['root@(.*):~#'] method: ssh connection: ssh - test: timeout: minutes: 5 definitions: - repository: metadata: format: Lava-Test Test Definition 1.0 name: smoke-tests-basic description: "Basic smoke test" run: steps: - lava-test-case linux-linaro-ubuntu-pwd --shell pwd - lava-test-case linux-linaro-ubuntu-uname --shell uname -a - lava-test-case linux-linaro-ubuntu-vmstat --shell vmstat - lava-test-case linux-linaro-ubuntu-ip --shell ip a from: inline name: smoke-tests-basic Any ideas ? Best regards, Sebastian

1 year, 9 months

How does the device run spcial cmd to build, burn and boot?

by irreallich＠126.com

Hi, There are mcu and soc on my board, two serial port for them( ttyusb0:mcu, ttyusb1:soc), the hostpc ( it is used for lava worker) connected to the two serial ports, and reboot soc cmd is echo "soc boot" > /dev/ttyUSB0 . then we can see boot log at ttyUSB1. We want to use lava to test soc system, it is arm64 with linux , without uboot . and we hope to add build action in the device. So the deploy and boot steps should be: 1. run build.sh on hostpc (it should be lava worker) and check the result (failed and pass) 2. run echo "soc burn" > /dev/ttyUSB0 on hostpc to change the soc to burn mode, and check the result (failed and pass) 3. run burn.sh on hostpc to burn to soc , and check the result (failed and pass) 4. run echo "soc boot" > /dev/ttyUSB0 to reboot soc , and check the result (failed and pass) 5. connect to /dev/ttyUSB1 to get soc boot log, and check the result (failed and pass) 6. ssh to the linux of soc. What I want to know is: 1. Is the above design feasible on lava? 2. What do I need to do for this? Are there any device type templets that I can refer to? The following is my lava system, I can run the test job with qemu device now. ~/work/src/lava $ dpkg -l |grep lava ii lava 2022.11.1+10+buster all Linaro Automated Validation Architecture metapackage ii lava-common 2022.11.1+10+buster all Linaro Automated Validation Architecture common ii lava-coordinator 2022.11.1+10+buster all LAVA coordinator daemon ii lava-dev 2022.11.1+10+buster all Linaro Automated Validation Architecture developer support ii lava-dispatcher 2022.11.1+10+buster all Linaro Automated Validation Architecture dispatcher ii lava-dispatcher-host 2022.11.1+10+buster all LAVA dispatcher host tools ii lava-server 2022.11.1+10+buster all Linaro Automated Validation Architecture server ii lava-server-doc 2022.11.1+10+buster all Linaro Automated Validation Architecture documentation ii lava-tool 0.25-2 all deprecated command line utility for LAVA ii lavacli 0.9.7+buster all LAVA XML-RPC command line interface ~/work/src/lava $

1 year, 10 months

Search conditional logic in job or something else could be act as workaround.

by Larry Shen

Hi, guys, I currently have 2 jobs: Job1: actions: - boot: failure_retry: 4 method: bootloader bootloader: u-boot commands: [] prompts: ['=>'] timeout: minutes: 2 - test: timeout: minutes: 4 interactive: - name: check-uboot prompts: ["=> ", "/ # "] script: - command: "printenv" name: printenv successes: - message: "soc_type=imx93" Job2: actions: - deploy: to : uuu images : boot : url : /path/to/bootloader - boot: method: uuu commands : - bcu: reset usb - uuu: -b sd {boot} - bcu: deinit timeout: minutes: 2 - boot: method: bootloader bootloader: u-boot commands: [] prompts: ['=>'] timeout: minutes: 2 - test: interactive: - name: check-uboot prompts: ["=> ", "/ # "] script: - command: "printenv" name: printenv successes: - message: "soc_type=imx93" timeout: minutes: 2 The first job just boot the board and check if the bootloader ok, the second job flash a new bootloader to board everytime before check the bootloader. I wonder if possible for me to combine the two, define next logic in a job: 1. Boot the board to check the uboot 2. If uboot is ok, then the job finish 3. If uboot not ok or action timeout, then go to flash action to flash new bootloader to the device, then job finish. The idea is: the step 3 is optional, we only flash a new bootloader when previous boot action failure. Regards, Larry

1 year, 10 months

is https://docs.lavasoftware.org server for LAVA document not supported anymore?

by 김서지 Lily

Hi. Is https://docs.lavasoftware.org server for LAVA document not supported anymore? I can't access the site anymore. Thank you.

1 year, 10 months

live log from multinode lava-wait

by 서지 Lily 김

Hi all. I am trying to setup my own embedded device using multinode API. The senario is simple. target needs to wait until host role done. The test action that wait for host role done looks like this: ``` - test: interactive: - name: send_target_ready prompts: - 'Generate Erased Block' script: - command: null name: result - lava-send: booted - lava-wait: done role: - target ``` The multinode job done very well, but the test action I mentioned not show live logs from the device uart connection. Is it possible to show live logs from the device while waiting with 'lava-wait host'? Thanks.

1 year, 10 months

Notify User is not sending notification to user:

by Pavankumar Reddy

Hello Team, I'm using notify action in the job definition to notify users about the status of the job. I'm already a registered user in lava and after completion of my job, the administration site shows the status as "*not sent*". Please let me know what can be the reason, and how can i achieve it? [image: lava-notify.PNG] Below is the job definition i'm using: *device_type: ADT-UNIT1job_name: sample test to notify usertimeouts: job: minutes: 15 action: minutes: 10 connection: minutes: 5visibility: publicactions:- command: name: relay_pwr_on timeout: minutes: 1- deploy: to: flasher images: package: url: https://artifactory.softwaretools.com/artifactory/gop-generic-stable-local/… <https://artifactory.softwaretools.com/artifactory/gop-generic-stable-local/…>- boot: method: u-boot commands: - setenv factorymode 1 - boot auto_login: login_prompt: 'login:' username: root password_prompt: 'Password:' password: root login_commands: - touch /home/root/test_file - ifconfig prompts: - 'root@hon-grip' - 'root@GRIP'notify: recipients: - to: method: email user: pavan criteria: status: finished verbosity: verbose* Thanks & Regards, Pavan

1 year, 10 months

One of Multinode Job Stcuk in Scheduling status forever

by 김서지 Lily

Hi. all I am struggling with situation that one of multinode job stuck in scheduling status forever. [image: image.png] It occurs 1 in 10 times. Sometimes one of the multinode job stuck in 'scheduling' status and the other job goes timeout waiting first multinode job. - lava-scheduler log [image: image.png] when this issue occur, the lava-scheduler's log indicate only one of multinode job scheduled, and the other's not. - lava-dispatcher log [image: image.png] and when issue occur, lava-dispatcher's log seems only one job has triggered. I am using lava-server and lava-dispatcher with docker instance (version 2023.08) It occur in 2023.06 too. It seems the issue related lava-scheduler. What should i check for resolve this issue? Please advise. Thank you

1 year, 10 months

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

lava-users September 2023