Hi all,
I'm basically repeating [1] here as there was no reaction for some
months now. Maybe I used the wrong communication channel, let's see...
We have a testsuite that is able to trigger a RCU WARNING inside the
Linux kernel. My expectation was that whenever a kernel warning / oops
/ call stack dump / ... occurs the LAVA job is marked as "failed".
This assumption seems to be wrong. It took some time to realize that we
have a real problem as manual inspection of test logs only happens from
time to time.
After scanning the code my understanding is that the output of the
connection (serial connection in my case) is only parsed during kernel
boot (until the login action takes over). That is not sufficient for
detecting problems that happen during test execution.
Is there a way to scan the full log for the same patterns that are used
by the boot action? If so, how to configure that? Whenever a kernel
problem occurs my test run should be marked as "failed".
Any ideas? Did I overlook something?
Best regards,
Florian
[1] https://git.lavasoftware.org/lava/lava/-/issues/576
Hi All,
I am trying to setup LAVA 2023.08 lava-server and lava-dispatcher on a
single machine, I successfully installed and was able to add workers and a
QEMU device.
From logs, it seems the workers are communicating with the lava server, I
am trying to execute a simple QEMU sample job and it always going in the
submitted state. I verified the device dictionary is correct.
As per my understanding if communication is happening between
lava-server/worker then a device should be assigned that is configured on
the worker, but in my case, no device is assigned to the test job and it
always keeps in a submitted state.
Can someone let me know if any special settings are required for running a
test job on a QEMU device? or share any docs link s to resolve this, any
input will be appreciated.
My Testjob YAML:
https://docs.lavasoftware.org/lava/examples/test-jobs/qemu-amd64-standard-s…
Thanks,
Ankit
Hello,
Is there a standard way to reboot the ECU by calling the reboot command defined in the device type (hard_reset_command),
between test case, so we are sure the tests are not impacting each other?
Thanks.
Hi All,
I am setting an HTTPS instance of LAVA, I can access the LAVA UI and am
able to log in as well. I am using a self-signed SSL certificate.
I have added *URL="https://172.16.60.178/ <https://172.16.60.178/>"* in the
/etc/lava-dispatcher/lava-worker file and restarted the service. It's
giving me the below error.
*2023-10-30 07:10:47,383 ERROR -> server error: code 5032023-10-30
07:10:47,383 DEBUG --> HTTPSConnectionPool(host='172.16.60.178',
port=443): Max retries exceeded with url:
/scheduler/internal/v1/workers/debian/?version=2023.10 (Caused by
SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED]
certificate verify failed: self signed certificate (_ssl.c:1123)')))*
Do we need to configure some extra settings in case we are using a
self-signed certificate? any help will be appreciated.
Thanks,
Ankit
Hello everyone,
There seems to be a bug in LAVA. I was on version 2022.04 and have also tried 2023.03. Both versions have the same bug.
The same configurations works in a 2018 build of LAVA on an old machine.
I am trying to connect to an always on board via ssh.
The healthcheck is failing with this error :
lava-dispatcher, installed at version: 2023.03<https://10.1.52.17/scheduler/job/8857#L1>start: 0 validate<https://10.1.52.17/scheduler/job/8857#L2>Start time: 2023-04-12 14:07:00.373707+00:00 (UTC)<https://10.1.52.17/scheduler/job/8857#L3>Traceback (most recent call last): File "/usr/lib/python3/dist-packages/lava_dispatcher/job.py", line 198, in validate self._validate() File "/usr/lib/python3/dist-packages/lava_dispatcher/job.py", line 183, in _validate self.pipeline.validate_actions() File "/usr/lib/python3/dist-packages/lava_dispatcher/action.py", line 190, in validate_actions action.validate() File "/usr/lib/python3/dist-packages/lava_dispatcher/actions/deploy/ssh.py", line 81, in validate if "serial" not in self.job.device["actions"]["deploy"]["connections"]: KeyError: 'connections' <https://10.1.52.17/scheduler/job/8857#L4> validate duration: 0.00<https://10.1.52.17/scheduler/job/8857#results_244238>case: validate
case_id: 244238
definition: lava
result: fail
<https://10.1.52.17/results/testcase/244238><https://10.1.52.17/scheduler/job/8857#L6>Cleaning after the job<https://10.1.52.17/scheduler/job/8857#L7>Root tmp directory removed at /var/lib/lava/dispatcher/tmp/8857<https://10.1.52.17/scheduler/job/8857#L8>LAVABug: This is probably a bug in LAVA, please report it.<https://10.1.52.17/scheduler/job/8857#results_244239>case: job
case_id: 244239
definition: lava
error_msg: 'connections'
error_type: Bug
result: fail<https://10.1.52.17/results/testcase/244239>
The health check looks like this:
job_name: SSH check
timeouts:
job:
minutes: 10
action:
minutes: 2
priority: medium
visibility: public
actions:
- deploy:
timeout: # timeout for the connection attempt
seconds: 30
to: ssh
os: oe
- boot:
timeout:
minutes: 2
prompts: ['root@(.*):~#']
method: ssh
connection: ssh
- test:
timeout:
minutes: 5
definitions:
- repository:
metadata:
format: Lava-Test Test Definition 1.0
name: smoke-tests-basic
description: "Basic smoke test"
run:
steps:
- lava-test-case linux-linaro-ubuntu-pwd --shell pwd
- lava-test-case linux-linaro-ubuntu-uname --shell uname -a
- lava-test-case linux-linaro-ubuntu-vmstat --shell vmstat
- lava-test-case linux-linaro-ubuntu-ip --shell ip a
from: inline
name: smoke-tests-basic
Any ideas ?
Best regards,
Sebastian
Hello everyone,
I hava a worker in docker container, and the container runs in a Ubuntu x86 PC.
now I want to make the Ubuntu x86 PC as a device and connect device to the worker run on it. Then I can execute job test actions commands in device.
Which kind of connection should I choose?
I am newer , and I try to make new device.
there is no uboot, kerner ... on my system.
for boot, we just need run cmd on host "echo "reset 1" > /dev/ttyUSB0" , it is the mcu serial port.
then I can get boot log from /dev/ttyUSB1 , it is thesoc serial port.
the following is the job
device_type: orinshort
job_name: for orinshort device
timeouts:
job:
minutes: 20
action:
minutes: 15
priority: medium
visibility: public
actions:
- deploy:
timeout:
minutes: 10
to: flasher
images:
kernel:
url: http://10.19.207.190/static/docs/v2/contents.html#contents-first-steps-using
The following is the my device, it is very sample:
{# orin test short #}
{% extends 'orinshort.jinja2' %}
{% set connection_list = ['usb1'] %}
{% set connection_tags = {'usb1': ['primary', 'telnet']} %}
{% set connection_commands = {'usb1': 'telnet localhost 10009'} %}
{% set flasher_reset_commands = ['/tmp/test.sh'] %}
{% block body %}
actions:
deploy:
methods:
flasher:
commands: {{ flasher_reset_commands }}
{% endblock body %}
I don't know how to set boot with serial port .
1. about cmd echo "reset 1" > /dev/ttyUSB0
0.1 when I input the cmd on the host shell : echo "reset 1" > /dev/ttyUSB0 , soc can be reset
0.2 when I add it to test.sh and run it with the following code in device, soc can be reset too
actions:
deploy:
methods:
flasher:
commands: {{ flasher_reset_commands }}
0.3 when I use {% set flasher_reset_commands = ['echo "reset 1" > /dev/ttyUSB0) '] %} , it doesn't work.
{% set flasher_reset_commands = ['echo \"reset 1\" > /dev/ttyUSB0) '] %} , it is doesn't work.
So how to echo cmd to /dev/ttyUSB0 by flasher: -> commands?
2. how to set boot with serial port in device jinja2 file?
if I add the fowllowing code , the test can't run
boot:
connections:
serial: usb1
3. I think the normal process is to configure deploy boot and other operations in the job. but I don't know how to do with my case.
just run cmd on host "echo "reset 1" > /dev/ttyUSB0" to start system, how to select the method?
how to run flasher_reset_commands in job yaml file?
4. If the job can't start
job state is Submitted and can't start test, how to debug? where is the log?
I am newer and my system is special , it just like docker, there is now uboot kernel, we need special cmd to deploy and boot
so it is hard for me, please give me a hand for it.
Thanks Paweł Wieczorek,
I'm not aware this new feature, magic job by you guys, thanks.
-----Original Message-----
From: lava-users-request(a)lists.lavasoftware.org <lava-users-request(a)lists.lavasoftware.org>
Sent: Thursday, October 12, 2023 8:00 AM
To: lava-users(a)lists.lavasoftware.org
Subject: [EXT] lava-users Digest, Vol 61, Issue 10
Caution: This is an external email. Please take care when clicking links or opening attachments. When in doubt, report the message using the 'Report this email' button
Send lava-users mailing list submissions to
lava-users(a)lists.lavasoftware.org
To subscribe or unsubscribe via email, send a message with subject or body 'help' to
lava-users-request(a)lists.lavasoftware.org
You can reach the person managing the list at
lava-users-owner(a)lists.lavasoftware.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of lava-users digest..."
Today's Topics:
1. Re: About timeouts. (Paweł Wieczorek)
----------------------------------------------------------------------
Message: 1
Date: Wed, 11 Oct 2023 12:43:51 +0200
From: Paweł Wieczorek <pawiecz(a)collabora.com>
Subject: [lava-users] Re: About timeouts.
To: lava-users(a)lists.lavasoftware.org
Message-ID: <32bdd741-289a-ae09-d994-f73b069a892f(a)collabora.com>
Content-Type: multipart/alternative;
boundary="------------4xeVji058aNHAzBX687S773k"
Hi Larry,
On 11.10.2023 08:04, Larry Shen wrote:
>
> Hi, guys, I have a question about timeout:
>
> 1) For next job, the boot action block's timeout will be 5 minutes,
> while pdu-reboot timeout will be 10 seconds.
>
> timeouts:
>
> job:
>
> minutes: 10
>
> action:
>
> minutes: 5
>
> actions:
>
> pdu-reboot:
>
> seconds: 10
>
> connection:
>
> minutes: 2
>
> actions:
>
> - boot:
>
> failure_retry: 4
>
> method: bootloader
>
> bootloader: u-boot
>
> commands: []
>
> prompts: ['=>']
>
> 2. But after we add timeouts to boot actions, the boot action timeout
> is 2 minutes now, that's OK. But the individual pdu-reboot timeout
> will be the left time of 2 minutes.
>
> actions:
>
> - boot:
>
> failure_retry: 4
>
> method: bootloader
>
> bootloader: u-boot
>
> commands: []
>
> prompts: ['=>']
>
> timeout:
>
> minutes: 2
>
> My question is: for the second item, if possible we could let
> pdu-reboot remain the value "10 seconds"?
>
> Above is just an example to explain my question. What I really want to
> achieve is: sometimes, I want to specify the timeout for "uboot wait
> for interrupt", I want to fail that individual sub-action quickly,
> then we are possible retry this action quickly without wait for the
> whole boot action timeout.
>
You can set action block timeouts also for individual actions [0][1] - in your case it could look like:
actions:
- boot:
failure_retry: 4
method: bootloader
bootloader: u-boot
commands: []
prompts: ['=>']
timeout:
minutes: 2
timeouts:
pdu-reboot:
seconds: 10
This feature should be available if you're running LAVA 2023.01 or newer [2].
[0]
https://lava.collabora.dev/static/docs/v2/actions-timeout.html#individual-a…
[1]
https://lava.collabora.dev/static/docs/v2/timeouts.html#action-block-overri…
[2]
https://gitlab.com/lava/lava/-/commit/15650f11aa10931c9b2a148ae16561b748a38…
Kind regards,
Paweł
> Any idea? Thanks.
>
> Regards,
>
> Larry
>
>
> _______________________________________________
> lava-users mailing list --lava-users(a)lists.lavasoftware.org
> To unsubscribe send an email tolava-users-leave(a)lists.lavasoftware.org
> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
Hi, guys, I have a question about timeout:
1) For next job, the boot action block's timeout will be 5 minutes, while pdu-reboot timeout will be 10 seconds.
timeouts:
job:
minutes: 10
action:
minutes: 5
actions:
pdu-reboot:
seconds: 10
connection:
minutes: 2
actions:
- boot:
failure_retry: 4
method: bootloader
bootloader: u-boot
commands: []
prompts: ['=>']
2. But after we add timeouts to boot actions, the boot action timeout is 2 minutes now, that's OK. But the individual pdu-reboot timeout will be the left time of 2 minutes.
actions:
- boot:
failure_retry: 4
method: bootloader
bootloader: u-boot
commands: []
prompts: ['=>']
timeout:
minutes: 2
My question is: for the second item, if possible we could let pdu-reboot remain the value "10 seconds"?
Above is just an example to explain my question. What I really want to achieve is: sometimes, I want to specify the timeout for "uboot wait for interrupt", I want to fail that individual sub-action quickly, then we are possible retry this action quickly without wait for the whole boot action timeout.
Any idea? Thanks.
Regards,
Larry