Hi ,
I am trying to setup LAVA on Single Master Worker node.
When i am trying to submit job it is running but no device is getting selected nor I am getting any output on UI. When I checked the lava-scheduler it is failing with below error:
root@debian:/etc/apache2/sites-available# systemctl status lava-scheduler
× lava-scheduler.service - LAVA scheduler
Loaded: loaded (/lib/systemd/system/lava-scheduler.service; enabled; preset: enabled)
Active: failed (Result: start-limit-hit) since Tue 2024-07-02 18:52:36 IST; 7min ago
Duration: 1.472s
Process: 10345 ExecStart=/usr/bin/lava-server manage lava-scheduler --level $LOGLEVEL --log-file $LOGFILE $EVENT_URL $IPV6 (code=exited, status=0/SUCCES>
Main PID: 10345 (code=exited, status=0/SUCCESS)
CPU: 1.431s
Jul 02 18:52:35 debian systemd[1]: lava-scheduler.service: Deactivated successfully.
Jul 02 18:52:35 debian systemd[1]: lava-scheduler.service: Consumed 1.431s CPU time.
Jul 02 18:52:36 debian systemd[1]: lava-scheduler.service: Scheduled restart job, restart counter is at 5.
Jul 02 18:52:36 debian systemd[1]: Stopped lava-scheduler.service - LAVA scheduler.
Jul 02 18:52:36 debian systemd[1]: lava-scheduler.service: Consumed 1.431s CPU time.
Jul 02 18:52:36 debian systemd[1]: lava-scheduler.service: Start request repeated too quickly.
Jul 02 18:52:36 debian systemd[1]: lava-scheduler.service: Failed with result 'start-limit-hit'.
Jul 02 18:52:36 debian systemd[1]: Failed to start lava-scheduler.service - LAVA scheduler.
root@debian:/etc/apache2/sites-available# tail -f /var/log/lava-server/lava-scheduler.log
{% extends 'qemu.jinja2' %}
^^^^^^^^^^^^^^^^^^^^^^^^^
[Previous line repeated 977 more times]
File "/usr/lib/python3/dist-packages/jinja2/environment.py", line 1494, in is_up_to_date
return self._uptodate()
^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/jinja2/loaders.py", line 212, in uptodate
return os.path.getmtime(filename) == mtime
^^^^^^^^^^^^^^^^^^^^^^^^^^
RecursionError: maximum recursion depth exceeded
Also when I am accessing /admin/lava_scheduler_app/device/ I am getting " 500 Internal Server Error
maximum recursion depth exceeded"
I have registered my worker node as debian,hostname as debian and device type is qemu. Could you please what I am missing.
Just as the subject says. I am using lava-test-case to confirm whether this particular command is done running and is successful or not, because the test has previously always ended prematurely and not done what I needed it to do. When I run the command manually outside of a lava test job it works just fine, no errors and doesn't end early. It does take a while to complete, but I don't think that's the issue as I use wget to download some large files and those have taken several minutes longer than this one command is supposed to.
How do I get lava to wait for this command to end? Or to change how it checks for command failure?
I get "Received signal: <ENDTC>" almost immediately after running the command, and subsequently the result=fail signal. I really don't know why it won't work, so if there is a change I can make to the files or a config somewhere I would like to know.
Best regards,
Michael
Hi,
I was wondering if it is possible to write a multinode test to boot a device, run a script to do some stuff then have it reboot one way or another, then run tests.
There is intention for pipeline integration down the line, so if it isn't possible to do that with multinode I have considered the idea of setting it up so one regular test runs and runs the script, turns DUT off, then another test runs to boot and actually test the device. This (obviously) has its drawbacks, but it's something of a backup.
If I understand correctly, where certain actions are placed in a multinode test actually matters, so it (in my mind) is possible to have a boot->test to boot a DUT then use test commands to run a script, and then after that another boot->test to boot the device or just login again and then actually run tests. So the final sort of action set up would be boot->test->boot->test, where the first two are for running the script and rebooting, and the latter two are actually testing.
If there are other ideas that are better or equally viable, then Im open to those too, but multinode was the first that came to mind cos of secondary connections and the like.
Best regards,
Michael
Hello
I am experiencing troubles connecting to apt.lavasoftware.org from a VPS
located in the "Paris France SD6" datacenter hosted by gandi.net,
The problematic VPS IP address is 46.226.105.174.
Testing from another VPS, still hosted at gandi.net but located in a
different datacenter (Luxembourg) presents no issues. Both VPS runs
Debian 12 Bookworm.
The symptoms I'm experiencing seems to suggest the TLS handshake gets
truncated as suggested by the 'gnutls-cli' output below reported
-------------------------------------------------------------------------------
# gnutls-cli -d9999 -p 443 apt.lavasoftware.org
...
|<5>| REC[0x56498f5d7740]: Sent Packet[1] Handshake(22) in epoch 0 and length: 402
|<11>| HWRITE: wrote 1 bytes, 0 bytes left.
|<11>| WRITE FLUSH: 402 bytes in buffer.
|<11>| WRITE: wrote 402 bytes, 0 bytes left.
|<3>| ASSERT: ../../lib/buffers.c[get_last_packet]:1185
|<10>| READ: Got 0 bytes from 0x3
|<10>| READ: read 0 bytes from 0x3
|<3>| ASSERT: ../../lib/buffers.c[_gnutls_io_read_buffered]:593
|<3>| ASSERT: ../../lib/record.c[recv_headers]:1195
|<3>| ASSERT: ../../lib/record.c[_gnutls_recv_in_buffers]:1321
|<3>| ASSERT: ../../lib/buffers.c[_gnutls_handshake_io_recv_int]:1467
|<3>| ASSERT: ../../lib/handshake.c[_gnutls_recv_handshake]:1600
|<3>| ASSERT: ../../lib/handshake.c[handshake_client]:3075
|<13>| BUF[HSK]: Emptied buffer
*** Fatal error: The TLS connection was non-properly terminated.
-------------------------------------------------------------------------------
Could you kindly check if my VPS is maybe in a range of IP addresses
blacklisted by the service provider that hosts lavasoftware.org ?
Thanks
j
Hi all,
I have been having issues doing solely boot and test with serial (specific issue(s) not diagnosed, but presumably something to do with telnet), but that's not the discussion here.
I want to be try booting the device with the serial method then deploying/running tests using ssh, since it seems like ssh can't power on and power off the DUT. If that is wrong then please let me know so I can try that.
I have two proposed ideas:
1) boot with serial -> deploy test (however works) -> run test with serial -> power off
2) boot with serial -> deploy test (however works) -> run test with ssh -> power off
I started with a multinode job (I still don't quite understand multinode yet) and this is my current job definition, if you could please help point out issues with it as it doesn't work:
#multinode job for controller deployment
job_name: controller deploy test
protocols:
lava-multinode:
roles:
host:
context:
lava_test_results_dir: /tmp/lava-%s
device_type: controller
timeout:
minutes: 10
count: 1
guest:
context:
lava_test_results_dir: /tmp/lava-%s
request: lava-start
count: 3
expect_role: host
timeout:
minutes: 10
connection: ssh
host_role: host
timeouts:
job:
minutes: 15
action:
minutes: 5
connection:
minutes: 2
priority: medium
visibility: public
actions:
- deploy:
role:
- host
timeout:
minutes: 10
to: tftp
authorize: ssh
kernel:
url: file:///kernel.img
type: uimage
ramdisk:
url: file:///ramdisk.gz
compression: gz
dtb:
url: file:///u-boot.dtb
- deploy:
role:
- guest
timeout:
minutes: 10
to: ssh
protocols:
lava-multinode:
- action: prepare-scp-overlay
request: lava-wait
messageID: ipv4
message:
ipaddr: $ipaddr
- boot:
role:
- host
timeout:
minutes: 5
method: minimal
prompts: ["# "]
auto_login:
login_prompt: "login: "
username: root
- boot:
role:
- guest
timeout:
minutes: 5
prompts: ["# $"]
parameters:
hostID: ipv4 # messageID
host_key: ipaddr # message key
method: ssh
connection: ssh
- test:
role:
- host
timeout:
minutes: 15
definitions:
- repository:
metadata:
format: Lava-Test Test Definition 1.0
name: install-ssh
description: "install step"
scope:
- functional
run:
steps:
# messageID matches, message_key as the key.
- lava-send ipv4 ipaddr=$(lava-echo-ipv4 eth0)
- lava-send lava_start
- lava-sync clients
from: inline
name: ssh-inline
path: inline/ssh-install.yaml
- test:
role:
- guest
timeout:
minutes: 5
definitions:
- repository: https://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@github.com/MichaelPed/lava…
from: git
path: smoke-tests/smoke.yaml
name: smoke-tests
# run the inline last as the host is waiting for this final sync.
- repository:
metadata:
format: Lava-Test Test Definition 1.0
name: client-ssh
description: "client complete"
scope:
- functional
run:
steps:
- df -h
- free
- lava-sync clients
from: inline
name: ssh-client
path: inline/ssh-client.yaml
And this is my device dictionary, if you could please let me know what needs changing:
{% extends 'controller.jinja2' %}
{% set power_off_command = 'python /power.py 12 off' %}
{% set soft_reboot_command = 'reboot'%}
{% set hard_reset_command = 'python /power.py 12 reboot' %}
{% set power_on_command = 'python /power.py 12 on' %}
{% set connection_list = ['uart0'] %}
{% set connection_commands = {'uart0': 'telnet 10.60.2.209 7001'} %}
{% set connection_tags = {'uart0': ['primary', 'telnet']} %}
{% set bootm_kernel_addr = '/kernel.img' %}
If you guys have any questions please ask away! Also, if there is somehow something other than a multinode job that will do what I want then please let me know!
Best regards,
Michael
Hello! I want to run QEMU in DUT and gdb the QEMU in another console to simulate something like a bit flip in memory. What should I do? I just need the new console to run some gdb scripts, like `rust-gdb vmlinux -ex 'target remote /tmp/gdb_socket' -ex 'set *0x40001000 ^= (1<<5)' -ex `c` `, so the new console can not be interactive, live to run some command is enough.
Thanks!
Hello everyone,
I would like to integrate & run a bare metal custom tests suite(pytest) on
lava. Currently I am executing manually from Host Machine(where pytest is
configured and pytest suite will execute) and these commands are sent to
DUT and results are collected on Host Machine.
I am familiar with lava-interactive method but it looks like it will not
suit this requirement(Run pytest on Host and collect logs from DUT).
Does lava suits for this requirement? if someone could give me advice on
this that would be great support.
Regards
Nagendra S
Hi,
I hope to be able to test whether the system can boot normally after multiple restarts (10,000 times)
Because every boot requires the auto_login of the boot action, it seems that a loop needs to be implemented in yaml, and this loop needs to include the boot action. Are there any relevant examples that you can refer to?
There are roughly two types of test logic
1:
while (a<10000)
{
-boot
auto_login
- test
basic io test
software reboot in target board
a=a+1
}
2:
while (a<10000)
{
-boot
auto_login
- test
basic io test
a=a+1
hardware reboot in worker
}
Thanks
Hi,
I am trying to use transfer_overlay as the fs on my DUT is read-only, but the /data/ and /tmp/ directories are writeable to some extent.
I have a working method to use wget, and it works when I use it manually to download the overlay tarball over http, both on my DUT and on the worker device.
However, the LAVA test itself always returns the error 'Network Unreachable'
What are the possible reasons for this?
Best regards,
Michael