Hello everyone,
I would like to open thread of discussion to understand about LAVA test framework support for some of the use cases where I’m facing issues.
While testing a reboot scenario in CIP (https://gitlab.com/cip-project/cip-core/isar-cip-core) where reboot is triggered by watchdog. LAVA is unable to do successful reboot. Following are the steps:
device_type: qemu job_name: qemu x86_64 software update testing timeouts: job: minutes: 20 action: minutes: 10 actions: power-off: seconds: 60 priority: high visibility: public notify: criteria: status: finished recipients: - to: method: email email: sai.sathujoda@toshiba-tsip.com context: arch: x86_64 lava_test_dir: '/home/lava-%s'
# ACTION BLOCK actions: - deploy: timeout: minutes: 15 to: tmpfs images: system: image_arg: '-drive file={system},discard=unmap,if=none,id=disk,format=raw -m 1G -serial mon:stdio -cpu qemu64 -smp 4 -machine q35,accel=tcg -global ICH9-LPC.noreboot=off -device ide-hd,drive=disk -nographic' url: ######.wic.xz compression: xz firmware: image_arg: '-drive if=pflash,format=raw,unit=0,readonly=on,file={firmware}' url: ######
# BOOT BLOCK - boot: timeout: minutes: 5 method: qemu media: tmpfs prompts: ["root@demo:~#"] auto_login: login_prompt: "demo login:" username: "root" password_prompt: "Password:" password: "root"
# TEST_BLOCK - test: timeout: minutes: 5 definitions: - repository: metadata: format: Lava-Test Test Definition 1.0 name: sample-test description: "check reboot version" run: steps: - lava-test-case uname --shell uname -a - cd /home - wget --no-check-certificate #### - lsblk - swupdate -i cip-core-* - reboot from: inline name: sample-test-1 path: inline/sample-test.yaml
- boot: timeout: minutes: 5 method: qemu media: tmpfs prompts: ["root@demo:"] auto_login: login_prompt: "demo login:" username: "root" password_prompt: "Password:" password: "root"
- test: timeout: minutes: 5 definitions: - repository: metadata: format: Lava-Test Test Definition 1.0 name: sample-test description: "check partition switch" run: steps: - lsblk from: inline name: sample-test-2 path: inline/sample-test.yaml
context: arch: x86_64 lava_test_results_dir: '/home/lava-%s'
A reboot is triggered by watchdog following the reboot done in the test action due to failed case. The reboot triggered by watchdog failed with timeout error at login stage which can be interpreted that last boot action in the above job definition failed to give the assigned login prompts.
I have already received some opinion about this from LAVA users community that LAVA does not support the board being rebooted outside it’s control ( whether by a watchdog or a package ). However, CIP extensively uses LAVA as test framework to regressively perform many kinds of tests on CIP supported hardware. Testing the watchdog is an important use case in CIP. Since LAVA is supposed to be test framework which can help to test many type of hardware. We as CIP project member would like to understand LAVA community future plan to support this use case.
Thanks and Regards, Sai Ashrith
Dear all,
A reboot is triggered by watchdog following the reboot done in the test action due to failed case. The reboot triggered by watchdog failed with timeout error at login stage which can be interpreted that last boot action in the above job definition failed to give the assigned login prompts.
I have already received some opinion about this from LAVA users community that LAVA does not support the board being rebooted outside it’s control ( whether by a watchdog or a package ). However, CIP extensively uses LAVA as test framework to regressively perform many kinds of tests on CIP supported hardware. Testing the watchdog is an important use case in CIP. Since LAVA is supposed to be test framework which can help to test many type of hardware. We as CIP project member would like to understand LAVA community future plan to support this use case.
A possible way would to use the minimal boot action together with a soft_reboot command which triggers panic. I tested it with the following configuration.
device_type: qemu job_name: qemu x86_64 software update testing timeouts: job: minutes: 20 action: minutes: 10 actions: power-off: seconds: 60 priority: high visibility: public context: arch: x86_64 lava_test_results_dir: /home/lava-%s
# ACTION BLOCK actions: - deploy: timeout: minutes: 15 to: tmpfs images: system: image_arg: '-drive file={system},discard=unmap,if=none,id=disk,format=raw -m 1G -serial mon:stdio -cpu qemu64 -smp 4 -machine q35,accel=tcg -global ICH9-LPC.noreboot=off -device ide-hd,drive=disk -nographic' url: ###### compression: xz firmware: image_arg: '-drive if=pflash,format=raw,unit=0,readonly=on,file={firmware}' url: #####
# BOOT BLOCK - boot: timeout: minutes: 5 method: qemu media: tmpfs prompts: ["root@demo:~#"] auto_login: login_prompt: "demo login:" username: "root" password_prompt: "Password:" password: "root"
# TEST_BLOCK - test: timeout: minutes: 5 definitions: - repository: metadata: format: Lava-Test Test Definition 1.0 name: sample-test description: "check reboot version" run: steps: - lava-test-case uname --shell uname -a - cd /home - lsblk from: inline name: sample-test-1 path: inline/sample-test.yaml
- boot: timeout: minutes: 10 method: minimal soft_reboot: echo c > /proc/sysrq-trigger prompts: ["root@demo:~#"] auto_login: login_prompt: "demo login:" username: "root" password_prompt: "Password:" password: "root" parameters: shutdown-message: "end Kernel panic - not syncing: sysrq triggered crash "
- test: timeout: minutes: 5 definitions: - repository: metadata: format: Lava-Test Test Definition 1.0 name: sample-test description: "check partition switch" run: steps: - lsblk from: inline name: sample-test-2 path: inline/sample-test.yaml
It takes around ~1 second on my test system from the last command of the test until the panic occurs.
This triggers a reboot with `sysrq` and waits until the kernel panic is finished. afterwards it is an normal boot.
Best regards, Quirin
lava-users@lists.lavasoftware.org