Hi.
I am using Lava-server and Lava-dispatcher using docker image with version 2023.08. I have an issue when submitting a multinode job, sometimes Lava-server run the jobs with abnormal Job ID such as 230, 230.1. Sometimes, Lava-server run the jobs with normal Job ID such as 230.0, 230.1.
When I submit the definition, sometimes Lava-server show the message 'Invalid job definition: expected a dictionary' on the web page. Is it because of the syntax error in my definition? When I submit using Lava server web or lavacli, there's no warning or error that indicates a syntax error in the definition.
The job definition yaml looks like this. ``` job_name: multinode test job
timeouts: job: minutes: 60 action: minutes: 60 connection: minutes: 60 priority: medium visibility: public
protocols: lava-multinode: roles: target: count: 1 device_type: customdevice host: count: 1 device_type: docker
actions: - deploy: role: - target to: flasher images: fw: url: http://example.com/repository/customdevice/test/test.bin
- boot: role: - target method: minimal prompts: - 'root:'
- test: interactive: - name: send_target_ready prompts: - 'root:' script: - command: ls - lava-send: booted - lava-wait: done role: - target
- deploy: role: - host to: docker os: debian image: testimage:2023.08
- boot: role: - host method: docker command: /bin/bash -c 'service ssh start; bash' prompts: - 'root@lava:'
- test: interactive: - name: wait_target_ready prompts: - 'root@lava:' script: - command: ls - lava-wait: booted role: - host
- test: role: - host definitions: - repository: metadata: format: Lava-Test Test Definition 1.0 name: run-command description: "Run command" os: - ubuntu scope: - functional run: steps: - lava-test-case pwd-command --shell 'pwd' - lava-send done from: inline name: run-command path: inline/run-command.yaml ```
On Thu, Aug 17, 2023 at 6:14 AM 서지 Lily 김 seoji.kim@fadutec.com wrote:
Hi.
I am using Lava-server and Lava-dispatcher using docker image with version 2023.08. I have an issue when submitting a multinode job, sometimes Lava-server run the jobs with abnormal Job ID such as 230, 230.1. Sometimes, Lava-server run the jobs with normal Job ID such as 230.0, 230.1.
This is not abnormal. The 230.0 and 230.1 are aliases for actual job IDs. See example here: https://validation.linaro.org/scheduler/job/3950113 is the same as https://validation.linaro.org/scheduler/job/3950112.1
When I submit the definition, sometimes Lava-server show the message 'Invalid job definition: expected a dictionary' on the web page. Is it because of the syntax error in my definition? When I submit using Lava server web or lavacli, there's no warning or error that indicates a syntax error in the definition.
The job definition yaml looks like this.
This definition looks valid.
Best Regards, Milosz
job_name: multinode test job timeouts: job: minutes: 60 action: minutes: 60 connection: minutes: 60 priority: medium visibility: public protocols: lava-multinode: roles: target: count: 1 device_type: customdevice host: count: 1 device_type: docker actions: - deploy: role: - target to: flasher images: fw: url: http://example.com/repository/customdevice/test/test.bin - boot: role: - target method: minimal prompts: - 'root:' - test: interactive: - name: send_target_ready prompts: - 'root:' script: - command: ls - lava-send: booted - lava-wait: done role: - target - deploy: role: - host to: docker os: debian image: testimage:2023.08 - boot: role: - host method: docker command: /bin/bash -c 'service ssh start; bash' prompts: - 'root@lava:' - test: interactive: - name: wait_target_ready prompts: - 'root@lava:' script: - command: ls - lava-wait: booted role: - host - test: role: - host definitions: - repository: metadata: format: Lava-Test Test Definition 1.0 name: run-command description: "Run command" os: - ubuntu scope: - functional run: steps: - lava-test-case pwd-command --shell 'pwd' - lava-send done from: inline name: run-command path: inline/run-command.yaml
Lava-users mailing list -- lava-users@lists.lavasoftware.org To unsubscribe send an email to lava-users-leave@lists.lavasoftware.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
Sorry. I mean the sub ID of each job from the multinode looks abnormal.
The first problem is I can't access the job's web page using sub ID 'https://my.lava.server/scheduler/job/287.0' When I access url with job's sub ID, It shows '404 Not found'. I can access the job with only job ID 'https://validation.linaro.org/scheduler/job/287'.
The second problem is when this message occurs, I can't resubmit the job cause all contents in resubmit text area is empty.
I found the same situation from issue page: https://gitlab.com/lava/lava/-/issues/583 The issue page let me know the root cause is "When calling the .save() function in django, every field of the current object instance are saved into the database. So if the same object is used by two processes (gunicorn and lava-scheduler here), the change in one process can override the changes from another process." And the page let me know the patch for save() only updated filed from worker, device, job.
Is it okay to apply this patch for my testing? And Will the patch be included in the next release?
Thank you.
lava-users@lists.lavasoftware.org