On 8 February 2018 at 08:26, Liao, Guoqi <guoqi.liao@hxt-semitech.com> wrote:

Hi Guys:

 

When I am doing a multi-node testing, I create one job definition liking below. For example:

Sub-job 1 finished booting and testing, but sub-job 2 is on-going booting. So sub-job 1 will

Remove the template file like <lava_dipatcher>/tmp/overlay****, that will cause sub-job 2 could NOT download

The overlay**** file,  sub-job 2 failed in the end. My question is how to do sync between multi-node in the job


Synchronisation is done using the MultiNode API - the test shell simply calls lava-sync but for that to work, there needs to be a functional test shell in the first place.

However, this is a different problem, to do with multiple usage of transfer_overlay.

What version of LAVA are you running? We have fixes for this in the upcoming release.

https://projects.linaro.org/browse/LAVA-1202

However, I don't think we've explicitly tested with MultiNode using transfer_overlay.
 

Definition?

 

My job definition:

 

protocols:

  lava-multinode:

    roles:

      foo:


This looks like a typical client:server MultiNode test job - it really does help if you describe the roles that way rather than using slang.
 

        tags:

          - board1

        device_type: **********


I'm assuming an internal device-type but it is worth exploring whether the device integration for this type can support adding the overlay to the rootfs in advance.

Transfer_overlay is not a solution to using the same rootfs for multiple test jobs - there are still issues of persistence which will affect the utilities executed by the test shell. It would be much better to deploy a fresh rootfs each time and then let LAVA add the overlay to that rootfs, avoiding the need for transfer_overlay support. The rootfs can have whatever dependencies are required by the base system pre-installed but a fresh rootfs each time means that the configuration is always the same at the start of each test job.

Keep things simple and only change one element at a time. Not deploying the rootfs each time means that the rootfs *can* change arbitrarily between test jobs. So not deploying the rootfs each time means that you are not only changing the kernel each test job, you are also inheriting unknown changes in the rootfs from the previous test job. The rootfs can be exactly the same tarball every time in every test job but that then means that all your results are reproducible - only the kernel is being changed in each test job. The small amount of time required to deploy a clean rootfs for each test job is tiny in comparison to the engineering time lost by trying to debug issues caused by a persistent rootfs.
 

        context:

          grub_method: centos

          grub_installed_device: (hd1,gpt1)

        count: 1

      bar:

        tags:

          - board2

        device_type: **********

        context:

          grub_method: centos

          grub_installed_device: (hd2,gpt1)

        count: 1

    timeout:

      minutes: 6

 

job_name: centos openjdk test

timeouts:

  job:

    minutes: 1500

  action:

    minutes: 50

  connection:

    minutes: 30

priority: medium

visibility: public

 

actions:

- deploy:

    role:

    - foo

    - bar

    kernel:

      url: http://********

      type: zimage

    os: centos

    timeout:

      minutes: 80

    to: tftp

 

- boot:

    timeout:

      minutes: 40

    role:

    - bar

    method: grub

    commands: centos_installed

    auto_login:

      login_prompt: 'login:'

      username: root

      password_prompt: 'Password:'

      password: root

    prompts:

    - 'root@localhost ~'

    transfer_overlay:

      download_command: rm -f /root/overlay* ; ifconfig ; wget -S --progress=dot:giga

      unpack_command: tar -C / -xaf

    parameters:

      shutdown-message: "reboot: Restarting system"

 

- boot:

    timeout:

      minutes: 40

    role:

    - foo

    method: grub

    commands: centos_installed

    auto_login:

      login_prompt: 'login:'

      username: root

      password_prompt: 'Password:'

      password: root

    prompts:

    - 'root@localhost ~'

    transfer_overlay:

      download_command: rm -f /root/overlay* ; ifconfig ; wget -S --progress=dot:giga

      unpack_command: tar -C / -xaf

    parameters:

      shutdown-message: "reboot: Restarting system"

 

 

- test:

    role:

    - foo

    - bar

    timeout:

      minutes: 50

    definitions:

    - repository: ssh://**********/test-definitions

      from: git

      branch: **********

      path: automated/linux/openjdk/openjdk-smoke.yaml


Any configuration, package installation or setup done by that test definition will be persistent into the next test job and that is known to cause reliability issues, difficulty in triaging of failed results and other complications.

 

      name: openjdk-smoke

 

 

Thanks

B.R.

Guoqi

 




This email is intended only for the named addressee. It may contain information that is confidential/private, legally privileged, or copyright-protected, and you should handle it accordingly. If you are not the intended recipient, you do not have legal rights to retain, copy, or distribute this email or its contents, and should promptly delete the email and all electronic copies in your system; do not retain copies in any media. If you have received this email in error, please notify the sender promptly. Thank you.




_______________________________________________
Lava-users mailing list
Lava-users@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lava-users




--