On 8 February 2018 at 09:10, Liao, Guoqi <guoqi.liao@hxt-semitech.com> wrote:

Hi Williams:

 

Thanks for your supporting and my feedback is inline bellowing.

 

B.R.

Guoqi

 

From: Neil Williams [mailto:neil.williams@linaro.org]
Sent: 2018
28 16:52
To: Liao, Guoqi <guoqi.liao@hxt-semitech.com>
Cc: lava-users@lists.linaro.org
Subject: Re: [Lava-users] how to achieve sync among multi-node in job definition

 

On 8 February 2018 at 08:26, Liao, Guoqi <guoqi.liao@hxt-semitech.com> wrote:

Hi Guys:

 

When I am doing a multi-node testing, I create one job definition liking below. For example:

Sub-job 1 finished booting and testing, but sub-job 2 is on-going booting. So sub-job 1 will

Remove the template file like <lava_dipatcher>/tmp/overlay****, that will cause sub-job 2 could NOT download

The overlay**** file,  sub-job 2 failed in the end. My question is how to do sync between multi-node in the job

 

Synchronisation is done using the MultiNode API - the test shell simply calls lava-sync but for that to work, there needs to be a functional test shell in the first place.

 

However, this is a different problem, to do with multiple usage of transfer_overlay.

 

What version of LAVA are you running? We have fixes for this in the upcoming release.

 

[Guoqi] our version of LAVA is 2017.12-1+stretch

  

 

However, I don't think we've explicitly tested with MultiNode using transfer_overlay.

 

Definition?

 

My job definition:

 

protocols:

  lava-multinode:

    roles:

      foo:

 

This looks like a typical client:server MultiNode test job - it really does help if you describe the roles that way rather than using slang.

 

[Guoqi] in fact, in this case, I just want to start up 2 boards to dome same testing. 2 boards have same role. You know that each board have spent different time on booting, that caused

Faster board will delete shared file in server.  


The work for LAVA-1202 is to ensure that the overlay tarball isn't deleted, instead it is copied to where it is needed and only removed when the test job finishes.

It is a very small change - you may be able to apply it directly: https://review.linaro.org/#/c/23674/
 

 

        tags:

          - board1

        device_type: **********

 

I'm assuming an internal device-type but it is worth exploring whether the device integration for this type can support adding the overlay to the rootfs in advance.

 

Transfer_overlay is not a solution to using the same rootfs for multiple test jobs - there are still issues of persistence which will affect the utilities executed by the test shell. It would be much better to deploy a fresh rootfs each time and then let LAVA add the overlay to that rootfs, avoiding the need for transfer_overlay support. The rootfs can have whatever dependencies are required by the base system pre-installed but a fresh rootfs each time means that the configuration is always the same at the start of each test job.

[Guoqi] In my env, we deployed the system on hard disk, so we need to transfer overlay to devices everytime.


It would be possible for LAVA to deploy the system to the hard disk fresh on each test job.
 

 

Keep things simple and only change one element at a time. Not deploying the rootfs each time means that the rootfs *can* change arbitrarily between test jobs. So not deploying the rootfs each time means that you are not only changing the kernel each test job, you are also inheriting unknown changes in the rootfs from the previous test job. The rootfs can be exactly the same tarball every time in every test job but that then means that all your results are reproducible - only the kernel is being changed in each test job. The small amount of time required to deploy a clean rootfs for each test job is tiny in comparison to the engineering time lost by trying to debug issues caused by a persistent rootfs.

 

        context:

          grub_method: centos

          grub_installed_device: (hd1,gpt1)

        count: 1

      bar:

        tags:

          - board2

        device_type: **********

        context:

          grub_method: centos

          grub_installed_device: (hd2,gpt1)

        count: 1

    timeout:

      minutes: 6

 

job_name: centos openjdk test

timeouts:

  job:

    minutes: 1500

  action:

    minutes: 50

  connection:

    minutes: 30

priority: medium

visibility: public

 

actions:

- deploy:

    role:

    - foo

    - bar

    kernel:

      url: http://********

      type: zimage

    os: centos

    timeout:

      minutes: 80

    to: tftp

 

- boot:

    timeout:

      minutes: 40

    role:

    - bar

    method: grub

    commands: centos_installed

    auto_login:

      login_prompt: 'login:'

      username: root

      password_prompt: 'Password:'

      password: root

    prompts:

    - 'root@localhost ~'

    transfer_overlay:

      download_command: rm -f /root/overlay* ; ifconfig ; wget -S --progress=dot:giga

      unpack_command: tar -C / -xaf

    parameters:

      shutdown-message: "reboot: Restarting system"

 

- boot:

    timeout:

      minutes: 40

    role:

    - foo

    method: grub

    commands: centos_installed

    auto_login:

      login_prompt: 'login:'

      username: root

      password_prompt: 'Password:'

      password: root

    prompts:

    - 'root@localhost ~'

    transfer_overlay:

      download_command: rm -f /root/overlay* ; ifconfig ; wget -S --progress=dot:giga

      unpack_command: tar -C / -xaf

    parameters:

      shutdown-message: "reboot: Restarting system"

 

 

- test:

    role:

    - foo

    - bar

    timeout:

      minutes: 50

    definitions:

    - repository: ssh://**********/test-definitions

      from: git

      branch: **********

      path: automated/linux/openjdk/openjdk-smoke.yaml

 

Any configuration, package installation or setup done by that test definition will be persistent into the next test job and that is known to cause reliability issues, difficulty in triaging of failed results and other complications.

 

 

      name: openjdk-smoke

 

 

Thanks

B.R.

Guoqi

 



This email is intended only for the named addressee. It may contain information that is confidential/private, legally privileged, or copyright-protected, and you should handle it accordingly. If you are not the intended recipient, you do not have legal rights to retain, copy, or distribute this email or its contents, and should promptly delete the email and all electronic copies in your system; do not retain copies in any media. If you have received this email in error, please notify the sender promptly. Thank you.

 


_______________________________________________
Lava-users mailing list
Lava-users@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lava-users




This email is intended only for the named addressee. It may contain information that is confidential/private, legally privileged, or copyright-protected, and you should handle it accordingly. If you are not the intended recipient, you do not have legal rights to retain, copy, or distribute this email or its contents, and should promptly delete the email and all electronic copies in your system; do not retain copies in any media. If you have received this email in error, please notify the sender promptly. Thank you.






--