Re: [Lava-users] Android iperf test hung.

5 Jul 2021


      Hi, Remi, I just have one dispatcher, both devices linked to this
dispatcher. And the issue continues after change to next:
{
    "port": 3079,
    "blocksize": 4096,
    "poll_delay": 3,
    "coordinator_hostname": "10.191.253.109"
}
On Thu, Jul 1, 2021 at 4:55 PM Remi Duraffort remi.duraffort@linaro.org
wrote:
...
Hello,
by default, lava-coordinator is expected to run on localhost. So the
second device will connect to localhost instead of the real host hosting
coordinator. Change the host
into /etc/lava-coordinator/lava-coordinator.conf
Le mar. 29 juin 2021 à 07:19, Hedy Lamarr lamarrhedy97@gmail.com a
écrit :
...
The "ssh" device will ssh to another machine which is not on the same
dispatcher to start a iperf server.
The "dragonboard-410c" device will start a docker container, then in this
container, it will call iperf client to connect to the iperf server.

In lava admin page, I link both "ssh device" & " dragonboard-410c" to

the same worker.
2. But, the command in ssh(iperf server) will run on another machine,
while command in docker container(iperf client) run on the same machine of
worker I think.
I'm not sure you mean 1 or 2?
On Mon, Jun 28, 2021 at 10:34 PM Remi Duraffort <
remi.duraffort@linaro.org> wrote:
...
Hello,
Le lun. 28 juin 2021 à 08:26, Hedy Lamarr lamarrhedy97@gmail.com a
écrit :
...
Hello,
What additional I need to afford to debug this issue?
Thanks,
Hedy Lamarr
On Thu, Jun 17, 2021 at 4:34 PM Hedy Lamarr lamarrhedy97@gmail.com
wrote:
...
YES, to make it clear, I restart the lava server just now and give you
a full log when that multinode job run:
2021-06-17 09:16:07,428    INFO [INIT] LAVA coordinator has started.
2021-06-17 09:16:07,757    INFO [INIT] Version 2021.03
2021-06-17 09:16:07,757    INFO [INIT] Loading configuration from
/etc/lava-coordinator/lava-coordinator.conf
2021-06-17 09:16:08,076    INFO [BTSP] binding to 0.0.0.0:3079
2021-06-17 09:16:08,076    INFO Ready to accept new connections
2021-06-17 09:17:23,603    INFO The
decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group will contain 2 nodes.
2021-06-17 09:17:23,603    INFO Waiting for 1 more clients to connect
to decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group
2021-06-17 09:17:23,603    INFO Ready to accept new connections
2021-06-17 09:17:23,790    INFO Group complete, starting tests
2021-06-17 09:17:23,790    INFO Ready to accept new connections
2021-06-17 09:17:26,613    INFO Group complete, starting tests
2021-06-17 09:17:26,613    INFO Ready to accept new connections
2021-06-17 09:18:03,522   DEBUG clear Group Data: 1 of 2
2021-06-17 09:18:03,522    INFO Ready to accept new connections
2021-06-17 09:18:06,001   DEBUG clear Group Data: 2 of 2
2021-06-17 09:18:06,001   DEBUG Clearing group data for
decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7
2021-06-17 09:18:06,001    INFO Ready to accept new connections
2021-06-17 09:24:43,620    INFO The
8956d8e7-1097-43e0-95dd-7afc61b2908b group will contain 2 nodes.
2021-06-17 09:24:43,620    INFO Waiting for 1 more clients to connect
to 8956d8e7-1097-43e0-95dd-7afc61b2908b group
2021-06-17 09:24:43,620    INFO Ready to accept new connections
2021-06-17 09:24:43,871    INFO Group complete, starting tests
2021-06-17 09:24:43,871    INFO Ready to accept new connections
2021-06-17 09:24:46,634    INFO Group complete, starting tests
2021-06-17 09:24:46,634    INFO Ready to accept new connections
2021-06-17 09:25:45,746    INFO lava_send: {'port': 3079, 'blocksize':
4096, 'poll_delay': 3, 'host': '10.191.253.109', 'hostname': 'lavaslave1',
'client_name': '3077', 'group_name':
'8956d8e7-1097-43e0-95dd-7afc61b2908b', 'role': 'host', 'request':
'lava_send', 'messageID': 'server_ready', 'message': {}}
2021-06-17 09:25:45,747    INFO lavaSend handler in Coordinator
received a messageID 'server_ready' for group
'8956d8e7-1097-43e0-95dd-7afc61b2908b' from 3077
2021-06-17 09:25:45,747   DEBUG message ID server_ready {"3077": {}}
for 3077
2021-06-17 09:25:45,747   DEBUG broadcast ID server_ready {"3077": {}}
for 3076
2021-06-17 09:25:45,747   DEBUG broadcast ID server_ready {"3077": {}}
for 3077
2021-06-17 09:25:45,747    INFO Ready to accept new connections
This log similar to the log I saw on web, the "SSH device" with
lava_send looks ok, but "dragonboard device for android test" with
lava_wait looks not ok, it's just hung. From above log,  looks the
coordinator did not receive anything?
For what I see in the logs, lava-coordinator is not receiving any signal
from the second test.
Are both devices on the same dispatcher/worker?
...
...
On Thu, Jun 17, 2021 at 4:02 PM Remi Duraffort <
...
remi.duraffort@linaro.org> wrote:
...
Le jeu. 17 juin 2021 à 09:11, Hedy Lamarr lamarrhedy97@gmail.com a
écrit :
> The output is:
>
> service lava-coordinator status
> ● lava-coordinator.service - LAVA coordinator
>    Loaded: loaded (/lib/systemd/system/lava-coordinator.service;
> enabled; vendor preset: enabled)
>    Active: active (running) since Fri 2021-06-04 18:09:19 CET; 1
> weeks 5 days ago
>  Main PID: 629 (lava-coordinato)
>     Tasks: 1 (limit: 4915)
>    Memory: 7.4M
>    CGroup: /system.slice/lava-coordinator.service
>            └─629 /usr/bin/python3 /usr/bin/lava-coordinator
> --loglevel DEBUG
>
So it's working.
Is it listening on 10.191.253.109:3079 ?
Do you have anything in the lava-coordinator logs?
(/var/log/lava-coordinator.log)
> On Thu, Jun 17, 2021 at 3:05 PM Remi Duraffort <
> remi.duraffort@linaro.org> wrote:
>
>>
>>
>> Le jeu. 17 juin 2021 à 09:02, Hedy Lamarr lamarrhedy97@gmail.com
>> a écrit :
>>
>>> Hello Remi,
>>>
>>> I think lava-coordinator is running.
>>>
>>> Because there are 2 devices here:
>>> Device1: dragonboard-410c, when lava-wait server_ready, it hangs
>>> with above log.
>>> Device2: ssh, when lava-send server_ready, it shows: Connecting to
>>> LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds.
>>>
>>> Would it be possible that lava-coordinator just works for ssh, but
>>> not for dragonboard-410c?
>>> Also I think the netstat, it shows:
>>> tcp        0      0 0.0.0.0:3079            0.0.0.0:*
>>>   LISTEN      629/python3          off (0.00/0/0)
>>> Does this mean coordinator running? Or how can I make sure
>>> coordinator running?
>>>
>>
>> service lava-coordinator status
>>
>>
>>>
>>> Thanks,
>>> Hedy Lamarr
>>>
>>> On Thu, Jun 17, 2021 at 2:33 PM Remi Duraffort <
>>> remi.duraffort@linaro.org> wrote:
>>>
>>>> Hello,
>>>>
>>>> do you have lava-coordinator running?
>>>>
>>>> Le lun. 14 juin 2021 à 14:29, Hedy Lamarr lamarrhedy97@gmail.com
>>>> a écrit :
>>>>
>>>>> By the way, we use 2021.03.post1.
>>>>>
>>>>> On Wed, Jun 9, 2021 at 10:40 AM Hedy Lamarr <
>>>>> lamarrhedy97@gmail.com> wrote:
>>>>>
>>>>>> Dear community,
>>>>>>
>>>>>> We are new to lava and try to use lava in our android test. We
>>>>>> have issues when test iperf.
>>>>>>
>>>>>> Job:
>>>>>>
>>>>>> job_name: android iperf test
>>>>>> timeouts:
>>>>>>   job:
>>>>>>     minutes: 10080
>>>>>>   action:
>>>>>>     minutes: 120
>>>>>>   connection:
>>>>>>     minutes: 5
>>>>>> priority: medium
>>>>>> visibility: public
>>>>>> protocols:
>>>>>>   lava-multinode:
>>>>>>     roles:
>>>>>>       device:
>>>>>>         count: 1
>>>>>>         device_type: dragonboard-410c
>>>>>>         timeout:
>>>>>>           minutes: 5
>>>>>>       host:
>>>>>>         count: 1
>>>>>>         device_type: ssh
>>>>>>         timeout:
>>>>>>           minutes: 5
>>>>>>         context:
>>>>>>           ssh_host: localhost
>>>>>>           ssh_user: root
>>>>>>           ssh_port: 22
>>>>>>           ssh_identity_file: /root/.ssh/id_rsa
>>>>>> actions:
>>>>>> - deploy:
>>>>>>     role:
>>>>>>     - host
>>>>>>     timeout:
>>>>>>       minutes: 2
>>>>>>     to: ssh
>>>>>>     os: debian
>>>>>> - boot:
>>>>>>     role:
>>>>>>     - host
>>>>>>     method: ssh
>>>>>>     connection: ssh
>>>>>>     prompts:
>>>>>>       - '@labpc1'
>>>>>> - test:
>>>>>>     role:
>>>>>>     - host
>>>>>>     timeout:
>>>>>>       minutes: 120
>>>>>>     definitions:
>>>>>>     - from: inline
>>>>>>       name: smoke-case
>>>>>>       path: inline/test.yaml
>>>>>>       repository:
>>>>>>         metadata:
>>>>>>           format: Lava-Test Test Definition
>>>>>>           name: smoke
>>>>>>           description: Run smoke case
>>>>>>         run:
>>>>>>           steps:
>>>>>>           - sleep 60
>>>>>>           - lava-send "server_ready"
>>>>>>           - iperf -s -V -P 1
>>>>>> - test:
>>>>>>     role:
>>>>>>     - device
>>>>>>     definitions:
>>>>>>     - from: inline
>>>>>>       name: cts_cts-media_test
>>>>>>       path: inline/cts_cts-media_test.yaml
>>>>>>       repository:
>>>>>>         metadata:
>>>>>>           description: cts cts-media test run
>>>>>>           format: Lava-Test Test Definition 1.0
>>>>>>           name: cts-cts-media-test-run
>>>>>>         run:
>>>>>>           steps:
>>>>>>           - adb wait-for-device
>>>>>>           - adb devices
>>>>>>           - adb root
>>>>>>           - adb wait-for-device
>>>>>>           - adb devices
>>>>>>           - lava-wait "server_ready"
>>>>>>           - sleep 3
>>>>>>           - lava-test-case "Case1" --shell adb shell
>>>>>> /data/local/iperf -c 10.191.253.21 -t 10
>>>>>>     docker:
>>>>>>       image: terceiro/android-platform-tools
>>>>>>     timeout:
>>>>>>       minutes: 4200
>>>>>>
>>>>>> The job log for dragonboard-410c is:
>>>>>> + lava-wait server_ready
>>>>>> <LAVA_WAIT_DEBUG  preparing Wed Jun  8 10:07:22 CST 2021>
>>>>>> <LAVA_WAIT_DEBUG  started Wed Jun  8 10:07:22 CST 2021>
>>>>>> <LAVA_MULTI_NODE> <LAVA_WAIT server_ready>
>>>>>> <LAVA_WAIT_DEBUG  finished Wed Jun  8 10:07:22 CST 2021>
>>>>>> <LAVA_WAIT_DEBUG  finished Wed Jun  8 10:07:22 CST 2021>
>>>>>> <LAVA_WAIT_DEBUG  starting to wait Wed Jun  8 10:07:22 CST 2021>
>>>>>> NOTE: it looks hung at this step, the job can't continue.
>>>>>>
>>>>>> The job log for ssh is:
>>>>>> + lava-send server_ready
>>>>>> <LAVA_SEND_DEBUG lava_multi_node_send preparing Wed Jun  8
>>>>>> 10:07:53 CST 2021>
>>>>>> <LAVA_SEND_DEBUG _lava_multi_node_send started Wed Jun  8
>>>>>> 10:07:53 CST 2021>
>>>>>> <LAVA_MULTI_NODE> <LAVA_SEND server_ready>
>>>>>> Received Multi_Node API <LAVA_SEND>
>>>>>> messageID: SEND-server_ready
>>>>>> lava-multinode lava-send
>>>>>> Handling signal <LAVA_SEND {"request": "lava_send",
>>>>>> "messageID": "server_ready", "message": {}, "timeout": 300}>
>>>>>> Setting poll timeout of 300 seconds
>>>>>> requesting lava_send server_ready
>>>>>> message: {}
>>>>>> requesting lava_send server_ready with args {}
>>>>>> request_send server_ready {}
>>>>>> Sending {'request': 'lava_send', 'messageID': 'server_ready',
>>>>>> 'message': {}}
>>>>>> final message: {"port": 3079, "blocksize": 4096, "poll_delay":
>>>>>> 3, "host": "10.191.253.109", "hostname": "lavaslave1", "client_name":
>>>>>> "3035", "group_name": "8a362e2a-6ee9-4f48-bddb-378ac2425f06", "role":
>>>>>> "host", "request": "lava_send", "messageID": "server_ready", "message": {}}
>>>>>> Connecting to LAVA Coordinator on 10.191.253.109:3079
>>>>>> timeout=300 seconds.
>>>>>> case: multinode-send-server_ready
>>>>>> case_id: 39177
>>>>>> definition: 0_smoke-case
>>>>>> result: pass
>>>>>> <LAVA_SEND_DEBUG _lava_multi_node_send finished Wed Jun  8
>>>>>> 10:07:53 CST 2021>
>>>>>> <LAVA_SEND_DEBUG lava_multi_node_send finished Wed Jun  8
>>>>>> 10:07:53 CST 2021>
>>>>>> + iperf -s -V -P 1
>>>>>> ------------------------------------------------------------
>>>>>> Server listening on TCP port 5001
>>>>>> TCP window size: 85.3 KByte (default)
>>>>>> ------------------------------------------------------------
>>>>>>
>>>>>> It seems that one node can't receive server-ready from another
>>>>>> node, what's wrong with my job? Please help!
>>>>>>
>>>>>> Thanks,
>>>>>> Hedy Lamarr
>>>>>>
>>>>>> _______________________________________________
>>>>> Lava-users mailing list
>>>>> Lava-users@lists.lavasoftware.org
>>>>> https://lists.lavasoftware.org/mailman/listinfo/lava-users
>>>>>
>>>>
>>>>
>>>> --
>>>> Rémi Duraffort
>>>> TuxArchitect
>>>> Linaro
>>>>
>>>
>>
>> --
>> Rémi Duraffort
>> TuxArchitect
>> Linaro
>>
>
--
Rémi Duraffort
TuxArchitect
Linaro
--
Rémi Duraffort
TuxArchitect
Linaro
--
Rémi Duraffort
TuxArchitect
Linaro

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Lava-users] Android iperf test hung.