Re: [Lava-users] Android iperf test hung.

1 Jul 2021


      Hello,
by default, lava-coordinator is expected to run on localhost. So the second
device will connect to localhost instead of the real host hosting
coordinator. Change the host
into /etc/lava-coordinator/lava-coordinator.conf
Le mar. 29 juin 2021 à 07:19, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
...
The "ssh" device will ssh to another machine which is not on the same
dispatcher to start a iperf server.
The "dragonboard-410c" device will start a docker container, then in this
container, it will call iperf client to connect to the iperf server.

In lava admin page, I link both "ssh device" & " dragonboard-410c" to

the same worker.
2. But, the command in ssh(iperf server) will run on another machine,
while command in docker container(iperf client) run on the same machine of
worker I think.
I'm not sure you mean 1 or 2?
On Mon, Jun 28, 2021 at 10:34 PM Remi Duraffort remi.duraffort@linaro.org
wrote:
...
Hello,
Le lun. 28 juin 2021 à 08:26, Hedy Lamarr lamarrhedy97@gmail.com a
écrit :
...
Hello,
What additional I need to afford to debug this issue?
Thanks,
Hedy Lamarr
On Thu, Jun 17, 2021 at 4:34 PM Hedy Lamarr lamarrhedy97@gmail.com
wrote:
...
YES, to make it clear, I restart the lava server just now and give you
a full log when that multinode job run:
2021-06-17 09:16:07,428    INFO [INIT] LAVA coordinator has started.
2021-06-17 09:16:07,757    INFO [INIT] Version 2021.03
2021-06-17 09:16:07,757    INFO [INIT] Loading configuration from
/etc/lava-coordinator/lava-coordinator.conf
2021-06-17 09:16:08,076    INFO [BTSP] binding to 0.0.0.0:3079
2021-06-17 09:16:08,076    INFO Ready to accept new connections
2021-06-17 09:17:23,603    INFO The
decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group will contain 2 nodes.
2021-06-17 09:17:23,603    INFO Waiting for 1 more clients to connect
to decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group
2021-06-17 09:17:23,603    INFO Ready to accept new connections
2021-06-17 09:17:23,790    INFO Group complete, starting tests
2021-06-17 09:17:23,790    INFO Ready to accept new connections
2021-06-17 09:17:26,613    INFO Group complete, starting tests
2021-06-17 09:17:26,613    INFO Ready to accept new connections
2021-06-17 09:18:03,522   DEBUG clear Group Data: 1 of 2
2021-06-17 09:18:03,522    INFO Ready to accept new connections
2021-06-17 09:18:06,001   DEBUG clear Group Data: 2 of 2
2021-06-17 09:18:06,001   DEBUG Clearing group data for
decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7
2021-06-17 09:18:06,001    INFO Ready to accept new connections
2021-06-17 09:24:43,620    INFO The
8956d8e7-1097-43e0-95dd-7afc61b2908b group will contain 2 nodes.
2021-06-17 09:24:43,620    INFO Waiting for 1 more clients to connect
to 8956d8e7-1097-43e0-95dd-7afc61b2908b group
2021-06-17 09:24:43,620    INFO Ready to accept new connections
2021-06-17 09:24:43,871    INFO Group complete, starting tests
2021-06-17 09:24:43,871    INFO Ready to accept new connections
2021-06-17 09:24:46,634    INFO Group complete, starting tests
2021-06-17 09:24:46,634    INFO Ready to accept new connections
2021-06-17 09:25:45,746    INFO lava_send: {'port': 3079, 'blocksize':
4096, 'poll_delay': 3, 'host': '10.191.253.109', 'hostname': 'lavaslave1',
'client_name': '3077', 'group_name':
'8956d8e7-1097-43e0-95dd-7afc61b2908b', 'role': 'host', 'request':
'lava_send', 'messageID': 'server_ready', 'message': {}}
2021-06-17 09:25:45,747    INFO lavaSend handler in Coordinator
received a messageID 'server_ready' for group
'8956d8e7-1097-43e0-95dd-7afc61b2908b' from 3077
2021-06-17 09:25:45,747   DEBUG message ID server_ready {"3077": {}}
for 3077
2021-06-17 09:25:45,747   DEBUG broadcast ID server_ready {"3077": {}}
for 3076
2021-06-17 09:25:45,747   DEBUG broadcast ID server_ready {"3077": {}}
for 3077
2021-06-17 09:25:45,747    INFO Ready to accept new connections
This log similar to the log I saw on web, the "SSH device" with
lava_send looks ok, but "dragonboard device for android test" with
lava_wait looks not ok, it's just hung. From above log,  looks the
coordinator did not receive anything?
For what I see in the logs, lava-coordinator is not receiving any signal
from the second test.
Are both devices on the same dispatcher/worker?
...
...
On Thu, Jun 17, 2021 at 4:02 PM Remi Duraffort <
...
remi.duraffort@linaro.org> wrote:
...
Le jeu. 17 juin 2021 à 09:11, Hedy Lamarr lamarrhedy97@gmail.com a
écrit :
...
The output is:
service lava-coordinator status
● lava-coordinator.service - LAVA coordinator
   Loaded: loaded (/lib/systemd/system/lava-coordinator.service;
enabled; vendor preset: enabled)
   Active: active (running) since Fri 2021-06-04 18:09:19 CET; 1
weeks 5 days ago
 Main PID: 629 (lava-coordinato)
    Tasks: 1 (limit: 4915)
   Memory: 7.4M
   CGroup: /system.slice/lava-coordinator.service
           └─629 /usr/bin/python3 /usr/bin/lava-coordinator
--loglevel DEBUG
So it's working.
Is it listening on 10.191.253.109:3079 ?
Do you have anything in the lava-coordinator logs?
(/var/log/lava-coordinator.log)
...
On Thu, Jun 17, 2021 at 3:05 PM Remi Duraffort <
remi.duraffort@linaro.org> wrote:
>
>
> Le jeu. 17 juin 2021 à 09:02, Hedy Lamarr lamarrhedy97@gmail.com
> a écrit :
>
>> Hello Remi,
>>
>> I think lava-coordinator is running.
>>
>> Because there are 2 devices here:
>> Device1: dragonboard-410c, when lava-wait server_ready, it hangs
>> with above log.
>> Device2: ssh, when lava-send server_ready, it shows: Connecting to
>> LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds.
>>
>> Would it be possible that lava-coordinator just works for ssh, but
>> not for dragonboard-410c?
>> Also I think the netstat, it shows:
>> tcp        0      0 0.0.0.0:3079            0.0.0.0:*
>>   LISTEN      629/python3          off (0.00/0/0)
>> Does this mean coordinator running? Or how can I make sure
>> coordinator running?
>>
>
> service lava-coordinator status
>
>
>>
>> Thanks,
>> Hedy Lamarr
>>
>> On Thu, Jun 17, 2021 at 2:33 PM Remi Duraffort <
>> remi.duraffort@linaro.org> wrote:
>>
>>> Hello,
>>>
>>> do you have lava-coordinator running?
>>>
>>> Le lun. 14 juin 2021 à 14:29, Hedy Lamarr lamarrhedy97@gmail.com
>>> a écrit :
>>>
>>>> By the way, we use 2021.03.post1.
>>>>
>>>> On Wed, Jun 9, 2021 at 10:40 AM Hedy Lamarr <
>>>> lamarrhedy97@gmail.com> wrote:
>>>>
>>>>> Dear community,
>>>>>
>>>>> We are new to lava and try to use lava in our android test. We
>>>>> have issues when test iperf.
>>>>>
>>>>> Job:
>>>>>
>>>>> job_name: android iperf test
>>>>> timeouts:
>>>>>   job:
>>>>>     minutes: 10080
>>>>>   action:
>>>>>     minutes: 120
>>>>>   connection:
>>>>>     minutes: 5
>>>>> priority: medium
>>>>> visibility: public
>>>>> protocols:
>>>>>   lava-multinode:
>>>>>     roles:
>>>>>       device:
>>>>>         count: 1
>>>>>         device_type: dragonboard-410c
>>>>>         timeout:
>>>>>           minutes: 5
>>>>>       host:
>>>>>         count: 1
>>>>>         device_type: ssh
>>>>>         timeout:
>>>>>           minutes: 5
>>>>>         context:
>>>>>           ssh_host: localhost
>>>>>           ssh_user: root
>>>>>           ssh_port: 22
>>>>>           ssh_identity_file: /root/.ssh/id_rsa
>>>>> actions:
>>>>> - deploy:
>>>>>     role:
>>>>>     - host
>>>>>     timeout:
>>>>>       minutes: 2
>>>>>     to: ssh
>>>>>     os: debian
>>>>> - boot:
>>>>>     role:
>>>>>     - host
>>>>>     method: ssh
>>>>>     connection: ssh
>>>>>     prompts:
>>>>>       - '@labpc1'
>>>>> - test:
>>>>>     role:
>>>>>     - host
>>>>>     timeout:
>>>>>       minutes: 120
>>>>>     definitions:
>>>>>     - from: inline
>>>>>       name: smoke-case
>>>>>       path: inline/test.yaml
>>>>>       repository:
>>>>>         metadata:
>>>>>           format: Lava-Test Test Definition
>>>>>           name: smoke
>>>>>           description: Run smoke case
>>>>>         run:
>>>>>           steps:
>>>>>           - sleep 60
>>>>>           - lava-send "server_ready"
>>>>>           - iperf -s -V -P 1
>>>>> - test:
>>>>>     role:
>>>>>     - device
>>>>>     definitions:
>>>>>     - from: inline
>>>>>       name: cts_cts-media_test
>>>>>       path: inline/cts_cts-media_test.yaml
>>>>>       repository:
>>>>>         metadata:
>>>>>           description: cts cts-media test run
>>>>>           format: Lava-Test Test Definition 1.0
>>>>>           name: cts-cts-media-test-run
>>>>>         run:
>>>>>           steps:
>>>>>           - adb wait-for-device
>>>>>           - adb devices
>>>>>           - adb root
>>>>>           - adb wait-for-device
>>>>>           - adb devices
>>>>>           - lava-wait "server_ready"
>>>>>           - sleep 3
>>>>>           - lava-test-case "Case1" --shell adb shell
>>>>> /data/local/iperf -c 10.191.253.21 -t 10
>>>>>     docker:
>>>>>       image: terceiro/android-platform-tools
>>>>>     timeout:
>>>>>       minutes: 4200
>>>>>
>>>>> The job log for dragonboard-410c is:
>>>>> + lava-wait server_ready
>>>>> <LAVA_WAIT_DEBUG  preparing Wed Jun  8 10:07:22 CST 2021>
>>>>> <LAVA_WAIT_DEBUG  started Wed Jun  8 10:07:22 CST 2021>
>>>>> <LAVA_MULTI_NODE> <LAVA_WAIT server_ready>
>>>>> <LAVA_WAIT_DEBUG  finished Wed Jun  8 10:07:22 CST 2021>
>>>>> <LAVA_WAIT_DEBUG  finished Wed Jun  8 10:07:22 CST 2021>
>>>>> <LAVA_WAIT_DEBUG  starting to wait Wed Jun  8 10:07:22 CST 2021>
>>>>> NOTE: it looks hung at this step, the job can't continue.
>>>>>
>>>>> The job log for ssh is:
>>>>> + lava-send server_ready
>>>>> <LAVA_SEND_DEBUG lava_multi_node_send preparing Wed Jun  8
>>>>> 10:07:53 CST 2021>
>>>>> <LAVA_SEND_DEBUG _lava_multi_node_send started Wed Jun  8
>>>>> 10:07:53 CST 2021>
>>>>> <LAVA_MULTI_NODE> <LAVA_SEND server_ready>
>>>>> Received Multi_Node API <LAVA_SEND>
>>>>> messageID: SEND-server_ready
>>>>> lava-multinode lava-send
>>>>> Handling signal <LAVA_SEND {"request": "lava_send", "messageID":
>>>>> "server_ready", "message": {}, "timeout": 300}>
>>>>> Setting poll timeout of 300 seconds
>>>>> requesting lava_send server_ready
>>>>> message: {}
>>>>> requesting lava_send server_ready with args {}
>>>>> request_send server_ready {}
>>>>> Sending {'request': 'lava_send', 'messageID': 'server_ready',
>>>>> 'message': {}}
>>>>> final message: {"port": 3079, "blocksize": 4096, "poll_delay":
>>>>> 3, "host": "10.191.253.109", "hostname": "lavaslave1", "client_name":
>>>>> "3035", "group_name": "8a362e2a-6ee9-4f48-bddb-378ac2425f06", "role":
>>>>> "host", "request": "lava_send", "messageID": "server_ready", "message": {}}
>>>>> Connecting to LAVA Coordinator on 10.191.253.109:3079
>>>>> timeout=300 seconds.
>>>>> case: multinode-send-server_ready
>>>>> case_id: 39177
>>>>> definition: 0_smoke-case
>>>>> result: pass
>>>>> <LAVA_SEND_DEBUG _lava_multi_node_send finished Wed Jun  8
>>>>> 10:07:53 CST 2021>
>>>>> <LAVA_SEND_DEBUG lava_multi_node_send finished Wed Jun  8
>>>>> 10:07:53 CST 2021>
>>>>> + iperf -s -V -P 1
>>>>> ------------------------------------------------------------
>>>>> Server listening on TCP port 5001
>>>>> TCP window size: 85.3 KByte (default)
>>>>> ------------------------------------------------------------
>>>>>
>>>>> It seems that one node can't receive server-ready from another
>>>>> node, what's wrong with my job? Please help!
>>>>>
>>>>> Thanks,
>>>>> Hedy Lamarr
>>>>>
>>>>> _______________________________________________
>>>> Lava-users mailing list
>>>> Lava-users@lists.lavasoftware.org
>>>> https://lists.lavasoftware.org/mailman/listinfo/lava-users
>>>>
>>>
>>>
>>> --
>>> Rémi Duraffort
>>> TuxArchitect
>>> Linaro
>>>
>>
>
> --
> Rémi Duraffort
> TuxArchitect
> Linaro
>
--
Rémi Duraffort
TuxArchitect
Linaro
--
Rémi Duraffort
TuxArchitect
Linaro
-- 
Rémi Duraffort
TuxArchitect
Linaro

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Lava-users] Android iperf test hung.