Dear community,
We are new to lava and try to use lava in our android test. We have issues when test iperf.
Job:
job_name: android iperf test timeouts: job: minutes: 10080 action: minutes: 120 connection: minutes: 5 priority: medium visibility: public protocols: lava-multinode: roles: device: count: 1 device_type: dragonboard-410c timeout: minutes: 5 host: count: 1 device_type: ssh timeout: minutes: 5 context: ssh_host: localhost ssh_user: root ssh_port: 22 ssh_identity_file: /root/.ssh/id_rsa actions: - deploy: role: - host timeout: minutes: 2 to: ssh os: debian - boot: role: - host method: ssh connection: ssh prompts: - '@labpc1' - test: role: - host timeout: minutes: 120 definitions: - from: inline name: smoke-case path: inline/test.yaml repository: metadata: format: Lava-Test Test Definition name: smoke description: Run smoke case run: steps: - sleep 60 - lava-send "server_ready" - iperf -s -V -P 1 - test: role: - device definitions: - from: inline name: cts_cts-media_test path: inline/cts_cts-media_test.yaml repository: metadata: description: cts cts-media test run format: Lava-Test Test Definition 1.0 name: cts-cts-media-test-run run: steps: - adb wait-for-device - adb devices - adb root - adb wait-for-device - adb devices - lava-wait "server_ready" - sleep 3 - lava-test-case "Case1" --shell adb shell /data/local/iperf -c 10.191.253.21 -t 10 docker: image: terceiro/android-platform-tools timeout: minutes: 4200
The job log for dragonboard-410c is: + lava-wait server_ready <LAVA_WAIT_DEBUG preparing Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG started Wed Jun 8 10:07:22 CST 2021> <LAVA_MULTI_NODE> <LAVA_WAIT server_ready> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG starting to wait Wed Jun 8 10:07:22 CST 2021> NOTE: it looks hung at this step, the job can't continue.
The job log for ssh is: + lava-send server_ready <LAVA_SEND_DEBUG lava_multi_node_send preparing Wed Jun 8 10:07:53 CST 2021> <LAVA_SEND_DEBUG _lava_multi_node_send started Wed Jun 8 10:07:53 CST 2021> <LAVA_MULTI_NODE> <LAVA_SEND server_ready> Received Multi_Node API <LAVA_SEND> messageID: SEND-server_ready lava-multinode lava-send Handling signal <LAVA_SEND {"request": "lava_send", "messageID": "server_ready", "message": {}, "timeout": 300}> Setting poll timeout of 300 seconds requesting lava_send server_ready message: {} requesting lava_send server_ready with args {} request_send server_ready {} Sending {'request': 'lava_send', 'messageID': 'server_ready', 'message': {}} final message: {"port": 3079, "blocksize": 4096, "poll_delay": 3, "host": "10.191.253.109", "hostname": "lavaslave1", "client_name": "3035", "group_name": "8a362e2a-6ee9-4f48-bddb-378ac2425f06", "role": "host", "request": "lava_send", "messageID": "server_ready", "message": {}} Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds. case: multinode-send-server_ready case_id: 39177 definition: 0_smoke-case result: pass <LAVA_SEND_DEBUG _lava_multi_node_send finished Wed Jun 8 10:07:53 CST 2021> <LAVA_SEND_DEBUG lava_multi_node_send finished Wed Jun 8 10:07:53 CST 2021> + iperf -s -V -P 1 ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------
It seems that one node can't receive server-ready from another node, what's wrong with my job? Please help!
Thanks, Hedy Lamarr
By the way, we use 2021.03.post1.
On Wed, Jun 9, 2021 at 10:40 AM Hedy Lamarr lamarrhedy97@gmail.com wrote:
Dear community,
We are new to lava and try to use lava in our android test. We have issues when test iperf.
Job:
job_name: android iperf test timeouts: job: minutes: 10080 action: minutes: 120 connection: minutes: 5 priority: medium visibility: public protocols: lava-multinode: roles: device: count: 1 device_type: dragonboard-410c timeout: minutes: 5 host: count: 1 device_type: ssh timeout: minutes: 5 context: ssh_host: localhost ssh_user: root ssh_port: 22 ssh_identity_file: /root/.ssh/id_rsa actions:
- deploy: role:
timeout: minutes: 2 to: ssh os: debian
- host
- boot: role:
method: ssh connection: ssh prompts: - '@labpc1'
- host
- test: role:
timeout: minutes: 120 definitions:
- host
- from: inline name: smoke-case path: inline/test.yaml repository: metadata: format: Lava-Test Test Definition name: smoke description: Run smoke case run: steps: - sleep 60 - lava-send "server_ready" - iperf -s -V -P 1
- test: role:
definitions:
- device
- from: inline name: cts_cts-media_test path: inline/cts_cts-media_test.yaml repository: metadata: description: cts cts-media test run format: Lava-Test Test Definition 1.0 name: cts-cts-media-test-run run: steps: - adb wait-for-device - adb devices - adb root - adb wait-for-device - adb devices - lava-wait "server_ready" - sleep 3 - lava-test-case "Case1" --shell adb shell /data/local/iperf -c
10.191.253.21 -t 10 docker: image: terceiro/android-platform-tools timeout: minutes: 4200
The job log for dragonboard-410c is:
- lava-wait server_ready
<LAVA_WAIT_DEBUG preparing Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG started Wed Jun 8 10:07:22 CST 2021> <LAVA_MULTI_NODE> <LAVA_WAIT server_ready> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG starting to wait Wed Jun 8 10:07:22 CST 2021> NOTE: it looks hung at this step, the job can't continue.
The job log for ssh is:
- lava-send server_ready
<LAVA_SEND_DEBUG lava_multi_node_send preparing Wed Jun 8 10:07:53 CST 2021> <LAVA_SEND_DEBUG _lava_multi_node_send started Wed Jun 8 10:07:53 CST 2021> <LAVA_MULTI_NODE> <LAVA_SEND server_ready> Received Multi_Node API <LAVA_SEND> messageID: SEND-server_ready lava-multinode lava-send Handling signal <LAVA_SEND {"request": "lava_send", "messageID": "server_ready", "message": {}, "timeout": 300}> Setting poll timeout of 300 seconds requesting lava_send server_ready message: {} requesting lava_send server_ready with args {} request_send server_ready {} Sending {'request': 'lava_send', 'messageID': 'server_ready', 'message': {}} final message: {"port": 3079, "blocksize": 4096, "poll_delay": 3, "host": "10.191.253.109", "hostname": "lavaslave1", "client_name": "3035", "group_name": "8a362e2a-6ee9-4f48-bddb-378ac2425f06", "role": "host", "request": "lava_send", "messageID": "server_ready", "message": {}} Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds. case: multinode-send-server_ready case_id: 39177 definition: 0_smoke-case result: pass <LAVA_SEND_DEBUG _lava_multi_node_send finished Wed Jun 8 10:07:53 CST 2021> <LAVA_SEND_DEBUG lava_multi_node_send finished Wed Jun 8 10:07:53 CST 2021>
- iperf -s -V -P 1
Server listening on TCP port 5001 TCP window size: 85.3 KByte (default)
It seems that one node can't receive server-ready from another node, what's wrong with my job? Please help!
Thanks, Hedy Lamarr
Hello,
do you have lava-coordinator running?
Le lun. 14 juin 2021 à 14:29, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
By the way, we use 2021.03.post1.
On Wed, Jun 9, 2021 at 10:40 AM Hedy Lamarr lamarrhedy97@gmail.com wrote:
Dear community,
We are new to lava and try to use lava in our android test. We have issues when test iperf.
Job:
job_name: android iperf test timeouts: job: minutes: 10080 action: minutes: 120 connection: minutes: 5 priority: medium visibility: public protocols: lava-multinode: roles: device: count: 1 device_type: dragonboard-410c timeout: minutes: 5 host: count: 1 device_type: ssh timeout: minutes: 5 context: ssh_host: localhost ssh_user: root ssh_port: 22 ssh_identity_file: /root/.ssh/id_rsa actions:
- deploy: role:
timeout: minutes: 2 to: ssh os: debian
- host
- boot: role:
method: ssh connection: ssh prompts: - '@labpc1'
- host
- test: role:
timeout: minutes: 120 definitions:
- host
- from: inline name: smoke-case path: inline/test.yaml repository: metadata: format: Lava-Test Test Definition name: smoke description: Run smoke case run: steps: - sleep 60 - lava-send "server_ready" - iperf -s -V -P 1
- test: role:
definitions:
- device
- from: inline name: cts_cts-media_test path: inline/cts_cts-media_test.yaml repository: metadata: description: cts cts-media test run format: Lava-Test Test Definition 1.0 name: cts-cts-media-test-run run: steps: - adb wait-for-device - adb devices - adb root - adb wait-for-device - adb devices - lava-wait "server_ready" - sleep 3 - lava-test-case "Case1" --shell adb shell /data/local/iperf -c
10.191.253.21 -t 10 docker: image: terceiro/android-platform-tools timeout: minutes: 4200
The job log for dragonboard-410c is:
- lava-wait server_ready
<LAVA_WAIT_DEBUG preparing Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG started Wed Jun 8 10:07:22 CST 2021> <LAVA_MULTI_NODE> <LAVA_WAIT server_ready> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG starting to wait Wed Jun 8 10:07:22 CST 2021> NOTE: it looks hung at this step, the job can't continue.
The job log for ssh is:
- lava-send server_ready
<LAVA_SEND_DEBUG lava_multi_node_send preparing Wed Jun 8 10:07:53 CST 2021> <LAVA_SEND_DEBUG _lava_multi_node_send started Wed Jun 8 10:07:53 CST 2021> <LAVA_MULTI_NODE> <LAVA_SEND server_ready> Received Multi_Node API <LAVA_SEND> messageID: SEND-server_ready lava-multinode lava-send Handling signal <LAVA_SEND {"request": "lava_send", "messageID": "server_ready", "message": {}, "timeout": 300}> Setting poll timeout of 300 seconds requesting lava_send server_ready message: {} requesting lava_send server_ready with args {} request_send server_ready {} Sending {'request': 'lava_send', 'messageID': 'server_ready', 'message': {}} final message: {"port": 3079, "blocksize": 4096, "poll_delay": 3, "host": "10.191.253.109", "hostname": "lavaslave1", "client_name": "3035", "group_name": "8a362e2a-6ee9-4f48-bddb-378ac2425f06", "role": "host", "request": "lava_send", "messageID": "server_ready", "message": {}} Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds. case: multinode-send-server_ready case_id: 39177 definition: 0_smoke-case result: pass <LAVA_SEND_DEBUG _lava_multi_node_send finished Wed Jun 8 10:07:53 CST 2021> <LAVA_SEND_DEBUG lava_multi_node_send finished Wed Jun 8 10:07:53 CST 2021>
- iperf -s -V -P 1
Server listening on TCP port 5001 TCP window size: 85.3 KByte (default)
It seems that one node can't receive server-ready from another node, what's wrong with my job? Please help!
Thanks, Hedy Lamarr
Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users
Hello Remi,
I think lava-coordinator is running.
Because there are 2 devices here: Device1: dragonboard-410c, when lava-wait server_ready, it hangs with above log. Device2: ssh, when lava-send server_ready, it shows: Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds.
Would it be possible that lava-coordinator just works for ssh, but not for dragonboard-410c? Also I think the netstat, it shows: tcp 0 0 0.0.0.0:3079 0.0.0.0:* LISTEN 629/python3 off (0.00/0/0) Does this mean coordinator running? Or how can I make sure coordinator running?
Thanks, Hedy Lamarr
On Thu, Jun 17, 2021 at 2:33 PM Remi Duraffort remi.duraffort@linaro.org wrote:
Hello,
do you have lava-coordinator running?
Le lun. 14 juin 2021 à 14:29, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
By the way, we use 2021.03.post1.
On Wed, Jun 9, 2021 at 10:40 AM Hedy Lamarr lamarrhedy97@gmail.com wrote:
Dear community,
We are new to lava and try to use lava in our android test. We have issues when test iperf.
Job:
job_name: android iperf test timeouts: job: minutes: 10080 action: minutes: 120 connection: minutes: 5 priority: medium visibility: public protocols: lava-multinode: roles: device: count: 1 device_type: dragonboard-410c timeout: minutes: 5 host: count: 1 device_type: ssh timeout: minutes: 5 context: ssh_host: localhost ssh_user: root ssh_port: 22 ssh_identity_file: /root/.ssh/id_rsa actions:
- deploy: role:
timeout: minutes: 2 to: ssh os: debian
- host
- boot: role:
method: ssh connection: ssh prompts: - '@labpc1'
- host
- test: role:
timeout: minutes: 120 definitions:
- host
- from: inline name: smoke-case path: inline/test.yaml repository: metadata: format: Lava-Test Test Definition name: smoke description: Run smoke case run: steps: - sleep 60 - lava-send "server_ready" - iperf -s -V -P 1
- test: role:
definitions:
- device
- from: inline name: cts_cts-media_test path: inline/cts_cts-media_test.yaml repository: metadata: description: cts cts-media test run format: Lava-Test Test Definition 1.0 name: cts-cts-media-test-run run: steps: - adb wait-for-device - adb devices - adb root - adb wait-for-device - adb devices - lava-wait "server_ready" - sleep 3 - lava-test-case "Case1" --shell adb shell /data/local/iperf
-c 10.191.253.21 -t 10 docker: image: terceiro/android-platform-tools timeout: minutes: 4200
The job log for dragonboard-410c is:
- lava-wait server_ready
<LAVA_WAIT_DEBUG preparing Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG started Wed Jun 8 10:07:22 CST 2021> <LAVA_MULTI_NODE> <LAVA_WAIT server_ready> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG starting to wait Wed Jun 8 10:07:22 CST 2021> NOTE: it looks hung at this step, the job can't continue.
The job log for ssh is:
- lava-send server_ready
<LAVA_SEND_DEBUG lava_multi_node_send preparing Wed Jun 8 10:07:53 CST 2021> <LAVA_SEND_DEBUG _lava_multi_node_send started Wed Jun 8 10:07:53 CST 2021> <LAVA_MULTI_NODE> <LAVA_SEND server_ready> Received Multi_Node API <LAVA_SEND> messageID: SEND-server_ready lava-multinode lava-send Handling signal <LAVA_SEND {"request": "lava_send", "messageID": "server_ready", "message": {}, "timeout": 300}> Setting poll timeout of 300 seconds requesting lava_send server_ready message: {} requesting lava_send server_ready with args {} request_send server_ready {} Sending {'request': 'lava_send', 'messageID': 'server_ready', 'message': {}} final message: {"port": 3079, "blocksize": 4096, "poll_delay": 3, "host": "10.191.253.109", "hostname": "lavaslave1", "client_name": "3035", "group_name": "8a362e2a-6ee9-4f48-bddb-378ac2425f06", "role": "host", "request": "lava_send", "messageID": "server_ready", "message": {}} Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds. case: multinode-send-server_ready case_id: 39177 definition: 0_smoke-case result: pass <LAVA_SEND_DEBUG _lava_multi_node_send finished Wed Jun 8 10:07:53 CST 2021> <LAVA_SEND_DEBUG lava_multi_node_send finished Wed Jun 8 10:07:53 CST 2021>
- iperf -s -V -P 1
Server listening on TCP port 5001 TCP window size: 85.3 KByte (default)
It seems that one node can't receive server-ready from another node, what's wrong with my job? Please help!
Thanks, Hedy Lamarr
Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users
-- Rémi Duraffort TuxArchitect Linaro
Le jeu. 17 juin 2021 à 09:02, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
Hello Remi,
I think lava-coordinator is running.
Because there are 2 devices here: Device1: dragonboard-410c, when lava-wait server_ready, it hangs with above log. Device2: ssh, when lava-send server_ready, it shows: Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds.
Would it be possible that lava-coordinator just works for ssh, but not for dragonboard-410c? Also I think the netstat, it shows: tcp 0 0 0.0.0.0:3079 0.0.0.0:* LISTEN 629/python3 off (0.00/0/0) Does this mean coordinator running? Or how can I make sure coordinator running?
service lava-coordinator status
Thanks, Hedy Lamarr
On Thu, Jun 17, 2021 at 2:33 PM Remi Duraffort remi.duraffort@linaro.org wrote:
Hello,
do you have lava-coordinator running?
Le lun. 14 juin 2021 à 14:29, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
By the way, we use 2021.03.post1.
On Wed, Jun 9, 2021 at 10:40 AM Hedy Lamarr lamarrhedy97@gmail.com wrote:
Dear community,
We are new to lava and try to use lava in our android test. We have issues when test iperf.
Job:
job_name: android iperf test timeouts: job: minutes: 10080 action: minutes: 120 connection: minutes: 5 priority: medium visibility: public protocols: lava-multinode: roles: device: count: 1 device_type: dragonboard-410c timeout: minutes: 5 host: count: 1 device_type: ssh timeout: minutes: 5 context: ssh_host: localhost ssh_user: root ssh_port: 22 ssh_identity_file: /root/.ssh/id_rsa actions:
- deploy: role:
timeout: minutes: 2 to: ssh os: debian
- host
- boot: role:
method: ssh connection: ssh prompts: - '@labpc1'
- host
- test: role:
timeout: minutes: 120 definitions:
- host
- from: inline name: smoke-case path: inline/test.yaml repository: metadata: format: Lava-Test Test Definition name: smoke description: Run smoke case run: steps: - sleep 60 - lava-send "server_ready" - iperf -s -V -P 1
- test: role:
definitions:
- device
- from: inline name: cts_cts-media_test path: inline/cts_cts-media_test.yaml repository: metadata: description: cts cts-media test run format: Lava-Test Test Definition 1.0 name: cts-cts-media-test-run run: steps: - adb wait-for-device - adb devices - adb root - adb wait-for-device - adb devices - lava-wait "server_ready" - sleep 3 - lava-test-case "Case1" --shell adb shell /data/local/iperf
-c 10.191.253.21 -t 10 docker: image: terceiro/android-platform-tools timeout: minutes: 4200
The job log for dragonboard-410c is:
- lava-wait server_ready
<LAVA_WAIT_DEBUG preparing Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG started Wed Jun 8 10:07:22 CST 2021> <LAVA_MULTI_NODE> <LAVA_WAIT server_ready> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG starting to wait Wed Jun 8 10:07:22 CST 2021> NOTE: it looks hung at this step, the job can't continue.
The job log for ssh is:
- lava-send server_ready
<LAVA_SEND_DEBUG lava_multi_node_send preparing Wed Jun 8 10:07:53 CST 2021> <LAVA_SEND_DEBUG _lava_multi_node_send started Wed Jun 8 10:07:53 CST 2021> <LAVA_MULTI_NODE> <LAVA_SEND server_ready> Received Multi_Node API <LAVA_SEND> messageID: SEND-server_ready lava-multinode lava-send Handling signal <LAVA_SEND {"request": "lava_send", "messageID": "server_ready", "message": {}, "timeout": 300}> Setting poll timeout of 300 seconds requesting lava_send server_ready message: {} requesting lava_send server_ready with args {} request_send server_ready {} Sending {'request': 'lava_send', 'messageID': 'server_ready', 'message': {}} final message: {"port": 3079, "blocksize": 4096, "poll_delay": 3, "host": "10.191.253.109", "hostname": "lavaslave1", "client_name": "3035", "group_name": "8a362e2a-6ee9-4f48-bddb-378ac2425f06", "role": "host", "request": "lava_send", "messageID": "server_ready", "message": {}} Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds. case: multinode-send-server_ready case_id: 39177 definition: 0_smoke-case result: pass <LAVA_SEND_DEBUG _lava_multi_node_send finished Wed Jun 8 10:07:53 CST 2021> <LAVA_SEND_DEBUG lava_multi_node_send finished Wed Jun 8 10:07:53 CST 2021>
- iperf -s -V -P 1
Server listening on TCP port 5001 TCP window size: 85.3 KByte (default)
It seems that one node can't receive server-ready from another node, what's wrong with my job? Please help!
Thanks, Hedy Lamarr
Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users
-- Rémi Duraffort TuxArchitect Linaro
The output is:
service lava-coordinator status ● lava-coordinator.service - LAVA coordinator Loaded: loaded (/lib/systemd/system/lava-coordinator.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2021-06-04 18:09:19 CET; 1 weeks 5 days ago Main PID: 629 (lava-coordinato) Tasks: 1 (limit: 4915) Memory: 7.4M CGroup: /system.slice/lava-coordinator.service └─629 /usr/bin/python3 /usr/bin/lava-coordinator --loglevel DEBUG
On Thu, Jun 17, 2021 at 3:05 PM Remi Duraffort remi.duraffort@linaro.org wrote:
Le jeu. 17 juin 2021 à 09:02, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
Hello Remi,
I think lava-coordinator is running.
Because there are 2 devices here: Device1: dragonboard-410c, when lava-wait server_ready, it hangs with above log. Device2: ssh, when lava-send server_ready, it shows: Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds.
Would it be possible that lava-coordinator just works for ssh, but not for dragonboard-410c? Also I think the netstat, it shows: tcp 0 0 0.0.0.0:3079 0.0.0.0:* LISTEN 629/python3 off (0.00/0/0) Does this mean coordinator running? Or how can I make sure coordinator running?
service lava-coordinator status
Thanks, Hedy Lamarr
On Thu, Jun 17, 2021 at 2:33 PM Remi Duraffort remi.duraffort@linaro.org wrote:
Hello,
do you have lava-coordinator running?
Le lun. 14 juin 2021 à 14:29, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
By the way, we use 2021.03.post1.
On Wed, Jun 9, 2021 at 10:40 AM Hedy Lamarr lamarrhedy97@gmail.com wrote:
Dear community,
We are new to lava and try to use lava in our android test. We have issues when test iperf.
Job:
job_name: android iperf test timeouts: job: minutes: 10080 action: minutes: 120 connection: minutes: 5 priority: medium visibility: public protocols: lava-multinode: roles: device: count: 1 device_type: dragonboard-410c timeout: minutes: 5 host: count: 1 device_type: ssh timeout: minutes: 5 context: ssh_host: localhost ssh_user: root ssh_port: 22 ssh_identity_file: /root/.ssh/id_rsa actions:
- deploy: role:
timeout: minutes: 2 to: ssh os: debian
- host
- boot: role:
method: ssh connection: ssh prompts: - '@labpc1'
- host
- test: role:
timeout: minutes: 120 definitions:
- host
- from: inline name: smoke-case path: inline/test.yaml repository: metadata: format: Lava-Test Test Definition name: smoke description: Run smoke case run: steps: - sleep 60 - lava-send "server_ready" - iperf -s -V -P 1
- test: role:
definitions:
- device
- from: inline name: cts_cts-media_test path: inline/cts_cts-media_test.yaml repository: metadata: description: cts cts-media test run format: Lava-Test Test Definition 1.0 name: cts-cts-media-test-run run: steps: - adb wait-for-device - adb devices - adb root - adb wait-for-device - adb devices - lava-wait "server_ready" - sleep 3 - lava-test-case "Case1" --shell adb shell /data/local/iperf
-c 10.191.253.21 -t 10 docker: image: terceiro/android-platform-tools timeout: minutes: 4200
The job log for dragonboard-410c is:
- lava-wait server_ready
<LAVA_WAIT_DEBUG preparing Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG started Wed Jun 8 10:07:22 CST 2021> <LAVA_MULTI_NODE> <LAVA_WAIT server_ready> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG starting to wait Wed Jun 8 10:07:22 CST 2021> NOTE: it looks hung at this step, the job can't continue.
The job log for ssh is:
- lava-send server_ready
<LAVA_SEND_DEBUG lava_multi_node_send preparing Wed Jun 8 10:07:53 CST 2021> <LAVA_SEND_DEBUG _lava_multi_node_send started Wed Jun 8 10:07:53 CST 2021> <LAVA_MULTI_NODE> <LAVA_SEND server_ready> Received Multi_Node API <LAVA_SEND> messageID: SEND-server_ready lava-multinode lava-send Handling signal <LAVA_SEND {"request": "lava_send", "messageID": "server_ready", "message": {}, "timeout": 300}> Setting poll timeout of 300 seconds requesting lava_send server_ready message: {} requesting lava_send server_ready with args {} request_send server_ready {} Sending {'request': 'lava_send', 'messageID': 'server_ready', 'message': {}} final message: {"port": 3079, "blocksize": 4096, "poll_delay": 3, "host": "10.191.253.109", "hostname": "lavaslave1", "client_name": "3035", "group_name": "8a362e2a-6ee9-4f48-bddb-378ac2425f06", "role": "host", "request": "lava_send", "messageID": "server_ready", "message": {}} Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds. case: multinode-send-server_ready case_id: 39177 definition: 0_smoke-case result: pass <LAVA_SEND_DEBUG _lava_multi_node_send finished Wed Jun 8 10:07:53 CST 2021> <LAVA_SEND_DEBUG lava_multi_node_send finished Wed Jun 8 10:07:53 CST 2021>
- iperf -s -V -P 1
Server listening on TCP port 5001 TCP window size: 85.3 KByte (default)
It seems that one node can't receive server-ready from another node, what's wrong with my job? Please help!
Thanks, Hedy Lamarr
Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users
-- Rémi Duraffort TuxArchitect Linaro
-- Rémi Duraffort TuxArchitect Linaro
Le jeu. 17 juin 2021 à 09:11, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
The output is:
service lava-coordinator status ● lava-coordinator.service - LAVA coordinator Loaded: loaded (/lib/systemd/system/lava-coordinator.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2021-06-04 18:09:19 CET; 1 weeks 5 days ago Main PID: 629 (lava-coordinato) Tasks: 1 (limit: 4915) Memory: 7.4M CGroup: /system.slice/lava-coordinator.service └─629 /usr/bin/python3 /usr/bin/lava-coordinator --loglevel DEBUG
So it's working.
Is it listening on 10.191.253.109:3079 ? Do you have anything in the lava-coordinator logs? (/var/log/lava-coordinator.log)
On Thu, Jun 17, 2021 at 3:05 PM Remi Duraffort remi.duraffort@linaro.org wrote:
Le jeu. 17 juin 2021 à 09:02, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
Hello Remi,
I think lava-coordinator is running.
Because there are 2 devices here: Device1: dragonboard-410c, when lava-wait server_ready, it hangs with above log. Device2: ssh, when lava-send server_ready, it shows: Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds.
Would it be possible that lava-coordinator just works for ssh, but not for dragonboard-410c? Also I think the netstat, it shows: tcp 0 0 0.0.0.0:3079 0.0.0.0:* LISTEN 629/python3 off (0.00/0/0) Does this mean coordinator running? Or how can I make sure coordinator running?
service lava-coordinator status
Thanks, Hedy Lamarr
On Thu, Jun 17, 2021 at 2:33 PM Remi Duraffort < remi.duraffort@linaro.org> wrote:
Hello,
do you have lava-coordinator running?
Le lun. 14 juin 2021 à 14:29, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
By the way, we use 2021.03.post1.
On Wed, Jun 9, 2021 at 10:40 AM Hedy Lamarr lamarrhedy97@gmail.com wrote:
Dear community,
We are new to lava and try to use lava in our android test. We have issues when test iperf.
Job:
job_name: android iperf test timeouts: job: minutes: 10080 action: minutes: 120 connection: minutes: 5 priority: medium visibility: public protocols: lava-multinode: roles: device: count: 1 device_type: dragonboard-410c timeout: minutes: 5 host: count: 1 device_type: ssh timeout: minutes: 5 context: ssh_host: localhost ssh_user: root ssh_port: 22 ssh_identity_file: /root/.ssh/id_rsa actions:
- deploy: role:
timeout: minutes: 2 to: ssh os: debian
- host
- boot: role:
method: ssh connection: ssh prompts: - '@labpc1'
- host
- test: role:
timeout: minutes: 120 definitions:
- host
- from: inline name: smoke-case path: inline/test.yaml repository: metadata: format: Lava-Test Test Definition name: smoke description: Run smoke case run: steps: - sleep 60 - lava-send "server_ready" - iperf -s -V -P 1
- test: role:
definitions:
- device
- from: inline name: cts_cts-media_test path: inline/cts_cts-media_test.yaml repository: metadata: description: cts cts-media test run format: Lava-Test Test Definition 1.0 name: cts-cts-media-test-run run: steps: - adb wait-for-device - adb devices - adb root - adb wait-for-device - adb devices - lava-wait "server_ready" - sleep 3 - lava-test-case "Case1" --shell adb shell
/data/local/iperf -c 10.191.253.21 -t 10 docker: image: terceiro/android-platform-tools timeout: minutes: 4200
The job log for dragonboard-410c is:
- lava-wait server_ready
<LAVA_WAIT_DEBUG preparing Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG started Wed Jun 8 10:07:22 CST 2021> <LAVA_MULTI_NODE> <LAVA_WAIT server_ready> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> <LAVA_WAIT_DEBUG starting to wait Wed Jun 8 10:07:22 CST 2021> NOTE: it looks hung at this step, the job can't continue.
The job log for ssh is:
- lava-send server_ready
<LAVA_SEND_DEBUG lava_multi_node_send preparing Wed Jun 8 10:07:53 CST 2021> <LAVA_SEND_DEBUG _lava_multi_node_send started Wed Jun 8 10:07:53 CST 2021> <LAVA_MULTI_NODE> <LAVA_SEND server_ready> Received Multi_Node API <LAVA_SEND> messageID: SEND-server_ready lava-multinode lava-send Handling signal <LAVA_SEND {"request": "lava_send", "messageID": "server_ready", "message": {}, "timeout": 300}> Setting poll timeout of 300 seconds requesting lava_send server_ready message: {} requesting lava_send server_ready with args {} request_send server_ready {} Sending {'request': 'lava_send', 'messageID': 'server_ready', 'message': {}} final message: {"port": 3079, "blocksize": 4096, "poll_delay": 3, "host": "10.191.253.109", "hostname": "lavaslave1", "client_name": "3035", "group_name": "8a362e2a-6ee9-4f48-bddb-378ac2425f06", "role": "host", "request": "lava_send", "messageID": "server_ready", "message": {}} Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds. case: multinode-send-server_ready case_id: 39177 definition: 0_smoke-case result: pass <LAVA_SEND_DEBUG _lava_multi_node_send finished Wed Jun 8 10:07:53 CST 2021> <LAVA_SEND_DEBUG lava_multi_node_send finished Wed Jun 8 10:07:53 CST 2021>
- iperf -s -V -P 1
Server listening on TCP port 5001 TCP window size: 85.3 KByte (default)
It seems that one node can't receive server-ready from another node, what's wrong with my job? Please help!
Thanks, Hedy Lamarr
Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users
-- Rémi Duraffort TuxArchitect Linaro
-- Rémi Duraffort TuxArchitect Linaro
YES, to make it clear, I restart the lava server just now and give you a full log when that multinode job run:
2021-06-17 09:16:07,428 INFO [INIT] LAVA coordinator has started. 2021-06-17 09:16:07,757 INFO [INIT] Version 2021.03 2021-06-17 09:16:07,757 INFO [INIT] Loading configuration from /etc/lava-coordinator/lava-coordinator.conf 2021-06-17 09:16:08,076 INFO [BTSP] binding to 0.0.0.0:3079 2021-06-17 09:16:08,076 INFO Ready to accept new connections 2021-06-17 09:17:23,603 INFO The decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group will contain 2 nodes. 2021-06-17 09:17:23,603 INFO Waiting for 1 more clients to connect to decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group 2021-06-17 09:17:23,603 INFO Ready to accept new connections 2021-06-17 09:17:23,790 INFO Group complete, starting tests 2021-06-17 09:17:23,790 INFO Ready to accept new connections 2021-06-17 09:17:26,613 INFO Group complete, starting tests 2021-06-17 09:17:26,613 INFO Ready to accept new connections 2021-06-17 09:18:03,522 DEBUG clear Group Data: 1 of 2 2021-06-17 09:18:03,522 INFO Ready to accept new connections 2021-06-17 09:18:06,001 DEBUG clear Group Data: 2 of 2 2021-06-17 09:18:06,001 DEBUG Clearing group data for decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 2021-06-17 09:18:06,001 INFO Ready to accept new connections 2021-06-17 09:24:43,620 INFO The 8956d8e7-1097-43e0-95dd-7afc61b2908b group will contain 2 nodes. 2021-06-17 09:24:43,620 INFO Waiting for 1 more clients to connect to 8956d8e7-1097-43e0-95dd-7afc61b2908b group 2021-06-17 09:24:43,620 INFO Ready to accept new connections 2021-06-17 09:24:43,871 INFO Group complete, starting tests 2021-06-17 09:24:43,871 INFO Ready to accept new connections 2021-06-17 09:24:46,634 INFO Group complete, starting tests 2021-06-17 09:24:46,634 INFO Ready to accept new connections 2021-06-17 09:25:45,746 INFO lava_send: {'port': 3079, 'blocksize': 4096, 'poll_delay': 3, 'host': '10.191.253.109', 'hostname': 'lavaslave1', 'client_name': '3077', 'group_name': '8956d8e7-1097-43e0-95dd-7afc61b2908b', 'role': 'host', 'request': 'lava_send', 'messageID': 'server_ready', 'message': {}} 2021-06-17 09:25:45,747 INFO lavaSend handler in Coordinator received a messageID 'server_ready' for group '8956d8e7-1097-43e0-95dd-7afc61b2908b' from 3077 2021-06-17 09:25:45,747 DEBUG message ID server_ready {"3077": {}} for 3077 2021-06-17 09:25:45,747 DEBUG broadcast ID server_ready {"3077": {}} for 3076 2021-06-17 09:25:45,747 DEBUG broadcast ID server_ready {"3077": {}} for 3077 2021-06-17 09:25:45,747 INFO Ready to accept new connections
This log similar to the log I saw on web, the "SSH device" with lava_send looks ok, but "dragonboard device for android test" with lava_wait looks not ok, it's just hung. From above log, looks the coordinator did not receive anything?
On Thu, Jun 17, 2021 at 4:02 PM Remi Duraffort remi.duraffort@linaro.org wrote:
Le jeu. 17 juin 2021 à 09:11, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
The output is:
service lava-coordinator status ● lava-coordinator.service - LAVA coordinator Loaded: loaded (/lib/systemd/system/lava-coordinator.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2021-06-04 18:09:19 CET; 1 weeks 5 days ago Main PID: 629 (lava-coordinato) Tasks: 1 (limit: 4915) Memory: 7.4M CGroup: /system.slice/lava-coordinator.service └─629 /usr/bin/python3 /usr/bin/lava-coordinator --loglevel DEBUG
So it's working.
Is it listening on 10.191.253.109:3079 ? Do you have anything in the lava-coordinator logs? (/var/log/lava-coordinator.log)
On Thu, Jun 17, 2021 at 3:05 PM Remi Duraffort remi.duraffort@linaro.org wrote:
Le jeu. 17 juin 2021 à 09:02, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
Hello Remi,
I think lava-coordinator is running.
Because there are 2 devices here: Device1: dragonboard-410c, when lava-wait server_ready, it hangs with above log. Device2: ssh, when lava-send server_ready, it shows: Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds.
Would it be possible that lava-coordinator just works for ssh, but not for dragonboard-410c? Also I think the netstat, it shows: tcp 0 0 0.0.0.0:3079 0.0.0.0:* LISTEN 629/python3 off (0.00/0/0) Does this mean coordinator running? Or how can I make sure coordinator running?
service lava-coordinator status
Thanks, Hedy Lamarr
On Thu, Jun 17, 2021 at 2:33 PM Remi Duraffort < remi.duraffort@linaro.org> wrote:
Hello,
do you have lava-coordinator running?
Le lun. 14 juin 2021 à 14:29, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
By the way, we use 2021.03.post1.
On Wed, Jun 9, 2021 at 10:40 AM Hedy Lamarr lamarrhedy97@gmail.com wrote:
> Dear community, > > We are new to lava and try to use lava in our android test. We have > issues when test iperf. > > Job: > > job_name: android iperf test > timeouts: > job: > minutes: 10080 > action: > minutes: 120 > connection: > minutes: 5 > priority: medium > visibility: public > protocols: > lava-multinode: > roles: > device: > count: 1 > device_type: dragonboard-410c > timeout: > minutes: 5 > host: > count: 1 > device_type: ssh > timeout: > minutes: 5 > context: > ssh_host: localhost > ssh_user: root > ssh_port: 22 > ssh_identity_file: /root/.ssh/id_rsa > actions: > - deploy: > role: > - host > timeout: > minutes: 2 > to: ssh > os: debian > - boot: > role: > - host > method: ssh > connection: ssh > prompts: > - '@labpc1' > - test: > role: > - host > timeout: > minutes: 120 > definitions: > - from: inline > name: smoke-case > path: inline/test.yaml > repository: > metadata: > format: Lava-Test Test Definition > name: smoke > description: Run smoke case > run: > steps: > - sleep 60 > - lava-send "server_ready" > - iperf -s -V -P 1 > - test: > role: > - device > definitions: > - from: inline > name: cts_cts-media_test > path: inline/cts_cts-media_test.yaml > repository: > metadata: > description: cts cts-media test run > format: Lava-Test Test Definition 1.0 > name: cts-cts-media-test-run > run: > steps: > - adb wait-for-device > - adb devices > - adb root > - adb wait-for-device > - adb devices > - lava-wait "server_ready" > - sleep 3 > - lava-test-case "Case1" --shell adb shell > /data/local/iperf -c 10.191.253.21 -t 10 > docker: > image: terceiro/android-platform-tools > timeout: > minutes: 4200 > > The job log for dragonboard-410c is: > + lava-wait server_ready > <LAVA_WAIT_DEBUG preparing Wed Jun 8 10:07:22 CST 2021> > <LAVA_WAIT_DEBUG started Wed Jun 8 10:07:22 CST 2021> > <LAVA_MULTI_NODE> <LAVA_WAIT server_ready> > <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> > <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> > <LAVA_WAIT_DEBUG starting to wait Wed Jun 8 10:07:22 CST 2021> > NOTE: it looks hung at this step, the job can't continue. > > The job log for ssh is: > + lava-send server_ready > <LAVA_SEND_DEBUG lava_multi_node_send preparing Wed Jun 8 10:07:53 > CST 2021> > <LAVA_SEND_DEBUG _lava_multi_node_send started Wed Jun 8 10:07:53 > CST 2021> > <LAVA_MULTI_NODE> <LAVA_SEND server_ready> > Received Multi_Node API <LAVA_SEND> > messageID: SEND-server_ready > lava-multinode lava-send > Handling signal <LAVA_SEND {"request": "lava_send", "messageID": > "server_ready", "message": {}, "timeout": 300}> > Setting poll timeout of 300 seconds > requesting lava_send server_ready > message: {} > requesting lava_send server_ready with args {} > request_send server_ready {} > Sending {'request': 'lava_send', 'messageID': 'server_ready', > 'message': {}} > final message: {"port": 3079, "blocksize": 4096, "poll_delay": 3, > "host": "10.191.253.109", "hostname": "lavaslave1", "client_name": "3035", > "group_name": "8a362e2a-6ee9-4f48-bddb-378ac2425f06", "role": "host", > "request": "lava_send", "messageID": "server_ready", "message": {}} > Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 > seconds. > case: multinode-send-server_ready > case_id: 39177 > definition: 0_smoke-case > result: pass > <LAVA_SEND_DEBUG _lava_multi_node_send finished Wed Jun 8 10:07:53 > CST 2021> > <LAVA_SEND_DEBUG lava_multi_node_send finished Wed Jun 8 10:07:53 > CST 2021> > + iperf -s -V -P 1 > ------------------------------------------------------------ > Server listening on TCP port 5001 > TCP window size: 85.3 KByte (default) > ------------------------------------------------------------ > > It seems that one node can't receive server-ready from another node, > what's wrong with my job? Please help! > > Thanks, > Hedy Lamarr > > _______________________________________________ Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users
-- Rémi Duraffort TuxArchitect Linaro
-- Rémi Duraffort TuxArchitect Linaro
-- Rémi Duraffort TuxArchitect Linaro
Hello,
What additional I need to afford to debug this issue?
Thanks, Hedy Lamarr
On Thu, Jun 17, 2021 at 4:34 PM Hedy Lamarr lamarrhedy97@gmail.com wrote:
YES, to make it clear, I restart the lava server just now and give you a full log when that multinode job run:
2021-06-17 09:16:07,428 INFO [INIT] LAVA coordinator has started. 2021-06-17 09:16:07,757 INFO [INIT] Version 2021.03 2021-06-17 09:16:07,757 INFO [INIT] Loading configuration from /etc/lava-coordinator/lava-coordinator.conf 2021-06-17 09:16:08,076 INFO [BTSP] binding to 0.0.0.0:3079 2021-06-17 09:16:08,076 INFO Ready to accept new connections 2021-06-17 09:17:23,603 INFO The decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group will contain 2 nodes. 2021-06-17 09:17:23,603 INFO Waiting for 1 more clients to connect to decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group 2021-06-17 09:17:23,603 INFO Ready to accept new connections 2021-06-17 09:17:23,790 INFO Group complete, starting tests 2021-06-17 09:17:23,790 INFO Ready to accept new connections 2021-06-17 09:17:26,613 INFO Group complete, starting tests 2021-06-17 09:17:26,613 INFO Ready to accept new connections 2021-06-17 09:18:03,522 DEBUG clear Group Data: 1 of 2 2021-06-17 09:18:03,522 INFO Ready to accept new connections 2021-06-17 09:18:06,001 DEBUG clear Group Data: 2 of 2 2021-06-17 09:18:06,001 DEBUG Clearing group data for decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 2021-06-17 09:18:06,001 INFO Ready to accept new connections 2021-06-17 09:24:43,620 INFO The 8956d8e7-1097-43e0-95dd-7afc61b2908b group will contain 2 nodes. 2021-06-17 09:24:43,620 INFO Waiting for 1 more clients to connect to 8956d8e7-1097-43e0-95dd-7afc61b2908b group 2021-06-17 09:24:43,620 INFO Ready to accept new connections 2021-06-17 09:24:43,871 INFO Group complete, starting tests 2021-06-17 09:24:43,871 INFO Ready to accept new connections 2021-06-17 09:24:46,634 INFO Group complete, starting tests 2021-06-17 09:24:46,634 INFO Ready to accept new connections 2021-06-17 09:25:45,746 INFO lava_send: {'port': 3079, 'blocksize': 4096, 'poll_delay': 3, 'host': '10.191.253.109', 'hostname': 'lavaslave1', 'client_name': '3077', 'group_name': '8956d8e7-1097-43e0-95dd-7afc61b2908b', 'role': 'host', 'request': 'lava_send', 'messageID': 'server_ready', 'message': {}} 2021-06-17 09:25:45,747 INFO lavaSend handler in Coordinator received a messageID 'server_ready' for group '8956d8e7-1097-43e0-95dd-7afc61b2908b' from 3077 2021-06-17 09:25:45,747 DEBUG message ID server_ready {"3077": {}} for 3077 2021-06-17 09:25:45,747 DEBUG broadcast ID server_ready {"3077": {}} for 3076 2021-06-17 09:25:45,747 DEBUG broadcast ID server_ready {"3077": {}} for 3077 2021-06-17 09:25:45,747 INFO Ready to accept new connections
This log similar to the log I saw on web, the "SSH device" with lava_send looks ok, but "dragonboard device for android test" with lava_wait looks not ok, it's just hung. From above log, looks the coordinator did not receive anything?
On Thu, Jun 17, 2021 at 4:02 PM Remi Duraffort remi.duraffort@linaro.org wrote:
Le jeu. 17 juin 2021 à 09:11, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
The output is:
service lava-coordinator status ● lava-coordinator.service - LAVA coordinator Loaded: loaded (/lib/systemd/system/lava-coordinator.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2021-06-04 18:09:19 CET; 1 weeks 5 days ago Main PID: 629 (lava-coordinato) Tasks: 1 (limit: 4915) Memory: 7.4M CGroup: /system.slice/lava-coordinator.service └─629 /usr/bin/python3 /usr/bin/lava-coordinator --loglevel DEBUG
So it's working.
Is it listening on 10.191.253.109:3079 ? Do you have anything in the lava-coordinator logs? (/var/log/lava-coordinator.log)
On Thu, Jun 17, 2021 at 3:05 PM Remi Duraffort < remi.duraffort@linaro.org> wrote:
Le jeu. 17 juin 2021 à 09:02, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
Hello Remi,
I think lava-coordinator is running.
Because there are 2 devices here: Device1: dragonboard-410c, when lava-wait server_ready, it hangs with above log. Device2: ssh, when lava-send server_ready, it shows: Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds.
Would it be possible that lava-coordinator just works for ssh, but not for dragonboard-410c? Also I think the netstat, it shows: tcp 0 0 0.0.0.0:3079 0.0.0.0:* LISTEN 629/python3 off (0.00/0/0) Does this mean coordinator running? Or how can I make sure coordinator running?
service lava-coordinator status
Thanks, Hedy Lamarr
On Thu, Jun 17, 2021 at 2:33 PM Remi Duraffort < remi.duraffort@linaro.org> wrote:
Hello,
do you have lava-coordinator running?
Le lun. 14 juin 2021 à 14:29, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
> By the way, we use 2021.03.post1. > > On Wed, Jun 9, 2021 at 10:40 AM Hedy Lamarr lamarrhedy97@gmail.com > wrote: > >> Dear community, >> >> We are new to lava and try to use lava in our android test. We have >> issues when test iperf. >> >> Job: >> >> job_name: android iperf test >> timeouts: >> job: >> minutes: 10080 >> action: >> minutes: 120 >> connection: >> minutes: 5 >> priority: medium >> visibility: public >> protocols: >> lava-multinode: >> roles: >> device: >> count: 1 >> device_type: dragonboard-410c >> timeout: >> minutes: 5 >> host: >> count: 1 >> device_type: ssh >> timeout: >> minutes: 5 >> context: >> ssh_host: localhost >> ssh_user: root >> ssh_port: 22 >> ssh_identity_file: /root/.ssh/id_rsa >> actions: >> - deploy: >> role: >> - host >> timeout: >> minutes: 2 >> to: ssh >> os: debian >> - boot: >> role: >> - host >> method: ssh >> connection: ssh >> prompts: >> - '@labpc1' >> - test: >> role: >> - host >> timeout: >> minutes: 120 >> definitions: >> - from: inline >> name: smoke-case >> path: inline/test.yaml >> repository: >> metadata: >> format: Lava-Test Test Definition >> name: smoke >> description: Run smoke case >> run: >> steps: >> - sleep 60 >> - lava-send "server_ready" >> - iperf -s -V -P 1 >> - test: >> role: >> - device >> definitions: >> - from: inline >> name: cts_cts-media_test >> path: inline/cts_cts-media_test.yaml >> repository: >> metadata: >> description: cts cts-media test run >> format: Lava-Test Test Definition 1.0 >> name: cts-cts-media-test-run >> run: >> steps: >> - adb wait-for-device >> - adb devices >> - adb root >> - adb wait-for-device >> - adb devices >> - lava-wait "server_ready" >> - sleep 3 >> - lava-test-case "Case1" --shell adb shell >> /data/local/iperf -c 10.191.253.21 -t 10 >> docker: >> image: terceiro/android-platform-tools >> timeout: >> minutes: 4200 >> >> The job log for dragonboard-410c is: >> + lava-wait server_ready >> <LAVA_WAIT_DEBUG preparing Wed Jun 8 10:07:22 CST 2021> >> <LAVA_WAIT_DEBUG started Wed Jun 8 10:07:22 CST 2021> >> <LAVA_MULTI_NODE> <LAVA_WAIT server_ready> >> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> >> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> >> <LAVA_WAIT_DEBUG starting to wait Wed Jun 8 10:07:22 CST 2021> >> NOTE: it looks hung at this step, the job can't continue. >> >> The job log for ssh is: >> + lava-send server_ready >> <LAVA_SEND_DEBUG lava_multi_node_send preparing Wed Jun 8 10:07:53 >> CST 2021> >> <LAVA_SEND_DEBUG _lava_multi_node_send started Wed Jun 8 10:07:53 >> CST 2021> >> <LAVA_MULTI_NODE> <LAVA_SEND server_ready> >> Received Multi_Node API <LAVA_SEND> >> messageID: SEND-server_ready >> lava-multinode lava-send >> Handling signal <LAVA_SEND {"request": "lava_send", "messageID": >> "server_ready", "message": {}, "timeout": 300}> >> Setting poll timeout of 300 seconds >> requesting lava_send server_ready >> message: {} >> requesting lava_send server_ready with args {} >> request_send server_ready {} >> Sending {'request': 'lava_send', 'messageID': 'server_ready', >> 'message': {}} >> final message: {"port": 3079, "blocksize": 4096, "poll_delay": 3, >> "host": "10.191.253.109", "hostname": "lavaslave1", "client_name": "3035", >> "group_name": "8a362e2a-6ee9-4f48-bddb-378ac2425f06", "role": "host", >> "request": "lava_send", "messageID": "server_ready", "message": {}} >> Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 >> seconds. >> case: multinode-send-server_ready >> case_id: 39177 >> definition: 0_smoke-case >> result: pass >> <LAVA_SEND_DEBUG _lava_multi_node_send finished Wed Jun 8 10:07:53 >> CST 2021> >> <LAVA_SEND_DEBUG lava_multi_node_send finished Wed Jun 8 10:07:53 >> CST 2021> >> + iperf -s -V -P 1 >> ------------------------------------------------------------ >> Server listening on TCP port 5001 >> TCP window size: 85.3 KByte (default) >> ------------------------------------------------------------ >> >> It seems that one node can't receive server-ready from another >> node, what's wrong with my job? Please help! >> >> Thanks, >> Hedy Lamarr >> >> _______________________________________________ > Lava-users mailing list > Lava-users@lists.lavasoftware.org > https://lists.lavasoftware.org/mailman/listinfo/lava-users >
-- Rémi Duraffort TuxArchitect Linaro
-- Rémi Duraffort TuxArchitect Linaro
-- Rémi Duraffort TuxArchitect Linaro
Hello,
Le lun. 28 juin 2021 à 08:26, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
Hello,
What additional I need to afford to debug this issue?
Thanks, Hedy Lamarr
On Thu, Jun 17, 2021 at 4:34 PM Hedy Lamarr lamarrhedy97@gmail.com wrote:
YES, to make it clear, I restart the lava server just now and give you a full log when that multinode job run:
2021-06-17 09:16:07,428 INFO [INIT] LAVA coordinator has started. 2021-06-17 09:16:07,757 INFO [INIT] Version 2021.03 2021-06-17 09:16:07,757 INFO [INIT] Loading configuration from /etc/lava-coordinator/lava-coordinator.conf 2021-06-17 09:16:08,076 INFO [BTSP] binding to 0.0.0.0:3079 2021-06-17 09:16:08,076 INFO Ready to accept new connections 2021-06-17 09:17:23,603 INFO The decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group will contain 2 nodes. 2021-06-17 09:17:23,603 INFO Waiting for 1 more clients to connect to decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group 2021-06-17 09:17:23,603 INFO Ready to accept new connections 2021-06-17 09:17:23,790 INFO Group complete, starting tests 2021-06-17 09:17:23,790 INFO Ready to accept new connections 2021-06-17 09:17:26,613 INFO Group complete, starting tests 2021-06-17 09:17:26,613 INFO Ready to accept new connections 2021-06-17 09:18:03,522 DEBUG clear Group Data: 1 of 2 2021-06-17 09:18:03,522 INFO Ready to accept new connections 2021-06-17 09:18:06,001 DEBUG clear Group Data: 2 of 2 2021-06-17 09:18:06,001 DEBUG Clearing group data for decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 2021-06-17 09:18:06,001 INFO Ready to accept new connections 2021-06-17 09:24:43,620 INFO The 8956d8e7-1097-43e0-95dd-7afc61b2908b group will contain 2 nodes. 2021-06-17 09:24:43,620 INFO Waiting for 1 more clients to connect to 8956d8e7-1097-43e0-95dd-7afc61b2908b group 2021-06-17 09:24:43,620 INFO Ready to accept new connections 2021-06-17 09:24:43,871 INFO Group complete, starting tests 2021-06-17 09:24:43,871 INFO Ready to accept new connections 2021-06-17 09:24:46,634 INFO Group complete, starting tests 2021-06-17 09:24:46,634 INFO Ready to accept new connections 2021-06-17 09:25:45,746 INFO lava_send: {'port': 3079, 'blocksize': 4096, 'poll_delay': 3, 'host': '10.191.253.109', 'hostname': 'lavaslave1', 'client_name': '3077', 'group_name': '8956d8e7-1097-43e0-95dd-7afc61b2908b', 'role': 'host', 'request': 'lava_send', 'messageID': 'server_ready', 'message': {}} 2021-06-17 09:25:45,747 INFO lavaSend handler in Coordinator received a messageID 'server_ready' for group '8956d8e7-1097-43e0-95dd-7afc61b2908b' from 3077 2021-06-17 09:25:45,747 DEBUG message ID server_ready {"3077": {}} for 3077 2021-06-17 09:25:45,747 DEBUG broadcast ID server_ready {"3077": {}} for 3076 2021-06-17 09:25:45,747 DEBUG broadcast ID server_ready {"3077": {}} for 3077 2021-06-17 09:25:45,747 INFO Ready to accept new connections
This log similar to the log I saw on web, the "SSH device" with lava_send looks ok, but "dragonboard device for android test" with lava_wait looks not ok, it's just hung. From above log, looks the coordinator did not receive anything?
For what I see in the logs, lava-coordinator is not receiving any signal from the second test.
Are both devices on the same dispatcher/worker?
On Thu, Jun 17, 2021 at 4:02 PM Remi Duraffort remi.duraffort@linaro.org
wrote:
Le jeu. 17 juin 2021 à 09:11, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
The output is:
service lava-coordinator status ● lava-coordinator.service - LAVA coordinator Loaded: loaded (/lib/systemd/system/lava-coordinator.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2021-06-04 18:09:19 CET; 1 weeks 5 days ago Main PID: 629 (lava-coordinato) Tasks: 1 (limit: 4915) Memory: 7.4M CGroup: /system.slice/lava-coordinator.service └─629 /usr/bin/python3 /usr/bin/lava-coordinator --loglevel DEBUG
So it's working.
Is it listening on 10.191.253.109:3079 ? Do you have anything in the lava-coordinator logs? (/var/log/lava-coordinator.log)
On Thu, Jun 17, 2021 at 3:05 PM Remi Duraffort < remi.duraffort@linaro.org> wrote:
Le jeu. 17 juin 2021 à 09:02, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
Hello Remi,
I think lava-coordinator is running.
Because there are 2 devices here: Device1: dragonboard-410c, when lava-wait server_ready, it hangs with above log. Device2: ssh, when lava-send server_ready, it shows: Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds.
Would it be possible that lava-coordinator just works for ssh, but not for dragonboard-410c? Also I think the netstat, it shows: tcp 0 0 0.0.0.0:3079 0.0.0.0:* LISTEN 629/python3 off (0.00/0/0) Does this mean coordinator running? Or how can I make sure coordinator running?
service lava-coordinator status
Thanks, Hedy Lamarr
On Thu, Jun 17, 2021 at 2:33 PM Remi Duraffort < remi.duraffort@linaro.org> wrote:
> Hello, > > do you have lava-coordinator running? > > Le lun. 14 juin 2021 à 14:29, Hedy Lamarr lamarrhedy97@gmail.com > a écrit : > >> By the way, we use 2021.03.post1. >> >> On Wed, Jun 9, 2021 at 10:40 AM Hedy Lamarr lamarrhedy97@gmail.com >> wrote: >> >>> Dear community, >>> >>> We are new to lava and try to use lava in our android test. We >>> have issues when test iperf. >>> >>> Job: >>> >>> job_name: android iperf test >>> timeouts: >>> job: >>> minutes: 10080 >>> action: >>> minutes: 120 >>> connection: >>> minutes: 5 >>> priority: medium >>> visibility: public >>> protocols: >>> lava-multinode: >>> roles: >>> device: >>> count: 1 >>> device_type: dragonboard-410c >>> timeout: >>> minutes: 5 >>> host: >>> count: 1 >>> device_type: ssh >>> timeout: >>> minutes: 5 >>> context: >>> ssh_host: localhost >>> ssh_user: root >>> ssh_port: 22 >>> ssh_identity_file: /root/.ssh/id_rsa >>> actions: >>> - deploy: >>> role: >>> - host >>> timeout: >>> minutes: 2 >>> to: ssh >>> os: debian >>> - boot: >>> role: >>> - host >>> method: ssh >>> connection: ssh >>> prompts: >>> - '@labpc1' >>> - test: >>> role: >>> - host >>> timeout: >>> minutes: 120 >>> definitions: >>> - from: inline >>> name: smoke-case >>> path: inline/test.yaml >>> repository: >>> metadata: >>> format: Lava-Test Test Definition >>> name: smoke >>> description: Run smoke case >>> run: >>> steps: >>> - sleep 60 >>> - lava-send "server_ready" >>> - iperf -s -V -P 1 >>> - test: >>> role: >>> - device >>> definitions: >>> - from: inline >>> name: cts_cts-media_test >>> path: inline/cts_cts-media_test.yaml >>> repository: >>> metadata: >>> description: cts cts-media test run >>> format: Lava-Test Test Definition 1.0 >>> name: cts-cts-media-test-run >>> run: >>> steps: >>> - adb wait-for-device >>> - adb devices >>> - adb root >>> - adb wait-for-device >>> - adb devices >>> - lava-wait "server_ready" >>> - sleep 3 >>> - lava-test-case "Case1" --shell adb shell >>> /data/local/iperf -c 10.191.253.21 -t 10 >>> docker: >>> image: terceiro/android-platform-tools >>> timeout: >>> minutes: 4200 >>> >>> The job log for dragonboard-410c is: >>> + lava-wait server_ready >>> <LAVA_WAIT_DEBUG preparing Wed Jun 8 10:07:22 CST 2021> >>> <LAVA_WAIT_DEBUG started Wed Jun 8 10:07:22 CST 2021> >>> <LAVA_MULTI_NODE> <LAVA_WAIT server_ready> >>> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> >>> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> >>> <LAVA_WAIT_DEBUG starting to wait Wed Jun 8 10:07:22 CST 2021> >>> NOTE: it looks hung at this step, the job can't continue. >>> >>> The job log for ssh is: >>> + lava-send server_ready >>> <LAVA_SEND_DEBUG lava_multi_node_send preparing Wed Jun 8 >>> 10:07:53 CST 2021> >>> <LAVA_SEND_DEBUG _lava_multi_node_send started Wed Jun 8 10:07:53 >>> CST 2021> >>> <LAVA_MULTI_NODE> <LAVA_SEND server_ready> >>> Received Multi_Node API <LAVA_SEND> >>> messageID: SEND-server_ready >>> lava-multinode lava-send >>> Handling signal <LAVA_SEND {"request": "lava_send", "messageID": >>> "server_ready", "message": {}, "timeout": 300}> >>> Setting poll timeout of 300 seconds >>> requesting lava_send server_ready >>> message: {} >>> requesting lava_send server_ready with args {} >>> request_send server_ready {} >>> Sending {'request': 'lava_send', 'messageID': 'server_ready', >>> 'message': {}} >>> final message: {"port": 3079, "blocksize": 4096, "poll_delay": 3, >>> "host": "10.191.253.109", "hostname": "lavaslave1", "client_name": "3035", >>> "group_name": "8a362e2a-6ee9-4f48-bddb-378ac2425f06", "role": "host", >>> "request": "lava_send", "messageID": "server_ready", "message": {}} >>> Connecting to LAVA Coordinator on 10.191.253.109:3079 timeout=300 >>> seconds. >>> case: multinode-send-server_ready >>> case_id: 39177 >>> definition: 0_smoke-case >>> result: pass >>> <LAVA_SEND_DEBUG _lava_multi_node_send finished Wed Jun 8 >>> 10:07:53 CST 2021> >>> <LAVA_SEND_DEBUG lava_multi_node_send finished Wed Jun 8 10:07:53 >>> CST 2021> >>> + iperf -s -V -P 1 >>> ------------------------------------------------------------ >>> Server listening on TCP port 5001 >>> TCP window size: 85.3 KByte (default) >>> ------------------------------------------------------------ >>> >>> It seems that one node can't receive server-ready from another >>> node, what's wrong with my job? Please help! >>> >>> Thanks, >>> Hedy Lamarr >>> >>> _______________________________________________ >> Lava-users mailing list >> Lava-users@lists.lavasoftware.org >> https://lists.lavasoftware.org/mailman/listinfo/lava-users >> > > > -- > Rémi Duraffort > TuxArchitect > Linaro >
-- Rémi Duraffort TuxArchitect Linaro
-- Rémi Duraffort TuxArchitect Linaro
The "ssh" device will ssh to another machine which is not on the same dispatcher to start a iperf server. The "dragonboard-410c" device will start a docker container, then in this container, it will call iperf client to connect to the iperf server.
1. In lava admin page, I link both "ssh device" & " dragonboard-410c" to the same worker. 2. But, the command in ssh(iperf server) will run on another machine, while command in docker container(iperf client) run on the same machine of worker I think. I'm not sure you mean 1 or 2?
On Mon, Jun 28, 2021 at 10:34 PM Remi Duraffort remi.duraffort@linaro.org wrote:
Hello,
Le lun. 28 juin 2021 à 08:26, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
Hello,
What additional I need to afford to debug this issue?
Thanks, Hedy Lamarr
On Thu, Jun 17, 2021 at 4:34 PM Hedy Lamarr lamarrhedy97@gmail.com wrote:
YES, to make it clear, I restart the lava server just now and give you a full log when that multinode job run:
2021-06-17 09:16:07,428 INFO [INIT] LAVA coordinator has started. 2021-06-17 09:16:07,757 INFO [INIT] Version 2021.03 2021-06-17 09:16:07,757 INFO [INIT] Loading configuration from /etc/lava-coordinator/lava-coordinator.conf 2021-06-17 09:16:08,076 INFO [BTSP] binding to 0.0.0.0:3079 2021-06-17 09:16:08,076 INFO Ready to accept new connections 2021-06-17 09:17:23,603 INFO The decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group will contain 2 nodes. 2021-06-17 09:17:23,603 INFO Waiting for 1 more clients to connect to decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group 2021-06-17 09:17:23,603 INFO Ready to accept new connections 2021-06-17 09:17:23,790 INFO Group complete, starting tests 2021-06-17 09:17:23,790 INFO Ready to accept new connections 2021-06-17 09:17:26,613 INFO Group complete, starting tests 2021-06-17 09:17:26,613 INFO Ready to accept new connections 2021-06-17 09:18:03,522 DEBUG clear Group Data: 1 of 2 2021-06-17 09:18:03,522 INFO Ready to accept new connections 2021-06-17 09:18:06,001 DEBUG clear Group Data: 2 of 2 2021-06-17 09:18:06,001 DEBUG Clearing group data for decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 2021-06-17 09:18:06,001 INFO Ready to accept new connections 2021-06-17 09:24:43,620 INFO The 8956d8e7-1097-43e0-95dd-7afc61b2908b group will contain 2 nodes. 2021-06-17 09:24:43,620 INFO Waiting for 1 more clients to connect to 8956d8e7-1097-43e0-95dd-7afc61b2908b group 2021-06-17 09:24:43,620 INFO Ready to accept new connections 2021-06-17 09:24:43,871 INFO Group complete, starting tests 2021-06-17 09:24:43,871 INFO Ready to accept new connections 2021-06-17 09:24:46,634 INFO Group complete, starting tests 2021-06-17 09:24:46,634 INFO Ready to accept new connections 2021-06-17 09:25:45,746 INFO lava_send: {'port': 3079, 'blocksize': 4096, 'poll_delay': 3, 'host': '10.191.253.109', 'hostname': 'lavaslave1', 'client_name': '3077', 'group_name': '8956d8e7-1097-43e0-95dd-7afc61b2908b', 'role': 'host', 'request': 'lava_send', 'messageID': 'server_ready', 'message': {}} 2021-06-17 09:25:45,747 INFO lavaSend handler in Coordinator received a messageID 'server_ready' for group '8956d8e7-1097-43e0-95dd-7afc61b2908b' from 3077 2021-06-17 09:25:45,747 DEBUG message ID server_ready {"3077": {}} for 3077 2021-06-17 09:25:45,747 DEBUG broadcast ID server_ready {"3077": {}} for 3076 2021-06-17 09:25:45,747 DEBUG broadcast ID server_ready {"3077": {}} for 3077 2021-06-17 09:25:45,747 INFO Ready to accept new connections
This log similar to the log I saw on web, the "SSH device" with lava_send looks ok, but "dragonboard device for android test" with lava_wait looks not ok, it's just hung. From above log, looks the coordinator did not receive anything?
For what I see in the logs, lava-coordinator is not receiving any signal from the second test.
Are both devices on the same dispatcher/worker?
On Thu, Jun 17, 2021 at 4:02 PM Remi Duraffort remi.duraffort@linaro.org
wrote:
Le jeu. 17 juin 2021 à 09:11, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
The output is:
service lava-coordinator status ● lava-coordinator.service - LAVA coordinator Loaded: loaded (/lib/systemd/system/lava-coordinator.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2021-06-04 18:09:19 CET; 1 weeks 5 days ago Main PID: 629 (lava-coordinato) Tasks: 1 (limit: 4915) Memory: 7.4M CGroup: /system.slice/lava-coordinator.service └─629 /usr/bin/python3 /usr/bin/lava-coordinator --loglevel DEBUG
So it's working.
Is it listening on 10.191.253.109:3079 ? Do you have anything in the lava-coordinator logs? (/var/log/lava-coordinator.log)
On Thu, Jun 17, 2021 at 3:05 PM Remi Duraffort < remi.duraffort@linaro.org> wrote:
Le jeu. 17 juin 2021 à 09:02, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
> Hello Remi, > > I think lava-coordinator is running. > > Because there are 2 devices here: > Device1: dragonboard-410c, when lava-wait server_ready, it hangs > with above log. > Device2: ssh, when lava-send server_ready, it shows: Connecting to > LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds. > > Would it be possible that lava-coordinator just works for ssh, but > not for dragonboard-410c? > Also I think the netstat, it shows: > tcp 0 0 0.0.0.0:3079 0.0.0.0:* > LISTEN 629/python3 off (0.00/0/0) > Does this mean coordinator running? Or how can I make sure > coordinator running? >
service lava-coordinator status
> > Thanks, > Hedy Lamarr > > On Thu, Jun 17, 2021 at 2:33 PM Remi Duraffort < > remi.duraffort@linaro.org> wrote: > >> Hello, >> >> do you have lava-coordinator running? >> >> Le lun. 14 juin 2021 à 14:29, Hedy Lamarr lamarrhedy97@gmail.com >> a écrit : >> >>> By the way, we use 2021.03.post1. >>> >>> On Wed, Jun 9, 2021 at 10:40 AM Hedy Lamarr < >>> lamarrhedy97@gmail.com> wrote: >>> >>>> Dear community, >>>> >>>> We are new to lava and try to use lava in our android test. We >>>> have issues when test iperf. >>>> >>>> Job: >>>> >>>> job_name: android iperf test >>>> timeouts: >>>> job: >>>> minutes: 10080 >>>> action: >>>> minutes: 120 >>>> connection: >>>> minutes: 5 >>>> priority: medium >>>> visibility: public >>>> protocols: >>>> lava-multinode: >>>> roles: >>>> device: >>>> count: 1 >>>> device_type: dragonboard-410c >>>> timeout: >>>> minutes: 5 >>>> host: >>>> count: 1 >>>> device_type: ssh >>>> timeout: >>>> minutes: 5 >>>> context: >>>> ssh_host: localhost >>>> ssh_user: root >>>> ssh_port: 22 >>>> ssh_identity_file: /root/.ssh/id_rsa >>>> actions: >>>> - deploy: >>>> role: >>>> - host >>>> timeout: >>>> minutes: 2 >>>> to: ssh >>>> os: debian >>>> - boot: >>>> role: >>>> - host >>>> method: ssh >>>> connection: ssh >>>> prompts: >>>> - '@labpc1' >>>> - test: >>>> role: >>>> - host >>>> timeout: >>>> minutes: 120 >>>> definitions: >>>> - from: inline >>>> name: smoke-case >>>> path: inline/test.yaml >>>> repository: >>>> metadata: >>>> format: Lava-Test Test Definition >>>> name: smoke >>>> description: Run smoke case >>>> run: >>>> steps: >>>> - sleep 60 >>>> - lava-send "server_ready" >>>> - iperf -s -V -P 1 >>>> - test: >>>> role: >>>> - device >>>> definitions: >>>> - from: inline >>>> name: cts_cts-media_test >>>> path: inline/cts_cts-media_test.yaml >>>> repository: >>>> metadata: >>>> description: cts cts-media test run >>>> format: Lava-Test Test Definition 1.0 >>>> name: cts-cts-media-test-run >>>> run: >>>> steps: >>>> - adb wait-for-device >>>> - adb devices >>>> - adb root >>>> - adb wait-for-device >>>> - adb devices >>>> - lava-wait "server_ready" >>>> - sleep 3 >>>> - lava-test-case "Case1" --shell adb shell >>>> /data/local/iperf -c 10.191.253.21 -t 10 >>>> docker: >>>> image: terceiro/android-platform-tools >>>> timeout: >>>> minutes: 4200 >>>> >>>> The job log for dragonboard-410c is: >>>> + lava-wait server_ready >>>> <LAVA_WAIT_DEBUG preparing Wed Jun 8 10:07:22 CST 2021> >>>> <LAVA_WAIT_DEBUG started Wed Jun 8 10:07:22 CST 2021> >>>> <LAVA_MULTI_NODE> <LAVA_WAIT server_ready> >>>> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> >>>> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> >>>> <LAVA_WAIT_DEBUG starting to wait Wed Jun 8 10:07:22 CST 2021> >>>> NOTE: it looks hung at this step, the job can't continue. >>>> >>>> The job log for ssh is: >>>> + lava-send server_ready >>>> <LAVA_SEND_DEBUG lava_multi_node_send preparing Wed Jun 8 >>>> 10:07:53 CST 2021> >>>> <LAVA_SEND_DEBUG _lava_multi_node_send started Wed Jun 8 >>>> 10:07:53 CST 2021> >>>> <LAVA_MULTI_NODE> <LAVA_SEND server_ready> >>>> Received Multi_Node API <LAVA_SEND> >>>> messageID: SEND-server_ready >>>> lava-multinode lava-send >>>> Handling signal <LAVA_SEND {"request": "lava_send", "messageID": >>>> "server_ready", "message": {}, "timeout": 300}> >>>> Setting poll timeout of 300 seconds >>>> requesting lava_send server_ready >>>> message: {} >>>> requesting lava_send server_ready with args {} >>>> request_send server_ready {} >>>> Sending {'request': 'lava_send', 'messageID': 'server_ready', >>>> 'message': {}} >>>> final message: {"port": 3079, "blocksize": 4096, "poll_delay": 3, >>>> "host": "10.191.253.109", "hostname": "lavaslave1", "client_name": "3035", >>>> "group_name": "8a362e2a-6ee9-4f48-bddb-378ac2425f06", "role": "host", >>>> "request": "lava_send", "messageID": "server_ready", "message": {}} >>>> Connecting to LAVA Coordinator on 10.191.253.109:3079 >>>> timeout=300 seconds. >>>> case: multinode-send-server_ready >>>> case_id: 39177 >>>> definition: 0_smoke-case >>>> result: pass >>>> <LAVA_SEND_DEBUG _lava_multi_node_send finished Wed Jun 8 >>>> 10:07:53 CST 2021> >>>> <LAVA_SEND_DEBUG lava_multi_node_send finished Wed Jun 8 >>>> 10:07:53 CST 2021> >>>> + iperf -s -V -P 1 >>>> ------------------------------------------------------------ >>>> Server listening on TCP port 5001 >>>> TCP window size: 85.3 KByte (default) >>>> ------------------------------------------------------------ >>>> >>>> It seems that one node can't receive server-ready from another >>>> node, what's wrong with my job? Please help! >>>> >>>> Thanks, >>>> Hedy Lamarr >>>> >>>> _______________________________________________ >>> Lava-users mailing list >>> Lava-users@lists.lavasoftware.org >>> https://lists.lavasoftware.org/mailman/listinfo/lava-users >>> >> >> >> -- >> Rémi Duraffort >> TuxArchitect >> Linaro >> >
-- Rémi Duraffort TuxArchitect Linaro
-- Rémi Duraffort TuxArchitect Linaro
-- Rémi Duraffort TuxArchitect Linaro
Hello,
by default, lava-coordinator is expected to run on localhost. So the second device will connect to localhost instead of the real host hosting coordinator. Change the host into /etc/lava-coordinator/lava-coordinator.conf
Le mar. 29 juin 2021 à 07:19, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
The "ssh" device will ssh to another machine which is not on the same dispatcher to start a iperf server. The "dragonboard-410c" device will start a docker container, then in this container, it will call iperf client to connect to the iperf server.
- In lava admin page, I link both "ssh device" & " dragonboard-410c" to
the same worker. 2. But, the command in ssh(iperf server) will run on another machine, while command in docker container(iperf client) run on the same machine of worker I think. I'm not sure you mean 1 or 2?
On Mon, Jun 28, 2021 at 10:34 PM Remi Duraffort remi.duraffort@linaro.org wrote:
Hello,
Le lun. 28 juin 2021 à 08:26, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
Hello,
What additional I need to afford to debug this issue?
Thanks, Hedy Lamarr
On Thu, Jun 17, 2021 at 4:34 PM Hedy Lamarr lamarrhedy97@gmail.com wrote:
YES, to make it clear, I restart the lava server just now and give you a full log when that multinode job run:
2021-06-17 09:16:07,428 INFO [INIT] LAVA coordinator has started. 2021-06-17 09:16:07,757 INFO [INIT] Version 2021.03 2021-06-17 09:16:07,757 INFO [INIT] Loading configuration from /etc/lava-coordinator/lava-coordinator.conf 2021-06-17 09:16:08,076 INFO [BTSP] binding to 0.0.0.0:3079 2021-06-17 09:16:08,076 INFO Ready to accept new connections 2021-06-17 09:17:23,603 INFO The decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group will contain 2 nodes. 2021-06-17 09:17:23,603 INFO Waiting for 1 more clients to connect to decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group 2021-06-17 09:17:23,603 INFO Ready to accept new connections 2021-06-17 09:17:23,790 INFO Group complete, starting tests 2021-06-17 09:17:23,790 INFO Ready to accept new connections 2021-06-17 09:17:26,613 INFO Group complete, starting tests 2021-06-17 09:17:26,613 INFO Ready to accept new connections 2021-06-17 09:18:03,522 DEBUG clear Group Data: 1 of 2 2021-06-17 09:18:03,522 INFO Ready to accept new connections 2021-06-17 09:18:06,001 DEBUG clear Group Data: 2 of 2 2021-06-17 09:18:06,001 DEBUG Clearing group data for decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 2021-06-17 09:18:06,001 INFO Ready to accept new connections 2021-06-17 09:24:43,620 INFO The 8956d8e7-1097-43e0-95dd-7afc61b2908b group will contain 2 nodes. 2021-06-17 09:24:43,620 INFO Waiting for 1 more clients to connect to 8956d8e7-1097-43e0-95dd-7afc61b2908b group 2021-06-17 09:24:43,620 INFO Ready to accept new connections 2021-06-17 09:24:43,871 INFO Group complete, starting tests 2021-06-17 09:24:43,871 INFO Ready to accept new connections 2021-06-17 09:24:46,634 INFO Group complete, starting tests 2021-06-17 09:24:46,634 INFO Ready to accept new connections 2021-06-17 09:25:45,746 INFO lava_send: {'port': 3079, 'blocksize': 4096, 'poll_delay': 3, 'host': '10.191.253.109', 'hostname': 'lavaslave1', 'client_name': '3077', 'group_name': '8956d8e7-1097-43e0-95dd-7afc61b2908b', 'role': 'host', 'request': 'lava_send', 'messageID': 'server_ready', 'message': {}} 2021-06-17 09:25:45,747 INFO lavaSend handler in Coordinator received a messageID 'server_ready' for group '8956d8e7-1097-43e0-95dd-7afc61b2908b' from 3077 2021-06-17 09:25:45,747 DEBUG message ID server_ready {"3077": {}} for 3077 2021-06-17 09:25:45,747 DEBUG broadcast ID server_ready {"3077": {}} for 3076 2021-06-17 09:25:45,747 DEBUG broadcast ID server_ready {"3077": {}} for 3077 2021-06-17 09:25:45,747 INFO Ready to accept new connections
This log similar to the log I saw on web, the "SSH device" with lava_send looks ok, but "dragonboard device for android test" with lava_wait looks not ok, it's just hung. From above log, looks the coordinator did not receive anything?
For what I see in the logs, lava-coordinator is not receiving any signal from the second test.
Are both devices on the same dispatcher/worker?
On Thu, Jun 17, 2021 at 4:02 PM Remi Duraffort <
remi.duraffort@linaro.org> wrote:
Le jeu. 17 juin 2021 à 09:11, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
The output is:
service lava-coordinator status ● lava-coordinator.service - LAVA coordinator Loaded: loaded (/lib/systemd/system/lava-coordinator.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2021-06-04 18:09:19 CET; 1 weeks 5 days ago Main PID: 629 (lava-coordinato) Tasks: 1 (limit: 4915) Memory: 7.4M CGroup: /system.slice/lava-coordinator.service └─629 /usr/bin/python3 /usr/bin/lava-coordinator --loglevel DEBUG
So it's working.
Is it listening on 10.191.253.109:3079 ? Do you have anything in the lava-coordinator logs? (/var/log/lava-coordinator.log)
On Thu, Jun 17, 2021 at 3:05 PM Remi Duraffort < remi.duraffort@linaro.org> wrote:
> > > Le jeu. 17 juin 2021 à 09:02, Hedy Lamarr lamarrhedy97@gmail.com > a écrit : > >> Hello Remi, >> >> I think lava-coordinator is running. >> >> Because there are 2 devices here: >> Device1: dragonboard-410c, when lava-wait server_ready, it hangs >> with above log. >> Device2: ssh, when lava-send server_ready, it shows: Connecting to >> LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds. >> >> Would it be possible that lava-coordinator just works for ssh, but >> not for dragonboard-410c? >> Also I think the netstat, it shows: >> tcp 0 0 0.0.0.0:3079 0.0.0.0:* >> LISTEN 629/python3 off (0.00/0/0) >> Does this mean coordinator running? Or how can I make sure >> coordinator running? >> > > service lava-coordinator status > > >> >> Thanks, >> Hedy Lamarr >> >> On Thu, Jun 17, 2021 at 2:33 PM Remi Duraffort < >> remi.duraffort@linaro.org> wrote: >> >>> Hello, >>> >>> do you have lava-coordinator running? >>> >>> Le lun. 14 juin 2021 à 14:29, Hedy Lamarr lamarrhedy97@gmail.com >>> a écrit : >>> >>>> By the way, we use 2021.03.post1. >>>> >>>> On Wed, Jun 9, 2021 at 10:40 AM Hedy Lamarr < >>>> lamarrhedy97@gmail.com> wrote: >>>> >>>>> Dear community, >>>>> >>>>> We are new to lava and try to use lava in our android test. We >>>>> have issues when test iperf. >>>>> >>>>> Job: >>>>> >>>>> job_name: android iperf test >>>>> timeouts: >>>>> job: >>>>> minutes: 10080 >>>>> action: >>>>> minutes: 120 >>>>> connection: >>>>> minutes: 5 >>>>> priority: medium >>>>> visibility: public >>>>> protocols: >>>>> lava-multinode: >>>>> roles: >>>>> device: >>>>> count: 1 >>>>> device_type: dragonboard-410c >>>>> timeout: >>>>> minutes: 5 >>>>> host: >>>>> count: 1 >>>>> device_type: ssh >>>>> timeout: >>>>> minutes: 5 >>>>> context: >>>>> ssh_host: localhost >>>>> ssh_user: root >>>>> ssh_port: 22 >>>>> ssh_identity_file: /root/.ssh/id_rsa >>>>> actions: >>>>> - deploy: >>>>> role: >>>>> - host >>>>> timeout: >>>>> minutes: 2 >>>>> to: ssh >>>>> os: debian >>>>> - boot: >>>>> role: >>>>> - host >>>>> method: ssh >>>>> connection: ssh >>>>> prompts: >>>>> - '@labpc1' >>>>> - test: >>>>> role: >>>>> - host >>>>> timeout: >>>>> minutes: 120 >>>>> definitions: >>>>> - from: inline >>>>> name: smoke-case >>>>> path: inline/test.yaml >>>>> repository: >>>>> metadata: >>>>> format: Lava-Test Test Definition >>>>> name: smoke >>>>> description: Run smoke case >>>>> run: >>>>> steps: >>>>> - sleep 60 >>>>> - lava-send "server_ready" >>>>> - iperf -s -V -P 1 >>>>> - test: >>>>> role: >>>>> - device >>>>> definitions: >>>>> - from: inline >>>>> name: cts_cts-media_test >>>>> path: inline/cts_cts-media_test.yaml >>>>> repository: >>>>> metadata: >>>>> description: cts cts-media test run >>>>> format: Lava-Test Test Definition 1.0 >>>>> name: cts-cts-media-test-run >>>>> run: >>>>> steps: >>>>> - adb wait-for-device >>>>> - adb devices >>>>> - adb root >>>>> - adb wait-for-device >>>>> - adb devices >>>>> - lava-wait "server_ready" >>>>> - sleep 3 >>>>> - lava-test-case "Case1" --shell adb shell >>>>> /data/local/iperf -c 10.191.253.21 -t 10 >>>>> docker: >>>>> image: terceiro/android-platform-tools >>>>> timeout: >>>>> minutes: 4200 >>>>> >>>>> The job log for dragonboard-410c is: >>>>> + lava-wait server_ready >>>>> <LAVA_WAIT_DEBUG preparing Wed Jun 8 10:07:22 CST 2021> >>>>> <LAVA_WAIT_DEBUG started Wed Jun 8 10:07:22 CST 2021> >>>>> <LAVA_MULTI_NODE> <LAVA_WAIT server_ready> >>>>> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> >>>>> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> >>>>> <LAVA_WAIT_DEBUG starting to wait Wed Jun 8 10:07:22 CST 2021> >>>>> NOTE: it looks hung at this step, the job can't continue. >>>>> >>>>> The job log for ssh is: >>>>> + lava-send server_ready >>>>> <LAVA_SEND_DEBUG lava_multi_node_send preparing Wed Jun 8 >>>>> 10:07:53 CST 2021> >>>>> <LAVA_SEND_DEBUG _lava_multi_node_send started Wed Jun 8 >>>>> 10:07:53 CST 2021> >>>>> <LAVA_MULTI_NODE> <LAVA_SEND server_ready> >>>>> Received Multi_Node API <LAVA_SEND> >>>>> messageID: SEND-server_ready >>>>> lava-multinode lava-send >>>>> Handling signal <LAVA_SEND {"request": "lava_send", "messageID": >>>>> "server_ready", "message": {}, "timeout": 300}> >>>>> Setting poll timeout of 300 seconds >>>>> requesting lava_send server_ready >>>>> message: {} >>>>> requesting lava_send server_ready with args {} >>>>> request_send server_ready {} >>>>> Sending {'request': 'lava_send', 'messageID': 'server_ready', >>>>> 'message': {}} >>>>> final message: {"port": 3079, "blocksize": 4096, "poll_delay": >>>>> 3, "host": "10.191.253.109", "hostname": "lavaslave1", "client_name": >>>>> "3035", "group_name": "8a362e2a-6ee9-4f48-bddb-378ac2425f06", "role": >>>>> "host", "request": "lava_send", "messageID": "server_ready", "message": {}} >>>>> Connecting to LAVA Coordinator on 10.191.253.109:3079 >>>>> timeout=300 seconds. >>>>> case: multinode-send-server_ready >>>>> case_id: 39177 >>>>> definition: 0_smoke-case >>>>> result: pass >>>>> <LAVA_SEND_DEBUG _lava_multi_node_send finished Wed Jun 8 >>>>> 10:07:53 CST 2021> >>>>> <LAVA_SEND_DEBUG lava_multi_node_send finished Wed Jun 8 >>>>> 10:07:53 CST 2021> >>>>> + iperf -s -V -P 1 >>>>> ------------------------------------------------------------ >>>>> Server listening on TCP port 5001 >>>>> TCP window size: 85.3 KByte (default) >>>>> ------------------------------------------------------------ >>>>> >>>>> It seems that one node can't receive server-ready from another >>>>> node, what's wrong with my job? Please help! >>>>> >>>>> Thanks, >>>>> Hedy Lamarr >>>>> >>>>> _______________________________________________ >>>> Lava-users mailing list >>>> Lava-users@lists.lavasoftware.org >>>> https://lists.lavasoftware.org/mailman/listinfo/lava-users >>>> >>> >>> >>> -- >>> Rémi Duraffort >>> TuxArchitect >>> Linaro >>> >> > > -- > Rémi Duraffort > TuxArchitect > Linaro >
-- Rémi Duraffort TuxArchitect Linaro
-- Rémi Duraffort TuxArchitect Linaro
Hi, Remi, I just have one dispatcher, both devices linked to this dispatcher. And the issue continues after change to next:
{ "port": 3079, "blocksize": 4096, "poll_delay": 3, "coordinator_hostname": "10.191.253.109" }
On Thu, Jul 1, 2021 at 4:55 PM Remi Duraffort remi.duraffort@linaro.org wrote:
Hello,
by default, lava-coordinator is expected to run on localhost. So the second device will connect to localhost instead of the real host hosting coordinator. Change the host into /etc/lava-coordinator/lava-coordinator.conf
Le mar. 29 juin 2021 à 07:19, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
The "ssh" device will ssh to another machine which is not on the same dispatcher to start a iperf server. The "dragonboard-410c" device will start a docker container, then in this container, it will call iperf client to connect to the iperf server.
- In lava admin page, I link both "ssh device" & " dragonboard-410c" to
the same worker. 2. But, the command in ssh(iperf server) will run on another machine, while command in docker container(iperf client) run on the same machine of worker I think. I'm not sure you mean 1 or 2?
On Mon, Jun 28, 2021 at 10:34 PM Remi Duraffort < remi.duraffort@linaro.org> wrote:
Hello,
Le lun. 28 juin 2021 à 08:26, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
Hello,
What additional I need to afford to debug this issue?
Thanks, Hedy Lamarr
On Thu, Jun 17, 2021 at 4:34 PM Hedy Lamarr lamarrhedy97@gmail.com wrote:
YES, to make it clear, I restart the lava server just now and give you a full log when that multinode job run:
2021-06-17 09:16:07,428 INFO [INIT] LAVA coordinator has started. 2021-06-17 09:16:07,757 INFO [INIT] Version 2021.03 2021-06-17 09:16:07,757 INFO [INIT] Loading configuration from /etc/lava-coordinator/lava-coordinator.conf 2021-06-17 09:16:08,076 INFO [BTSP] binding to 0.0.0.0:3079 2021-06-17 09:16:08,076 INFO Ready to accept new connections 2021-06-17 09:17:23,603 INFO The decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group will contain 2 nodes. 2021-06-17 09:17:23,603 INFO Waiting for 1 more clients to connect to decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group 2021-06-17 09:17:23,603 INFO Ready to accept new connections 2021-06-17 09:17:23,790 INFO Group complete, starting tests 2021-06-17 09:17:23,790 INFO Ready to accept new connections 2021-06-17 09:17:26,613 INFO Group complete, starting tests 2021-06-17 09:17:26,613 INFO Ready to accept new connections 2021-06-17 09:18:03,522 DEBUG clear Group Data: 1 of 2 2021-06-17 09:18:03,522 INFO Ready to accept new connections 2021-06-17 09:18:06,001 DEBUG clear Group Data: 2 of 2 2021-06-17 09:18:06,001 DEBUG Clearing group data for decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 2021-06-17 09:18:06,001 INFO Ready to accept new connections 2021-06-17 09:24:43,620 INFO The 8956d8e7-1097-43e0-95dd-7afc61b2908b group will contain 2 nodes. 2021-06-17 09:24:43,620 INFO Waiting for 1 more clients to connect to 8956d8e7-1097-43e0-95dd-7afc61b2908b group 2021-06-17 09:24:43,620 INFO Ready to accept new connections 2021-06-17 09:24:43,871 INFO Group complete, starting tests 2021-06-17 09:24:43,871 INFO Ready to accept new connections 2021-06-17 09:24:46,634 INFO Group complete, starting tests 2021-06-17 09:24:46,634 INFO Ready to accept new connections 2021-06-17 09:25:45,746 INFO lava_send: {'port': 3079, 'blocksize': 4096, 'poll_delay': 3, 'host': '10.191.253.109', 'hostname': 'lavaslave1', 'client_name': '3077', 'group_name': '8956d8e7-1097-43e0-95dd-7afc61b2908b', 'role': 'host', 'request': 'lava_send', 'messageID': 'server_ready', 'message': {}} 2021-06-17 09:25:45,747 INFO lavaSend handler in Coordinator received a messageID 'server_ready' for group '8956d8e7-1097-43e0-95dd-7afc61b2908b' from 3077 2021-06-17 09:25:45,747 DEBUG message ID server_ready {"3077": {}} for 3077 2021-06-17 09:25:45,747 DEBUG broadcast ID server_ready {"3077": {}} for 3076 2021-06-17 09:25:45,747 DEBUG broadcast ID server_ready {"3077": {}} for 3077 2021-06-17 09:25:45,747 INFO Ready to accept new connections
This log similar to the log I saw on web, the "SSH device" with lava_send looks ok, but "dragonboard device for android test" with lava_wait looks not ok, it's just hung. From above log, looks the coordinator did not receive anything?
For what I see in the logs, lava-coordinator is not receiving any signal from the second test.
Are both devices on the same dispatcher/worker?
On Thu, Jun 17, 2021 at 4:02 PM Remi Duraffort <
remi.duraffort@linaro.org> wrote:
Le jeu. 17 juin 2021 à 09:11, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
> The output is: > > service lava-coordinator status > ● lava-coordinator.service - LAVA coordinator > Loaded: loaded (/lib/systemd/system/lava-coordinator.service; > enabled; vendor preset: enabled) > Active: active (running) since Fri 2021-06-04 18:09:19 CET; 1 > weeks 5 days ago > Main PID: 629 (lava-coordinato) > Tasks: 1 (limit: 4915) > Memory: 7.4M > CGroup: /system.slice/lava-coordinator.service > └─629 /usr/bin/python3 /usr/bin/lava-coordinator > --loglevel DEBUG >
So it's working.
Is it listening on 10.191.253.109:3079 ? Do you have anything in the lava-coordinator logs? (/var/log/lava-coordinator.log)
> On Thu, Jun 17, 2021 at 3:05 PM Remi Duraffort < > remi.duraffort@linaro.org> wrote: > >> >> >> Le jeu. 17 juin 2021 à 09:02, Hedy Lamarr lamarrhedy97@gmail.com >> a écrit : >> >>> Hello Remi, >>> >>> I think lava-coordinator is running. >>> >>> Because there are 2 devices here: >>> Device1: dragonboard-410c, when lava-wait server_ready, it hangs >>> with above log. >>> Device2: ssh, when lava-send server_ready, it shows: Connecting to >>> LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds. >>> >>> Would it be possible that lava-coordinator just works for ssh, but >>> not for dragonboard-410c? >>> Also I think the netstat, it shows: >>> tcp 0 0 0.0.0.0:3079 0.0.0.0:* >>> LISTEN 629/python3 off (0.00/0/0) >>> Does this mean coordinator running? Or how can I make sure >>> coordinator running? >>> >> >> service lava-coordinator status >> >> >>> >>> Thanks, >>> Hedy Lamarr >>> >>> On Thu, Jun 17, 2021 at 2:33 PM Remi Duraffort < >>> remi.duraffort@linaro.org> wrote: >>> >>>> Hello, >>>> >>>> do you have lava-coordinator running? >>>> >>>> Le lun. 14 juin 2021 à 14:29, Hedy Lamarr lamarrhedy97@gmail.com >>>> a écrit : >>>> >>>>> By the way, we use 2021.03.post1. >>>>> >>>>> On Wed, Jun 9, 2021 at 10:40 AM Hedy Lamarr < >>>>> lamarrhedy97@gmail.com> wrote: >>>>> >>>>>> Dear community, >>>>>> >>>>>> We are new to lava and try to use lava in our android test. We >>>>>> have issues when test iperf. >>>>>> >>>>>> Job: >>>>>> >>>>>> job_name: android iperf test >>>>>> timeouts: >>>>>> job: >>>>>> minutes: 10080 >>>>>> action: >>>>>> minutes: 120 >>>>>> connection: >>>>>> minutes: 5 >>>>>> priority: medium >>>>>> visibility: public >>>>>> protocols: >>>>>> lava-multinode: >>>>>> roles: >>>>>> device: >>>>>> count: 1 >>>>>> device_type: dragonboard-410c >>>>>> timeout: >>>>>> minutes: 5 >>>>>> host: >>>>>> count: 1 >>>>>> device_type: ssh >>>>>> timeout: >>>>>> minutes: 5 >>>>>> context: >>>>>> ssh_host: localhost >>>>>> ssh_user: root >>>>>> ssh_port: 22 >>>>>> ssh_identity_file: /root/.ssh/id_rsa >>>>>> actions: >>>>>> - deploy: >>>>>> role: >>>>>> - host >>>>>> timeout: >>>>>> minutes: 2 >>>>>> to: ssh >>>>>> os: debian >>>>>> - boot: >>>>>> role: >>>>>> - host >>>>>> method: ssh >>>>>> connection: ssh >>>>>> prompts: >>>>>> - '@labpc1' >>>>>> - test: >>>>>> role: >>>>>> - host >>>>>> timeout: >>>>>> minutes: 120 >>>>>> definitions: >>>>>> - from: inline >>>>>> name: smoke-case >>>>>> path: inline/test.yaml >>>>>> repository: >>>>>> metadata: >>>>>> format: Lava-Test Test Definition >>>>>> name: smoke >>>>>> description: Run smoke case >>>>>> run: >>>>>> steps: >>>>>> - sleep 60 >>>>>> - lava-send "server_ready" >>>>>> - iperf -s -V -P 1 >>>>>> - test: >>>>>> role: >>>>>> - device >>>>>> definitions: >>>>>> - from: inline >>>>>> name: cts_cts-media_test >>>>>> path: inline/cts_cts-media_test.yaml >>>>>> repository: >>>>>> metadata: >>>>>> description: cts cts-media test run >>>>>> format: Lava-Test Test Definition 1.0 >>>>>> name: cts-cts-media-test-run >>>>>> run: >>>>>> steps: >>>>>> - adb wait-for-device >>>>>> - adb devices >>>>>> - adb root >>>>>> - adb wait-for-device >>>>>> - adb devices >>>>>> - lava-wait "server_ready" >>>>>> - sleep 3 >>>>>> - lava-test-case "Case1" --shell adb shell >>>>>> /data/local/iperf -c 10.191.253.21 -t 10 >>>>>> docker: >>>>>> image: terceiro/android-platform-tools >>>>>> timeout: >>>>>> minutes: 4200 >>>>>> >>>>>> The job log for dragonboard-410c is: >>>>>> + lava-wait server_ready >>>>>> <LAVA_WAIT_DEBUG preparing Wed Jun 8 10:07:22 CST 2021> >>>>>> <LAVA_WAIT_DEBUG started Wed Jun 8 10:07:22 CST 2021> >>>>>> <LAVA_MULTI_NODE> <LAVA_WAIT server_ready> >>>>>> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> >>>>>> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> >>>>>> <LAVA_WAIT_DEBUG starting to wait Wed Jun 8 10:07:22 CST 2021> >>>>>> NOTE: it looks hung at this step, the job can't continue. >>>>>> >>>>>> The job log for ssh is: >>>>>> + lava-send server_ready >>>>>> <LAVA_SEND_DEBUG lava_multi_node_send preparing Wed Jun 8 >>>>>> 10:07:53 CST 2021> >>>>>> <LAVA_SEND_DEBUG _lava_multi_node_send started Wed Jun 8 >>>>>> 10:07:53 CST 2021> >>>>>> <LAVA_MULTI_NODE> <LAVA_SEND server_ready> >>>>>> Received Multi_Node API <LAVA_SEND> >>>>>> messageID: SEND-server_ready >>>>>> lava-multinode lava-send >>>>>> Handling signal <LAVA_SEND {"request": "lava_send", >>>>>> "messageID": "server_ready", "message": {}, "timeout": 300}> >>>>>> Setting poll timeout of 300 seconds >>>>>> requesting lava_send server_ready >>>>>> message: {} >>>>>> requesting lava_send server_ready with args {} >>>>>> request_send server_ready {} >>>>>> Sending {'request': 'lava_send', 'messageID': 'server_ready', >>>>>> 'message': {}} >>>>>> final message: {"port": 3079, "blocksize": 4096, "poll_delay": >>>>>> 3, "host": "10.191.253.109", "hostname": "lavaslave1", "client_name": >>>>>> "3035", "group_name": "8a362e2a-6ee9-4f48-bddb-378ac2425f06", "role": >>>>>> "host", "request": "lava_send", "messageID": "server_ready", "message": {}} >>>>>> Connecting to LAVA Coordinator on 10.191.253.109:3079 >>>>>> timeout=300 seconds. >>>>>> case: multinode-send-server_ready >>>>>> case_id: 39177 >>>>>> definition: 0_smoke-case >>>>>> result: pass >>>>>> <LAVA_SEND_DEBUG _lava_multi_node_send finished Wed Jun 8 >>>>>> 10:07:53 CST 2021> >>>>>> <LAVA_SEND_DEBUG lava_multi_node_send finished Wed Jun 8 >>>>>> 10:07:53 CST 2021> >>>>>> + iperf -s -V -P 1 >>>>>> ------------------------------------------------------------ >>>>>> Server listening on TCP port 5001 >>>>>> TCP window size: 85.3 KByte (default) >>>>>> ------------------------------------------------------------ >>>>>> >>>>>> It seems that one node can't receive server-ready from another >>>>>> node, what's wrong with my job? Please help! >>>>>> >>>>>> Thanks, >>>>>> Hedy Lamarr >>>>>> >>>>>> _______________________________________________ >>>>> Lava-users mailing list >>>>> Lava-users@lists.lavasoftware.org >>>>> https://lists.lavasoftware.org/mailman/listinfo/lava-users >>>>> >>>> >>>> >>>> -- >>>> Rémi Duraffort >>>> TuxArchitect >>>> Linaro >>>> >>> >> >> -- >> Rémi Duraffort >> TuxArchitect >> Linaro >> >
-- Rémi Duraffort TuxArchitect Linaro
-- Rémi Duraffort TuxArchitect Linaro
-- Rémi Duraffort TuxArchitect Linaro
Just to tell you we still have not fixed this issue. I'm still waiting for your suggestions.
Thanks, Hedy Lamarr
On Mon, Jul 5, 2021 at 4:02 PM Hedy Lamarr lamarrhedy97@gmail.com wrote:
Hi, Remi, I just have one dispatcher, both devices linked to this dispatcher. And the issue continues after change to next:
{ "port": 3079, "blocksize": 4096, "poll_delay": 3, "coordinator_hostname": "10.191.253.109" }
On Thu, Jul 1, 2021 at 4:55 PM Remi Duraffort remi.duraffort@linaro.org wrote:
Hello,
by default, lava-coordinator is expected to run on localhost. So the second device will connect to localhost instead of the real host hosting coordinator. Change the host into /etc/lava-coordinator/lava-coordinator.conf
Le mar. 29 juin 2021 à 07:19, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
The "ssh" device will ssh to another machine which is not on the same dispatcher to start a iperf server. The "dragonboard-410c" device will start a docker container, then in this container, it will call iperf client to connect to the iperf server.
- In lava admin page, I link both "ssh device" & " dragonboard-410c" to
the same worker. 2. But, the command in ssh(iperf server) will run on another machine, while command in docker container(iperf client) run on the same machine of worker I think. I'm not sure you mean 1 or 2?
On Mon, Jun 28, 2021 at 10:34 PM Remi Duraffort < remi.duraffort@linaro.org> wrote:
Hello,
Le lun. 28 juin 2021 à 08:26, Hedy Lamarr lamarrhedy97@gmail.com a écrit :
Hello,
What additional I need to afford to debug this issue?
Thanks, Hedy Lamarr
On Thu, Jun 17, 2021 at 4:34 PM Hedy Lamarr lamarrhedy97@gmail.com wrote:
YES, to make it clear, I restart the lava server just now and give you a full log when that multinode job run:
2021-06-17 09:16:07,428 INFO [INIT] LAVA coordinator has started. 2021-06-17 09:16:07,757 INFO [INIT] Version 2021.03 2021-06-17 09:16:07,757 INFO [INIT] Loading configuration from /etc/lava-coordinator/lava-coordinator.conf 2021-06-17 09:16:08,076 INFO [BTSP] binding to 0.0.0.0:3079 2021-06-17 09:16:08,076 INFO Ready to accept new connections 2021-06-17 09:17:23,603 INFO The decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group will contain 2 nodes. 2021-06-17 09:17:23,603 INFO Waiting for 1 more clients to connect to decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 group 2021-06-17 09:17:23,603 INFO Ready to accept new connections 2021-06-17 09:17:23,790 INFO Group complete, starting tests 2021-06-17 09:17:23,790 INFO Ready to accept new connections 2021-06-17 09:17:26,613 INFO Group complete, starting tests 2021-06-17 09:17:26,613 INFO Ready to accept new connections 2021-06-17 09:18:03,522 DEBUG clear Group Data: 1 of 2 2021-06-17 09:18:03,522 INFO Ready to accept new connections 2021-06-17 09:18:06,001 DEBUG clear Group Data: 2 of 2 2021-06-17 09:18:06,001 DEBUG Clearing group data for decbbfe5-f3be-4e6c-a2b8-5744eabfe8a7 2021-06-17 09:18:06,001 INFO Ready to accept new connections 2021-06-17 09:24:43,620 INFO The 8956d8e7-1097-43e0-95dd-7afc61b2908b group will contain 2 nodes. 2021-06-17 09:24:43,620 INFO Waiting for 1 more clients to connect to 8956d8e7-1097-43e0-95dd-7afc61b2908b group 2021-06-17 09:24:43,620 INFO Ready to accept new connections 2021-06-17 09:24:43,871 INFO Group complete, starting tests 2021-06-17 09:24:43,871 INFO Ready to accept new connections 2021-06-17 09:24:46,634 INFO Group complete, starting tests 2021-06-17 09:24:46,634 INFO Ready to accept new connections 2021-06-17 09:25:45,746 INFO lava_send: {'port': 3079, 'blocksize': 4096, 'poll_delay': 3, 'host': '10.191.253.109', 'hostname': 'lavaslave1', 'client_name': '3077', 'group_name': '8956d8e7-1097-43e0-95dd-7afc61b2908b', 'role': 'host', 'request': 'lava_send', 'messageID': 'server_ready', 'message': {}} 2021-06-17 09:25:45,747 INFO lavaSend handler in Coordinator received a messageID 'server_ready' for group '8956d8e7-1097-43e0-95dd-7afc61b2908b' from 3077 2021-06-17 09:25:45,747 DEBUG message ID server_ready {"3077": {}} for 3077 2021-06-17 09:25:45,747 DEBUG broadcast ID server_ready {"3077": {}} for 3076 2021-06-17 09:25:45,747 DEBUG broadcast ID server_ready {"3077": {}} for 3077 2021-06-17 09:25:45,747 INFO Ready to accept new connections
This log similar to the log I saw on web, the "SSH device" with lava_send looks ok, but "dragonboard device for android test" with lava_wait looks not ok, it's just hung. From above log, looks the coordinator did not receive anything?
For what I see in the logs, lava-coordinator is not receiving any signal from the second test.
Are both devices on the same dispatcher/worker?
On Thu, Jun 17, 2021 at 4:02 PM Remi Duraffort <
remi.duraffort@linaro.org> wrote:
> > > Le jeu. 17 juin 2021 à 09:11, Hedy Lamarr lamarrhedy97@gmail.com > a écrit : > >> The output is: >> >> service lava-coordinator status >> ● lava-coordinator.service - LAVA coordinator >> Loaded: loaded (/lib/systemd/system/lava-coordinator.service; >> enabled; vendor preset: enabled) >> Active: active (running) since Fri 2021-06-04 18:09:19 CET; 1 >> weeks 5 days ago >> Main PID: 629 (lava-coordinato) >> Tasks: 1 (limit: 4915) >> Memory: 7.4M >> CGroup: /system.slice/lava-coordinator.service >> └─629 /usr/bin/python3 /usr/bin/lava-coordinator >> --loglevel DEBUG >> > > So it's working. > > Is it listening on 10.191.253.109:3079 ? > Do you have anything in the lava-coordinator logs? > (/var/log/lava-coordinator.log) > > > >> On Thu, Jun 17, 2021 at 3:05 PM Remi Duraffort < >> remi.duraffort@linaro.org> wrote: >> >>> >>> >>> Le jeu. 17 juin 2021 à 09:02, Hedy Lamarr lamarrhedy97@gmail.com >>> a écrit : >>> >>>> Hello Remi, >>>> >>>> I think lava-coordinator is running. >>>> >>>> Because there are 2 devices here: >>>> Device1: dragonboard-410c, when lava-wait server_ready, it hangs >>>> with above log. >>>> Device2: ssh, when lava-send server_ready, it shows: Connecting >>>> to LAVA Coordinator on 10.191.253.109:3079 timeout=300 seconds. >>>> >>>> Would it be possible that lava-coordinator just works for ssh, >>>> but not for dragonboard-410c? >>>> Also I think the netstat, it shows: >>>> tcp 0 0 0.0.0.0:3079 0.0.0.0:* >>>> LISTEN 629/python3 off (0.00/0/0) >>>> Does this mean coordinator running? Or how can I make sure >>>> coordinator running? >>>> >>> >>> service lava-coordinator status >>> >>> >>>> >>>> Thanks, >>>> Hedy Lamarr >>>> >>>> On Thu, Jun 17, 2021 at 2:33 PM Remi Duraffort < >>>> remi.duraffort@linaro.org> wrote: >>>> >>>>> Hello, >>>>> >>>>> do you have lava-coordinator running? >>>>> >>>>> Le lun. 14 juin 2021 à 14:29, Hedy Lamarr < >>>>> lamarrhedy97@gmail.com> a écrit : >>>>> >>>>>> By the way, we use 2021.03.post1. >>>>>> >>>>>> On Wed, Jun 9, 2021 at 10:40 AM Hedy Lamarr < >>>>>> lamarrhedy97@gmail.com> wrote: >>>>>> >>>>>>> Dear community, >>>>>>> >>>>>>> We are new to lava and try to use lava in our android test. We >>>>>>> have issues when test iperf. >>>>>>> >>>>>>> Job: >>>>>>> >>>>>>> job_name: android iperf test >>>>>>> timeouts: >>>>>>> job: >>>>>>> minutes: 10080 >>>>>>> action: >>>>>>> minutes: 120 >>>>>>> connection: >>>>>>> minutes: 5 >>>>>>> priority: medium >>>>>>> visibility: public >>>>>>> protocols: >>>>>>> lava-multinode: >>>>>>> roles: >>>>>>> device: >>>>>>> count: 1 >>>>>>> device_type: dragonboard-410c >>>>>>> timeout: >>>>>>> minutes: 5 >>>>>>> host: >>>>>>> count: 1 >>>>>>> device_type: ssh >>>>>>> timeout: >>>>>>> minutes: 5 >>>>>>> context: >>>>>>> ssh_host: localhost >>>>>>> ssh_user: root >>>>>>> ssh_port: 22 >>>>>>> ssh_identity_file: /root/.ssh/id_rsa >>>>>>> actions: >>>>>>> - deploy: >>>>>>> role: >>>>>>> - host >>>>>>> timeout: >>>>>>> minutes: 2 >>>>>>> to: ssh >>>>>>> os: debian >>>>>>> - boot: >>>>>>> role: >>>>>>> - host >>>>>>> method: ssh >>>>>>> connection: ssh >>>>>>> prompts: >>>>>>> - '@labpc1' >>>>>>> - test: >>>>>>> role: >>>>>>> - host >>>>>>> timeout: >>>>>>> minutes: 120 >>>>>>> definitions: >>>>>>> - from: inline >>>>>>> name: smoke-case >>>>>>> path: inline/test.yaml >>>>>>> repository: >>>>>>> metadata: >>>>>>> format: Lava-Test Test Definition >>>>>>> name: smoke >>>>>>> description: Run smoke case >>>>>>> run: >>>>>>> steps: >>>>>>> - sleep 60 >>>>>>> - lava-send "server_ready" >>>>>>> - iperf -s -V -P 1 >>>>>>> - test: >>>>>>> role: >>>>>>> - device >>>>>>> definitions: >>>>>>> - from: inline >>>>>>> name: cts_cts-media_test >>>>>>> path: inline/cts_cts-media_test.yaml >>>>>>> repository: >>>>>>> metadata: >>>>>>> description: cts cts-media test run >>>>>>> format: Lava-Test Test Definition 1.0 >>>>>>> name: cts-cts-media-test-run >>>>>>> run: >>>>>>> steps: >>>>>>> - adb wait-for-device >>>>>>> - adb devices >>>>>>> - adb root >>>>>>> - adb wait-for-device >>>>>>> - adb devices >>>>>>> - lava-wait "server_ready" >>>>>>> - sleep 3 >>>>>>> - lava-test-case "Case1" --shell adb shell >>>>>>> /data/local/iperf -c 10.191.253.21 -t 10 >>>>>>> docker: >>>>>>> image: terceiro/android-platform-tools >>>>>>> timeout: >>>>>>> minutes: 4200 >>>>>>> >>>>>>> The job log for dragonboard-410c is: >>>>>>> + lava-wait server_ready >>>>>>> <LAVA_WAIT_DEBUG preparing Wed Jun 8 10:07:22 CST 2021> >>>>>>> <LAVA_WAIT_DEBUG started Wed Jun 8 10:07:22 CST 2021> >>>>>>> <LAVA_MULTI_NODE> <LAVA_WAIT server_ready> >>>>>>> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> >>>>>>> <LAVA_WAIT_DEBUG finished Wed Jun 8 10:07:22 CST 2021> >>>>>>> <LAVA_WAIT_DEBUG starting to wait Wed Jun 8 10:07:22 CST >>>>>>> 2021> >>>>>>> NOTE: it looks hung at this step, the job can't continue. >>>>>>> >>>>>>> The job log for ssh is: >>>>>>> + lava-send server_ready >>>>>>> <LAVA_SEND_DEBUG lava_multi_node_send preparing Wed Jun 8 >>>>>>> 10:07:53 CST 2021> >>>>>>> <LAVA_SEND_DEBUG _lava_multi_node_send started Wed Jun 8 >>>>>>> 10:07:53 CST 2021> >>>>>>> <LAVA_MULTI_NODE> <LAVA_SEND server_ready> >>>>>>> Received Multi_Node API <LAVA_SEND> >>>>>>> messageID: SEND-server_ready >>>>>>> lava-multinode lava-send >>>>>>> Handling signal <LAVA_SEND {"request": "lava_send", >>>>>>> "messageID": "server_ready", "message": {}, "timeout": 300}> >>>>>>> Setting poll timeout of 300 seconds >>>>>>> requesting lava_send server_ready >>>>>>> message: {} >>>>>>> requesting lava_send server_ready with args {} >>>>>>> request_send server_ready {} >>>>>>> Sending {'request': 'lava_send', 'messageID': 'server_ready', >>>>>>> 'message': {}} >>>>>>> final message: {"port": 3079, "blocksize": 4096, "poll_delay": >>>>>>> 3, "host": "10.191.253.109", "hostname": "lavaslave1", "client_name": >>>>>>> "3035", "group_name": "8a362e2a-6ee9-4f48-bddb-378ac2425f06", "role": >>>>>>> "host", "request": "lava_send", "messageID": "server_ready", "message": {}} >>>>>>> Connecting to LAVA Coordinator on 10.191.253.109:3079 >>>>>>> timeout=300 seconds. >>>>>>> case: multinode-send-server_ready >>>>>>> case_id: 39177 >>>>>>> definition: 0_smoke-case >>>>>>> result: pass >>>>>>> <LAVA_SEND_DEBUG _lava_multi_node_send finished Wed Jun 8 >>>>>>> 10:07:53 CST 2021> >>>>>>> <LAVA_SEND_DEBUG lava_multi_node_send finished Wed Jun 8 >>>>>>> 10:07:53 CST 2021> >>>>>>> + iperf -s -V -P 1 >>>>>>> ------------------------------------------------------------ >>>>>>> Server listening on TCP port 5001 >>>>>>> TCP window size: 85.3 KByte (default) >>>>>>> ------------------------------------------------------------ >>>>>>> >>>>>>> It seems that one node can't receive server-ready from another >>>>>>> node, what's wrong with my job? Please help! >>>>>>> >>>>>>> Thanks, >>>>>>> Hedy Lamarr >>>>>>> >>>>>>> _______________________________________________ >>>>>> Lava-users mailing list >>>>>> Lava-users@lists.lavasoftware.org >>>>>> https://lists.lavasoftware.org/mailman/listinfo/lava-users >>>>>> >>>>> >>>>> >>>>> -- >>>>> Rémi Duraffort >>>>> TuxArchitect >>>>> Linaro >>>>> >>>> >>> >>> -- >>> Rémi Duraffort >>> TuxArchitect >>> Linaro >>> >> > > -- > Rémi Duraffort > TuxArchitect > Linaro >
-- Rémi Duraffort TuxArchitect Linaro
-- Rémi Duraffort TuxArchitect Linaro
lava-users@lists.lavasoftware.org