Hi Antonio,
These are the logs for the server connection:
Worker side log ( /var/log/lava-dispatcher/lava-worker.log ) ------------------------------------------------------------
2022-02-24 05:56:58,718 INFO [3834] FINISHED => server 2022-02-24 05:57:01,233 ERROR [3834] -> server error: code 404 2022-02-24 05:57:01,233 DEBUG [3834] --> {"error": "Unknown job '3834'"} 2022-02-24 05:57:18,246 INFO PING => server 2022-02-24 05:57:18,729 INFO [3834] FINISHED => server 2022-02-24 05:57:18,965 ERROR [3834] -> server error: code 503 2022-02-24 05:57:18,965 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) 2022-02-24 05:57:38,248 INFO PING => server 2022-02-24 05:57:38,737 INFO [3834] FINISHED => server 2022-02-24 05:57:38,977 ERROR [3834] -> server error: code 503 2022-02-24 05:57:38,977 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) 2022-02-24 05:57:58,250 INFO PING => server 2022-02-24 05:57:58,731 INFO [3834] FINISHED => server 2022-02-24 05:57:58,968 ERROR [3834] -> server error: code 503 2022-02-24 05:57:58,969 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) 2022-02-24 05:58:18,252 INFO PING => server 2022-02-24 05:58:18,745 INFO [3834] FINISHED => server 2022-02-24 05:58:21,739 ERROR [3834] -> server error: code 502 2022-02-24 05:58:21,740 DEBUG [3834] --> <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>502 Bad Gateway</title> </head><body> <h1>Bad Gateway</h1> <p>The proxy server received an invalid response from an upstream server.<br /> </p> <hr> <address>Apache/2.4.38 (Debian) Server at 132.186.71.148 Port 80</address> </body></html>
2022-02-24 05:58:38,253 INFO PING => server 2022-02-24 05:58:38,735 INFO [3834] FINISHED => server 2022-02-24 05:58:38,971 ERROR [3834] -> server error: code 503 2022-02-24 05:58:38,971 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) 2022-02-24 05:58:58,254 INFO PING => server 2022-02-24 05:58:58,738 INFO [3834] FINISHED => server 2022-02-24 05:58:58,973 ERROR [3834] -> server error: code 503 2022-02-24 05:58:58,973 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) 2022-02-24 05:59:18,256 INFO PING => server
Server side log ( /var/log/apache2/lava-server.log ) ------------------------------------------------------
134.86.62.69 - - [24/Feb/2022:19:39:46 +0530] "GET /ws/ HTTP/1.1" 500 804 "-" "lava-worker 2021.10" ::1 - - [24/Feb/2022:19:39:46 +0530] "POST /scheduler/internal/v1/workers/ HTTP/1.1" 400 68338 "-" "lava-worker 2021.10" [Thu Feb 24 19:39:46.711251 2022] [proxy:warn] [pid 9108:tid 140199652738816] [client 134.86.62.139:42968] AH01144: No protocol handler was valid for the URL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sure the proxy submodules are included in the configuration using LoadModule. 134.86.62.139 - - [24/Feb/2022:19:39:46 +0530] "GET /ws/ HTTP/1.1" 500 804 "-" "lava-worker 2021.10" [Thu Feb 24 19:39:47.054716 2022] [proxy:warn] [pid 9151:tid 140199132653312] [client 134.86.61.20:43200] AH01144: No protocol handler was valid for the URL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sure the proxy submodules are included in the configuration using LoadModule. 134.86.61.20 - - [24/Feb/2022:19:39:47 +0530] "GET /ws/ HTTP/1.1" 500 804 "-" "lava-worker 2021.10" [Thu Feb 24 19:39:47.919417 2022] [proxy:warn] [pid 9108:tid 140200256718592] [client 134.86.62.69:45566] AH01144: No protocol handler was valid for the URL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sure the proxy submodules are included in the configuration using LoadModule. 134.86.62.69 - - [24/Feb/2022:19:39:47 +0530] "GET /ws/ HTTP/1.1" 500 804 "-" "lava-worker 2021.10" [Thu Feb 24 19:39:48.202295 2022] [proxy:warn] [pid 9151:tid 140199661131520] [client 134.86.62.139:42970] AH01144: No protocol handler was valid for the URL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sure the proxy submodules are included in the configuration using LoadModule. 134.86.62.139 - - [24/Feb/2022:19:39:48 +0530] "GET /ws/ HTTP/1.1" 500 804 "-" "lava-worker 2021.10" [Thu Feb 24 19:39:48.515377 2022] [proxy:warn] [pid 9108:tid 140200655480576] [client 134.86.61.20:43202] AH01144: No protocol handler was valid for the URL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sure the proxy submodules are included in the configuration using LoadModule. 134.86.61.20 - - [24/Feb/2022:19:39:48 +0530] "GET /ws/ HTTP/1.1" 500 804 "-" "lava-worker 2021.10"
Server side log ( /var/log/lava-server/gunicorn.log ) --------------------------------------------------------
[2022-02-24 14:02:17 +0000] [704] [DEBUG] GET /scheduler/internal/v1/workers/slll-worker-testing/ [2022-02-24 14:02:18 +0000] [704] [DEBUG] POST /scheduler/internal/v1/workers/ [2022-02-24 14:02:19 +0000] [704] [DEBUG] GET /scheduler/internal/v1/workers/bng-test-worker/ [2022-02-24 14:02:20 +0000] [722] [DEBUG] GET /scheduler/internal/v1/workers/Test-worker/ [2022-02-24 14:02:20 +0000] [704] [DEBUG] POST /scheduler/internal/v1/jobs/3879/ [2022-02-24 14:02:23 +0000] [722] [DEBUG] POST /scheduler/internal/v1/workers/ [2022-02-24 14:02:28 +0000] [721] [DEBUG] POST /scheduler/internal/v1/workers/ [2022-02-24 14:02:29 +0000] [704] [DEBUG] GET /scheduler/job/3966/job_status [2022-02-24 14:02:29 +0000] [721] [DEBUG] GET /scheduler/job/3966/log_pipeline_incremental [2022-02-24 14:02:33 +0000] [704] [DEBUG] POST /scheduler/internal/v1/workers/ [2022-02-24 14:02:37 +0000] [704] [DEBUG] GET /scheduler/internal/v1/workers/slll-worker-testing/ [2022-02-24 14:02:38 +0000] [720] [DEBUG] POST /scheduler/internal/v1/workers/ [2022-02-24 14:02:38 +0000] [704] [DEBUG] POST /scheduler/internal/v1/jobs/3834/ [2022-02-24 14:02:39 +0000] [704] [DEBUG] GET /scheduler/internal/v1/workers/bng-test-worker/ [2022-02-24 14:02:40 +0000] [704] [DEBUG] GET /scheduler/internal/v1/workers/Test-worker/ [2022-02-24 14:02:43 +0000] [722] [DEBUG] POST /scheduler/internal/v1/workers/ [2022-02-24 14:02:48 +0000] [722] [DEBUG] POST /scheduler/internal/v1/workers/
Regards Sarath P T
-----Original Message----- From: Antonio Terceiro [mailto:antonio.terceiro@linaro.org] Sent: 24 February 2022 18:37 To: P T, Sarath Sarath_PT@mentor.com Cc: lava-users@lists.lavasoftware.org Subject: Re: [Lava-users] Re: Job is not exiting after the timeout
On Thu, Feb 24, 2022 at 09:40:22AM +0000, P T, Sarath wrote:
Hi Team,
I could able to find the root cause of the issue just giving my observation :
- I deleted a `cancelling` job with the ID 3834 from the GUI.
- And for the next test run its giving an error log under worker like this .
2022-02-24 01:18:57,502 ERROR [3834] -> server error: code 503 2022-02-24 01:18:57,502 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) 2022-02-24 01:19:16,795 INFO PING => server 2022-02-24 01:19:17,268 INFO [3834] FINISHED => server 2022-02-24 01:19:18,666 ERROR [3834] -> server error: code 404 2022-02-24 01:19:18,666 DEBUG [3834] --> {"error": "Unknown job '3834'"} 2022-02-24 01:19:36,797 INFO PING => server 2022-02-24 01:19:37,274 INFO [3834] FINISHED => server 2022-02-24 01:19:37,509 ERROR [3834] -> server error: code 503 2022-02-24 01:19:37,509 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Is the server receiving the connections normally? If you look at the server logs (apache and/or gunicorn) there should be corresponding error messages in there telling you what went wrong.