From Sarath_PT@mentor.com Fri Feb 25 05:11:03 2022 From: "P T, Sarath" To: lava-users@lists.lavasoftware.org Subject: [Lava-users] Re: Job is not exiting after the timeout Date: Fri, 25 Feb 2022 05:10:58 +0000 Message-ID: <7b18ad8ebf54460e935b147659d2da99@svr-orw-mbx-01.mgc.mentorg.com> In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0398144365310460446==" --===============0398144365310460446== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hi Antonio, These are the logs for the server connection:=20 Worker side log ( /var/log/lava-dispatcher/lava-worker.log ) ------------------------------------------------------------ 2022-02-24 05:56:58,718 INFO [3834] FINISHED =3D> server 2022-02-24 05:57:01,233 ERROR [3834] -> server error: code 404 2022-02-24 05:57:01,233 DEBUG [3834] --> {"error": "Unknown job '3834'"} 2022-02-24 05:57:18,246 INFO PING =3D> server 2022-02-24 05:57:18,729 INFO [3834] FINISHED =3D> server 2022-02-24 05:57:18,965 ERROR [3834] -> server error: code 503 2022-02-24 05:57:18,965 DEBUG [3834] --> ('Connection aborted.', RemoteDiscon= nected('Remote end closed connection without response')) 2022-02-24 05:57:38,248 INFO PING =3D> server 2022-02-24 05:57:38,737 INFO [3834] FINISHED =3D> server 2022-02-24 05:57:38,977 ERROR [3834] -> server error: code 503 2022-02-24 05:57:38,977 DEBUG [3834] --> ('Connection aborted.', RemoteDiscon= nected('Remote end closed connection without response')) 2022-02-24 05:57:58,250 INFO PING =3D> server 2022-02-24 05:57:58,731 INFO [3834] FINISHED =3D> server 2022-02-24 05:57:58,968 ERROR [3834] -> server error: code 503 2022-02-24 05:57:58,969 DEBUG [3834] --> ('Connection aborted.', RemoteDiscon= nected('Remote end closed connection without response')) 2022-02-24 05:58:18,252 INFO PING =3D> server 2022-02-24 05:58:18,745 INFO [3834] FINISHED =3D> server 2022-02-24 05:58:21,739 ERROR [3834] -> server error: code 502 2022-02-24 05:58:21,740 DEBUG [3834] --> 502 Bad Gateway

Bad Gateway

The proxy server received an invalid response from an upstream server.


Apache/2.4.38 (Debian) Server at 132.186.71.148 Port 80
2022-02-24 05:58:38,253 INFO PING =3D> server 2022-02-24 05:58:38,735 INFO [3834] FINISHED =3D> server 2022-02-24 05:58:38,971 ERROR [3834] -> server error: code 503 2022-02-24 05:58:38,971 DEBUG [3834] --> ('Connection aborted.', RemoteDiscon= nected('Remote end closed connection without response')) 2022-02-24 05:58:58,254 INFO PING =3D> server 2022-02-24 05:58:58,738 INFO [3834] FINISHED =3D> server 2022-02-24 05:58:58,973 ERROR [3834] -> server error: code 503 2022-02-24 05:58:58,973 DEBUG [3834] --> ('Connection aborted.', RemoteDiscon= nected('Remote end closed connection without response')) 2022-02-24 05:59:18,256 INFO PING =3D> server Server side log ( /var/log/apache2/lava-server.log ) ------------------------------------------------------ 134.86.62.69 - - [24/Feb/2022:19:39:46 +0530] "GET /ws/ HTTP/1.1" 500 804 "-"= "lava-worker 2021.10" ::1 - - [24/Feb/2022:19:39:46 +0530] "POST /scheduler/internal/v1/workers/ HT= TP/1.1" 400 68338 "-" "lava-worker 2021.10" [Thu Feb 24 19:39:46.711251 2022] [proxy:warn] [pid 9108:tid 140199652738816]= [client 134.86.62.139:42968] AH01144: No protocol handler was valid for the = URL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sur= e the proxy submodules are included in the configuration using LoadModule. 134.86.62.139 - - [24/Feb/2022:19:39:46 +0530] "GET /ws/ HTTP/1.1" 500 804 "-= " "lava-worker 2021.10" [Thu Feb 24 19:39:47.054716 2022] [proxy:warn] [pid 9151:tid 140199132653312]= [client 134.86.61.20:43200] AH01144: No protocol handler was valid for the U= RL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sure= the proxy submodules are included in the configuration using LoadModule. 134.86.61.20 - - [24/Feb/2022:19:39:47 +0530] "GET /ws/ HTTP/1.1" 500 804 "-"= "lava-worker 2021.10" [Thu Feb 24 19:39:47.919417 2022] [proxy:warn] [pid 9108:tid 140200256718592]= [client 134.86.62.69:45566] AH01144: No protocol handler was valid for the U= RL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sure= the proxy submodules are included in the configuration using LoadModule. 134.86.62.69 - - [24/Feb/2022:19:39:47 +0530] "GET /ws/ HTTP/1.1" 500 804 "-"= "lava-worker 2021.10" [Thu Feb 24 19:39:48.202295 2022] [proxy:warn] [pid 9151:tid 140199661131520]= [client 134.86.62.139:42970] AH01144: No protocol handler was valid for the = URL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sur= e the proxy submodules are included in the configuration using LoadModule. 134.86.62.139 - - [24/Feb/2022:19:39:48 +0530] "GET /ws/ HTTP/1.1" 500 804 "-= " "lava-worker 2021.10" [Thu Feb 24 19:39:48.515377 2022] [proxy:warn] [pid 9108:tid 140200655480576]= [client 134.86.61.20:43202] AH01144: No protocol handler was valid for the U= RL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sure= the proxy submodules are included in the configuration using LoadModule. 134.86.61.20 - - [24/Feb/2022:19:39:48 +0530] "GET /ws/ HTTP/1.1" 500 804 "-"= "lava-worker 2021.10"=20 Server side log ( /var/log/lava-server/gunicorn.log ) -------------------------------------------------------- [2022-02-24 14:02:17 +0000] [704] [DEBUG] GET /scheduler/internal/v1/workers/= slll-worker-testing/ [2022-02-24 14:02:18 +0000] [704] [DEBUG] POST /scheduler/internal/v1/workers/ [2022-02-24 14:02:19 +0000] [704] [DEBUG] GET /scheduler/internal/v1/workers/= bng-test-worker/ [2022-02-24 14:02:20 +0000] [722] [DEBUG] GET /scheduler/internal/v1/workers/= Test-worker/ [2022-02-24 14:02:20 +0000] [704] [DEBUG] POST /scheduler/internal/v1/jobs/38= 79/ [2022-02-24 14:02:23 +0000] [722] [DEBUG] POST /scheduler/internal/v1/workers/ [2022-02-24 14:02:28 +0000] [721] [DEBUG] POST /scheduler/internal/v1/workers/ [2022-02-24 14:02:29 +0000] [704] [DEBUG] GET /scheduler/job/3966/job_status [2022-02-24 14:02:29 +0000] [721] [DEBUG] GET /scheduler/job/3966/log_pipelin= e_incremental [2022-02-24 14:02:33 +0000] [704] [DEBUG] POST /scheduler/internal/v1/workers/ [2022-02-24 14:02:37 +0000] [704] [DEBUG] GET /scheduler/internal/v1/workers/= slll-worker-testing/ [2022-02-24 14:02:38 +0000] [720] [DEBUG] POST /scheduler/internal/v1/workers/ [2022-02-24 14:02:38 +0000] [704] [DEBUG] POST /scheduler/internal/v1/jobs/38= 34/ [2022-02-24 14:02:39 +0000] [704] [DEBUG] GET /scheduler/internal/v1/workers/= bng-test-worker/ [2022-02-24 14:02:40 +0000] [704] [DEBUG] GET /scheduler/internal/v1/workers/= Test-worker/ [2022-02-24 14:02:43 +0000] [722] [DEBUG] POST /scheduler/internal/v1/workers/ [2022-02-24 14:02:48 +0000] [722] [DEBUG] POST /scheduler/internal/v1/workers/ Regards=20 Sarath P T -----Original Message----- From: Antonio Terceiro [mailto:antonio.terceiro(a)linaro.org]=20 Sent: 24 February 2022 18:37 To: P T, Sarath Cc: lava-users(a)lists.lavasoftware.org Subject: Re: [Lava-users] Re: Job is not exiting after the timeout On Thu, Feb 24, 2022 at 09:40:22AM +0000, P T, Sarath wrote: > Hi Team, >=20 > I could able to find the root cause of the issue just giving my observation= : >=20 > 1. I deleted a `cancelling` job with the ID 3834 from the GUI. > 2. And for the next test run its giving an error log under worker like thi= s . >=20 > 2022-02-24 01:18:57,502 ERROR [3834] -> server error: code 503 > 2022-02-24 01:18:57,502 DEBUG [3834] --> ('Connection aborted.', RemoteDi= sconnected('Remote end closed connection without response')) > 2022-02-24 01:19:16,795 INFO PING =3D> server > 2022-02-24 01:19:17,268 INFO [3834] FINISHED =3D> server > 2022-02-24 01:19:18,666 ERROR [3834] -> server error: code 404 > 2022-02-24 01:19:18,666 DEBUG [3834] --> {"error": "Unknown job '3834'"} > 2022-02-24 01:19:36,797 INFO PING =3D> server > 2022-02-24 01:19:37,274 INFO [3834] FINISHED =3D> server > 2022-02-24 01:19:37,509 ERROR [3834] -> server error: code 503 > 2022-02-24 01:19:37,509 DEBUG [3834] --> ('Connection aborted.', RemoteDi= sconnected('Remote end closed connection without response')) Is the server receiving the connections normally? If you look at the server l= ogs (apache and/or gunicorn) there should be corresponding error messages in = there telling you what went wrong. --===============0398144365310460446==--