Hi Sarath,

Thanks for your email.

This resolution resolved the issue and appreciated your support.

Regards,
Koti

On Thu, 22 Sept 2022 at 11:37, P T, Sarath <Sarath_PT@mentor.com> wrote:

Hi koti,

It might be the issue from the database. So you need to clear the existing database which contains the jobs which are not coming even after the timeout. Try below steps.

1) Navigate to the dispatcher temp directory :
$ cd /var/lib/lava/dispatcher/worker/tmp

2) clear the jobs which are present under it
$ sudo rm -rf *

3) Clear the “db.sqlite3” file under  “/var/lib/lava/dispatcher/worker”
$ sudo rm –rf db.sqlite3

4) Restart the lava-worker service by below command.
$ systemctl restart lava-worker

5) Try to submit the job again

Let me know if these steps are helpful.

Regards
Sarath P T

From: koti [mailto:kotisoftwaretest@gmail.com]
Sent: 22 September 2022 02:14
To: lava-users <lava-users@lists.lavasoftware.org>
Cc: P T, Sarath <Sarath_PT@mentor.com>; Antonio Terceiro <antonio.terceiro@linaro.org>
Subject: Job does not exiting even after the timeout

 

Hi.

 

It looks like I am facing the same problem and the job does not exist even after the timeout.  .

 

I guess there might be communication gap between the Dispatcher and server.

 

Dispatcher log screenshot: (/var/log/lava-dispatcher/lava-worker.log)

######################

image.png

 

any solution to resolve this?

 

Regards,

Koti

On Sat, 26 Feb 2022 at 05:30, <lava-users-request@lists.lavasoftware.org> wrote:

Send Lava-users mailing list submissions to
        lava-users@lists.lavasoftware.org

To subscribe or unsubscribe via email, send a message with subject or
body 'help' to
        lava-users-request@lists.lavasoftware.org

You can reach the person managing the list at
        lava-users-owner@lists.lavasoftware.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Lava-users digest..."

Today's Topics:

   1. Re: Job is not exiting after the timeout (P T, Sarath)
   2. Re: Job is not exiting after the timeout (Antonio Terceiro)


----------------------------------------------------------------------

Message: 1
Date: Fri, 25 Feb 2022 05:10:58 +0000
From: "P T, Sarath" <Sarath_PT@mentor.com>
Subject: [Lava-users] Re: Job is not exiting after the timeout
To: Antonio Terceiro <antonio.terceiro@linaro.org>
Cc: "lava-users@lists.lavasoftware.org"
        <lava-users@lists.lavasoftware.org>
Message-ID:
        <7b18ad8ebf54460e935b147659d2da99@svr-orw-mbx-01.mgc.mentorg.com>
Content-Type: text/plain; charset="us-ascii"

Hi Antonio,

These are the logs for the server connection:

Worker side log ( /var/log/lava-dispatcher/lava-worker.log )
------------------------------------------------------------

2022-02-24 05:56:58,718 INFO [3834] FINISHED => server
2022-02-24 05:57:01,233 ERROR [3834] -> server error: code 404
2022-02-24 05:57:01,233 DEBUG [3834] --> {"error": "Unknown job '3834'"}
2022-02-24 05:57:18,246 INFO PING => server
2022-02-24 05:57:18,729 INFO [3834] FINISHED => server
2022-02-24 05:57:18,965 ERROR [3834] -> server error: code 503
2022-02-24 05:57:18,965 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2022-02-24 05:57:38,248 INFO PING => server
2022-02-24 05:57:38,737 INFO [3834] FINISHED => server
2022-02-24 05:57:38,977 ERROR [3834] -> server error: code 503
2022-02-24 05:57:38,977 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2022-02-24 05:57:58,250 INFO PING => server
2022-02-24 05:57:58,731 INFO [3834] FINISHED => server
2022-02-24 05:57:58,968 ERROR [3834] -> server error: code 503
2022-02-24 05:57:58,969 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2022-02-24 05:58:18,252 INFO PING => server
2022-02-24 05:58:18,745 INFO [3834] FINISHED => server
2022-02-24 05:58:21,739 ERROR [3834] -> server error: code 502
2022-02-24 05:58:21,740 DEBUG [3834] --> <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>502 Bad Gateway</title>
</head><body>
<h1>Bad Gateway</h1>
<p>The proxy server received an invalid
response from an upstream server.<br />
</p>
<hr>
<address>Apache/2.4.38 (Debian) Server at 132.186.71.148 Port 80</address>
</body></html>


2022-02-24 05:58:38,253 INFO PING => server
2022-02-24 05:58:38,735 INFO [3834] FINISHED => server
2022-02-24 05:58:38,971 ERROR [3834] -> server error: code 503
2022-02-24 05:58:38,971 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2022-02-24 05:58:58,254 INFO PING => server
2022-02-24 05:58:58,738 INFO [3834] FINISHED => server
2022-02-24 05:58:58,973 ERROR [3834] -> server error: code 503
2022-02-24 05:58:58,973 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2022-02-24 05:59:18,256 INFO PING => server


Server side log ( /var/log/apache2/lava-server.log )
------------------------------------------------------

134.86.62.69 - - [24/Feb/2022:19:39:46 +0530] "GET /ws/ HTTP/1.1" 500 804 "-" "lava-worker 2021.10"
::1 - - [24/Feb/2022:19:39:46 +0530] "POST /scheduler/internal/v1/workers/ HTTP/1.1" 400 68338 "-" "lava-worker 2021.10"
[Thu Feb 24 19:39:46.711251 2022] [proxy:warn] [pid 9108:tid 140199652738816] [client 134.86.62.139:42968] AH01144: No protocol handler was valid for the URL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sure the proxy submodules are included in the configuration using LoadModule.
134.86.62.139 - - [24/Feb/2022:19:39:46 +0530] "GET /ws/ HTTP/1.1" 500 804 "-" "lava-worker 2021.10"
[Thu Feb 24 19:39:47.054716 2022] [proxy:warn] [pid 9151:tid 140199132653312] [client 134.86.61.20:43200] AH01144: No protocol handler was valid for the URL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sure the proxy submodules are included in the configuration using LoadModule.
134.86.61.20 - - [24/Feb/2022:19:39:47 +0530] "GET /ws/ HTTP/1.1" 500 804 "-" "lava-worker 2021.10"
[Thu Feb 24 19:39:47.919417 2022] [proxy:warn] [pid 9108:tid 140200256718592] [client 134.86.62.69:45566] AH01144: No protocol handler was valid for the URL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sure the proxy submodules are included in the configuration using LoadModule.
134.86.62.69 - - [24/Feb/2022:19:39:47 +0530] "GET /ws/ HTTP/1.1" 500 804 "-" "lava-worker 2021.10"
[Thu Feb 24 19:39:48.202295 2022] [proxy:warn] [pid 9151:tid 140199661131520] [client 134.86.62.139:42970] AH01144: No protocol handler was valid for the URL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sure the proxy submodules are included in the configuration using LoadModule.
134.86.62.139 - - [24/Feb/2022:19:39:48 +0530] "GET /ws/ HTTP/1.1" 500 804 "-" "lava-worker 2021.10"
[Thu Feb 24 19:39:48.515377 2022] [proxy:warn] [pid 9108:tid 140200655480576] [client 134.86.61.20:43202] AH01144: No protocol handler was valid for the URL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sure the proxy submodules are included in the configuration using LoadModule.
134.86.61.20 - - [24/Feb/2022:19:39:48 +0530] "GET /ws/ HTTP/1.1" 500 804 "-" "lava-worker 2021.10"


Server side log ( /var/log/lava-server/gunicorn.log )
--------------------------------------------------------

[2022-02-24 14:02:17 +0000] [704] [DEBUG] GET /scheduler/internal/v1/workers/slll-worker-testing/
[2022-02-24 14:02:18 +0000] [704] [DEBUG] POST /scheduler/internal/v1/workers/
[2022-02-24 14:02:19 +0000] [704] [DEBUG] GET /scheduler/internal/v1/workers/bng-test-worker/
[2022-02-24 14:02:20 +0000] [722] [DEBUG] GET /scheduler/internal/v1/workers/Test-worker/
[2022-02-24 14:02:20 +0000] [704] [DEBUG] POST /scheduler/internal/v1/jobs/3879/
[2022-02-24 14:02:23 +0000] [722] [DEBUG] POST /scheduler/internal/v1/workers/
[2022-02-24 14:02:28 +0000] [721] [DEBUG] POST /scheduler/internal/v1/workers/
[2022-02-24 14:02:29 +0000] [704] [DEBUG] GET /scheduler/job/3966/job_status
[2022-02-24 14:02:29 +0000] [721] [DEBUG] GET /scheduler/job/3966/log_pipeline_incremental
[2022-02-24 14:02:33 +0000] [704] [DEBUG] POST /scheduler/internal/v1/workers/
[2022-02-24 14:02:37 +0000] [704] [DEBUG] GET /scheduler/internal/v1/workers/slll-worker-testing/
[2022-02-24 14:02:38 +0000] [720] [DEBUG] POST /scheduler/internal/v1/workers/
[2022-02-24 14:02:38 +0000] [704] [DEBUG] POST /scheduler/internal/v1/jobs/3834/
[2022-02-24 14:02:39 +0000] [704] [DEBUG] GET /scheduler/internal/v1/workers/bng-test-worker/
[2022-02-24 14:02:40 +0000] [704] [DEBUG] GET /scheduler/internal/v1/workers/Test-worker/
[2022-02-24 14:02:43 +0000] [722] [DEBUG] POST /scheduler/internal/v1/workers/
[2022-02-24 14:02:48 +0000] [722] [DEBUG] POST /scheduler/internal/v1/workers/


Regards
Sarath  P T

-----Original Message-----
From: Antonio Terceiro [mailto:antonio.terceiro@linaro.org]
Sent: 24 February 2022 18:37
To: P T, Sarath <Sarath_PT@mentor.com>
Cc: lava-users@lists.lavasoftware.org
Subject: Re: [Lava-users] Re: Job is not exiting after the timeout

On Thu, Feb 24, 2022 at 09:40:22AM +0000, P T, Sarath wrote:
> Hi Team,
>
> I could able to find the root cause of the issue just giving my observation :
>
> 1. I deleted a `cancelling` job with the ID 3834 from the GUI.
> 2. And for the next test run its giving an error log  under worker like this .
>
> 2022-02-24 01:18:57,502   ERROR [3834] -> server error: code 503
> 2022-02-24 01:18:57,502   DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
> 2022-02-24 01:19:16,795    INFO PING => server
> 2022-02-24 01:19:17,268    INFO [3834] FINISHED => server
> 2022-02-24 01:19:18,666   ERROR [3834] -> server error: code 404
> 2022-02-24 01:19:18,666   DEBUG [3834] --> {"error": "Unknown job '3834'"}
> 2022-02-24 01:19:36,797    INFO PING => server
> 2022-02-24 01:19:37,274    INFO [3834] FINISHED => server
> 2022-02-24 01:19:37,509   ERROR [3834] -> server error: code 503
> 2022-02-24 01:19:37,509   DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

Is the server receiving the connections normally? If you look at the server logs (apache and/or gunicorn) there should be corresponding error messages in there telling you what went wrong.

------------------------------

Message: 2
Date: Fri, 25 Feb 2022 10:37:00 -0300
From: Antonio Terceiro <antonio.terceiro@linaro.org>
Subject: [Lava-users] Re: Job is not exiting after the timeout
To: "P T, Sarath" <Sarath_PT@mentor.com>
Cc: "lava-users@lists.lavasoftware.org"
        <lava-users@lists.lavasoftware.org>
Message-ID: <YhjbfBGnnyO67EIY@linaro.org>
Content-Type: multipart/signed; micalg=pgp-sha256;
        protocol="application/pgp-signature"; boundary="U431ChLU/1f+Fa7u"

On Fri, Feb 25, 2022 at 05:10:58AM +0000, P T, Sarath wrote:
> Server side log ( /var/log/apache2/lava-server.log )
> ------------------------------------------------------
>
> 134.86.62.69 - - [24/Feb/2022:19:39:46 +0530] "GET /ws/ HTTP/1.1" 500 804 "-" "lava-worker 2021.10"
> ::1 - - [24/Feb/2022:19:39:46 +0530] "POST /scheduler/internal/v1/workers/ HTTP/1.1" 400 68338 "-" "lava-worker 2021.10"
> [Thu Feb 24 19:39:46.711251 2022] [proxy:warn] [pid 9108:tid 140199652738816] [client 134.86.62.139:42968] AH01144: No protocol handler was valid for the URL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sure the proxy submodules are included in the configuration using LoadModule.
> 134.86.62.139 - - [24/Feb/2022:19:39:46 +0530] "GET /ws/ HTTP/1.1" 500 804 "-" "lava-worker 2021.10"
> [Thu Feb 24 19:39:47.054716 2022] [proxy:warn] [pid 9151:tid 140199132653312] [client 134.86.61.20:43200] AH01144: No protocol handler was valid for the URL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sure the proxy submodules are included in the configuration using LoadModule.
> 134.86.61.20 - - [24/Feb/2022:19:39:47 +0530] "GET /ws/ HTTP/1.1" 500 804 "-" "lava-worker 2021.10"
> [Thu Feb 24 19:39:47.919417 2022] [proxy:warn] [pid 9108:tid 140200256718592] [client 134.86.62.69:45566] AH01144: No protocol handler was valid for the URL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sure the proxy submodules are included in the configuration using LoadModule.
> 134.86.62.69 - - [24/Feb/2022:19:39:47 +0530] "GET /ws/ HTTP/1.1" 500 804 "-" "lava-worker 2021.10"
> [Thu Feb 24 19:39:48.202295 2022] [proxy:warn] [pid 9151:tid 140199661131520] [client 134.86.62.139:42970] AH01144: No protocol handler was valid for the URL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sure the proxy submodules are included in the configuration using LoadModule.
> 134.86.62.139 - - [24/Feb/2022:19:39:48 +0530] "GET /ws/ HTTP/1.1" 500 804 "-" "lava-worker 2021.10"
> [Thu Feb 24 19:39:48.515377 2022] [proxy:warn] [pid 9108:tid 140200655480576] [client 134.86.61.20:43202] AH01144: No protocol handler was valid for the URL /ws/ (scheme 'ws'). If you are using a DSO version of mod_proxy, make sure the proxy submodules are included in the configuration using LoadModule.
> 134.86.61.20 - - [24/Feb/2022:19:39:48 +0530] "GET /ws/ HTTP/1.1" 500 804 "-" "lava-worker 2021.10"

Your apache is not configured correctly, you are probably missing
enabling mod_proxy and/or mod_proxy_http. See
https://master.lavasoftware.org/static/docs/v2/installing_on_debian.html#production-releases
-------------- next part --------------
A message part incompatible with plain text digests has been removed ...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available

------------------------------

Subject: Digest Footer

_______________________________________________
Lava-users mailing list -- lava-users@lists.lavasoftware.org
To unsubscribe send an email to lava-users-leave@lists.lavasoftware.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

------------------------------

End of Lava-users Digest, Vol 42, Issue 6
*****************************************