Hi Team,
Facing an issue related to the LAVA job timeout. Job is not exiting even after the timeout , its continuously loading after the job completion. Below I am giving my observations:
1. A LAVA job has executed with the job id : 3959
2. And under the worker I am getting the log as shown below : 2022-02-22 23:19:09,534 DEBUG [3959] dispatch: None
2022-02-22 23:19:09,535 DEBUG [3959] env : {'purge': True, 'overrides': {'LC_ALL': 'C.UTF-8', 'LANG': 'C', 'PATH': '/usr/local/bin:/usr/local/sbin:/bin:/usr/bin:/usr/sbin:/sbin'}}
2022-02-22 23:19:09,535 DEBUG [3959] env-dut : None
2022-02-22 23:19:09,584 INFO [3959] RUNNING => server
2022-02-22 23:19:09,850 INFO [3834] FINISHED => server
2022-02-22 23:19:10,087 ERROR [3834] -> server error: code 503
2022-02-22 23:19:10,088 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2022-02-22 23:19:28,651 INFO PING => server
2022-02-22 23:19:29,142 INFO [3834] FINISHED => server
2022-02-22 23:19:29,775 ERROR [3834] -> server error: code 503
2022-02-22 23:19:29,776 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2022-02-22 23:19:48,652 INFO PING => server
2022-02-22 23:19:49,146 INFO [3834] FINISHED => server
2022-02-22 23:19:49,391 ERROR [3834] -> server error: code 503
2022-02-22 23:19:49,391 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2022-02-22 23:20:08,653 INFO PING => server
2022-02-22 23:20:09,171 INFO [3959] running -> finished
2022-02-22 23:20:09,224 INFO [3834] FINISHED => server
2022-02-22 23:20:09,525 ERROR [3834] -> server error: code 503
2022-02-22 23:20:09,525 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2022-02-22 23:20:28,654 INFO PING => server
2022-02-22 23:20:29,147 INFO [3834] FINISHED => server
2022-02-22 23:20:29,756 ERROR [3834] -> server error: code 503
2022-02-22 23:20:29,756 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Seems after the job execution somewhere the communication between server and worker is getting interrupted. It will be a great help if you people are looking into the issue and hope you will find the solution for the same.
Regards Sarath P T
On Wed, Feb 23, 2022 at 08:55:32AM +0000, P T, Sarath wrote:
2022-02-22 23:20:08,653 INFO PING => server 2022-02-22 23:20:09,171 INFO [3959] running -> finished 2022-02-22 23:20:09,224 INFO [3834] FINISHED => server 2022-02-22 23:20:09,525 ERROR [3834] -> server error: code 503 2022-02-22 23:20:09,525 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) 2022-02-22 23:20:28,654 INFO PING => server 2022-02-22 23:20:29,147 INFO [3834] FINISHED => server 2022-02-22 23:20:29,756 ERROR [3834] -> server error: code 503 2022-02-22 23:20:29,756 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Seems after the job execution somewhere the communication between server and worker is getting interrupted. It will be a great help if you people are looking into the issue and hope you will find the solution for the same.
Is the server receiving the connections normally? If you look at the server logs (apache and/or gunicorn) there should be corresponding error messages in there telling you what went wrong.
lava-users@lists.lavasoftware.org