Hi Team,
Facing an issue related to the LAVA job timeout. Job is not exiting even after the timeout , its continuously loading after the job completion. Below I am giving my observations:
1. A LAVA job has executed with the job id : 3959
2. And under the worker I am getting the log as shown below : 2022-02-22 23:19:09,534 DEBUG [3959] dispatch: None
2022-02-22 23:19:09,535 DEBUG [3959] env : {'purge': True, 'overrides': {'LC_ALL': 'C.UTF-8', 'LANG': 'C', 'PATH': '/usr/local/bin:/usr/local/sbin:/bin:/usr/bin:/usr/sbin:/sbin'}}
2022-02-22 23:19:09,535 DEBUG [3959] env-dut : None
2022-02-22 23:19:09,584 INFO [3959] RUNNING => server
2022-02-22 23:19:09,850 INFO [3834] FINISHED => server
2022-02-22 23:19:10,087 ERROR [3834] -> server error: code 503
2022-02-22 23:19:10,088 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2022-02-22 23:19:28,651 INFO PING => server
2022-02-22 23:19:29,142 INFO [3834] FINISHED => server
2022-02-22 23:19:29,775 ERROR [3834] -> server error: code 503
2022-02-22 23:19:29,776 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2022-02-22 23:19:48,652 INFO PING => server
2022-02-22 23:19:49,146 INFO [3834] FINISHED => server
2022-02-22 23:19:49,391 ERROR [3834] -> server error: code 503
2022-02-22 23:19:49,391 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2022-02-22 23:20:08,653 INFO PING => server
2022-02-22 23:20:09,171 INFO [3959] running -> finished
2022-02-22 23:20:09,224 INFO [3834] FINISHED => server
2022-02-22 23:20:09,525 ERROR [3834] -> server error: code 503
2022-02-22 23:20:09,525 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
2022-02-22 23:20:28,654 INFO PING => server
2022-02-22 23:20:29,147 INFO [3834] FINISHED => server
2022-02-22 23:20:29,756 ERROR [3834] -> server error: code 503
2022-02-22 23:20:29,756 DEBUG [3834] --> ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Seems after the job execution somewhere the communication between server and worker is getting interrupted. It will be a great help if you people are looking into the issue and hope you will find the solution for the same.
Regards Sarath P T