Hi Neil,
On Fri, Jan 05, 2018 at 11:12:22AM +0000, Neil Williams wrote:
On 5 January 2018 at 10:52, Thomas Petazzoni <[1]thomas.petazzoni@free-electrons.com> wrote:
And this happens for lots of jobs. Pretty much every day or two, we have ten boards stuck in this situation.
This should not be happening that often. We have seen it where lots of test jobs get cancelled when there is a long queue. We have also seen it where the actual setup is buggy.
We happen to cancel jobs on a regular basis as we do run custom tests and KCI jobs, leading to occasionally have a big queue.
Check that NONE of the workers have any V1 hangovers, that includes lava-server still being installed on the worker and lava-master still running. Follow the docs on how to clean up a worker which used to support V1. [7]https://staging.validation.linaro.org/static/docs/v2/pipeline-server.html#di... Similarly, check [8]https://staging.validation.linaro.org/static/docs/v2/pipeline-server.html#di...
We've followed these two chapters when upgrading to Lava 2017.12. We've completely disabled and cleaned up all related v1 files and services.
(I just double-checked).
I'll remove the auto-cancellation of old jobs we have in place to see if that helps, and we'll wait for the next release to land in.
Thanks, Antoine