I met this for twice during past years, I use django shell to reset the state, FYI:
sudo lava-server manage shell
from lava_scheduler_app.models import Device
Device.objects.filter(hostname="imx8mm-evk-sh99").update(state=Device.STATE_IDLE)
-----Original Message-----
From: lava-users-request(a)lists.lavasoftware.org <lava-users-request(a)lists.lavasoftware.org>
Sent: Tuesday, June 7, 2022 8:00 AM
To: lava-users(a)lists.lavasoftware.org
Subject: [EXT] Lava-users Digest, Vol 46, Issue 3
Caution: EXT Email
Send Lava-users mailing list submissions to
lava-users(a)lists.lavasoftware.org
To subscribe or unsubscribe via email, send a message with subject or body 'help' to
lava-users-request(a)lists.lavasoftware.org
You can reach the person managing the list at
lava-users-owner(a)lists.lavasoftware.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Lava-users digest..."
Today's Topics:
1. Re: [EXTERNAL] Re: Device stuck in 'Running' (Milosz Wasilewski)
----------------------------------------------------------------------
Message: 1
Date: Mon, 6 Jun 2022 13:39:08 +0100
From: Milosz Wasilewski <milosz.wasilewski(a)foundries.io>
Subject: [Lava-users] Re: [EXTERNAL] Re: Device stuck in 'Running'
To: "Westermann, Oliver" <Oliver.Westermann(a)cognex.com>
Cc: "lava-users(a)lists.lavasoftware.org"
<lava-users(a)lists.lavasoftware.org>
Message-ID:
<CAH1=h_QrLjTp5g+gCgVnZ1fyi+M7tnamAqEngzxjCj--H+2wAw(a)mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"
On Thu, Jun 2, 2022 at 10:36 AM Westermann, Oliver <Oliver.Westermann(a)cognex.com> wrote:
>
> Hey,
>
> > From: Milosz Wasilewski <milosz.wasilewski(a)foundries.io>
> >
> > On Tue, May 31, 2022 at 9:04 PM Westermann, Oliver <Oliver.Westermann(a)cognex.com> wrote:
> > >
> > > Hey,
> > >
> > > I've a device stuck in 'Running' state, but missing the usual appending of '#<123> <Job Name> [submitter]'.
> > >
> > > Does somebody have some tips how to set/force the device state to 'Idle' manually?
> >
> > I put my bet on lava-dispatcher crashing somewhere along the way.
> > Please try to restart lava-dispatcher and all services around it (webserver, ser2net, etc.).
>
> Did so before my mail request. We're running the docker-compose setup and I've already update (to 2022.05) and restarted (and rebuild) the containers with no success.
That's weird. It should fix the issue
>
> > Once you've done this, you can mark the device "idle" in the admin interface. This should help.
>
> Can you give me some guidance here? The admin interface (as well as the lava-server cli) allow me to set the health (good, bad, maintenance, retired), but not the state.
My bad, you're absolutely right. The state field doesn't seem to be editable. You might need to change it in the DB. I've no idea how it got to this point. In my case when dispatcher is restarted the device status usually fixes itself. One thing to consider is deleting the offending TestJob from DB. This might help (I didn't look at the code to confirm it)
>
> Unrelated sidenote:
> Rebuilding the docker containers failed me.
> The dispatcher requires a file from schneider electric [1], which is currently not available for me.
> I found the file on Remis Github [2] and fixed my Dockerfile locally, but I will try to create a Issue in the gitlab.
It's MIB for APC PDU (or some other Schneider equipment). It's only required if you're using SNMP to talk to PDU with the name instead of it's ASN.1 number. Linaro's lab-scripts [1] make use of this MIB.
[1] https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.linar…
Best Regards,
Milosz
>
> Best regards, Olli
>
> [1]
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.
> lavasoftware.org%2Flava%2Fpkg%2Fdocker-compose%2F-%2Fblob%2Fmaster%2Fd
> ispatcher%2FDockerfile%23L14&data=05%7C01%7Clarry.shen%40nxp.com%7
> Ce553a3e81ed4483b0aa508da4818b237%7C686ea1d3bc2b4c6fa92cd99c5c301635%7
> C0%7C0%7C637901568148410870%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMD
> AiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&s
> data=jQS9w3MoZmp5zq6GwQHleYhWuYRUOAyIEB7mVTPVIvc%3D&reserved=0
> [2]
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> ub.com%2Fci-box%2Fci-box-lava-worker%2Fblob%2Fmaster%2Fpowernet428.mib
> &data=05%7C01%7Clarry.shen%40nxp.com%7Ce553a3e81ed4483b0aa508da481
> 8b237%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637901568148410870%
> 7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik
> 1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=oQ%2B4ZVu7i27IA4%2BHBD3
> mzToa4nLUurFPSP4VoFyojU0%3D&reserved=0
>
------------------------------
Subject: Digest Footer
_______________________________________________
Lava-users mailing list -- lava-users(a)lists.lavasoftware.org To unsubscribe send an email to lava-users-leave(a)lists.lavasoftware.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
------------------------------
End of Lava-users Digest, Vol 46, Issue 3
*****************************************
Hey,
I've a device stuck in 'Running' state, but missing the usual appending of '#<123> <Job Name> [submitter]'. It does not accept new jobs and cycling it through it's health states does not change the state.
We had to reboot our systems unplanned (and uncoordinated) today, so that might be the cause of the issue, but I would like to get my device back :D
Google surfaced an old, outdated FAQ in the lava sources [1], and a bug marked as solved in our version 2021.11 [2], but both didn't really don't work as expected.
Does somebody have some tips how to set/force the device state to 'Idle' manually?
Best regards, Olli
[1] https://git.lavasoftware.org/balikm/lava/-/blob/2015.01/doc/faq.rst
[2] https://git.lavasoftware.org/lava/lava/-/issues/471