Additionally, lavacli can do the work of querying the scheduler jobs list.
$ lavacli -i staging jobs list --health INCOMPLETE --state FINISHED --limit 5 --verbose --yaml - {actual_device: staging-x15-01, description: x-15 tftp nfs OE, device_type: x15, end_time: '2019-01-29 20:42:16.329289+00:00', error_msg: auto-login-action timed out after 1768 seconds, error_type: Job, health: Incomplete, id: 248207, start_time: '2019-01-29 20:11:30.376908+00:00', state: Finished, submitter: neil.williams@linaro.org} - {actual_device: staging-x15-01, description: x-15 tftp nfs OE, device_type: x15, end_time: '2019-01-29 20:08:38.065813+00:00', error_msg: auto-login-action timed out after 522 seconds, error_type: Job, health: Incomplete, id: 248206, start_time: '2019-01-29 19:58:28.706303+00:00', state: Finished, submitter: neil.williams@linaro.org} - {actual_device: staging-hi960-hikey-02, description: HiKey 960 Android boot test, device_type: hi960-hikey, end_time: '2019-01-28 12:50:33.255416+00:00', error_msg: /usr/local/lab-scripts/cbrxd_hub_control -i DJ008WLX -m sync -u 15 failed, error_type: Infrastructure, health: Incomplete, id: 248142, start_time: '2019-01-28 12:47:44.897788+00:00', state: Finished, submitter: lava-health} - {actual_device: staging-black04, description: beaglebone-black standard NFS health check, device_type: beaglebone-black, end_time: '2019-01-26 19:03:12.393021+00:00', error_msg: 'matched a bootloader error message: ''Retry count exceeded'' (4)', error_type: Infrastructure, health: Incomplete, id: 248104, start_time: '2019-01-26 19:01:51.542197+00:00', state: Finished, submitter: lava-health} - {actual_device: staging-hi960-hikey-02, description: HiKey 960 Android boot test, device_type: hi960-hikey, end_time: '2019-01-25 16:31:58.165753+00:00', error_msg: /usr/local/lab-scripts/cbrxd_hub_control -i DJ008WLX -m sync -u 15 failed, error_type: Infrastructure, health: Incomplete, id: 248065, start_time: '2019-01-25 16:27:59.082922+00:00', state: Finished, submitter: lava-health}
On Wed, 30 Jan 2019 at 12:48, Neil Williams neil.williams@linaro.org wrote:
On Wed, 30 Jan 2019 at 10:11, Denis HUMEAU denis.humeau@st.com wrote:
After getting stats on my setup robustness, the step forward is have a complete view on the lava errors we meet in incomplete jobs.
Query isn't best suited for this - all you'll get are the testcases / Job IDs which match, not the reason for the failure. There is already a table for Recent Job Errors: http://localhost/scheduler/joberrors
The Query|Charts support in LAVA is deliberately simple and limited. Admins can use lava-server manage operations to get specific information and there are XMLRPC and REST API calls which can be made too. Quite quickly, other reporting support will be needed in lots of labs but each set of queries is different and custom reporting becomes necessary.
From what I see in incomplete jobs, my intention is to query on test suite lava and the name “job”.
I got limited use from: http://localhost/results/query/+custom?entity=testsuite&conditions=tests...
More instructive, I find, is to use notifications in health checks and/or functional tests:
https://git.lavasoftware.org/lava/functional-tests/blob/master/release/cubie...
The problem with looking at all incomplete test jobs is that you need to filter out genuine test failures (e.g. kernel didn't boot) from the Infrastructure errors or bugs. That's what the Recent Job Errors table does.
For XMLRPC you could start with http://localhost/api/help/#scheduler.jobs.list
`scheduler.jobs.list` ( `state=None`, `health=None`, `start=0`, `limit=25`, `since=None`, `verbose=False` )
Crucially, this call includes the "error_msg" & "error_type" in the first data set, if verbose is set.
Note: the reason why these are not typically included is that retrieving this data involves a lot more SQL queries under the hood. That's also why it is not possible for a Query to dig into that data.
--
Neil Williams
neil.williams@linaro.org http://www.linux.codehelp.co.uk/