Re: [Lava-users] "Where're my failed tests, dudes?"

8 Apr 2020

      Hello,
On Wed, 8 Apr 2020 08:09:46 +0000
Milosz Wasilewski milosz.wasilewski@linaro.org wrote:
[]
...
...
...
...
So, what's that usage? Well, I'm not much interested in
"interactive" use (submit jobs manually from my machine). Our
interest is in unattended automated CI, of which the testing
system is the second half after the build system. So let me
remind how our build system, Jenkins, works. Normally, it just
builds binaries and uploads them to a publishing server. It's
invisible to me in this phase, and my engineering work goes
uninterrupted. But when a build fails, I get an email with
details about a failure. And I'll continue to get them while it
continues to fail. So, the only option I have is to go see the
failure, investigate, and fix it. When I arrive at Jenkins, I
can easily see which jobs failed and which not, then within
each job, see which builds failed and which succeeded. That's
very easy, because failed things are red, and successful things
are green.
This is just one 'test case'. In a way jenkins executes one test
for you - build test. You can clearly see this test result and
associate it with software version. LAVA executes multiple tests.
There may be multiple users running their jobs on a single LAVA
instance and even on single lava 'device'.
But the talk is not about this. It's about:

Jenkins clearly allows me to distinguish "failed" build. It

allows me to receive notification when build fails. Both of these
don't seem to be possible with LAVA.
Can you distinguish build failed because disk out of space and
compilation error? No you can't. This means jenkins FAILED doesn't
always indicate fault in the code.
My point was that Jenkins offers conceptual capability to do so. But
indeed, every system likes to pose quizzes to its users, and some 10
years ago I has to write a Jenkins plugin to be able to use different
statuses easily:
https://git.linaro.org/infrastructure/jenkins-plugin-shell-status.git/tree/R... ,
dunno if situation improved since then.
And why I brought Jenkins into discussion is to show that what I expect
from LAVA didn't come from thin air. We can discuss specific
differences between Jenkins and LAVA for a long time, as there many
indeed. But the question I posed is "how to achieve a similar
high-level, conceptual interaction module in LAVA, similar to what
offered by Jenkins".
...
Yes, LAVA will send you
notification about failed (incomplete) jobs:
https://master.lavasoftware.org/static/docs/v2/user-notifications.html?highl...
...

You say "it's just one 'test case'", but I can make a job with

with one test case in LAVA, that test case can fail, and LAVA will
still keep me oblivious of this fact.
The first test LAVA runs is 'boot' test. So at the point you're
running your 'first' explicit test it's a second test LAVA executes at
best. When boot fails, job is marked as incomplete (in most cases).
You can also mark job incomplete on other tests with lava-test-raise.
So I claim you're incorrect here.
That's exactly the interpretation of "Incomplete" job status I came to:
it means "infrastructure error", i.e. that it didn't produce any useful
"test" results (or put in another words, any test results it produced
cannot be trusted).
Indeed, I had an idea, that to achieve my aim - clear visibility of
"failed" jobs, I could make them finished with "Incomplete" status.
But I don't have "lava-test-raise" on my hands, surprise ;-). Then for
"interactive" tests I use, I can raise an exception leading to
recording with "Incomplete" status, but it's very verbose. Multiplied
to a hundred testcases (that's not many, right), it would be outright
"ugly". But we also use "monitor" tests, and AFAIK, it doesn't support
ability to raise an exception, so again I'd need to patch it in.
But most importantly, that's conceptually wrong. I appreciate the fact
that LAVA can run and record multiple testcases, and I don't want to
work it around by making my tests fail on a very first failure. I want
to run as many tests as possible (like, all), and record their status,
that's certainly useful, and "abort on first failure" throws out that
use for nothing.
...
...
Right, so LAVA not only self-absolves from helping user to interpret
result it runs, it simply disallows user to do that within its
bounds, given the statuses listed above.
I completely don't understand this, sorry. You can check test job
results in LAVA UI.
Sure, let's try. I open LAVA, select Scheduler -> Jobs (perhaps I even
bookmarked it). I see a list of jobs, oftentimes health-check spam (and
no, I can't use "Username -> Jobs", because "my" jobs aren't my, but
submitted by a bot user), then click "Results" icon, and only then I
see job results.
Multiple that to hundreds jobs we have and you see my concern. (Why we
have those hundreds daily jobs is another matter, the whole issue of
"LITE vs LAVA" is multi-level problem, here I just concentrate on UI).
...
As I wrote in my previous email, there is no easy
way to capture multiple test results in a single status. Therefore
LAVA doesn't do it.
And here I disagree. For me, there's a big difference between "a job
with 0 failed testcases" (doesn't require my attention), and "a job
with >0 failed testcases" (requires my attention), and I'm seeking how
to capture and visualize that difference. I certainly can agree that's
not the only possible criteria, so ways to "capture" it may be
difference (I don't insist on explicit "Failed" status for example),
but not trying to capture (or making it more "obfuscated" than it
could be) isn't a right way IMHO.
...
...
I now have to say that this discussion haven't started with this
email, we came to it on Gitlab, and I find this reply from Stevan
insightful:
https://git.lavasoftware.org/lava/lava/-/issues/394#note_15175. Let
me quote z part of it:
...
This is not something we invented over night. [...] LAVA users
have ALWAYS been asking for something more, something else. What
ever kind of result representation you implement, however generic
it is, some percentage of users (sometimes it's even 100%) will
find something missing and/or not satisfactory to their needs.
IIRC this is in the context of visual reports in LAVA, not statuses or
results collection.
For me it's a general question of making relatively small-scale
improvements to LAVA UI (vs continuing with reductionist approach of
removing useful features).
...
...
I'm sure this didn't come overnight, that's why I was very keen to
do my homework before coming up with emails like this. I actually
may imagine those Complete/Incomplete statuses are "achievement" of
LAVA2 (comparing to LAVA1).
No, complete/incomplete were there from the very beginning.
Great, so it's not a case of some slowpoke like my trying to wind up
history in circles, we actually can make a progress here ;-).
...
...
I also can very well relate with that fact that users always want
more and more and are never satisfied. But it seems to me that you
guys concluded that "if we can't satisfy all needs, let's satisfy
NONE". And as you imagine, I cannot agree with that, because, based
on my *personal* analysis, this over-simplification on LAVA side,
and over-complication on user side, goes against needs of the team I
represent (an internal Linaro customer).
I disagree. As stated before, LAVA is just a test executor helping to
run unified tests on different hardware.
I understand that you guys see the biggest value of LAVA in its
executor backend, bit for example to me, backend is just a bit of
complication on a local scale (I understand its value for "lab-level"
setups). For me, the biggest value proposition lies in the frontend.
[]
...
I remember offering you my help in CI setup for LITE. I also remember
you refused, so I really don't understand the complaint above. My
offer is still on the table :)
I doubt "refused" is the right word here. Milosz, I appreciate you
being on the forefront of LAVA support, and find it very helpful. So I
can't imagine ever "refusing" it. And I'm not sure which exact word I
used, but the meaning was along the lines "we didn't yet grew up to
that, so let us do more homework and approach it (and you) later". And
I started my mail with admitting that finally that "later" has come at
least for me of my team.
...
...
Or, as an alternative, our team now needs to develop a frontend for
LAVA, for its own to stop one step short of providing a
baseline-useful solution. It's not me who does the resource
allocation, but I'm almost sure our team doesn't have resources for
that.
It's already there and it was already mentioned in this email: SQUAD.
There already is a rudimentary LITE setup in there:
https://qa-reports.linaro.org/lite/. I'll repeat what I wrote above,
please accept help offered and we won't have to deal with somewhat
false assumptions.
If you think that SQUAD is the answer to our needs in LITE, I would
only like to listen. Actually, that was my plan for BUD20 - to come to
your room and ask "That SQUAD thing, what is it, why do you wrote it
(*), and how do you think it may be useful for LITE, given that the
needs at hand we currently seem to have is a) ..., b) ..., c) ...".
(*) That's oversimplification, given the project contribution stats.
...
...
So, I'm raising these issues trying to find a suitable solution. The
good news is that none of the issues are deep or complex, they're
literally "last-mile" style issues. Rather than working on yet
another adhoc frontend, I'd rather work on making LAVA more
smooth-curve solution for all its users, offering baseline
reporting capabilities out of the box. The risk here is of course
that there's no agreement of what "baseline capabilities" are.
I'm OK with that as long as I don't have to do that. Patches to LAVA
are always welcome.
Sure, for all issues I submit I propose a solution, and judge it by
whether I myself could make a bite at it (so looking for simple(r)
solutions giving enough of improvement). I definitely need triaging of
my ideas and guidance, as of course "good intentions" doesn't mean
"good changes".
...
milosz
Thanks!
-- 
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Lava-users] "Where're my failed tests, dudes?"