"Where're my failed tests, dudes?"

List overview All Threads
Download

newer

older

About health check behavior?

2020.02: docker in docker support...

Paul Sokolovsky

8 Apr 2020 8 Apr '20

2:22 a.m.

Hello,

I hope that the mail subject set up an easy, cheerful background for this discussion ;-). This definitely goes into the "stupid questions" department. My plan was to collect "evidence"/feedback from my colleagues, then at the Connect BUD20, crash into LAVA/QA rooms and ask what I/we are doing wrong. As circumstances changed, but the "testing debt" only builds up, there's little choice but to try figure out these matters over a much narrower pipe.

So, before proceeding to the question per se, let me theorize that the reasons why such questions come up at all, are sufficient differences in the intended LAVA usage patterns. In other words, how I'd like to use LAVA on the LITE team side may differ from how LAVA team intends it to be used, or how QA team uses (the biggest user would it be). The issue? How I'd intend to use it is IMHO one the most basic ways to use a test system.

So, what's that usage? Well, I'm not much interested in "interactive" use (submit jobs manually from my machine). Our interest is in unattended automated CI, of which the testing system is the second half after the build system. So let me remind how our build system, Jenkins, works. Normally, it just builds binaries and uploads them to a publishing server. It's invisible to me in this phase, and my engineering work goes uninterrupted. But when a build fails, I get an email with details about a failure. And I'll continue to get them while it continues to fail. So, the only option I have is to go see the failure, investigate, and fix it. When I arrive at Jenkins, I can easily see which jobs failed and which not, then within each job, see which builds failed and which succeeded. That's very easy, because failed things are red, and successful things are green.

So, we've now arrived at the main question of this email - Why I don't seem to be able to use LAVA in the same way? Why LAVA offers only "Incomplete" and "Complete" job statuses? "Incomplete" is understood - it's an infrastructure failure, such a job is definitely "failed". But "Complete" doesn't give me any useful information whether the job succeeded or failed. Because a "Complete" job may still have any number of tests failed. And that's exactly the "last mile" LAVA misses to go: for any test job, I want to see a cumulative number of test cases which failed, straight at the page like https://lite.validation.linaro.org/scheduler/alljobs . Then, I'd like to filter out jobs which has this number >0. Then I'd like to receive a notification only for "failed" jobs, "failed" defined as "status != Complete OR failed_testcases > 0".

So, what am I missing and how to make LAVA work like the above?

Thanks, Paul

Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog

Show replies by date

Milosz Wasilewski

8 Apr 8 Apr

2:37 a.m.

New subject: [Lava-users] "Where're my failed tests, dudes?"

On Tue, 7 Apr 2020 at 15:22, Paul Sokolovsky paul.sokolovsky@linaro.org wrote:

...

Hello,

I hope that the mail subject set up an easy, cheerful background for this discussion ;-). This definitely goes into the "stupid questions" department. My plan was to collect "evidence"/feedback from my colleagues, then at the Connect BUD20, crash into LAVA/QA rooms and ask what I/we are doing wrong. As circumstances changed, but the "testing debt" only builds up, there's little choice but to try figure out these matters over a much narrower pipe.

So, before proceeding to the question per se, let me theorize that the reasons why such questions come up at all, are sufficient differences in the intended LAVA usage patterns. In other words, how I'd like to use LAVA on the LITE team side may differ from how LAVA team intends it to be used, or how QA team uses (the biggest user would it be). The issue? How I'd intend to use it is IMHO one the most basic ways to use a test system.

So, what's that usage? Well, I'm not much interested in "interactive" use (submit jobs manually from my machine). Our interest is in unattended automated CI, of which the testing system is the second half after the build system. So let me remind how our build system, Jenkins, works. Normally, it just builds binaries and uploads them to a publishing server. It's invisible to me in this phase, and my engineering work goes uninterrupted. But when a build fails, I get an email with details about a failure. And I'll continue to get them while it continues to fail. So, the only option I have is to go see the failure, investigate, and fix it. When I arrive at Jenkins, I can easily see which jobs failed and which not, then within each job, see which builds failed and which succeeded. That's very easy, because failed things are red, and successful things are green.

This is just one 'test case'. In a way jenkins executes one test for you - build test. You can clearly see this test result and associate it with software version. LAVA executes multiple tests. There may be multiple users running their jobs on a single LAVA instance and even on single lava 'device'. Each of them needs to collect results of these jobs and interpret for their needs.

I know jenkins can execute tests (other than just build). But in such case it allows to configure how many tests can fail when the build is still considered 'green'.

...

So, we've now arrived at the main question of this email - Why I don't seem to be able to use LAVA in the same way? Why LAVA offers only "Incomplete" and "Complete" job statuses? "Incomplete" is understood - it's an infrastructure failure, such a job is definitely "failed". But "Complete" doesn't give me any useful information whether the job succeeded or failed. Because a "Complete" job may still have any number of tests failed. And that's exactly the "last mile" LAVA misses to go: for any test job, I want to see a cumulative number of test cases which failed, straight at the page like https://lite.validation.linaro.org/scheduler/alljobs . Then, I'd like to filter out jobs which has this number >0. Then I'd like to receive a notification only for "failed" jobs, "failed" defined as "status != Complete OR failed_testcases > 0".

So, what am I missing and how to make LAVA work like the above?

My take on this is LAVA is _not_ 'jenkins for testing'. It's simply test executor and you need to postprocess your results yourself.

milosz

...

Thanks, Paul

Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog

Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users

Tim Jaacks

3:31 a.m.

New subject: [Lava-users] "Where're my failed tests, dudes?"

...

-----Ursprüngliche Nachricht----- Von: Lava-users lava-users-bounces@lists.lavasoftware.org Im Auftrag von Milosz Wasilewski Gesendet: Dienstag, 7. April 2020 16:37 An: Paul Sokolovsky paul.sokolovsky@linaro.org Cc: lava-users@lists.lavasoftware.org Betreff: Re: [Lava-users] "Where're my failed tests, dudes?"

On Tue, 7 Apr 2020 at 15:22, Paul Sokolovsky paul.sokolovsky@linaro.org wrote:

...
Hello,

I hope that the mail subject set up an easy, cheerful background for this discussion ;-). This definitely goes into the "stupid questions" department. My plan was to collect "evidence"/feedback from my colleagues, then at the Connect BUD20, crash into LAVA/QA rooms and ask what I/we are doing wrong. As circumstances changed, but the "testing debt" only builds up, there's little choice but to try figure out these matters over a much narrower pipe.

So, before proceeding to the question per se, let me theorize that the reasons why such questions come up at all, are sufficient differences in the intended LAVA usage patterns. In other words, how I'd like to use LAVA on the LITE team side may differ from how LAVA team intends it to be used, or how QA team uses (the biggest user would it be). The issue? How I'd intend to use it is IMHO one the most basic ways to use a test system.

So, what's that usage? Well, I'm not much interested in "interactive" use (submit jobs manually from my machine). Our interest is in unattended automated CI, of which the testing system is the second half after the build system. So let me remind how our build system, Jenkins, works. Normally, it just builds binaries and uploads them to a publishing server. It's invisible to me in this phase, and my engineering work goes uninterrupted. But when a build fails, I get an email with details about a failure. And I'll continue to get them while it continues to fail. So, the only option I have is to go see the failure, investigate, and fix it. When I arrive at Jenkins, I can easily see which jobs failed and which not, then within each job, see which builds failed and which succeeded. That's very easy, because failed things are red, and successful things are green.

This is just one 'test case'. In a way jenkins executes one test for you - build test. You can clearly see this test result and associate it with software version. LAVA executes multiple tests. There may be multiple users running their jobs on a single LAVA instance and even on single lava 'device'. Each of them needs to collect results of these jobs and interpret for their needs.

I know jenkins can execute tests (other than just build). But in such case it allows to configure how many tests can fail when the build is still considered 'green'.

...
So, we've now arrived at the main question of this email - Why I don't seem to be able to use LAVA in the same way? Why LAVA offers only "Incomplete" and "Complete" job statuses? "Incomplete" is understood - it's an infrastructure failure, such a job is definitely "failed". But "Complete" doesn't give me any useful information whether the job succeeded or failed. Because a "Complete" job may still have any number of tests failed. And that's exactly the "last mile" LAVA misses to go: for any test job, I want to see a cumulative number of test cases which failed, straight at the page like https://lite.validation.linaro.org/scheduler/alljobs . Then, I'd like to filter out jobs which has this number >0. Then I'd like to receive a notification only for "failed" jobs, "failed" defined as "status != Complete OR failed_testcases > 0".

So, what am I missing and how to make LAVA work like the above?

My take on this is LAVA is _not_ 'jenkins for testing'. It's simply test executor and you need to postprocess your results yourself.

As far as I understood, SQUAD is a project which brings build results and test results together. Not sure if it supports Jenkins, though:

https://github.com/Linaro/squad

...

milosz

...
Thanks, Paul

Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog

Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users

Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users

Mit freundlichen Grüßen / Best regards Tim Jaacks DEVELOPMENT ENGINEER Garz & Fricke GmbH Schlachthofstrasse 20 21079 Hamburg Direct: +49 40 791 899 - 183 Fax: +49 40 791899 - 39 tim.jaacks@garz-fricke.com www.garz-fricke.com WE MAKE IT YOURS!

Sitz der Gesellschaft: D-21079 Hamburg Registergericht: Amtsgericht Hamburg, HRB 60514 Geschäftsführer: Matthias Fricke, Manfred Garz, Marc-Michael Braun

Milosz Wasilewski

3:46 a.m.

New subject: [Lava-users] "Where're my failed tests, dudes?"

On Tue, 7 Apr 2020 at 16:31, Tim Jaacks tim.jaacks@garz-fricke.com wrote:

...

...
-----Ursprüngliche Nachricht----- Von: Lava-users lava-users-bounces@lists.lavasoftware.org Im Auftrag von Milosz Wasilewski Gesendet: Dienstag, 7. April 2020 16:37 An: Paul Sokolovsky paul.sokolovsky@linaro.org Cc: lava-users@lists.lavasoftware.org Betreff: Re: [Lava-users] "Where're my failed tests, dudes?"

On Tue, 7 Apr 2020 at 15:22, Paul Sokolovsky paul.sokolovsky@linaro.org wrote:

...
Hello,

I hope that the mail subject set up an easy, cheerful background for this discussion ;-). This definitely goes into the "stupid questions" department. My plan was to collect "evidence"/feedback from my colleagues, then at the Connect BUD20, crash into LAVA/QA rooms and ask what I/we are doing wrong. As circumstances changed, but the "testing debt" only builds up, there's little choice but to try figure out these matters over a much narrower pipe.

So, before proceeding to the question per se, let me theorize that the reasons why such questions come up at all, are sufficient differences in the intended LAVA usage patterns. In other words, how I'd like to use LAVA on the LITE team side may differ from how LAVA team intends it to be used, or how QA team uses (the biggest user would it be). The issue? How I'd intend to use it is IMHO one the most basic ways to use a test system.

So, what's that usage? Well, I'm not much interested in "interactive" use (submit jobs manually from my machine). Our interest is in unattended automated CI, of which the testing system is the second half after the build system. So let me remind how our build system, Jenkins, works. Normally, it just builds binaries and uploads them to a publishing server. It's invisible to me in this phase, and my engineering work goes uninterrupted. But when a build fails, I get an email with details about a failure. And I'll continue to get them while it continues to fail. So, the only option I have is to go see the failure, investigate, and fix it. When I arrive at Jenkins, I can easily see which jobs failed and which not, then within each job, see which builds failed and which succeeded. That's very easy, because failed things are red, and successful things are green.

This is just one 'test case'. In a way jenkins executes one test for you - build test. You can clearly see this test result and associate it with software version. LAVA executes multiple tests. There may be multiple users running their jobs on a single LAVA instance and even on single lava 'device'. Each of them needs to collect results of these jobs and interpret for their needs.

I know jenkins can execute tests (other than just build). But in such case it allows to configure how many tests can fail when the build is still considered 'green'.

...
So, we've now arrived at the main question of this email - Why I don't seem to be able to use LAVA in the same way? Why LAVA offers only "Incomplete" and "Complete" job statuses? "Incomplete" is understood - it's an infrastructure failure, such a job is definitely "failed". But "Complete" doesn't give me any useful information whether the job succeeded or failed. Because a "Complete" job may still have any number of tests failed. And that's exactly the "last mile" LAVA misses to go: for any test job, I want to see a cumulative number of test cases which failed, straight at the page like https://lite.validation.linaro.org/scheduler/alljobs . Then, I'd like to filter out jobs which has this number >0. Then I'd like to receive a notification only for "failed" jobs, "failed" defined as "status != Complete OR failed_testcases > 0".

...
...
...
So, what am I missing and how to make LAVA work like the above?

My take on this is LAVA is _not_ 'jenkins for testing'. It's simply test executor and you need to postprocess your results yourself.

As far as I understood, SQUAD is a project which brings build results and test results together. Not sure if it supports Jenkins, though:

https://github.com/Linaro/squad

Most likely not, but it depends on what you mean by 'supports jenkins'.

To give a longer explanation, if the results from lava are supposed to be reported in jenkins, "something" has to send them there. We had several attempts to tackle that problem. The most naive approach was to create additional jenkins jobs that would pull data from LAVA and import them into jenkins builds. This had a few downsides mainly in terms of maintainability. Then we moved to SQUAD and displayed links to SQUAD build in jenkins builds. SQUAD can also produce a 'badge' that displays results for a given project. Unfortunately it only displays 'latest' results, not ones corresponding to the build. IMHO this can be changed easily if it's useful. Example for LKFT linux mainline: https://qa-reports.linaro.org/lkft/linux-mainline-oe-sanity/badge This kind of badge works in github, I'm not sure about jenkins.

milosz

...

...
milosz

...
Thanks, Paul

Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog

Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users

Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users

Mit freundlichen Grüßen / Best regards Tim Jaacks DEVELOPMENT ENGINEER Garz & Fricke GmbH Schlachthofstrasse 20 21079 Hamburg Direct: +49 40 791 899 - 183 Fax: +49 40 791899 - 39 tim.jaacks@garz-fricke.com www.garz-fricke.com WE MAKE IT YOURS!

Sitz der Gesellschaft: D-21079 Hamburg Registergericht: Amtsgericht Hamburg, HRB 60514 Geschäftsführer: Matthias Fricke, Manfred Garz, Marc-Michael Braun

Milosz Wasilewski

4:11 a.m.

New subject: [Lava-users] "Where're my failed tests, dudes?"

On Tue, 7 Apr 2020 at 16:46, Milosz Wasilewski milosz.wasilewski@linaro.org wrote:

...

On Tue, 7 Apr 2020 at 16:31, Tim Jaacks tim.jaacks@garz-fricke.com wrote:

...
...
-----Ursprüngliche Nachricht----- Von: Lava-users lava-users-bounces@lists.lavasoftware.org Im Auftrag von Milosz Wasilewski Gesendet: Dienstag, 7. April 2020 16:37 An: Paul Sokolovsky paul.sokolovsky@linaro.org Cc: lava-users@lists.lavasoftware.org Betreff: Re: [Lava-users] "Where're my failed tests, dudes?"

On Tue, 7 Apr 2020 at 15:22, Paul Sokolovsky paul.sokolovsky@linaro.org wrote:

...
Hello,

I hope that the mail subject set up an easy, cheerful background for this discussion ;-). This definitely goes into the "stupid questions" department. My plan was to collect "evidence"/feedback from my colleagues, then at the Connect BUD20, crash into LAVA/QA rooms and ask what I/we are doing wrong. As circumstances changed, but the "testing debt" only builds up, there's little choice but to try figure out these matters over a much narrower pipe.

So, before proceeding to the question per se, let me theorize that the reasons why such questions come up at all, are sufficient differences in the intended LAVA usage patterns. In other words, how I'd like to use LAVA on the LITE team side may differ from how LAVA team intends it to be used, or how QA team uses (the biggest user would it be). The issue? How I'd intend to use it is IMHO one the most basic ways to use a test system.

So, what's that usage? Well, I'm not much interested in "interactive" use (submit jobs manually from my machine). Our interest is in unattended automated CI, of which the testing system is the second half after the build system. So let me remind how our build system, Jenkins, works. Normally, it just builds binaries and uploads them to a publishing server. It's invisible to me in this phase, and my engineering work goes uninterrupted. But when a build fails, I get an email with details about a failure. And I'll continue to get them while it continues to fail. So, the only option I have is to go see the failure, investigate, and fix it. When I arrive at Jenkins, I can easily see which jobs failed and which not, then within each job, see which builds failed and which succeeded. That's very easy, because failed things are red, and successful things are green.

This is just one 'test case'. In a way jenkins executes one test for you - build test. You can clearly see this test result and associate it with software version. LAVA executes multiple tests. There may be multiple users running their jobs on a single LAVA instance and even on single lava 'device'. Each of them needs to collect results of these jobs and interpret for their needs.

I know jenkins can execute tests (other than just build). But in such case it allows to configure how many tests can fail when the build is still considered 'green'.

...
So, we've now arrived at the main question of this email - Why I don't seem to be able to use LAVA in the same way? Why LAVA offers only "Incomplete" and "Complete" job statuses? "Incomplete" is understood - it's an infrastructure failure, such a job is definitely "failed". But "Complete" doesn't give me any useful information whether the job succeeded or failed. Because a "Complete" job may still have any number of tests failed. And that's exactly the "last mile" LAVA misses to go: for any test job, I want to see a cumulative number of test cases which failed, straight at the page like https://lite.validation.linaro.org/scheduler/alljobs . Then, I'd like to filter out jobs which has this number >0. Then I'd like to receive a notification only for "failed" jobs, "failed" defined as "status != Complete OR failed_testcases > 0".

...
...
>

So, what am I missing and how to make LAVA work like the above?

My take on this is LAVA is _not_ 'jenkins for testing'. It's simply test executor and you need to postprocess your results yourself.

As far as I understood, SQUAD is a project which brings build results and test results together. Not sure if it supports Jenkins, though:

https://github.com/Linaro/squad

Most likely not, but it depends on what you mean by 'supports jenkins'.

To give a longer explanation, if the results from lava are supposed to be reported in jenkins, "something" has to send them there. We had several attempts to tackle that problem. The most naive approach was to create additional jenkins jobs that would pull data from LAVA and import them into jenkins builds. This had a few downsides mainly in terms of maintainability. Then we moved to SQUAD and displayed links to SQUAD build in jenkins builds. SQUAD can also produce a 'badge' that displays results for a given project. Unfortunately it only displays 'latest' results, not ones corresponding to the build. IMHO this can be changed easily if it's useful. Example for LKFT linux

It turns out it was pretty easy, so I went ahead and sent it for review: https://github.com/Linaro/squad/pull/679 Sorry for hijacking lava-users for squad issue.

milosz

...

mainline: https://qa-reports.linaro.org/lkft/linux-mainline-oe-sanity/badge This kind of badge works in github, I'm not sure about jenkins.

milosz

...
...
milosz

...
Thanks, Paul

Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog

Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users

Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users

Mit freundlichen Grüßen / Best regards Tim Jaacks DEVELOPMENT ENGINEER Garz & Fricke GmbH Schlachthofstrasse 20 21079 Hamburg Direct: +49 40 791 899 - 183 Fax: +49 40 791899 - 39 tim.jaacks@garz-fricke.com www.garz-fricke.com WE MAKE IT YOURS!

Sitz der Gesellschaft: D-21079 Hamburg Registergericht: Amtsgericht Hamburg, HRB 60514 Geschäftsführer: Matthias Fricke, Manfred Garz, Marc-Michael Braun

Paul Sokolovsky

7:58 a.m.

New subject: [Lava-users] "Where're my failed tests, dudes?"

Hello,

On Tue, 7 Apr 2020 15:31:36 +0000 Tim Jaacks tim.jaacks@garz-fricke.com wrote:

[]

...

...
My take on this is LAVA is _not_ 'jenkins for testing'. It's simply test executor and you need to postprocess your results yourself.

As far as I understood, SQUAD is a project which brings build results and test results together. Not sure if it supports Jenkins, though:

https://github.com/Linaro/squad

Thanks, I'm definitely aware of SQUAD. And it's a nice, cheerful explanation of it as "a project which brings build results and test results together". Less cheerful (conspiracy-theoretic? ;-) ) might be: "Some stakeholders might have found LAVA own reporting, etc. capabilities to be lacking. As LAVA is a big and inert project, they decided to prototype a new solution outside of it. Fast forward, and they even advertise writing a standalone frontend using LAVA API as the best solution for everyone's needs, forgetting to mention that it took some 4 years, 3 dedicated developers, and a dozen contributors to get there."

And with all that, there's absolutely no guarantee that what worked for some stakeholders would work for you. Actually, someone who was burned by LAVA 'missing and "not needed"' features, would be cautious to not get into catch-22 by adopting another frontend, find features missing, proposing their addition and getting replies like "Well, it was never intended to be like that".

Identifying the minimal changes required for LAVA itself, and arguing for those changes, seems like a better strategy. Maybe I'm wrong.

[]

-- Best Regards, Paul Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog

Paul Sokolovsky

7:31 a.m.

New subject: [Lava-users] "Where're my failed tests, dudes?"

Hello Milosz,

On Tue, 7 Apr 2020 14:37:06 +0000 Milosz Wasilewski milosz.wasilewski@linaro.org wrote:

[]

...

...
So, what's that usage? Well, I'm not much interested in "interactive" use (submit jobs manually from my machine). Our interest is in unattended automated CI, of which the testing system is the second half after the build system. So let me remind how our build system, Jenkins, works. Normally, it just builds binaries and uploads them to a publishing server. It's invisible to me in this phase, and my engineering work goes uninterrupted. But when a build fails, I get an email with details about a failure. And I'll continue to get them while it continues to fail. So, the only option I have is to go see the failure, investigate, and fix it. When I arrive at Jenkins, I can easily see which jobs failed and which not, then within each job, see which builds failed and which succeeded. That's very easy, because failed things are red, and successful things are green.

This is just one 'test case'. In a way jenkins executes one test for you - build test. You can clearly see this test result and associate it with software version. LAVA executes multiple tests. There may be multiple users running their jobs on a single LAVA instance and even on single lava 'device'.

But the talk is not about this. It's about:

1. Jenkins clearly allows me to distinguish "failed" build. It allows me to receive notification when build fails. Both of these don't seem to be possible with LAVA. 2. You say "it's just one 'test case'", but I can make a job with with one test case in LAVA, that test case can fail, and LAVA will still keep me oblivious of this fact.

So, I'm afraid the difference lies not in number of "test cases". It lies in the fact that Jenkins provides following job statuses: SUCCESS, UNSTABLE, FAILURE, NOT_BUILT or ABORTED (note the clear presence of SUCCESS and FAILURE). Whereas LAVA provides status of Unknown, Complete, Incomplete, Canceled.

...

Each of them needs to collect results of these jobs and interpret for their needs.

Right, so LAVA not only self-absolves from helping user to interpret result it runs, it simply disallows user to do that within its bounds, given the statuses listed above.

I now have to say that this discussion haven't started with this email, we came to it on Gitlab, and I find this reply from Stevan insightful: https://git.lavasoftware.org/lava/lava/-/issues/394#note_15175. Let me quote z part of it:

...

This is not something we invented over night. [...] LAVA users have ALWAYS been asking for something more, something else. What ever kind of result representation you implement, however generic it is, some percentage of users (sometimes it's even 100%) will find something missing and/or not satisfactory to their needs.

I'm sure this didn't come overnight, that's why I was very keen to do my homework before coming up with emails like this. I actually may imagine those Complete/Incomplete statuses are "achievement" of LAVA2 (comparing to LAVA1).

I also can very well relate with that fact that users always want more and more and are never satisfied. But it seems to me that you guys concluded that "if we can't satisfy all needs, let's satisfy NONE". And as you imagine, I cannot agree with that, because, based on my *personal* analysis, this over-simplification on LAVA side, and over-complication on user side, goes against needs of the team I represent (an internal Linaro customer).

[]

...

...
So, what am I missing and how to make LAVA work like the above?

My take on this is LAVA is _not_ 'jenkins for testing'. It's simply test executor and you need to postprocess your results yourself.

"Yourself", with which hat, as which role? My story is that I'm an embedded engineer (microcontroller level). My team doesn't have a dedicated test engineer, each engineer is tasked with working on testing as part of their workload, and that always goes into the backlog. I personally finally hit a deadend, where lack of proper testing truly affects development. So these last few month I'm actually working *as* a test engineer for my team. I'm resolving various issues in LAVA backend precluding proper working with our devices (MCU-level, again), all upheld by the hope that afterwards, we (the entire team) will be able to control and monitor all our testing need. Just to find that the way LAVA offers me to do that is by receiving hundreds spammy notification mails (that's our job volume, yeah), and grep each of them manually for word "failed", which is of course not acceptable.

Or, as an alternative, our team now needs to develop a frontend for LAVA, for its own to stop one step short of providing a baseline-useful solution. It's not me who does the resource allocation, but I'm almost sure our team doesn't have resources for that.

So, I'm raising these issues trying to find a suitable solution. The good news is that none of the issues are deep or complex, they're literally "last-mile" style issues. Rather than working on yet another adhoc frontend, I'd rather work on making LAVA more smooth-curve solution for all its users, offering baseline reporting capabilities out of the box. The risk here is of course that there's no agreement of what "baseline capabilities" are.

...

milosz

-- Best Regards, Paul Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog

Stevan Radaković

6:31 p.m.

New subject: [Lava-users] "Where're my failed tests, dudes?"

Hey Paul,

I don't want to divert a conversation from the big picture here, but here's a query that can help you if I understand your problem correctly:

https://lite.validation.linaro.org/results/query/+custom?entity=testjob&...

It will show you all your jobs which are completed and have at least one test case failure.

You can use this API call https://lite.validation.linaro.org/api/help/#results.run_query to run queries from your choice of language and utilize it in any way you wish.

Cheers,

On 4/7/20 9:31 PM, Paul Sokolovsky wrote:

...

Hello Milosz,

On Tue, 7 Apr 2020 14:37:06 +0000 Milosz Wasilewski milosz.wasilewski@linaro.org wrote:

[]

...
...
So, what's that usage? Well, I'm not much interested in "interactive" use (submit jobs manually from my machine). Our interest is in unattended automated CI, of which the testing system is the second half after the build system. So let me remind how our build system, Jenkins, works. Normally, it just builds binaries and uploads them to a publishing server. It's invisible to me in this phase, and my engineering work goes uninterrupted. But when a build fails, I get an email with details about a failure. And I'll continue to get them while it continues to fail. So, the only option I have is to go see the failure, investigate, and fix it. When I arrive at Jenkins, I can easily see which jobs failed and which not, then within each job, see which builds failed and which succeeded. That's very easy, because failed things are red, and successful things are green.

This is just one 'test case'. In a way jenkins executes one test for you - build test. You can clearly see this test result and associate it with software version. LAVA executes multiple tests. There may be multiple users running their jobs on a single LAVA instance and even on single lava 'device'.

But the talk is not about this. It's about:

Jenkins clearly allows me to distinguish "failed" build. It allows

me to receive notification when build fails. Both of these don't seem to be possible with LAVA. 2. You say "it's just one 'test case'", but I can make a job with with one test case in LAVA, that test case can fail, and LAVA will still keep me oblivious of this fact.

So, I'm afraid the difference lies not in number of "test cases". It lies in the fact that Jenkins provides following job statuses: SUCCESS, UNSTABLE, FAILURE, NOT_BUILT or ABORTED (note the clear presence of SUCCESS and FAILURE). Whereas LAVA provides status of Unknown, Complete, Incomplete, Canceled.

...
Each of them needs to collect results of these jobs and interpret for their needs.

Right, so LAVA not only self-absolves from helping user to interpret result it runs, it simply disallows user to do that within its bounds, given the statuses listed above.

I now have to say that this discussion haven't started with this email, we came to it on Gitlab, and I find this reply from Stevan insightful: https://git.lavasoftware.org/lava/lava/-/issues/394#note_15175. Let me quote z part of it:

...
This is not something we invented over night. [...] LAVA users have ALWAYS been asking for something more, something else. What ever kind of result representation you implement, however generic it is, some percentage of users (sometimes it's even 100%) will find something missing and/or not satisfactory to their needs.

I'm sure this didn't come overnight, that's why I was very keen to do my homework before coming up with emails like this. I actually may imagine those Complete/Incomplete statuses are "achievement" of LAVA2 (comparing to LAVA1).

I also can very well relate with that fact that users always want more and more and are never satisfied. But it seems to me that you guys concluded that "if we can't satisfy all needs, let's satisfy NONE". And as you imagine, I cannot agree with that, because, based on my *personal* analysis, this over-simplification on LAVA side, and over-complication on user side, goes against needs of the team I represent (an internal Linaro customer).

[]

...
...
So, what am I missing and how to make LAVA work like the above?

My take on this is LAVA is _not_ 'jenkins for testing'. It's simply test executor and you need to postprocess your results yourself.

"Yourself", with which hat, as which role? My story is that I'm an embedded engineer (microcontroller level). My team doesn't have a dedicated test engineer, each engineer is tasked with working on testing as part of their workload, and that always goes into the backlog. I personally finally hit a deadend, where lack of proper testing truly affects development. So these last few month I'm actually working *as* a test engineer for my team. I'm resolving various issues in LAVA backend precluding proper working with our devices (MCU-level, again), all upheld by the hope that afterwards, we (the entire team) will be able to control and monitor all our testing need. Just to find that the way LAVA offers me to do that is by receiving hundreds spammy notification mails (that's our job volume, yeah), and grep each of them manually for word "failed", which is of course not acceptable.

Or, as an alternative, our team now needs to develop a frontend for LAVA, for its own to stop one step short of providing a baseline-useful solution. It's not me who does the resource allocation, but I'm almost sure our team doesn't have resources for that.

So, I'm raising these issues trying to find a suitable solution. The good news is that none of the issues are deep or complex, they're literally "last-mile" style issues. Rather than working on yet another adhoc frontend, I'd rather work on making LAVA more smooth-curve solution for all its users, offering baseline reporting capabilities out of the box. The risk here is of course that there's no agreement of what "baseline capabilities" are.

...
milosz

-- Stevan Radaković | LAVA Engineer Linaro.org <www.linaro.org> │ Open source software for ARM SoCs

Paul Sokolovsky

11:27 p.m.

New subject: [Lava-users] "Where're my failed tests, dudes?"

Hello Stevan,

On Wed, 8 Apr 2020 08:31:27 +0200 Stevan Radaković stevan.radakovic@linaro.org wrote:

...

Hey Paul,

I don't want to divert a conversation from the big picture here, but here's a query that can help you if I understand your problem correctly:

https://lite.validation.linaro.org/results/query/+custom?entity=testjob&...

I appreciate being concise and to the point. And I would need to spend more homework with that query, but yes, it seems to be just what I asked for.

And thinking out myself from the link you provided, the answer to a question "What I'm doing wrong?" would be:

1. I seem to be fixated on "test jobs" and a page https://lite.validation.linaro.org/scheduler/alljobs, and missing "test results" and page https://lite.validation.linaro.org/results/ . And to me it's clear why - what I submit is "job", what fails is "job", and what I fix is "job", whereas the result page by default shows only "test suites", with a link to "job" not immediately visible/clear (requiring one-by-one clickthru to extra pages to find "job").

2. I had no idea that LAVA query builder hosts actually quite advanced capabilities. For example, it appears that it allows to encode database joins concisely (but not exactly intuitively). For example, my understanding that the query above effectively says "query individual test cases, find ones with status 'failed', but show not these testcases, but jobs containing them".

So, while this is already wealth of information to think and play with, the actual query I'm interested in is:

https://lite.validation.linaro.org/results/query/+custom?entity=testjob&...

(Our jobs are submit-proxied via SQUAD from automated Jenkins builds.)

And that leads me to "502 Proxy Error" (not a rare situation with LAVA queries on production systems).

So, while I'll need to play with these things more, some tentative feedback might be:

1. Some queries are more useful than others, and worth being optimized and be closer to users' fingertips. 2. LAVA query builder is very useful and powerful capability, we should look for cheap ways to make it easier for users to leverage (inline hints, tips, more help links, etc.), not a ways to remove it (re: discussion in https://git.lavasoftware.org/lava/lava/-/issues/394)

...

It will show you all your jobs which are completed and have at least one test case failure.

You can use this API call https://lite.validation.linaro.org/api/help/#results.run_query to run queries from your choice of language and utilize it in any way you wish.

My preferred language here is a language of visual information in the GUI with mouse clicks (clicks used sparingly). I simply didn't grow up to a need to query LAVA via API and process results in advanced ways. I'm looking just for a basic ways to stop tests falling from my hands, throwing more API into the mix to jungle with unlikely will help me.

Thanks!

[]

-- Best Regards, Paul Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog

Milosz Wasilewski

11:43 p.m.

New subject: [Lava-users] "Where're my failed tests, dudes?"

On Wed, 8 Apr 2020 at 12:27, Paul Sokolovsky paul.sokolovsky@linaro.org wrote:

...

Hello Stevan,

On Wed, 8 Apr 2020 08:31:27 +0200 Stevan Radaković stevan.radakovic@linaro.org wrote:

...
Hey Paul,

I don't want to divert a conversation from the big picture here, but here's a query that can help you if I understand your problem correctly:

https://lite.validation.linaro.org/results/query/+custom?entity=testjob&...

I appreciate being concise and to the point. And I would need to spend more homework with that query, but yes, it seems to be just what I asked for.

And thinking out myself from the link you provided, the answer to a question "What I'm doing wrong?" would be:

I seem to be fixated on "test jobs" and a page

https://lite.validation.linaro.org/scheduler/alljobs, and missing "test results" and page https://lite.validation.linaro.org/results/ . And to me it's clear why - what I submit is "job", what fails is "job", and what I fix is "job", whereas the result page by default shows only "test suites", with a link to "job" not immediately visible/clear (requiring one-by-one clickthru to extra pages to find "job").

I had no idea that LAVA query builder hosts actually quite advanced

capabilities. For example, it appears that it allows to encode database joins concisely (but not exactly intuitively). For example, my understanding that the query above effectively says "query individual test cases, find ones with status 'failed', but show not these testcases, but jobs containing them".

So, while this is already wealth of information to think and play with, the actual query I'm interested in is:

https://lite.validation.linaro.org/results/query/+custom?entity=testjob&...

(Our jobs are submit-proxied via SQUAD from automated Jenkins builds.)

Hmm, could you explain (maybe again) why are you doing that while the information you require is already aggregated. Example: https://qa-reports.linaro.org/lite/zephyr-upstream/build/295572a5

milosz

...

And that leads me to "502 Proxy Error" (not a rare situation with LAVA queries on production systems).

So, while I'll need to play with these things more, some tentative feedback might be:

Some queries are more useful than others, and worth being optimized

and be closer to users' fingertips. 2. LAVA query builder is very useful and powerful capability, we should look for cheap ways to make it easier for users to leverage (inline hints, tips, more help links, etc.), not a ways to remove it (re: discussion in https://git.lavasoftware.org/lava/lava/-/issues/394)

...
It will show you all your jobs which are completed and have at least one test case failure.

You can use this API call https://lite.validation.linaro.org/api/help/#results.run_query to run queries from your choice of language and utilize it in any way you wish.

My preferred language here is a language of visual information in the GUI with mouse clicks (clicks used sparingly). I simply didn't grow up to a need to query LAVA via API and process results in advanced ways. I'm looking just for a basic ways to stop tests falling from my hands, throwing more API into the mix to jungle with unlikely will help me.

Thanks!

[]

-- Best Regards, Paul

Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog

Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users

Paul Sokolovsky

9 Apr 9 Apr

3:44 a.m.

New subject: [Lava-users] "Where're my failed tests, dudes?"

On Wed, 8 Apr 2020 11:43:38 +0000 Milosz Wasilewski milosz.wasilewski@linaro.org wrote:

[]

...

...
https://lite.validation.linaro.org/results/query/+custom?entity=testjob&...

(Our jobs are submit-proxied via SQUAD from automated Jenkins builds.)

Hmm, could you explain (maybe again) why are you doing that while the information you require is already aggregated. Example: https://qa-reports.linaro.org/lite/zephyr-upstream/build/295572a5

Milosz, thanks for insightful discussion we had at the LAVA Design online meeting, and an offer to guide me thru SQUAD on how to address my usecases. As you suggest, let's move that to private mail to not offtopic on this list.

But just to not leave this thread to end abruptly in the archive, let me tell what's the problem I see immediately at the link you gave: the page starts with:

Test jobs 560 61

Moving cursor at "560", I see "Complete", at "61" - "Incomplete". And as I explained, "Complete" number is rather useless to me, because some "Complete" jobs are actually "failed" jobs.

It occurred to me that maybe it's unclear what I'm talking about, because you may picture "LITE jobs" as instance of zephyr-upstream Jenkins build, while I'm trying to think of how to setup new jobs, so they weren't as deeply problematic as zephyr-upstream is. So, example of "new" job is https://lite.validation.linaro.org/scheduler/job/954523 - "Complete" and having 1 failed testcase, so "failed" job overall, per my criteria.

That job didn't go thru SQUAD, so I guess my next step is doing homework with passing that job thru SQUAD, to see if I have any issues with how its results will be presented there.

...

milosz

[]

-- Best Regards, Paul Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog

Milosz Wasilewski

3:56 a.m.

New subject: [Lava-users] "Where're my failed tests, dudes?"

On Wed, 8 Apr 2020 at 16:44, Paul Sokolovsky paul.sokolovsky@linaro.org wrote:

...

On Wed, 8 Apr 2020 11:43:38 +0000 Milosz Wasilewski milosz.wasilewski@linaro.org wrote:

[]

...
...
https://lite.validation.linaro.org/results/query/+custom?entity=testjob&...

(Our jobs are submit-proxied via SQUAD from automated Jenkins builds.)

Hmm, could you explain (maybe again) why are you doing that while the information you require is already aggregated. Example: https://qa-reports.linaro.org/lite/zephyr-upstream/build/295572a5

Milosz, thanks for insightful discussion we had at the LAVA Design online meeting, and an offer to guide me thru SQUAD on how to address my usecases. As you suggest, let's move that to private mail to not offtopic on this list.

But just to not leave this thread to end abruptly in the archive, let me tell what's the problem I see immediately at the link you gave: the page starts with:

Test jobs 560 61

Moving cursor at "560", I see "Complete", at "61" - "Incomplete". And as I explained, "Complete" number is rather useless to me, because some "Complete" jobs are actually "failed" jobs.

I would say this is only informational as when you're using SQUAD test jobs become irrelevant.

...

It occurred to me that maybe it's unclear what I'm talking about, because you may picture "LITE jobs" as instance of zephyr-upstream Jenkins build, while I'm trying to think of how to setup new jobs, so they weren't as deeply problematic as zephyr-upstream is. So, example of "new" job is https://lite.validation.linaro.org/scheduler/job/954523 - "Complete" and having 1 failed testcase, so "failed" job overall, per my criteria.

My assumption might be wrong, but I think you're looking for failed tests, rather than 'failed jobs'. This is the message I'm trying to convey all the time. LAVA is an executor and LAVA jobs are means to execute your tests. Job status isn't really that useful when you try to check which tests are passed and which are failed. For example there can be incomplete job with useful test results and complete job without any test results. How would you classify this?

...

That job didn't go thru SQUAD, so I guess my next step is doing homework with passing that job thru SQUAD, to see if I have any issues with how its results will be presented there.

Sure, if you need any help, just ask.

milosz

...

...
milosz

[]

-- Best Regards, Paul

Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog

Stevan Radaković

3:05 a.m.

New subject: [Lava-users] "Where're my failed tests, dudes?"

On 4/8/20 1:27 PM, Paul Sokolovsky wrote:

...

Hello Stevan,

On Wed, 8 Apr 2020 08:31:27 +0200 Stevan Radaković stevan.radakovic@linaro.org wrote:

...
Hey Paul,

I don't want to divert a conversation from the big picture here, but here's a query that can help you if I understand your problem correctly:

https://lite.validation.linaro.org/results/query/+custom?entity=testjob&...

I appreciate being concise and to the point. And I would need to spend more homework with that query, but yes, it seems to be just what I asked for.

And thinking out myself from the link you provided, the answer to a question "What I'm doing wrong?" would be:

I seem to be fixated on "test jobs" and a page

https://lite.validation.linaro.org/scheduler/alljobs, and missing "test results" and page https://lite.validation.linaro.org/results/ . And to me it's clear why - what I submit is "job", what fails is "job", and what I fix is "job", whereas the result page by default shows only "test suites", with a link to "job" not immediately visible/clear (requiring one-by-one clickthru to extra pages to find "job").

I had no idea that LAVA query builder hosts actually quite advanced

capabilities. For example, it appears that it allows to encode database joins concisely (but not exactly intuitively). For example, my understanding that the query above effectively says "query individual test cases, find ones with status 'failed', but show not these testcases, but jobs containing them".

So, while this is already wealth of information to think and play with, the actual query I'm interested in is:

https://lite.validation.linaro.org/results/query/+custom?entity=testjob&...

(Our jobs are submit-proxied via SQUAD from automated Jenkins builds.)

And that leads me to "502 Proxy Error" (not a rare situation with LAVA queries on production systems).

Querying by URL is considered a "live" query, i.e. not using caching mechanism in that "usual" queries do. Hence it will time out depending on the data volume in DB and the complexity of the query. That's why we recommend using the cached queries like so:

If you go to the query https://lite.validation.linaro.org/results/query page, create a Test Job query, then add query conditions described in a URL (test job submitter equals 'qa-reports-bot', test job health equals Complete, test case result equals 'Test failed') and then click on 'Run query' you'll be able to see the results of that query every time in a very timely manner since the query is now cached. I've created an example https://lite.validation.linaro.org/results/query/~stevan.radakovic/failed-tests/+detail as well.

You can check out the docs on the topic https://docs.lavasoftware.org/lava/lava-queries-charts.html#conditions but also if you need any more clarifications you can reach me on IRC.

Cheers,

...

So, while I'll need to play with these things more, some tentative feedback might be:

Some queries are more useful than others, and worth being optimized

and be closer to users' fingertips. 2. LAVA query builder is very useful and powerful capability, we should look for cheap ways to make it easier for users to leverage (inline hints, tips, more help links, etc.), not a ways to remove it (re: discussion in https://git.lavasoftware.org/lava/lava/-/issues/394)

...
It will show you all your jobs which are completed and have at least one test case failure.

You can use this API call https://lite.validation.linaro.org/api/help/#results.run_query to run queries from your choice of language and utilize it in any way you wish.

My preferred language here is a language of visual information in the GUI with mouse clicks (clicks used sparingly). I simply didn't grow up to a need to query LAVA via API and process results in advanced ways. I'm looking just for a basic ways to stop tests falling from my hands, throwing more API into the mix to jungle with unlikely will help me.

Thanks!

[]

-- Stevan Radaković | LAVA Engineer Linaro.org <www.linaro.org> │ Open source software for ARM SoCs

Paul Sokolovsky

10 Apr 10 Apr

11 p.m.

New subject: [Lava-users] "Where're my failed tests, dudes?"

Hello Stevan,

On Wed, 8 Apr 2020 17:05:13 +0200 Stevan Radaković stevan.radakovic@linaro.org wrote:

[]

https://lite.validation.linaro.org/results/query/+custom?entity=testjob&...

...

...
(Our jobs are submit-proxied via SQUAD from automated Jenkins builds.)

And that leads me to "502 Proxy Error" (not a rare situation with LAVA queries on production systems).

Querying by URL is considered a "live" query, i.e. not using caching mechanism in that "usual" queries do. Hence it will time out depending on the data volume in DB and the complexity of the query. That's why we recommend using the cached queries like so:

If you go to the query https://lite.validation.linaro.org/results/query page, create a Test Job query, then add query conditions described in a URL (test job submitter equals 'qa-reports-bot', test job health equals Complete, test case result equals 'Test failed') and then click on 'Run query' you'll be able to see the results of that query every time in a very timely manner since the query is now cached. I've created an example https://lite.validation.linaro.org/results/query/~stevan.radakovic/failed-tests/+detail as well.

Thanks Stevan. I was able to reproduce this query on my side too. I was aware of these non-live ("batched"?) queries, but as they seem cumbersome to use, I effectively avoiding and "forgetting" about them. Seems that even they have their uses. So, I hope the matter of removing any querying functionality (as mentioned in the discussion of https://git.lavasoftware.org/lava/lava/-/issues/394) would be approached carefully.

In the meantime, I effectively got an answer to the original question of finding jobs which fail, so will be able to look into analyzing those failures, while, per Milosz's suggestion, I learn to leverage more of SQUAD on the promise that it offers only more flexible/easier to use reporting and monitoring capabilities.

Thanks for your help!

-- Best Regards, Paul Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog

Milosz Wasilewski

8 Apr 8 Apr

8:09 p.m.

New subject: [Lava-users] "Where're my failed tests, dudes?"

On Tue, 7 Apr 2020 at 20:31, Paul Sokolovsky paul.sokolovsky@linaro.org wrote:

...

Hello Milosz,

On Tue, 7 Apr 2020 14:37:06 +0000 Milosz Wasilewski milosz.wasilewski@linaro.org wrote:

[]

...
...
So, what's that usage? Well, I'm not much interested in "interactive" use (submit jobs manually from my machine). Our interest is in unattended automated CI, of which the testing system is the second half after the build system. So let me remind how our build system, Jenkins, works. Normally, it just builds binaries and uploads them to a publishing server. It's invisible to me in this phase, and my engineering work goes uninterrupted. But when a build fails, I get an email with details about a failure. And I'll continue to get them while it continues to fail. So, the only option I have is to go see the failure, investigate, and fix it. When I arrive at Jenkins, I can easily see which jobs failed and which not, then within each job, see which builds failed and which succeeded. That's very easy, because failed things are red, and successful things are green.

This is just one 'test case'. In a way jenkins executes one test for you - build test. You can clearly see this test result and associate it with software version. LAVA executes multiple tests. There may be multiple users running their jobs on a single LAVA instance and even on single lava 'device'.

But the talk is not about this. It's about:

Jenkins clearly allows me to distinguish "failed" build. It allows

me to receive notification when build fails. Both of these don't seem to be possible with LAVA.

Can you distinguish build failed because disk out of space and compilation error? No you can't. This means jenkins FAILED doesn't always indicate fault in the code. Yes, LAVA will send you notification about failed (incomplete) jobs: https://master.lavasoftware.org/static/docs/v2/user-notifications.html?highl...

...

You say "it's just one 'test case'", but I can make a job with with

one test case in LAVA, that test case can fail, and LAVA will still keep me oblivious of this fact.

The first test LAVA runs is 'boot' test. So at the point you're running your 'first' explicit test it's a second test LAVA executes at best. When boot fails, job is marked as incomplete (in most cases). You can also mark job incomplete on other tests with lava-test-raise. So I claim you're incorrect here.

...

So, I'm afraid the difference lies not in number of "test cases". It lies in the fact that Jenkins provides following job statuses: SUCCESS, UNSTABLE, FAILURE, NOT_BUILT or ABORTED (note the clear presence of SUCCESS and FAILURE). Whereas LAVA provides status of Unknown, Complete, Incomplete, Canceled.

As pointed above, there is a pretty good match between these states. LAVA doesn't have 'unstable' and 'not_built' but they would have no meaning.

...

...
Each of them needs to collect results of these jobs and interpret for their needs.

Right, so LAVA not only self-absolves from helping user to interpret result it runs, it simply disallows user to do that within its bounds, given the statuses listed above.

I completely don't understand this, sorry. You can check test job results in LAVA UI. As I wrote in my previous email, there is no easy way to capture multiple test results in a single status. Therefore LAVA doesn't do it.

...

I now have to say that this discussion haven't started with this email, we came to it on Gitlab, and I find this reply from Stevan insightful: https://git.lavasoftware.org/lava/lava/-/issues/394#note_15175. Let me quote z part of it:

...
This is not something we invented over night. [...] LAVA users have ALWAYS been asking for something more, something else. What ever kind of result representation you implement, however generic it is, some percentage of users (sometimes it's even 100%) will find something missing and/or not satisfactory to their needs.

IIRC this is in the context of visual reports in LAVA, not statuses or results collection.

...

I'm sure this didn't come overnight, that's why I was very keen to do my homework before coming up with emails like this. I actually may imagine those Complete/Incomplete statuses are "achievement" of LAVA2 (comparing to LAVA1).

No, complete/incomplete were there from the very beginning.

...

I also can very well relate with that fact that users always want more and more and are never satisfied. But it seems to me that you guys concluded that "if we can't satisfy all needs, let's satisfy NONE". And as you imagine, I cannot agree with that, because, based on my *personal* analysis, this over-simplification on LAVA side, and over-complication on user side, goes against needs of the team I represent (an internal Linaro customer).

I disagree. As stated before, LAVA is just a test executor helping to run unified tests on different hardware.

...

[]

...
...
So, what am I missing and how to make LAVA work like the above?

My take on this is LAVA is _not_ 'jenkins for testing'. It's simply test executor and you need to postprocess your results yourself.

"Yourself", with which hat, as which role? My story is that I'm an embedded engineer (microcontroller level). My team doesn't have a dedicated test engineer, each engineer is tasked with working on testing as part of their workload, and that always goes into the backlog. I personally finally hit a deadend, where lack of proper testing truly affects development. So these last few month I'm actually working *as* a test engineer for my team. I'm resolving various issues in LAVA backend precluding proper working with our devices (MCU-level, again), all upheld by the hope that afterwards, we (the entire team) will be able to control and monitor all our testing need. Just to find that the way LAVA offers me to do that is by receiving hundreds spammy notification mails (that's our job volume, yeah), and grep each of them manually for word "failed", which is of course not acceptable.

I remember offering you my help in CI setup for LITE. I also remember you refused, so I really don't understand the complaint above. My offer is still on the table :)

...

Or, as an alternative, our team now needs to develop a frontend for LAVA, for its own to stop one step short of providing a baseline-useful solution. It's not me who does the resource allocation, but I'm almost sure our team doesn't have resources for that.

It's already there and it was already mentioned in this email: SQUAD. There already is a rudimentary LITE setup in there: https://qa-reports.linaro.org/lite/. I'll repeat what I wrote above, please accept help offered and we won't have to deal with somewhat false assumptions.

...

So, I'm raising these issues trying to find a suitable solution. The good news is that none of the issues are deep or complex, they're literally "last-mile" style issues. Rather than working on yet another adhoc frontend, I'd rather work on making LAVA more smooth-curve solution for all its users, offering baseline reporting capabilities out of the box. The risk here is of course that there's no agreement of what "baseline capabilities" are.

I'm OK with that as long as I don't have to do that. Patches to LAVA are always welcome.

milosz

...

...
milosz

-- Best Regards, Paul

Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog

Paul Sokolovsky

9 Apr 9 Apr

12:54 a.m.

New subject: [Lava-users] "Where're my failed tests, dudes?"

Hello,

On Wed, 8 Apr 2020 08:09:46 +0000 Milosz Wasilewski milosz.wasilewski@linaro.org wrote:

[]

...

...
...
...
So, what's that usage? Well, I'm not much interested in "interactive" use (submit jobs manually from my machine). Our interest is in unattended automated CI, of which the testing system is the second half after the build system. So let me remind how our build system, Jenkins, works. Normally, it just builds binaries and uploads them to a publishing server. It's invisible to me in this phase, and my engineering work goes uninterrupted. But when a build fails, I get an email with details about a failure. And I'll continue to get them while it continues to fail. So, the only option I have is to go see the failure, investigate, and fix it. When I arrive at Jenkins, I can easily see which jobs failed and which not, then within each job, see which builds failed and which succeeded. That's very easy, because failed things are red, and successful things are green.

This is just one 'test case'. In a way jenkins executes one test for you - build test. You can clearly see this test result and associate it with software version. LAVA executes multiple tests. There may be multiple users running their jobs on a single LAVA instance and even on single lava 'device'.

But the talk is not about this. It's about:

Jenkins clearly allows me to distinguish "failed" build. It

allows me to receive notification when build fails. Both of these don't seem to be possible with LAVA.

Can you distinguish build failed because disk out of space and compilation error? No you can't. This means jenkins FAILED doesn't always indicate fault in the code.

My point was that Jenkins offers conceptual capability to do so. But indeed, every system likes to pose quizzes to its users, and some 10 years ago I has to write a Jenkins plugin to be able to use different statuses easily: https://git.linaro.org/infrastructure/jenkins-plugin-shell-status.git/tree/R... , dunno if situation improved since then.

And why I brought Jenkins into discussion is to show that what I expect from LAVA didn't come from thin air. We can discuss specific differences between Jenkins and LAVA for a long time, as there many indeed. But the question I posed is "how to achieve a similar high-level, conceptual interaction module in LAVA, similar to what offered by Jenkins".

...

Yes, LAVA will send you notification about failed (incomplete) jobs: https://master.lavasoftware.org/static/docs/v2/user-notifications.html?highl...

...

You say "it's just one 'test case'", but I can make a job with

with one test case in LAVA, that test case can fail, and LAVA will still keep me oblivious of this fact.

The first test LAVA runs is 'boot' test. So at the point you're running your 'first' explicit test it's a second test LAVA executes at best. When boot fails, job is marked as incomplete (in most cases). You can also mark job incomplete on other tests with lava-test-raise. So I claim you're incorrect here.

That's exactly the interpretation of "Incomplete" job status I came to: it means "infrastructure error", i.e. that it didn't produce any useful "test" results (or put in another words, any test results it produced cannot be trusted).

Indeed, I had an idea, that to achieve my aim - clear visibility of "failed" jobs, I could make them finished with "Incomplete" status. But I don't have "lava-test-raise" on my hands, surprise ;-). Then for "interactive" tests I use, I can raise an exception leading to recording with "Incomplete" status, but it's very verbose. Multiplied to a hundred testcases (that's not many, right), it would be outright "ugly". But we also use "monitor" tests, and AFAIK, it doesn't support ability to raise an exception, so again I'd need to patch it in.

But most importantly, that's conceptually wrong. I appreciate the fact that LAVA can run and record multiple testcases, and I don't want to work it around by making my tests fail on a very first failure. I want to run as many tests as possible (like, all), and record their status, that's certainly useful, and "abort on first failure" throws out that use for nothing.

...

...
Right, so LAVA not only self-absolves from helping user to interpret result it runs, it simply disallows user to do that within its bounds, given the statuses listed above.

I completely don't understand this, sorry. You can check test job results in LAVA UI.

Sure, let's try. I open LAVA, select Scheduler -> Jobs (perhaps I even bookmarked it). I see a list of jobs, oftentimes health-check spam (and no, I can't use "Username -> Jobs", because "my" jobs aren't my, but submitted by a bot user), then click "Results" icon, and only then I see job results.

Multiple that to hundreds jobs we have and you see my concern. (Why we have those hundreds daily jobs is another matter, the whole issue of "LITE vs LAVA" is multi-level problem, here I just concentrate on UI).

...

As I wrote in my previous email, there is no easy way to capture multiple test results in a single status. Therefore LAVA doesn't do it.

And here I disagree. For me, there's a big difference between "a job with 0 failed testcases" (doesn't require my attention), and "a job with >0 failed testcases" (requires my attention), and I'm seeking how to capture and visualize that difference. I certainly can agree that's not the only possible criteria, so ways to "capture" it may be difference (I don't insist on explicit "Failed" status for example), but not trying to capture (or making it more "obfuscated" than it could be) isn't a right way IMHO.

...

...
I now have to say that this discussion haven't started with this email, we came to it on Gitlab, and I find this reply from Stevan insightful: https://git.lavasoftware.org/lava/lava/-/issues/394#note_15175. Let me quote z part of it:

...
This is not something we invented over night. [...] LAVA users have ALWAYS been asking for something more, something else. What ever kind of result representation you implement, however generic it is, some percentage of users (sometimes it's even 100%) will find something missing and/or not satisfactory to their needs.

IIRC this is in the context of visual reports in LAVA, not statuses or results collection.

For me it's a general question of making relatively small-scale improvements to LAVA UI (vs continuing with reductionist approach of removing useful features).

...

...
I'm sure this didn't come overnight, that's why I was very keen to do my homework before coming up with emails like this. I actually may imagine those Complete/Incomplete statuses are "achievement" of LAVA2 (comparing to LAVA1).

No, complete/incomplete were there from the very beginning.

Great, so it's not a case of some slowpoke like my trying to wind up history in circles, we actually can make a progress here ;-).

...

...
I also can very well relate with that fact that users always want more and more and are never satisfied. But it seems to me that you guys concluded that "if we can't satisfy all needs, let's satisfy NONE". And as you imagine, I cannot agree with that, because, based on my *personal* analysis, this over-simplification on LAVA side, and over-complication on user side, goes against needs of the team I represent (an internal Linaro customer).

I disagree. As stated before, LAVA is just a test executor helping to run unified tests on different hardware.

I understand that you guys see the biggest value of LAVA in its executor backend, bit for example to me, backend is just a bit of complication on a local scale (I understand its value for "lab-level" setups). For me, the biggest value proposition lies in the frontend.

[]

...

I remember offering you my help in CI setup for LITE. I also remember you refused, so I really don't understand the complaint above. My offer is still on the table :)

I doubt "refused" is the right word here. Milosz, I appreciate you being on the forefront of LAVA support, and find it very helpful. So I can't imagine ever "refusing" it. And I'm not sure which exact word I used, but the meaning was along the lines "we didn't yet grew up to that, so let us do more homework and approach it (and you) later". And I started my mail with admitting that finally that "later" has come at least for me of my team.

...

...
Or, as an alternative, our team now needs to develop a frontend for LAVA, for its own to stop one step short of providing a baseline-useful solution. It's not me who does the resource allocation, but I'm almost sure our team doesn't have resources for that.

It's already there and it was already mentioned in this email: SQUAD. There already is a rudimentary LITE setup in there: https://qa-reports.linaro.org/lite/. I'll repeat what I wrote above, please accept help offered and we won't have to deal with somewhat false assumptions.

If you think that SQUAD is the answer to our needs in LITE, I would only like to listen. Actually, that was my plan for BUD20 - to come to your room and ask "That SQUAD thing, what is it, why do you wrote it (*), and how do you think it may be useful for LITE, given that the needs at hand we currently seem to have is a) ..., b) ..., c) ...".

(*) That's oversimplification, given the project contribution stats.

...

...
So, I'm raising these issues trying to find a suitable solution. The good news is that none of the issues are deep or complex, they're literally "last-mile" style issues. Rather than working on yet another adhoc frontend, I'd rather work on making LAVA more smooth-curve solution for all its users, offering baseline reporting capabilities out of the box. The risk here is of course that there's no agreement of what "baseline capabilities" are.

I'm OK with that as long as I don't have to do that. Patches to LAVA are always welcome.

Sure, for all issues I submit I propose a solution, and judge it by whether I myself could make a bite at it (so looking for simple(r) solutions giving enough of improvement). I definitely need triaging of my ideas and guidance, as of course "good intentions" doesn't mean "good changes".

...

milosz

Thanks!

-- Best Regards, Paul Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog

1920

days inactive

1923

days old

lava-users@lists.lavasoftware.org

15 comments

participants

tags (0)

participants (4)

Milosz Wasilewski
Paul Sokolovsky
Stevan Radaković
Tim Jaacks