Hi folks,
We held our regular design meeting today via Hangout. Summary of brief
discussion:
# 24th July 2019
# install.git-deps [milosz]
This feature works nicely:
https://lkft.validation.linaro.org/scheduler/job/834097
Proposal: keep `install` option but restrict it so it’s not trying to
install system packages.
[Rémi] Will submit a patch to remove the “deprecation” warning in the
documentation.
# Authentication refactoring [milosz]
Under review by Remi. Looks good.
# Connect sessions where accepted [Rémi]
* LAVA users forum
* Hacking and contributing to LAVA
* Advanced testing in python
# Playing with Sentry error reporting [Rémi]
* Will create a ticket to have it installed in the linaro lab.
* Will create sentry.lavasoftware.org
* No debian package available for python3-sentry-sdk
* Should be installed from pip (sentry-sdk)
* Will send a patch to install sentry-sdk from pip in lava-server docker
container.
* Activate it for lavafed instances.
============================================================================
The LAVA design meeting is held weekly, every Wednesday at 13:00 to
14:00 UTC using Google Hangouts Meet: https://meet.google.com/qre-rgen-zwc
Feel free to comment here or join us directly in the meeting.
Minutes from this and previous meetings are also stored in the LAVA wiki:
https://git.lavasoftware.org/lava/lava/wikis/design-meetings/index
Cheers,
--
Rémi Duraffort
LAVA Team, Linaro
Hi folks,
We held our regular design meeting today via Hangout. Summary of
brief discussion:
# 17th July 2019
# Large job definitions causing outages [deanb]
* Issue: https://git.lavasoftware.org/lava/lava/issues/299
* Wondering if for large jobs (configurable limit) simply not making
ActionData objects is a sensible approach.
* Tried this:
https://git.lavasoftware.org/dean-birch/lava/commit/dd220c0bd82bf092e35e643…
* In my test instance this reduced outage to 30 seconds (from hours).
* If not, what else can we do?
* Anything extra needs to be added?
* Documentation?
* [deanb] will send a patch with the first improvements (CLoader)
* [deanb] will look at using bulk save to save all objects in one call
* [stevan] investigate ActionData: is it possible to create them later on
or even maybe not creating them?
# Test from inline with git [milosz]
The idea is to source test-definition YAML from inline but use git
repository to prepare overlay. Example: https://github.com/andersson/bootrr
Bjorn doesn’t want to have YAML file in this repository
[Rémi] Using install.git-repos might work
*
https://git.lavasoftware.org/lava/lava/blob/master/lava_dispatcher/tests/te…
*
https://docs.lavasoftware.org/lava/lava_test_shell.html#adding-git-bzr-repo…
[milosz] will try install.git-repos
If that’s working Rémi will add some tests in lavafed or meta-lava
# Switching between serial connections on device with multiple UARTs
[Malcolm Brooks]
* Issue: We have devices which use separate serial outputs for MCC, AP and
SCP UARTs.
* Workaround: Use the `new_connection` boot method to switch between UART1
and UART2 in order to catch the kernel booting once MCC flash stage is
complete.
* Idea: Allow all connections (or possibly a subset defined in the
“connection_tags” for example) to be established and followed from the
beginning of the job, and allow each action/stage to select which they are
actually listening/interacting with the `connection` option (example below).
```yaml
- boot:
namespace: target
connection: uart1
method: minimal
```
[Rémi] Sounds like a good idea.
* Using feedback LAVA can already use one connection and listen/print the
other ones
* Malcom will create an issue on gitlab.
============================================================================
The LAVA design meeting is held weekly, every Wednesday at 13:00 to
14:00 UTC using Google Hangouts Meet: https://meet.google.com/qre-rgen-zwc
Feel free to comment here or join us directly in the meeting.
Minutes from this and previous meetings are also stored in the LAVA wiki:
https://git.lavasoftware.org/lava/lava/wikis/design-meetings/index
Cheers,
--
Rémi Duraffort
LAVA Team, Linaro
Hi All,
We've previously had an issue on our LAVA instance where it stopped
responding to workers and stopped dispatching jobs when it finished running
large job definition (around 25000 lines in the definition, around 1000
deploy/boot/test actions). I've been looking into reproducing this safely
in a development environment, and I've got a few observations and questions
about how the situation could be improved.
The lava-master process appears be stuck processing the job results, and
takes a painstakingly long time to finish this and send an ACK for END_OK.
During this processing, the master doesn't respond to worker pings, and
doesn't schedule other jobs. Tracking a bit deeper, it seems that the vast
majority of time (I've never seen it finish as I have always restarted the
lava services before it finishes) in the walk_actions and build_action
functions of the lava_results_app/dbutils.py file:
https://git.lavasoftware.org/lava/lava/blob/2019.05.post1/lava_results_app/…https://git.lavasoftware.org/lava/lava/blob/2019.05.post1/lava_results_app/…
What options is there to mitigate this issue? Some ideas below:
- Could we optimize the build_action function? There are a few Django
model/db queries in build_action, could some results be queried once and
cached? With an obscenely large job, would this even give us enough savings
to make the time invested in safely optimizing this worth it?
- What are the implications of not having created ActionData objects for a
job? Does this mean that no options will be available in the "Pipeline ↓"
drop-down on the job page for quick navigation? Could we optionally abort
after a certain amount of these (and make it configurable per LAVA
instance)?
- Should/could the handling of the results be forked off, so lava-master
can continue to schedule more jobs and respond to worker pings, but slowly
the ActionData objects can be populated? I'm unsure if you have to be on a
special thread to write to Django models. Even if this could be done, would
any weird behaviours occur on the slave side as it will still be waiting
for the ACK for END_OK from the master?
Any guidance on how to proceed with this would be appreciated! I'm happy to
place this and some more details in as a LAVA issue on git.lavasoftware.org
if this is easier to track and discuss.
Thanks,
Dean
Hi folks,
As Rémi and Stevan are both out and we don't have any items listed for
discussion in advance, I'm cancelling today's design meeting.
Cheers,
--
Steve McIntyre steve.mcintyre(a)linaro.org
<http://www.linaro.org/> Linaro.org | Open source software for ARM SoCs
Hi folks,
We held our regular design meeting today via Hangout. Summary of brief
discussion:
3rd July 2019
1. [Rémi] 2019.07 release?
1. Should we do one?
2. Most people will be out for most of the month
3. Maybe worth it for the LITE group (depends on the number of
patches).
4. Steve back from DebConf on the 28th, but…
5. Yes: releasing around the 18th
2. [Rémi] Debian buster is due soon
1. Basing the docker image on Buster?
2. No, wait for a little bit. Maybe 2019.08?
3. Staging is already running Buster, main v.l.o is still on Stretch
but the lab team will want to upgrade soon
4. How long do we support stretch-backports?
5. Add buster-backports soon, as new uploads will hit Debian
unstable (---> Bullseye).
6. Target 2019.08 at all three releases (stretch-backports,
buster-backports, bullseye)
3. [Rémi] Recommendations about VACUUM ANALYZE
1. This should be run regularly (every day) on busy instances to
clean up
2. Add a thing in the docs, test in the lab
3. See https://www.postgresql.org/docs/11/sql-vacuum.html for more
info - does a VACUUM then ANALYZE without the old data.
4. Lets the DB self-optimise for performance
4. [Rémi] Using git submodule to include docker sources into lava
sources
1. Still a separate repository
2. The exact commit hash used for the lava docker image is now known
and reproducible.
1. This is the main reason
2. Using version.py on the last commit of the docker directory
can also work.
3. See https://git.lavasoftware.org/lava/lava/merge_requests/637
4. Let's go with this instead of git submodule, it works fine
============================================================================
The LAVA design meeting is held weekly, every Wednesday at 13:00 to
14:00 UTC using Google Hangouts Meet: https://meet.google.com/qre-rgen-zwc
Feel free to comment here or join us directly in the meeting.
Minutes from this and previous meetings are also stored in the LAVA wiki:
https://git.lavasoftware.org/lava/lava/wikis/design-meetings/index
Cheers,
--
Steve McIntyre steve.mcintyre(a)linaro.org
<http://www.linaro.org/> Linaro.org | Open source software for ARM SoCs
Hi folks,
We held our regular design meeting today via Hangout. Summary of brief
discussion:
26th June 2019
1. 2019.06 release
1. https://federation.lavasoftware.org/versions/
2019.05.0050.gf287c3449/
1. Looks ok
2. https://git.lavasoftware.org/groups/lava/-/merge
requests?milestonetitle=2019.06
2. [Dean] Fast model support
1. Was deprecated 2 years ago with V1
2. Now needed again
3. Run as a user-configured container (so test writer does
stuff), or re-integrate like we had with v1?
4. Very similar to how we do qemu
5. Look at how we run openocd/gdb as inspiration
3. [Remi] Allowing test job definitions to override U-Boot
config (e.g. load addresses)
1. Should be easy, waiting on a patch from Matt
4. [Kumar] Direct serial connection (#296)
1. pyserial probably the best bet?
1. Look at connection tags like in:
1. https://staging.validation.linaro.org/scheduler/
device/staging-black01/devicedict#defline77
2. ser2net works, but this might add more flexibility
3. possible timing problems with ser2net in a container?
4. we can help working out package dependencies etc. if
needed
5. Linaro Connect SAN19 - what sessions should we have?
1. Talk next week, suggestions welcome
============================================================================
The LAVA design meeting is held weekly, every Wednesday at 13:00 to
14:00 UTC using Google Hangouts Meet: https://meet.google.com/qre-rgen-zwc
Feel free to comment here or join us directly in the meeting.
Minutes from this and previous meetings are also stored in the LAVA wiki:
https://git.lavasoftware.org/lava/lava/wikis/design-meetings/index
Cheers,
--
Steve McIntyre steve.mcintyre(a)linaro.org
<http://www.linaro.org/> Linaro.org | Open source software for ARM SoCs
Hi folks,
We held our regular weekly design meeting today via Hangout. Summary
of brief discussion:
5th June 2019
1. [Rémi] 2019.06 planning:
1. Features and issues that should be in for 2019.06
2. Try to assign some, depending on the available time to work
on LAVA
3. [stevanr]
1. VAC, plus auth refactoring fixes
4. [Steve]
1. lots of doc updates
2. debugging some Arm issues
3. vland - docs, new switch support, etc.
5. Finish reviewing/reworking/merging Tim's device-dict in test
job patch
1. Anibal looking into this too
2. With expansion, helps to support the expanded fastboot
image work
6. https://git.lavasoftware.org/lava/lava/issues/277 - make
table lengths configurable?
2. [Anibal] Some NFS code is duplicated - will open an issue
============================================================================
The LAVA design meeting is held weekly, every Wednesday at 13:00 to
14:00 UTC using Google Hangouts Meet: https://meet.google.com/qre-rgen-zwc
Feel free to comment here or join us directly in the meeting.
Minutes from this and previous meetings are also stored in the LAVA wiki:
https://git.lavasoftware.org/lava/lava/wikis/design-meetings/index
Cheers,
--
Steve McIntyre steve.mcintyre(a)linaro.org
<http://www.linaro.org/> Linaro.org | Open source software for ARM SoCs
Hi folks,
We held our regular weekly design meeting today via Hangout. Summary
of brief discussion:
29th May 2019
1. [Rémi] 2019.05 release: changelog
1. Last part of the fix for the security issue regarding job
context
2. Lava-slave and socks proxy for remote labs behind proxies
3. Compressing job logs
2. [Rémi] Next releases
1. 2019.06
1. Rest api filter
2. device dictionary access from the test shell
2. 2019.08
1. auth refactoring
3. [Steve] Name for the extra udev tools package?
1. Forwarding udev events to docker containers
2. udev pass-through script
3. "docker-udev-tools" agreed and created as a new project
1. https://git.lavasoftware.org/lava/docker-udev-tools
============================================================================
The LAVA design meeting is held weekly, every Wednesday at 13:00 to
14:00 UTC using Google Hangouts Meet: https://meet.google.com/qre-rgen-zwc
Feel free to comment here or join us directly in the meeting.
Minutes from this and previous meetings are also stored in the LAVA wiki:
https://git.lavasoftware.org/lava/lava/wikis/design-meetings/index
Cheers,
--
Steve McIntyre steve.mcintyre(a)linaro.org
<http://www.linaro.org/> Linaro.org | Open source software for ARM SoCs
Hi folks,
We held our regular weekly design meeting on 22nd May via Hangout. Summary
of brief discussion:
22nd May 2019
1. [milosz] How to fix packaging on this branch: https://
git.lavasoftware.org/mwasilew/lava/pipelines/3220 ?
1. Fixed by Steve and Rémi
2. Problem with some debian python packages
2. [Steve] charfield to textfield changes needing work - !527
1. as Stevan points out, this is breaking other things.
2. Going to back out the future-proofing changes that extended
this, and go back to just fixing the specific things that we've
found to be broken
3. [stevanr] Auth refactoring submit/resubmit/cancel permissions
1. Currently: submit is a separate permission and resubmit/cancel
goes in the same permission level
2. Submit permission is not tied to specific testjob while
resubmit and cancel are
3. [ivoire] Keep things as is
4. [Anibal] questions about the fastboot-nfs setup - how to do things?
1. how to pass information into the lxc when creating the image?
2. Ordering of actions is important - the test action in the lxc
will need information that's available from the fastboot deploy
step. To pass via overlay, would need this to be available
before the lxc deploy
3. Can we simply pass the device dict for the DUT into the lxc,
similarly to what Tim has in https://git.lavasoftware.org/lava/
lava/merge_requests/536 ?
4. How to list the variables/information we want to have
available?
1. Device dictionary
2. Some dynamic data (nfsrootfs address)
5. What about listing in the test block, the “dependencies” (find
a better name) that we are expecting?
============================================================================
The LAVA design meeting is held weekly, every Wednesday at 13:00 to
14:00 UTC using Google Hangouts Meet: https://meet.google.com/qre-rgen-zwc
Feel free to comment here or join us directly in the meeting.
Minutes from this and previous meetings are also stored in the LAVA wiki:
https://git.lavasoftware.org/lava/lava/wikis/design-meetings/index
Cheers,
--
Steve McIntyre steve.mcintyre(a)linaro.org
<http://www.linaro.org/> Linaro.org | Open source software for ARM SoCs
Hi folks,
We held our regular weekly design meeting today via Hangout. Summary
of brief discussion:
15th May 2019
1. [Steve] https://git.lavasoftware.org/lava/lava/issues/273 - Using long URLs
in notify block causes lava-logs to crash.
1. Clearly using the wrong type of field here (fixed-length CharField
instead of an open-length TextField). That's easily fixed.
2. Where else might we be using the wrong data types in our DB models and
potentially storing up future bugs? Quick scan of CharField uses in
models.py:
1. ExtendedUser.irchandle 40
2. ExtendedUser.ircserver 40
3. Architecture 100 (primary key)
4. ProcessorFamily 100 (primary key)
5. Alias 200 (primary key) (also, ignoring due to other work)
6. Core 100 (primary key)
7. DeviceType.cpumodel 100 (primary key)
8. Worker.hostname 200 (primary key)
9. Device.hostname 200 (primary key)
10. Device.deviceversion 200
11. JobFailureTag.name 256
12. TestJob.subid 200
13. TestJob.targetgroup 64
14. TestJob.description 200
15. Notification.template 50
16. Notification.blacklist 100 (array)
17. Notification.queryname 1024
18. Notification.conditions 400
19. NotificationRecipient.email 100
20. NotificationRecipient.irchandle 40
21. NotificationRecipient.irc_server 40
22. NotificationCallback.url 200
23. NotificationCallback.token 200
2. [Remi] udev event forward
1. How to get udev events (kernel and udev types) inside a docker
container?
2. The NETLINK socket is affected by the network namespaces
1. Run systemd-udev inside the docker container
2. Remove the network namespace (--net host)
1. Ugly and hacky
3. Run a service on the host that forward events
1. Another project on lavasoftware.org - udevforward.py
2. [Rémi] sending to all docker containers? Filtering the
container names?
1. Currently broadcasting to the selected containers only.
3. [Rémi] Make udevforward a proper project under the lava group
1. [All] find a good name, let’s chat on irc
2. Will move the passthrough script in the same repo.
3. [Kumar] Race between Cortex-M USB devices and Connectdevice()
1. Some boards: 1 usb for serial + debug/flashing
2. udev event for the tty vs symlink created
3. [Kumar] create an issue in git.l.o/lava/lava
4. [Matt] lava-test-raise allow different exceptions
1. Parsing args on device vs parsing args on server
1. Parsing on the DUT is cleaner
2. [Matt] finish the patch and send a MR
5. [Dan] Support fastboot boot with ramdisk and NFS issue 271
1. Mimic uboot, command ramdisk or command nfs
2. Maybe something like https://staging.validation.linaro.org/scheduler/
job/252683/definition#defline39
6. [Dean] Job error spotted with message “Unable to create metadata store:
[Errno 36] File name too long: '/var/lib/lava-server/default/media/
job-output/2019/05/15/61270/metadata/…”
1. How to safely truncate these filenames and still save them?
2. Check this isn’t multiple lines and test cases etc.
3. [Dean] to raise a bug with some more info
7. [Rémi] Cycle planning draft
1. -ENOTIME, coming back to this later
============================================================================
The LAVA design meeting is held weekly, every Wednesday at 13:00 to
14:00 UTC using Google Hangouts Meet: https://meet.google.com/qre-rgen-zwc
Feel free to comment here or join us directly in the meeting.
Minutes from this and previous meetings are also stored in the LAVA wiki:
https://git.lavasoftware.org/lava/lava/wikis/design-meetings/index
Cheers,
--
Steve McIntyre steve.mcintyre(a)linaro.org
<http://www.linaro.org/> Linaro.org | Open source software for ARM SoCs