Hi folks,
We held our regular weekly design meeting yesterday via
Hangout. Summary of discussion:
1. [Neil] Job action timeouts
a. Downgrade current change to not reject XMLRPC API job submission [Rémi]
1. lots of jobs on LKFT need changes, likely to be indicative
of a wider problem
b. Implement the XMLRPC check and get lava-schema.py out to
people in 2019.02 [Rémi]
c. Announce the new schema
d. Leave fatal exceptions until a future release
e. Confirmed. the schema validation itself is not yet part of lava_scheduler_app submit.
2. [Neil] Aarch64 gitlab-runners
a. Initial config available, needs optimisation, especially
concurrency per machine vs cores per runner
1. optimisation to be done during 2019.03 cycle.
b. [Steve] Mustang machine not booting - investigating
1. could be an issue with upgrade to buster.
3. [Neil] wisdom of running unit tests inside docker builds in the
ci-images project?
a. guarantees that the new image won't break lava.git master.
b. another item to be added to the docs of how our CI
operates. [Neil] needs an issue.
4. [Neil] Remi to authenticate the GnuPG fingerprint for
4E9995EC67B6560E0A9B97A9597DCC10C0D1B33D to enable lavasoftware.org
ansible password_store
a. Now fixed via keys.gnupg.net and pgp.earth.li
5. [Dean] Feasibility of upgrading django-ldap-auth to version 1.7?
a. https://tracker.debian.org/pkg/django-auth-ldap
b. To sync LDAP groups into Django auth.
c. We want to be able to mirror LDAP groups as groups in LAVA
(details of this:
https://django-auth-ldap.readthedocs.io/en/latest/permissions.html#group-mi…)
but examples of this given in 1.5 docs (oldest I've found) don't
appear to work in 1.3
d. I believe 1.7 is in Buster, so is it a case of moving to buster?
1. not urgent to migrate to buster now.
6. [Steve] location for documentation
a. Needs an issue to track this.
1. avoid the rabbit hole of optimising what is left.
b. certain elements need to go into the website from lava-server-doc
1. Release docs
2. Development process
3. Design overview
4. Keep development-intro and update links.
c. update all the links in lava-server-doc.
d. website will move to Sphinx instead of Pelican.
e. a lot of docs are using Sphinx RST.
1. there will be some conversion needed for items which are currently in Google Docs.
f. move design meeting doc to the wiki?
1. no - interactive shared-editing feature is very useful
2. we can simply copy text to the wiki after the fact
3. Page per meeting in the wiki with index links
4. Consider the new document as public by the end of the meeting.
7. [Remi] MuxPi rPi zero. build raw images.
a. Guestfish based on GuestFS
b. inside docker? need /boot and /lib/modules volumes. Easy scriptable
way to do things.
c. Should we publish our images and a way to rebuild them?
1. Let's not do this as an automated public build
a. Neil to close https://git.lavasoftware.org/lava/functional-tests/issues/7 (Use
GitLab CI to auto-build functional test image files for
files.lavasoftware.org) on the basis that we are not an image
building service.
2. Must document how our images are built
a. files.lavasoftware.org already includes copies of the
scripts which were used to create the Debian standard image files.
The LAVA design meeting is held weekly, every Wednesday at 13:00 to
14:00 UTC using Google Hangouts Meet: https://meet.google.com/qre-rgen-zwc
Feel free to comment here or join us directly in the meeting.
Cheers,
--
Steve McIntyre steve.mcintyre(a)linaro.org
<http://www.linaro.org/> Linaro.org | Open source software for ARM SoCs
Hi folks,
We held our regular weekly design meeting today via Hangout. Summary
of discussion:
1. [Steve] Layoffs in Linaro affecting the team
2. [Dean] A user has noticed that their jobs running the same process multiple times in a test shell have noticed that the latter iterations take much longer than earlier iterations, even though they should take the same amount of time.
1. They noticed the "Listened to connection for namespace '<NAMESPACE>' done" message appeared a lot.
2. shell.py has the following noted around this debug log:
1. # With an higher timeout, this can have a big impact on
2. # the performances of the overall loop.
1. Is there any known issues with the read feedback checks?
2. Is there any reason why this step would take longer over the course of a job? If so, is there anything we can do to mitigate this?
3. Is there any settings we can tweak to adjust performance?
4. Any further information that might help us investigate this further?
5. Maybe a pexpect problem? Changes in this area happened in 20l8.7, Dean is using 2018.5
3. [Rémi] lavafed labs
1. Neil’s lab is off
2. ARM? In process, but may take a while - needs IT involvement to open up ports etc.
3. [Rémi] Contact lava users:
1. Collabora? [done]
2. Baylibre? [done]
3. ST?
1. [Rémi] Add matt’s lab
2. [Steve] Can set up some stuff if needed (Mustang? BBB? Panda? Maybe grab old boards from Neil?)
4. [Rémi] LAVA 2019.2 release
1. When ?
2. Start the process Thu 28th, but we're not going to get it all done then
3. Expect to finish Monday 4th?
4. Need to document what functional tests we're doing manually for now (list was in Neil's head!)
5. We want to get to the point where lavafed etc. make this obsolete
1. Will need to actually work out useful tests for all the devices!
The LAVA design meeting is held weekly, every Wednesday at 13:00 to
14:00 UTC using Google Hangouts Meet: https://meet.google.com/qre-rgen-zwc
Feel free to comment here or join us directly in the meeting.
Cheers,
--
Steve McIntyre steve.mcintyre(a)linaro.org
<http://www.linaro.org/> Linaro.org | Open source software for ARM SoCs
Starting lava-run in a dedicated container
===============================
https://git.lavasoftware.org/lava/lava/issues/114
Work has already been done for the device support in this area. The
intention is that the admins can create static udev rules which add
the device to the correct container. To achieve this, the name of the
container needs to be made available to the udev rule. The plan will
be for lava-slave to create a file in /var/cache/lava-slave/. The file
will be named according to the device hostname and will contain the
container name plus some other useful data, e.g. the job ID. the udev
rule can then parse this file to know which container to use. The udev
rule would be triggered on each ADD. If lava-slave is run in a docker,
/var/cache/lava-slave/ would need to be made available to that docker
as a volume. (LAVA specifies the name of the container in advance.)
Test job to control the image to be used
-----------------------------------------------------
The LAVA documentation will need to recommend using official LAVA
Software Community Project docker images, some teams will want to
build & use images based on those to help include tools which take a
long time to install / build. LAVA will not be able to check the
provenance of the images being used, this is a test writer problem.
LAVA will need to clearly output the docker image being executed and
retain that in the permanent test job log output or result metadata.
lava-slave will need to handle "latest" URLs and turn it into a
reproducible ID using docker inspect to get the image ID. This is to
be done by passing an argument to a new lava-run option.
LAVA already outputs the version of lava-dispatcher (lava-run) in use
(and other tools) and this will continue with docker.
Admins will continue to control certificates, e.g. ZMQ
Capabilities may need to be added too.
LAVA runs the container with --rm (possibly with --force too).
Releases and milestones
===================
We have created a 2019.03 milestone which is expected to contain the
work on lava-run in a separate container above. We have moved a number
of issues and merge requests from 2019.02 into 2019.03 (or sometimes
into .05) to get to a feasible number of changes to release 2019.02.
Adding env variables to the test shell
=============================
In combination with https://git.lavasoftware.org/lava/lava/issues/228
we will be looking at making device dictionary elements and some
environment variables available inside the Lava-Test Test Definition
1.0 overlays using a test shell helper. Test writers are advised not
to rely on device-specific information unless essential.
--
Neil Williams
=============
neil.williams(a)linaro.org
http://www.linux.codehelp.co.uk/
We've come across a problem with LXC test jobs on a network which only
supports IPv6 but it's hard to replicate (this network is at a
conference, just for a few days).
Has anyone already looked at IPv6 and LXC? Are there other IPv6 issues
with LAVA?
I've found this guide but to be able to document this, the content
needs to come from someone who can replicate and test the actual
problems.
https://techoverflow.net/2018/06/06/routing-public-ipv6-addresses-to-your-l…
--
Neil Williams
=============
neil.williams(a)linaro.org
http://www.linux.codehelp.co.uk/