On 30 May 2018 at 19:28, Kevin Hilman khilman@baylibre.com wrote:
Hello,
After upgrading to 2018.4 (also tried .5) many of our device-types using base-uboot.jinja2 are broken. While I really like the major improvement to run commands individually, there seems to be some problems and the LAVA output logs are very confusing, showing concatenated strings, etc.
Here is an example for an upstream device-type (meson-gxbb-p200), and here is where it starts interacting with u-boot: http://khilman.ddns.net/scheduler/job/15#L336
This is a classic presentation of the receiving end not coping with the speed of the transfer of the characters - the device-type template can define:
{% set boot_character_delay = 10 %}
This adds the specified number of milliseconds between each character sent along the serial, for all strings sent by the boot action. It shouldn't make any noticeable difference to the speed of the test job but it should dramatically improve the ability of the receiving buffer to cope with automated text entry instead of keyboard entry.
I suspect the older methods were just below some threshold in the DUT.
(There is also support for test_character_delay for similar reasons, should that be necessary.)
The "Parsed boot commands" look perfect, and all the commands in black all look good, but notice the commands at the u-boot prompt, they appear to be concatenated, starting right away at the "setenv initrd_high ..."
However, observing the commands on the actual serial port (I use conmux, so can observe the serial console interactions directly), I'm not seeing concatenated strings, but the "setenv serverip ..." never shows up, so the TFTP downloads fail, and the job fails.
Here's what I see directly on the serial console:
Hit Enter or space or Ctrl+C key to stop autoboot -- : 0 gxb_p200_v1# gxb_p200_v1#setenv autoload no gxb_p200_v1#setenv initrd_high 0xffffffff gxb_p200_v1#setenv fdt_high 0xffffffff gxb_p200_v1#dhcp dwmac.c9410000 Waiting for PHY auto negotiation to complete.. done Speed: 100, full duplex BOOTP broadcast 1 BOOTP broadcast 2 DHCP client bound to address 192.168.0.216 (267 ms) gxb_p200_v1#tftp 0x1080000 14/tftp-deploy-5v1wo7fv/kernel/uImage Speed: 100, full duplex Using dwmac.c9410000 device TFTP from server 192.168.0.1; our IP address is 192.168.0.216 Filename '14/tftp-deploy-5v1wo7fv/kernel/uImage'. Load address: 0x1080000 Loading: * TFTP error: 'File not found' (1)
Even more interesting is that on the same setup, a beaglebone-black device, using the same base-uboot.jinja2 is working just fine: http://khilman.ddns.net/scheduler/job/1
That is also typical of character delay issues - very device-type specific.
Any help would be appreciated, I'm thoroughly confused by what's going on here.
It took quite some time to identify this originally too.
There is a mention in the docs but it's not indexed well and that will be improved as an action from your question. There are a number of examples of it in the other device type templates.
https://lava.codehelp.co.uk/static/docs/v2/device-integration.html#extend-th... https://lava.codehelp.co.uk/static/docs/v2/ipmi-pxe-deploy.html#serial-over-... (which is where this was first found)