On 30 May 2018 at 19:28, Kevin Hilman <khilman@baylibre.com> wrote:
Hello,

After upgrading to 2018.4 (also tried .5) many of our device-types
using base-uboot.jinja2 are broken.  While I really like the major
improvement to run commands individually, there seems to be some
problems and the LAVA output logs are very confusing, showing
concatenated strings, etc.

Here is an example for an upstream device-type (meson-gxbb-p200), and
here is where it starts interacting with u-boot:
http://khilman.ddns.net/scheduler/job/15#L336

This is a classic presentation of the receiving end not coping with the speed of the transfer of the characters - the device-type template can define:

{% set boot_character_delay = 10 %}

This adds the specified number of milliseconds between each character sent along the serial, for all strings sent by the boot action. It shouldn't make any noticeable difference to the speed of the test job but it should dramatically improve the ability of the receiving buffer to cope with automated text entry instead of keyboard entry.

I suspect the older methods were just below some threshold in the DUT.

(There is also support for test_character_delay for similar reasons, should that be necessary.)
 


The "Parsed boot commands" look perfect, and all the commands in black
all look good, but notice the commands at the u-boot prompt, they
appear to be concatenated, starting right away at the "setenv
initrd_high ..."

However, observing the commands on the actual serial port (I use
conmux, so can observe the serial console interactions directly), I'm
not seeing concatenated strings, but the "setenv serverip ..." never
shows up, so the TFTP downloads fail, and the job fails.

Here's what I see directly on the serial console:

Hit Enter or space or Ctrl+C key to stop autoboot -- :  0
gxb_p200_v1#
gxb_p200_v1#setenv autoload no
gxb_p200_v1#setenv initrd_high 0xffffffff
gxb_p200_v1#setenv fdt_high 0xffffffff
gxb_p200_v1#dhcp
dwmac.c9410000 Waiting for PHY auto negotiation to complete.. done
Speed: 100, full duplex
BOOTP broadcast 1
BOOTP broadcast 2
DHCP client bound to address 192.168.0.216 (267 ms)
gxb_p200_v1#tftp 0x1080000 14/tftp-deploy-5v1wo7fv/kernel/uImage
Speed: 100, full duplex
Using dwmac.c9410000 device
TFTP from server 192.168.0.1; our IP address is 192.168.0.216
Filename '14/tftp-deploy-5v1wo7fv/kernel/uImage'.
Load address: 0x1080000
Loading: *
TFTP error: 'File not found' (1)

Even more interesting is that on the same setup, a beaglebone-black
device, using the same base-uboot.jinja2 is working just fine:
http://khilman.ddns.net/scheduler/job/1

That is also typical of character delay issues - very device-type specific.
 


Any help would be appreciated, I'm thoroughly confused by what's going on here.

It took quite some time to identify this originally too.

There is a mention in the docs but it's not indexed well and that will be improved as an action from your question. There are a number of examples of it in the other device type templates.

https://lava.codehelp.co.uk/static/docs/v2/device-integration.html#extend-the-template-unit-tests
https://lava.codehelp.co.uk/static/docs/v2/ipmi-pxe-deploy.html#serial-over-lan-input-issues (which is where this was first found)

--