Hi Neil,
On Thu, May 31, 2018 at 12:30 AM, Neil Williams neil.williams@linaro.org wrote:
On 30 May 2018 at 19:28, Kevin Hilman khilman@baylibre.com wrote:
Hello,
After upgrading to 2018.4 (also tried .5) many of our device-types using base-uboot.jinja2 are broken. While I really like the major improvement to run commands individually, there seems to be some problems and the LAVA output logs are very confusing, showing concatenated strings, etc.
Here is an example for an upstream device-type (meson-gxbb-p200), and here is where it starts interacting with u-boot: http://khilman.ddns.net/scheduler/job/15#L336
This is a classic presentation of the receiving end not coping with the speed of the transfer of the characters - the device-type template can define:
{% set boot_character_delay = 10 %}
Yes, I'm aware of this feature and actually tried, but it doesn't help. I use that feature on several platforms that have buggy UART drivers in uboot, or that don't have proper FIFO control so commands need to be slowed down, so I'm aware of the classic symptoms and that solution. Sorry, I should've mentioned that I tried that in my original post.
However, for me, there are several reason why this is not a classic symptom
- it worked without any delays before the 2018.4 upgrade - it works fine in 2018.4 if we replace base-uboot.jinja2 with one from 2017.11 - the same board works with other pexpect tooling (pyboot) driving u-boot without any delays - it fails in the same way on several boards: it's always the command right after dhcp that's missing
Also, the classic symptoms that require that fix (at least in my experience) are randomly lost characters, not entire commands that just go missing.
To me what this actually looks like is LAVA is not fully waiting for a prompt before sending the next command. IOW, it sends a command before the prompt shows up, and so that command is totally lost. I haven't dug enough in the code to know how/why it's happening, but that's what it "feels" like so far.
Kevin