Hi everyone,
I’m facing an issue when U-Boot commands are sent one after another without waiting for a prompt. Obviously, device is not able to boot.
Excerpt from logs:
dhcp => dhcp bootloader-commands: Wait for prompt ['=>', 'Resetting CPU', 'Must RESET board to recover', 'TIMEOUT', 'Retry count exceeded', 'ERROR: The remote end did not respond in time.'] (timeout 00:04:56) setenv serverip 172.17.1.189 setenv serverip 172.17.1.189 bootloader-commands: Wait for prompt ['=>', 'Resetting CPU', 'Must RESET board to recover', 'TIMEOUT', 'Retry count exceeded', 'ERROR: The remote end did not respond in time.'] (timeout 00:04:56) tftp 0x80000000 60/tftp-deploy-46_3_i9j/kernel/zImage-am335x-vcu.bin tftp 0x80000000 60/tftp-deploy-46_3_i9j/kernel/zImage-am335x-vcu.bin bootloader-commands: Wait for prompt ['=>', 'Resetting CPU', 'Must RESET board to recover', 'TIMEOUT', 'Retry count exceeded', 'ERROR: The remote end did not respond in time.'] (timeout 00:04:56) tftp 0x83000000 60/tftp-deploy-46_3_i9j/ramdisk/ramdisk.cpio.gz.uboot tftp 0x83000000 60/tftp-deploy-46_3_i9j/ramdisk/ramdisk.cpio.gz.uboot bootloader-commands: Wait for prompt ['=>', 'Resetting CPU', 'Must RESET board to recover', 'TIMEOUT', 'Retry count exceeded', 'ERROR: The remote end did not respond in time.'] (timeout 00:04:55) setenv initrd_size ${filesize} setenv initrd_size ${filesize} bootloader-commands: Wait for prompt ['=>', 'Resetting CPU', 'Must RESET board to recover', 'TIMEOUT', 'Retry count exceeded', 'ERROR: The remote end did not respond in time.'] (timeout 00:04:55) tftp 0x82000000 60/tftp-deploy-46_3_i9j/dtb/am335x-vcu-prod1.dtb tftp 0x82000000 60/tftp-deploy-46_3_i9j/dtb/am335x-vcu-prod1.dtb bootloader-commands: Wait for prompt ['=>', 'Resetting CPU', 'Must RESET board to recover', 'TIMEOUT', 'Retry count exceeded', 'ERROR: The remote end did not respond in time.'] (timeout 00:04:55) dhcp dhcp link up on port 0, speed 10, half duplex BOOTP broadcast 1 setenv serverip 172.17.1.189 tftp 0x80000000 60/tftp-deploy-46_3_i9j/kernel/zImage-am335x-vcu.bin BOOTP broadcast 2 tftp 0x83000000 60/tftp-deploy-46_3_i9j/ramdisk/ramdisk.cpio.gz.uboot setenv initrd_size ${filesize} tftp 0x82000000 60/tftp-deploy-46_3_i9j/dtb/am335x-vcu-prod1.dtb BOOTP broadcast 3 BOOTP broadcast 4 BOOTP broadcast 5 Retry time exceeded setenv bootargs 'console=ttyS2,115200n8 root=/dev/ram0 ti_cpsw.rx_packet_max=1526 ip=dhcp' => setenv bootargs 'console=ttyS2,115200n8 root=/dev/ram0 ti_cpsw.rx_packet_max=1526 ip=dhcp' bootloader-commands: Wait for prompt ['=>', 'Resetting CPU', 'Must RESET board to recover', 'TIMEOUT', 'Retry count exceeded', 'ERROR: The remote end did not respond in time.'] (timeout 00:04:50) setenv bootargs 'console=ttyS2,115200n8 root=/dev/ram0 ti_cpsw.rx_packet_max=1526 ip=dhcp' setenv bootargs 'console=ttyS2,115200n8 root=/dev/ram0 ti_cpsw.rx_packet_max=1526 ip=dhcp' => bootz 0x80000000 0x83000000 0x82000000 => bootz 0x80000000 0x83000000 0x82000000 bootloader-commands: Wait for prompt Starting kernel (timeout 00:04:50) bootz 0x80000000 0x83000000 0x82000000 bootz 0x80000000 0x83000000 0x82000000 Bad Linux ARM zImage magic! => Bad Linux ARM zImage magic!
As you see, all commands that were sent between `dhcp` command and until it was able to complete or fail are simply dropped.
Device type config:
{% extends 'base-uboot.jinja2' %}
{% set device_type = "vcu" %}
{% set console_device = 'ttyS2' %} {% set baud_rate = 115200 %} {% set interrupt_prompt = 'Press s to abort autoboot' %} {% set interrupt_char = 's' %} {% set bootloader_prompt = '=>' %}
{% set uboot_mkimage_arch = 'arm' %}
{% set bootz_kernel_addr = '0x80000000' %} {% set bootz_ramdisk_addr = '0x83000000' %} {% set bootz_dtb_addr = '0x82000000' %}
{% set extra_kernel_args = 'ti_cpsw.rx_packet_max=1526' %}
{% set kernel_start_message = 'Welcome to' %}
Device config:
{% extends 'vcu.jinja2' %}
{% set connection_command = 'telnet lava-disp-1.local 7000' %} {% set power_on_command = 'relay-ctrl --relay 1 --state on' %} {% set power_off_command = 'relay-ctrl --relay 1 --state off' %} {% set hard_reset_command = 'relay-ctrl --relay 1 --toggle' %}
Boot action block from job definition:
- boot: timeout: minutes: 5 method: u-boot commands: ramdisk auto_login: login_prompt: 'am335x-nmhw21 login: ' username: root prompts: - 'fct@am335x-nmhw21:~# '
Have I misconfigured something? What I’m missing? Thanks!
Best regards, Andrejs Cainikovs.
On Thu, 29 Nov 2018 at 14:42, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote:
Hi everyone,
I’m facing an issue when U-Boot commands are sent one after another without waiting for a prompt. Obviously, device is not able to boot.
Excerpt from logs:
(Much better to attach the complete log as downloaded from the menu on the job detail page, it includes timestamps and other useful content.) I'm afraid the content you've pasted into your original doesn't help. Please attach the complete test job log file, don't paste it into the body of the email where it will be mangled.
What version of lava-server and lava-dispatcher are you running?
One thing would be to obtain a U-Boot device which is already available on staging.validation.linaro.org or another instance and known to be working, then borrow the config from that. If you have something like that available already, it'll make things easier.
As you see, all commands that were sent between `dhcp` command and until it was able to complete or fail are simply dropped.
Device type config:
{% extends 'base-uboot.jinja2' %}
{% set device_type = "vcu" %}
This line isn't required and a couple of other lines are the defaults.
{% set console_device = 'ttyS2' %}
{% set baud_rate = 115200 %}
{% set interrupt_prompt = 'Press s to abort autoboot' %}
{% set interrupt_char = 's' %}
{% set bootloader_prompt = '=>' %}
{% set uboot_mkimage_arch = 'arm' %}
{% set bootz_kernel_addr = '0x80000000' %}
{% set bootz_ramdisk_addr = '0x83000000' %}
{% set bootz_dtb_addr = '0x82000000' %}
{% set extra_kernel_args = 'ti_cpsw.rx_packet_max=1526' %}
{% set kernel_start_message = 'Welcome to' %}
Device config:
{% extends 'vcu.jinja2' %}
{% set connection_command = 'telnet lava-disp-1.local 7000' %}
{% set power_on_command = 'relay-ctrl --relay 1 --state on' %}
{% set power_off_command = 'relay-ctrl --relay 1 --state off' %}
{% set hard_reset_command = 'relay-ctrl --relay 1 --toggle' %}
Boot action block from job definition:
boot:
timeout:
minutes: 5
method: u-boot
commands: ramdisk
auto_login:
login_prompt: 'am335x-nmhw21 login: ' username: root
prompts:
- 'fct@am335x-nmhw21:~# '
Have I misconfigured something? What I’m missing? Thanks!
Hi Neil,
On 29/11/2018 15:58, Neil Williams wrote:
On Thu, 29 Nov 2018 at 14:42, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote:
Hi everyone,
I’m facing an issue when U-Boot commands are sent one after another without waiting for a prompt. Obviously, device is not able to boot.
Excerpt from logs:
(Much better to attach the complete log as downloaded from the menu on the job detail page, it includes timestamps and other useful content.) I'm afraid the content you've pasted into your original doesn't help. Please attach the complete test job log file, don't paste it into the body of the email where it will be mangled.
Complete log attached.
What version of lava-server and lava-dispatcher are you running?
Latest from stretch-backports: 2018.11-1~bpo9+1
One thing would be to obtain a U-Boot device which is already available on staging.validation.linaro.org or another instance and known to be working, then borrow the config from that. If you have something like that available already, it'll make things easier.
Thanks for a hint, I'll have a look into this.
As you see, all commands that were sent between `dhcp` command and until it was able to complete or fail are simply dropped.
Device type config:
{% extends 'base-uboot.jinja2' %}
{% set device_type = "vcu" %}
This line isn't required and a couple of other lines are the defaults.
{% set console_device = 'ttyS2' %}
{% set baud_rate = 115200 %}
{% set interrupt_prompt = 'Press s to abort autoboot' %}
{% set interrupt_char = 's' %}
{% set bootloader_prompt = '=>' %}
{% set uboot_mkimage_arch = 'arm' %}
{% set bootz_kernel_addr = '0x80000000' %}
{% set bootz_ramdisk_addr = '0x83000000' %}
{% set bootz_dtb_addr = '0x82000000' %}
{% set extra_kernel_args = 'ti_cpsw.rx_packet_max=1526' %}
{% set kernel_start_message = 'Welcome to' %}
Device config:
{% extends 'vcu.jinja2' %}
{% set connection_command = 'telnet lava-disp-1.local 7000' %}
{% set power_on_command = 'relay-ctrl --relay 1 --state on' %}
{% set power_off_command = 'relay-ctrl --relay 1 --state off' %}
{% set hard_reset_command = 'relay-ctrl --relay 1 --toggle' %}
Boot action block from job definition:
boot:
timeout:
minutes: 5
method: u-boot
commands: ramdisk
auto_login:
login_prompt: 'am335x-nmhw21 login: ' username: root
prompts:
- 'fct@am335x-nmhw21:~# '
Have I misconfigured something? What I’m missing? Thanks!
Hi,
TWIMC: I got it working. However, I have noticed that last command in command sequence (at least for U-Boot) never waits for command prompt, so I had to update base-uboot.jinja2 to get past this issue.
Now, when I'm looking at this patch which I prepared yesterday - I'm not sure whether it was indeed issue with last command in a sequence, or with `dhcp` command processing (there must be a reason why I moved `setenv serverip` to top, but I'm lazy to check. Just giving it back to community as it is in case someone will face something similar.
Also, attaching updated device type config.
$ diff base-uboot.jinja2 base-uboot.jinja2.backup
19,21c19,20 < " - setenv serverip {SERVER_IP} < - " + uboot_ipaddr_cmd|default('dhcp') + " < - sleep 1" ) -%} ---
" - " + uboot_ipaddr_cmd|default('dhcp') + " - setenv serverip {SERVER_IP}" ) -%}
Best regards, Andrejs Cainikovs.
On 29/11/2018 16:38, Andrejs Cainikovs wrote:
Hi Neil,
On 29/11/2018 15:58, Neil Williams wrote:
On Thu, 29 Nov 2018 at 14:42, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote:
Hi everyone,
I’m facing an issue when U-Boot commands are sent one after another
without waiting for a prompt. Obviously, device is not able to boot.
Excerpt from logs:
(Much better to attach the complete log as downloaded from the menu on the job detail page, it includes timestamps and other useful content.) I'm afraid the content you've pasted into your original doesn't help. Please attach the complete test job log file, don't paste it into the body of the email where it will be mangled.
Complete log attached.
What version of lava-server and lava-dispatcher are you running?
Latest from stretch-backports: 2018.11-1~bpo9+1
One thing would be to obtain a U-Boot device which is already available on staging.validation.linaro.org or another instance and known to be working, then borrow the config from that. If you have something like that available already, it'll make things easier.
Thanks for a hint, I'll have a look into this.
As you see, all commands that were sent between `dhcp` command and
until it was able to complete or fail are simply dropped.
Device type config:
{% extends 'base-uboot.jinja2' %}
{% set device_type = "vcu" %}
This line isn't required and a couple of other lines are the defaults.
{% set console_device = 'ttyS2' %}
{% set baud_rate = 115200 %}
{% set interrupt_prompt = 'Press s to abort autoboot' %}
{% set interrupt_char = 's' %}
{% set bootloader_prompt = '=>' %}
{% set uboot_mkimage_arch = 'arm' %}
{% set bootz_kernel_addr = '0x80000000' %}
{% set bootz_ramdisk_addr = '0x83000000' %}
{% set bootz_dtb_addr = '0x82000000' %}
{% set extra_kernel_args = 'ti_cpsw.rx_packet_max=1526' %}
{% set kernel_start_message = 'Welcome to' %}
Device config:
{% extends 'vcu.jinja2' %}
{% set connection_command = 'telnet lava-disp-1.local 7000' %}
{% set power_on_command = 'relay-ctrl --relay 1 --state on' %}
{% set power_off_command = 'relay-ctrl --relay 1 --state off' %}
{% set hard_reset_command = 'relay-ctrl --relay 1 --toggle' %}
Boot action block from job definition:
boot:
timeout:
minutes: 5
method: u-boot
commands: ramdisk
auto_login:
login_prompt: 'am335x-nmhw21 login: ' username: root
prompts:
- 'fct@am335x-nmhw21:~# '
Have I misconfigured something? What I’m missing? Thanks!
On Tue, 4 Dec 2018 at 09:37, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote:
Hi,
TWIMC: I got it working. However, I have noticed that last command in command sequence (at least for U-Boot) never waits for command prompt, so I had to update base-uboot.jinja2 to get past this issue.
Compare with : https://staging.validation.linaro.org/scheduler/job/245806
The last command issued to U-Boot should be boot and that is not expected to ever get to a U-Boot prompt. Yet LAVA does check for a U-Boot prompt occurring after that command, along with other failure indicators like Must RESET board to recover
Please do more investigation on your end to work out if the problem is correctly fixed.
Now, when I'm looking at this patch which I prepared yesterday - I'm not sure whether it was indeed issue with last command in a sequence, or with `dhcp` command processing (there must be a reason why I moved `setenv serverip` to top, but I'm lazy to check. Just giving it back to community as it is in case someone will face something similar.
Please take the time to check. When vcu.jinja2 is properly debugged, a contribution to upstream (with supporting unit test) would be very welcome.
https://master.lavasoftware.org/static/docs/v2/contribution-intro.html
Also, attaching updated device type config.
$ diff base-uboot.jinja2 base-uboot.jinja2.backup
19,21c19,20 < " - setenv serverip {SERVER_IP} < - " + uboot_ipaddr_cmd|default('dhcp') + "
< - sleep 1" ) -%}
" - " + uboot_ipaddr_cmd|default('dhcp') + " - setenv serverip {SERVER_IP}" ) -%}
This doesn't need a change to base-uboot.jinja2 - this can be done in the device-type which extends base-uboot.jinja2, supported by a unit test.
Best regards, Andrejs Cainikovs.
On 29/11/2018 16:38, Andrejs Cainikovs wrote:
Hi Neil,
On 29/11/2018 15:58, Neil Williams wrote:
On Thu, 29 Nov 2018 at 14:42, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote:
Hi everyone,
I’m facing an issue when U-Boot commands are sent one after another
without waiting for a prompt. Obviously, device is not able to boot.
Excerpt from logs:
(Much better to attach the complete log as downloaded from the menu on the job detail page, it includes timestamps and other useful content.) I'm afraid the content you've pasted into your original doesn't help. Please attach the complete test job log file, don't paste it into the body of the email where it will be mangled.
Complete log attached.
What version of lava-server and lava-dispatcher are you running?
Latest from stretch-backports: 2018.11-1~bpo9+1
One thing would be to obtain a U-Boot device which is already available on staging.validation.linaro.org or another instance and known to be working, then borrow the config from that. If you have something like that available already, it'll make things easier.
Thanks for a hint, I'll have a look into this.
As you see, all commands that were sent between `dhcp` command and
until it was able to complete or fail are simply dropped.
Device type config:
{% extends 'base-uboot.jinja2' %}
{% set device_type = "vcu" %}
This line isn't required and a couple of other lines are the defaults.
{% set console_device = 'ttyS2' %}
{% set baud_rate = 115200 %}
{% set interrupt_prompt = 'Press s to abort autoboot' %}
{% set interrupt_char = 's' %}
{% set bootloader_prompt = '=>' %}
{% set uboot_mkimage_arch = 'arm' %}
{% set bootz_kernel_addr = '0x80000000' %}
{% set bootz_ramdisk_addr = '0x83000000' %}
{% set bootz_dtb_addr = '0x82000000' %}
{% set extra_kernel_args = 'ti_cpsw.rx_packet_max=1526' %}
{% set kernel_start_message = 'Welcome to' %}
Device config:
{% extends 'vcu.jinja2' %}
{% set connection_command = 'telnet lava-disp-1.local 7000' %}
{% set power_on_command = 'relay-ctrl --relay 1 --state on' %}
{% set power_off_command = 'relay-ctrl --relay 1 --state off' %}
{% set hard_reset_command = 'relay-ctrl --relay 1 --toggle' %}
Boot action block from job definition:
boot:
timeout:
minutes: 5
method: u-boot
commands: ramdisk
auto_login:
login_prompt: 'am335x-nmhw21 login: ' username: root
prompts:
- 'fct@am335x-nmhw21:~# '
Have I misconfigured something? What I’m missing? Thanks!
Hi Neil,
Apart from inlined answers below, job configuration and full log is attached.
On 04/12/2018 11:02, Neil Williams wrote:
On Tue, 4 Dec 2018 at 09:37, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote:
Hi,
TWIMC: I got it working. However, I have noticed that last command in command sequence (at least for U-Boot) never waits for command prompt, so I had to update base-uboot.jinja2 to get past this issue.
Compare with : https://staging.validation.linaro.org/scheduler/job/245806
The last command issued to U-Boot should be boot and that is not expected to ever get to a U-Boot prompt. Yet LAVA does check for a U-Boot prompt occurring after that command, along with other failure indicators like Must RESET board to recover
You are, of course, right. I was not clear enough, and by last command I meant something else, which we can skip now. Obviously, my aim is to boot the DUT past the U-Boot into Linux, and then run some tests.
Please do more investigation on your end to work out if the problem is correctly fixed.
Now, when I'm looking at this patch which I prepared yesterday - I'm not sure whether it was indeed issue with last command in a sequence, or with `dhcp` command processing (there must be a reason why I moved `setenv serverip` to top, but I'm lazy to check. Just giving it back to community as it is in case someone will face something similar.
Please take the time to check. When vcu.jinja2 is properly debugged, a contribution to upstream (with supporting unit test) would be very welcome.
I have put some extra effort into debugging, and got working scenario without changing base-uboot.jinja2. I did this by setting `uboot_ipaddr_cmd` in device type template:
{% set uboot_ipaddr_cmd = 'sleep 5; setenv serverip {SERVER_IP}; dhcp' %}
However, I still realize this is a hack, and looking into the log I might be catching something:
- {"dt": "2018-12-04T12:47:23.731495", "lvl": "debug", "msg": "bootloader-commands: Wait for prompt ['=>', 'Resetting CPU', 'Must RESET board to recover', 'TIMEOUT', 'Retry count exceeded', 'ERROR: The remote end did not respond in time.', 'Retry time exceeded'] (timeout 00:04:56)"} - {"dt": "2018-12-04T12:47:23.746562", "lvl": "input", "msg": "sleep 5; setenv serverip 172.17.1.189; dhcp\n"} - {"dt": "2018-12-04T12:47:23.852418", "lvl": "target", "msg": "sleep 5; setenv serverip 172.17.1.189; dhcp"} - {"dt": "2018-12-04T12:47:23.857183", "lvl": "debug", "msg": "bootloader-commands: Wait for prompt ['=>', 'Resetting CPU', 'Must RESET board to recover', 'TIMEOUT', 'Retry count exceeded', 'ERROR: The remote end did not respond in time.', 'Retry time exceeded'] (timeout 00:04:56)"} - {"dt": "2018-12-04T12:47:23.862239", "lvl": "target", "msg": "setenv autoload no"} - {"dt": "2018-12-04T12:47:23.866570", "lvl": "input", "msg": "setenv serverip 172.17.1.189\n"}
It looks like prompt waiting is skipped in this particular block! Why?
https://master.lavasoftware.org/static/docs/v2/contribution-intro.html
Also, attaching updated device type config.
$ diff base-uboot.jinja2 base-uboot.jinja2.backup
19,21c19,20 < " - setenv serverip {SERVER_IP} < - " + uboot_ipaddr_cmd|default('dhcp') + "
< - sleep 1" ) -%}
" - " + uboot_ipaddr_cmd|default('dhcp') + " - setenv serverip {SERVER_IP}" ) -%}
This doesn't need a change to base-uboot.jinja2 - this can be done in the device-type which extends base-uboot.jinja2, supported by a unit test.
No, that's not true. `base_uboot_dhcp_command` variable from base-uboot.jinja2 is not overrideable by device type templates. Or did I missed something?
Best regards, Andrejs Cainikovs.
On 29/11/2018 16:38, Andrejs Cainikovs wrote:
Hi Neil,
On 29/11/2018 15:58, Neil Williams wrote:
On Thu, 29 Nov 2018 at 14:42, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote:
Hi everyone,
I’m facing an issue when U-Boot commands are sent one after another
without waiting for a prompt. Obviously, device is not able to boot.
Excerpt from logs:
(Much better to attach the complete log as downloaded from the menu on the job detail page, it includes timestamps and other useful content.) I'm afraid the content you've pasted into your original doesn't help. Please attach the complete test job log file, don't paste it into the body of the email where it will be mangled.
Complete log attached.
What version of lava-server and lava-dispatcher are you running?
Latest from stretch-backports: 2018.11-1~bpo9+1
One thing would be to obtain a U-Boot device which is already available on staging.validation.linaro.org or another instance and known to be working, then borrow the config from that. If you have something like that available already, it'll make things easier.
Thanks for a hint, I'll have a look into this.
As you see, all commands that were sent between `dhcp` command and
until it was able to complete or fail are simply dropped.
Device type config:
{% extends 'base-uboot.jinja2' %}
{% set device_type = "vcu" %}
This line isn't required and a couple of other lines are the defaults.
{% set console_device = 'ttyS2' %}
{% set baud_rate = 115200 %}
{% set interrupt_prompt = 'Press s to abort autoboot' %}
{% set interrupt_char = 's' %}
{% set bootloader_prompt = '=>' %}
{% set uboot_mkimage_arch = 'arm' %}
{% set bootz_kernel_addr = '0x80000000' %}
{% set bootz_ramdisk_addr = '0x83000000' %}
{% set bootz_dtb_addr = '0x82000000' %}
{% set extra_kernel_args = 'ti_cpsw.rx_packet_max=1526' %}
{% set kernel_start_message = 'Welcome to' %}
Device config:
{% extends 'vcu.jinja2' %}
{% set connection_command = 'telnet lava-disp-1.local 7000' %}
{% set power_on_command = 'relay-ctrl --relay 1 --state on' %}
{% set power_off_command = 'relay-ctrl --relay 1 --state off' %}
{% set hard_reset_command = 'relay-ctrl --relay 1 --toggle' %}
Boot action block from job definition:
boot:
timeout:
minutes: 5
method: u-boot
commands: ramdisk
auto_login:
login_prompt: 'am335x-nmhw21 login: ' username: root
prompts:
- 'fct@am335x-nmhw21:~# '
Have I misconfigured something? What I’m missing? Thanks!
On Tue, 4 Dec 2018 at 13:31, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote:
Hi Neil,
Apart from inlined answers below, job configuration and full log is attached.
On 04/12/2018 11:02, Neil Williams wrote:
On Tue, 4 Dec 2018 at 09:37, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote:
Hi,
TWIMC: I got it working. However, I have noticed that last command in command sequence (at least for U-Boot) never waits for command prompt, so I had to update base-uboot.jinja2 to get past this issue.
Compare with : https://staging.validation.linaro.org/scheduler/job/245806
The last command issued to U-Boot should be boot and that is not expected to ever get to a U-Boot prompt. Yet LAVA does check for a U-Boot prompt occurring after that command, along with other failure indicators like Must RESET board to recover
You are, of course, right. I was not clear enough, and by last command I meant something else, which we can skip now. Obviously, my aim is to boot the DUT past the U-Boot into Linux, and then run some tests.
Please do more investigation on your end to work out if the problem is correctly fixed.
Now, when I'm looking at this patch which I prepared yesterday - I'm not sure whether it was indeed issue with last command in a sequence, or with `dhcp` command processing (there must be a reason why I moved `setenv serverip` to top, but I'm lazy to check. Just giving it back to community as it is in case someone will face something similar.
Please take the time to check. When vcu.jinja2 is properly debugged, a contribution to upstream (with supporting unit test) would be very welcome.
I have put some extra effort into debugging, and got working scenario without changing base-uboot.jinja2. I did this by setting `uboot_ipaddr_cmd` in device type template:
{% set uboot_ipaddr_cmd = 'sleep 5; setenv serverip {SERVER_IP}; dhcp' %}
I think this might be part of the problem. The U-Boot shell will not emit a prompt between commands if those commands are sent as single line separated by a semi-colon. These should be separate lines in the U-boot commands. You shouldn't need the sleep.
However, I still realize this is a hack, and looking into the log I might be catching something:
- {"dt": "2018-12-04T12:47:23.731495", "lvl": "debug", "msg":
"bootloader-commands: Wait for prompt ['=>', 'Resetting CPU', 'Must RESET board to recover', 'TIMEOUT', 'Retry count exceeded', 'ERROR: The remote end did not respond in time.', 'Retry time exceeded'] (timeout 00:04:56)"}
- {"dt": "2018-12-04T12:47:23.746562", "lvl": "input", "msg": "sleep 5;
setenv serverip 172.17.1.189; dhcp\n"}
- {"dt": "2018-12-04T12:47:23.852418", "lvl": "target", "msg": "sleep 5;
setenv serverip 172.17.1.189; dhcp"}
- {"dt": "2018-12-04T12:47:23.857183", "lvl": "debug", "msg":
"bootloader-commands: Wait for prompt ['=>', 'Resetting CPU', 'Must RESET board to recover', 'TIMEOUT', 'Retry count exceeded', 'ERROR: The remote end did not respond in time.', 'Retry time exceeded'] (timeout 00:04:56)"}
- {"dt": "2018-12-04T12:47:23.862239", "lvl": "target", "msg": "setenv
autoload no"}
- {"dt": "2018-12-04T12:47:23.866570", "lvl": "input", "msg": "setenv
serverip 172.17.1.189\n"}
It looks like prompt waiting is skipped in this particular block! Why?
https://master.lavasoftware.org/static/docs/v2/contribution-intro.html
Also, attaching updated device type config.
$ diff base-uboot.jinja2 base-uboot.jinja2.backup
19,21c19,20 < " - setenv serverip {SERVER_IP} < - " + uboot_ipaddr_cmd|default('dhcp') + "
< - sleep 1" ) -%}
" - " + uboot_ipaddr_cmd|default('dhcp') + " - setenv serverip {SERVER_IP}" ) -%}
This doesn't need a change to base-uboot.jinja2 - this can be done in the device-type which extends base-uboot.jinja2, supported by a unit test.
No, that's not true. `base_uboot_dhcp_command` variable from base-uboot.jinja2 is not overrideable by device type templates. Or did I missed something?
OK, the base-boot.jinja2 could do with override support to allow for that.
I'll have a look at your test job log file and see how to both remove the sleep and provide the override.
Best regards, Andrejs Cainikovs.
On 29/11/2018 16:38, Andrejs Cainikovs wrote:
Hi Neil,
On 29/11/2018 15:58, Neil Williams wrote:
On Thu, 29 Nov 2018 at 14:42, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote:
Hi everyone,
I’m facing an issue when U-Boot commands are sent one after another
without waiting for a prompt. Obviously, device is not able to boot.
Excerpt from logs:
(Much better to attach the complete log as downloaded from the menu on the job detail page, it includes timestamps and other useful content.) I'm afraid the content you've pasted into your original doesn't help. Please attach the complete test job log file, don't paste it into the body of the email where it will be mangled.
Complete log attached.
What version of lava-server and lava-dispatcher are you running?
Latest from stretch-backports: 2018.11-1~bpo9+1
One thing would be to obtain a U-Boot device which is already available on staging.validation.linaro.org or another instance and known to be working, then borrow the config from that. If you have something like that available already, it'll make things easier.
Thanks for a hint, I'll have a look into this.
As you see, all commands that were sent between `dhcp` command and
until it was able to complete or fail are simply dropped.
Device type config:
{% extends 'base-uboot.jinja2' %}
{% set device_type = "vcu" %}
This line isn't required and a couple of other lines are the defaults.
{% set console_device = 'ttyS2' %}
{% set baud_rate = 115200 %}
{% set interrupt_prompt = 'Press s to abort autoboot' %}
{% set interrupt_char = 's' %}
{% set bootloader_prompt = '=>' %}
{% set uboot_mkimage_arch = 'arm' %}
{% set bootz_kernel_addr = '0x80000000' %}
{% set bootz_ramdisk_addr = '0x83000000' %}
{% set bootz_dtb_addr = '0x82000000' %}
{% set extra_kernel_args = 'ti_cpsw.rx_packet_max=1526' %}
{% set kernel_start_message = 'Welcome to' %}
Device config:
{% extends 'vcu.jinja2' %}
{% set connection_command = 'telnet lava-disp-1.local 7000' %}
{% set power_on_command = 'relay-ctrl --relay 1 --state on' %}
{% set power_off_command = 'relay-ctrl --relay 1 --state off' %}
{% set hard_reset_command = 'relay-ctrl --relay 1 --toggle' %}
Boot action block from job definition:
boot:
timeout:
minutes: 5
method: u-boot
commands: ramdisk
auto_login:
login_prompt: 'am335x-nmhw21 login: ' username: root
prompts:
- 'fct@am335x-nmhw21:~# '
Have I misconfigured something? What I’m missing? Thanks!
On 04/12/2018 14:41, Neil Williams wrote:
On Tue, 4 Dec 2018 at 13:31, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote:
Hi Neil,
Apart from inlined answers below, job configuration and full log is attached.
On 04/12/2018 11:02, Neil Williams wrote:
On Tue, 4 Dec 2018 at 09:37, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote:
Hi,
TWIMC: I got it working. However, I have noticed that last command in command sequence (at least for U-Boot) never waits for command prompt, so I had to update base-uboot.jinja2 to get past this issue.
Compare with : https://staging.validation.linaro.org/scheduler/job/245806
The last command issued to U-Boot should be boot and that is not expected to ever get to a U-Boot prompt. Yet LAVA does check for a U-Boot prompt occurring after that command, along with other failure indicators like Must RESET board to recover
You are, of course, right. I was not clear enough, and by last command I meant something else, which we can skip now. Obviously, my aim is to boot the DUT past the U-Boot into Linux, and then run some tests.
Please do more investigation on your end to work out if the problem is correctly fixed.
Now, when I'm looking at this patch which I prepared yesterday - I'm not sure whether it was indeed issue with last command in a sequence, or with `dhcp` command processing (there must be a reason why I moved `setenv serverip` to top, but I'm lazy to check. Just giving it back to community as it is in case someone will face something similar.
Please take the time to check. When vcu.jinja2 is properly debugged, a contribution to upstream (with supporting unit test) would be very welcome.
I have put some extra effort into debugging, and got working scenario without changing base-uboot.jinja2. I did this by setting `uboot_ipaddr_cmd` in device type template:
{% set uboot_ipaddr_cmd = 'sleep 5; setenv serverip {SERVER_IP}; dhcp' %}
I think this might be part of the problem. The U-Boot shell will not emit a prompt between commands if those commands are sent as single line separated by a semi-colon. These should be separate lines in the U-boot commands. You shouldn't need the sleep.
I need that sleep due to environment specifics, it's not a part of a workaround here. I have tried just now to move the sleep into relay control script, removing `uboot_ipaddr_cmd` from device type config at all, and I hit the original problem again: `setenv serverip` is ignored by U-Boot because it was sent while getting DHCP address without waiting for command prompt. Log attached.
However, I still realize this is a hack, and looking into the log I might be catching something:
- {"dt": "2018-12-04T12:47:23.731495", "lvl": "debug", "msg":
"bootloader-commands: Wait for prompt ['=>', 'Resetting CPU', 'Must RESET board to recover', 'TIMEOUT', 'Retry count exceeded', 'ERROR: The remote end did not respond in time.', 'Retry time exceeded'] (timeout 00:04:56)"}
- {"dt": "2018-12-04T12:47:23.746562", "lvl": "input", "msg": "sleep 5;
setenv serverip 172.17.1.189; dhcp\n"}
- {"dt": "2018-12-04T12:47:23.852418", "lvl": "target", "msg": "sleep 5;
setenv serverip 172.17.1.189; dhcp"}
- {"dt": "2018-12-04T12:47:23.857183", "lvl": "debug", "msg":
"bootloader-commands: Wait for prompt ['=>', 'Resetting CPU', 'Must RESET board to recover', 'TIMEOUT', 'Retry count exceeded', 'ERROR: The remote end did not respond in time.', 'Retry time exceeded'] (timeout 00:04:56)"}
- {"dt": "2018-12-04T12:47:23.862239", "lvl": "target", "msg": "setenv
autoload no"}
- {"dt": "2018-12-04T12:47:23.866570", "lvl": "input", "msg": "setenv
serverip 172.17.1.189\n"}
It looks like prompt waiting is skipped in this particular block! Why?
https://master.lavasoftware.org/static/docs/v2/contribution-intro.html
Also, attaching updated device type config.
$ diff base-uboot.jinja2 base-uboot.jinja2.backup
19,21c19,20 < " - setenv serverip {SERVER_IP} < - " + uboot_ipaddr_cmd|default('dhcp') + "
< - sleep 1" ) -%}
" - " + uboot_ipaddr_cmd|default('dhcp') + " - setenv serverip {SERVER_IP}" ) -%}
This doesn't need a change to base-uboot.jinja2 - this can be done in the device-type which extends base-uboot.jinja2, supported by a unit test.
No, that's not true. `base_uboot_dhcp_command` variable from base-uboot.jinja2 is not overrideable by device type templates. Or did I missed something?
OK, the base-boot.jinja2 could do with override support to allow for that.
I'll have a look at your test job log file and see how to both remove the sleep and provide the override.
Best regards, Andrejs Cainikovs.
On 29/11/2018 16:38, Andrejs Cainikovs wrote:
Hi Neil,
On 29/11/2018 15:58, Neil Williams wrote:
On Thu, 29 Nov 2018 at 14:42, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote: > Hi everyone, > > > > I’m facing an issue when U-Boot commands are sent one after another
without waiting for a prompt. Obviously, device is not able to boot.
> > > > Excerpt from logs: (Much better to attach the complete log as downloaded from the menu on the job detail page, it includes timestamps and other useful content.) I'm afraid the content you've pasted into your original doesn't help. Please attach the complete test job log file, don't paste it into the body of the email where it will be mangled.
Complete log attached.
What version of lava-server and lava-dispatcher are you running?
Latest from stretch-backports: 2018.11-1~bpo9+1
One thing would be to obtain a U-Boot device which is already available on staging.validation.linaro.org or another instance and known to be working, then borrow the config from that. If you have something like that available already, it'll make things easier.
Thanks for a hint, I'll have a look into this.
> > > As you see, all commands that were sent between `dhcp` command and
until it was able to complete or fail are simply dropped.
> > > > Device type config: > > > > {% extends 'base-uboot.jinja2' %} > > > > {% set device_type = "vcu" %} This line isn't required and a couple of other lines are the defaults.
> > {% set console_device = 'ttyS2' %} > > {% set baud_rate = 115200 %} > > {% set interrupt_prompt = 'Press s to abort autoboot' %} > > {% set interrupt_char = 's' %} > > {% set bootloader_prompt = '=>' %} > > > > {% set uboot_mkimage_arch = 'arm' %} > > > > {% set bootz_kernel_addr = '0x80000000' %} > > {% set bootz_ramdisk_addr = '0x83000000' %} > > {% set bootz_dtb_addr = '0x82000000' %} > > > > {% set extra_kernel_args = 'ti_cpsw.rx_packet_max=1526' %} > > > > {% set kernel_start_message = 'Welcome to' %} > > > > Device config: > > > > {% extends 'vcu.jinja2' %} > > > > {% set connection_command = 'telnet lava-disp-1.local 7000' %} > > {% set power_on_command = 'relay-ctrl --relay 1 --state on' %} > > {% set power_off_command = 'relay-ctrl --relay 1 --state off' %} > > {% set hard_reset_command = 'relay-ctrl --relay 1 --toggle' %} > > > > Boot action block from job definition: > > > > - boot: > > timeout: > > minutes: 5 > > method: u-boot > > commands: ramdisk > > auto_login: > > login_prompt: 'am335x-nmhw21 login: ' > > username: root > > prompts: > > - 'fct@am335x-nmhw21:~# ' > > > > Have I misconfigured something? What I’m missing? Thanks!
On Tue, 4 Dec 2018 at 14:31, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote:
On 04/12/2018 14:41, Neil Williams wrote:
On Tue, 4 Dec 2018 at 13:31, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote:
Hi Neil,
Apart from inlined answers below, job configuration and full log is attached.
On 04/12/2018 11:02, Neil Williams wrote:
On Tue, 4 Dec 2018 at 09:37, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote:
Hi,
TWIMC: I got it working. However, I have noticed that last command in command sequence (at least for U-Boot) never waits for command prompt, so I had to update base-uboot.jinja2 to get past this issue.
Compare with : https://staging.validation.linaro.org/scheduler/job/245806
See https://staging.validation.linaro.org/scheduler/job/245785#L314
Sending with 10 millisecond of delay
https://staging.validation.linaro.org/scheduler/device/staging-black01/devic...
https://git.lavasoftware.org/lava/lava/blob/master/lava_scheduler_app/tests/...
Rather than using sleep, it might just be that automation is firing the characters at the serial connection faster than a human can type and therefore revealing a bug in the hardware which cannot cope. Setting a character delay allows LAVA to wait that many milliseconds between each character, replicating what happens with a human.
In your device-type jinja2, you can try: {% set boot_character_delay = 10 %}
The last command issued to U-Boot should be boot and that is not expected to ever get to a U-Boot prompt. Yet LAVA does check for a U-Boot prompt occurring after that command, along with other failure indicators like Must RESET board to recover
You are, of course, right. I was not clear enough, and by last command I meant something else, which we can skip now. Obviously, my aim is to boot the DUT past the U-Boot into Linux, and then run some tests.
Please do more investigation on your end to work out if the problem is correctly fixed.
Now, when I'm looking at this patch which I prepared yesterday - I'm not sure whether it was indeed issue with last command in a sequence, or with `dhcp` command processing (there must be a reason why I moved `setenv serverip` to top, but I'm lazy to check. Just giving it back to community as it is in case someone will face something similar.
Please take the time to check. When vcu.jinja2 is properly debugged, a contribution to upstream (with supporting unit test) would be very welcome.
I have put some extra effort into debugging, and got working scenario without changing base-uboot.jinja2. I did this by setting `uboot_ipaddr_cmd` in device type template:
{% set uboot_ipaddr_cmd = 'sleep 5; setenv serverip {SERVER_IP}; dhcp' %}
I think this might be part of the problem. The U-Boot shell will not emit a prompt between commands if those commands are sent as single line separated by a semi-colon. These should be separate lines in the U-boot commands. You shouldn't need the sleep.
I need that sleep due to environment specifics, it's not a part of a workaround here.
In which case, use Jinja2 comments to clearly explain what is going on here or someone will later "optimise" it.
{# the sleep is needed here because .... #}
Be verbose in that comment, possibly even link to a URL where the full reasoning is explained.
I have tried just now to move the sleep into relay control script, removing `uboot_ipaddr_cmd` from device type config at all, and I hit the original problem again: `setenv serverip` is ignored by U-Boot because it was sent while getting DHCP address without waiting for command prompt.
I fear that the sleep is causing this to work "by mistake" - i.e. accidentally - and that the core problem of this DUT has not been resolved. Sometime, somewhere, someone else is going to come across this and find intermittent failures because the sleep didn't turn out to be long enough.
From job_94.log
- {"dt": "2018-12-04T14:21:35.918610", "lvl": "target", "msg": "=> dhcpsetenv serverip 172.17.1.189"}
That is the core problem. The DUT hasn't processed the single command as a list of commands, it's concatenated.
Log attached.
However, I still realize this is a hack, and looking into the log I might be catching something:
- {"dt": "2018-12-04T12:47:23.731495", "lvl": "debug", "msg":
"bootloader-commands: Wait for prompt ['=>', 'Resetting CPU', 'Must RESET board to recover', 'TIMEOUT', 'Retry count exceeded', 'ERROR: The remote end did not respond in time.', 'Retry time exceeded'] (timeout 00:04:56)"}
- {"dt": "2018-12-04T12:47:23.746562", "lvl": "input", "msg": "sleep 5;
setenv serverip 172.17.1.189; dhcp\n"}
- {"dt": "2018-12-04T12:47:23.852418", "lvl": "target", "msg": "sleep 5;
setenv serverip 172.17.1.189; dhcp"}
- {"dt": "2018-12-04T12:47:23.857183", "lvl": "debug", "msg":
"bootloader-commands: Wait for prompt ['=>', 'Resetting CPU', 'Must RESET board to recover', 'TIMEOUT', 'Retry count exceeded', 'ERROR: The remote end did not respond in time.', 'Retry time exceeded'] (timeout 00:04:56)"}
- {"dt": "2018-12-04T12:47:23.862239", "lvl": "target", "msg": "setenv
autoload no"}
- {"dt": "2018-12-04T12:47:23.866570", "lvl": "input", "msg": "setenv
serverip 172.17.1.189\n"}
It looks like prompt waiting is skipped in this particular block! Why?
https://master.lavasoftware.org/static/docs/v2/contribution-intro.html
Also, attaching updated device type config.
$ diff base-uboot.jinja2 base-uboot.jinja2.backup
19,21c19,20 < " - setenv serverip {SERVER_IP} < - " + uboot_ipaddr_cmd|default('dhcp') + "
< - sleep 1" ) -%}
" - " + uboot_ipaddr_cmd|default('dhcp') + " - setenv serverip {SERVER_IP}" ) -%}
This doesn't need a change to base-uboot.jinja2 - this can be done in the device-type which extends base-uboot.jinja2, supported by a unit test.
No, that's not true. `base_uboot_dhcp_command` variable from base-uboot.jinja2 is not overrideable by device type templates. Or did I missed something?
OK, the base-boot.jinja2 could do with override support to allow for that.
I'll have a look at your test job log file and see how to both remove the sleep and provide the override.
For the override, it's simply a case of defining a variable and using the existing string as a default:
{% set base_uboot_dhcp_command = uboot_dhcp_command|default( " - " + uboot_ipaddr_cmd|default('dhcp') + " - setenv serverip {SERVER_IP}" ) -%}
Then your device-type jinja2 could set uboot_dhcp_command
When it does, it should set a list of commands not a single command using semicolons. Compare with x15.jinja2
Best regards, Andrejs Cainikovs.
On 29/11/2018 16:38, Andrejs Cainikovs wrote:
Hi Neil,
On 29/11/2018 15:58, Neil Williams wrote: > On Thu, 29 Nov 2018 at 14:42, Andrejs Cainikovs > Andrejs.Cainikovs@netmodule.com wrote: >> Hi everyone, >> >> >> >> I’m facing an issue when U-Boot commands are sent one after another without waiting for a prompt. Obviously, device is not able to boot. >> >> >> >> Excerpt from logs: > (Much better to attach the complete log as downloaded from the menu on > the job detail page, it includes timestamps and other useful content.) > I'm afraid the content you've pasted into your original doesn't help. > Please attach the complete test job log file, don't paste it into the > body of the email where it will be mangled.
Complete log attached.
> What version of lava-server and lava-dispatcher are you running?
Latest from stretch-backports: 2018.11-1~bpo9+1
> > One thing would be to obtain a U-Boot device which is already > available on staging.validation.linaro.org or another instance and > known to be working, then borrow the config from that. If you have > something like that available already, it'll make things easier.
Thanks for a hint, I'll have a look into this.
>> >> >> As you see, all commands that were sent between `dhcp` command and until it was able to complete or fail are simply dropped. >> >> >> >> Device type config: >> >> >> >> {% extends 'base-uboot.jinja2' %} >> >> >> >> {% set device_type = "vcu" %} > This line isn't required and a couple of other lines are the defaults. > >> >> {% set console_device = 'ttyS2' %} >> >> {% set baud_rate = 115200 %} >> >> {% set interrupt_prompt = 'Press s to abort autoboot' %} >> >> {% set interrupt_char = 's' %} >> >> {% set bootloader_prompt = '=>' %} >> >> >> >> {% set uboot_mkimage_arch = 'arm' %} >> >> >> >> {% set bootz_kernel_addr = '0x80000000' %} >> >> {% set bootz_ramdisk_addr = '0x83000000' %} >> >> {% set bootz_dtb_addr = '0x82000000' %} >> >> >> >> {% set extra_kernel_args = 'ti_cpsw.rx_packet_max=1526' %} >> >> >> >> {% set kernel_start_message = 'Welcome to' %} >> >> >> >> Device config: >> >> >> >> {% extends 'vcu.jinja2' %} >> >> >> >> {% set connection_command = 'telnet lava-disp-1.local 7000' %} >> >> {% set power_on_command = 'relay-ctrl --relay 1 --state on' %} >> >> {% set power_off_command = 'relay-ctrl --relay 1 --state off' %} >> >> {% set hard_reset_command = 'relay-ctrl --relay 1 --toggle' %} >> >> >> >> Boot action block from job definition: >> >> >> >> - boot: >> >> timeout: >> >> minutes: 5 >> >> method: u-boot >> >> commands: ramdisk >> >> auto_login: >> >> login_prompt: 'am335x-nmhw21 login: ' >> >> username: root >> >> prompts: >> >> - 'fct@am335x-nmhw21:~# ' >> >> >> >> Have I misconfigured something? What I’m missing? Thanks! > > >
On 04/12/2018 15:51, Neil Williams wrote:
On Tue, 4 Dec 2018 at 14:31, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote:
On 04/12/2018 14:41, Neil Williams wrote:
On Tue, 4 Dec 2018 at 13:31, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote:
Hi Neil,
Apart from inlined answers below, job configuration and full log is attached.
On 04/12/2018 11:02, Neil Williams wrote:
On Tue, 4 Dec 2018 at 09:37, Andrejs Cainikovs Andrejs.Cainikovs@netmodule.com wrote:
Hi,
TWIMC: I got it working. However, I have noticed that last command in command sequence (at least for U-Boot) never waits for command prompt, so I had to update base-uboot.jinja2 to get past this issue.
Compare with : https://staging.validation.linaro.org/scheduler/job/245806
See https://staging.validation.linaro.org/scheduler/job/245785#L314
Sending with 10 millisecond of delay
https://staging.validation.linaro.org/scheduler/device/staging-black01/devic...
https://git.lavasoftware.org/lava/lava/blob/master/lava_scheduler_app/tests/...
Rather than using sleep, it might just be that automation is firing the characters at the serial connection faster than a human can type and therefore revealing a bug in the hardware which cannot cope. Setting a character delay allows LAVA to wait that many milliseconds between each character, replicating what happens with a human.
In your device-type jinja2, you can try: {% set boot_character_delay = 10 %}
I have played with char delay quite a lot, and it gave me no result.
Also, I wrote that sleep required by environment, and it has nothing to do with serial connection or anything in this discussion. I will expand it: negotiation between or Ethernet switch and our hardware takes few seconds. I need that sleep or dhcp will fail because of timeout.
The last command issued to U-Boot should be boot and that is not expected to ever get to a U-Boot prompt. Yet LAVA does check for a U-Boot prompt occurring after that command, along with other failure indicators like Must RESET board to recover
You are, of course, right. I was not clear enough, and by last command I meant something else, which we can skip now. Obviously, my aim is to boot the DUT past the U-Boot into Linux, and then run some tests.
Please do more investigation on your end to work out if the problem is correctly fixed.
Now, when I'm looking at this patch which I prepared yesterday - I'm not sure whether it was indeed issue with last command in a sequence, or with `dhcp` command processing (there must be a reason why I moved `setenv serverip` to top, but I'm lazy to check. Just giving it back to community as it is in case someone will face something similar.
Please take the time to check. When vcu.jinja2 is properly debugged, a contribution to upstream (with supporting unit test) would be very welcome.
I have put some extra effort into debugging, and got working scenario without changing base-uboot.jinja2. I did this by setting `uboot_ipaddr_cmd` in device type template:
{% set uboot_ipaddr_cmd = 'sleep 5; setenv serverip {SERVER_IP}; dhcp' %}
I think this might be part of the problem. The U-Boot shell will not emit a prompt between commands if those commands are sent as single line separated by a semi-colon. These should be separate lines in the U-boot commands. You shouldn't need the sleep.
I need that sleep due to environment specifics, it's not a part of a workaround here.
In which case, use Jinja2 comments to clearly explain what is going on here or someone will later "optimise" it.
{# the sleep is needed here because .... #}
Be verbose in that comment, possibly even link to a URL where the full reasoning is explained.
Ditto.
I have tried just now to move the sleep into relay control script, removing `uboot_ipaddr_cmd` from device type config at all, and I hit the original problem again: `setenv serverip` is ignored by U-Boot because it was sent while getting DHCP address without waiting for command prompt.
I fear that the sleep is causing this to work "by mistake" - i.e. accidentally - and that the core problem of this DUT has not been resolved. Sometime, somewhere, someone else is going to come across this and find intermittent failures because the sleep didn't turn out to be long enough.
From job_94.log
- {"dt": "2018-12-04T14:21:35.918610", "lvl": "target", "msg": "=>
dhcpsetenv serverip 172.17.1.189"}
That is the core problem. The DUT hasn't processed the single command as a list of commands, it's concatenated.
What it was concatenated in the first place? If there would be a correct prompt processing I would not see this issue at all, right?
Log attached.
However, I still realize this is a hack, and looking into the log I might be catching something:
- {"dt": "2018-12-04T12:47:23.731495", "lvl": "debug", "msg":
"bootloader-commands: Wait for prompt ['=>', 'Resetting CPU', 'Must RESET board to recover', 'TIMEOUT', 'Retry count exceeded', 'ERROR: The remote end did not respond in time.', 'Retry time exceeded'] (timeout 00:04:56)"}
- {"dt": "2018-12-04T12:47:23.746562", "lvl": "input", "msg": "sleep 5;
setenv serverip 172.17.1.189; dhcp\n"}
- {"dt": "2018-12-04T12:47:23.852418", "lvl": "target", "msg": "sleep 5;
setenv serverip 172.17.1.189; dhcp"}
- {"dt": "2018-12-04T12:47:23.857183", "lvl": "debug", "msg":
"bootloader-commands: Wait for prompt ['=>', 'Resetting CPU', 'Must RESET board to recover', 'TIMEOUT', 'Retry count exceeded', 'ERROR: The remote end did not respond in time.', 'Retry time exceeded'] (timeout 00:04:56)"}
- {"dt": "2018-12-04T12:47:23.862239", "lvl": "target", "msg": "setenv
autoload no"}
- {"dt": "2018-12-04T12:47:23.866570", "lvl": "input", "msg": "setenv
serverip 172.17.1.189\n"}
It looks like prompt waiting is skipped in this particular block! Why?
https://master.lavasoftware.org/static/docs/v2/contribution-intro.html
Also, attaching updated device type config.
$ diff base-uboot.jinja2 base-uboot.jinja2.backup
19,21c19,20 < " - setenv serverip {SERVER_IP} < - " + uboot_ipaddr_cmd|default('dhcp') + "
< - sleep 1" ) -%}
> " - " + uboot_ipaddr_cmd|default('dhcp') + " > - setenv serverip {SERVER_IP}" ) -%}
This doesn't need a change to base-uboot.jinja2 - this can be done in the device-type which extends base-uboot.jinja2, supported by a unit test.
No, that's not true. `base_uboot_dhcp_command` variable from base-uboot.jinja2 is not overrideable by device type templates. Or did I missed something?
OK, the base-boot.jinja2 could do with override support to allow for that.
I'll have a look at your test job log file and see how to both remove the sleep and provide the override.
For the override, it's simply a case of defining a variable and using the existing string as a default:
{% set base_uboot_dhcp_command = uboot_dhcp_command|default( " - " + uboot_ipaddr_cmd|default('dhcp') +" - setenv serverip {SERVER_IP}" ) -%}
Then your device-type jinja2 could set uboot_dhcp_command
In latest official release (2018.11-1~bpo9+1) it's:
{% set base_uboot_dhcp_command = ( " - " + uboot_ipaddr_cmd|default('dhcp') + " - setenv serverip {SERVER_IP}" ) -%}
Which does not allow me to override `base_uboot_dhcp_command` from device type config.
When it does, it should set a list of commands not a single command using semicolons. Compare with x15.jinja2
Thanks, but we're not using fastboot, we're booting via tftp from Jenkins. If I strip all fastboot related configuration from x15.jinja2, it will be basically same configuration as I have now.
Best regards, Andrejs Cainikovs.
On 29/11/2018 16:38, Andrejs Cainikovs wrote: > Hi Neil, > > On 29/11/2018 15:58, Neil Williams wrote: > > On Thu, 29 Nov 2018 at 14:42, Andrejs Cainikovs > > Andrejs.Cainikovs@netmodule.com wrote: > >> Hi everyone, > >> > >> > >> > >> I’m facing an issue when U-Boot commands are sent one after another > without waiting for a prompt. Obviously, device is not able to boot. > >> > >> > >> > >> Excerpt from logs: > > (Much better to attach the complete log as downloaded from the menu on > > the job detail page, it includes timestamps and other useful content.) > > I'm afraid the content you've pasted into your original doesn't help. > > Please attach the complete test job log file, don't paste it into the > > body of the email where it will be mangled. > > Complete log attached. > > > What version of lava-server and lava-dispatcher are you running? > > Latest from stretch-backports: 2018.11-1~bpo9+1 > > > > > One thing would be to obtain a U-Boot device which is already > > available on staging.validation.linaro.org or another instance and > > known to be working, then borrow the config from that. If you have > > something like that available already, it'll make things easier. > > Thanks for a hint, I'll have a look into this. > > >> > >> > >> As you see, all commands that were sent between `dhcp` command and > until it was able to complete or fail are simply dropped. > >> > >> > >> > >> Device type config: > >> > >> > >> > >> {% extends 'base-uboot.jinja2' %} > >> > >> > >> > >> {% set device_type = "vcu" %} > > This line isn't required and a couple of other lines are the defaults. > > > >> > >> {% set console_device = 'ttyS2' %} > >> > >> {% set baud_rate = 115200 %} > >> > >> {% set interrupt_prompt = 'Press s to abort autoboot' %} > >> > >> {% set interrupt_char = 's' %} > >> > >> {% set bootloader_prompt = '=>' %} > >> > >> > >> > >> {% set uboot_mkimage_arch = 'arm' %} > >> > >> > >> > >> {% set bootz_kernel_addr = '0x80000000' %} > >> > >> {% set bootz_ramdisk_addr = '0x83000000' %} > >> > >> {% set bootz_dtb_addr = '0x82000000' %} > >> > >> > >> > >> {% set extra_kernel_args = 'ti_cpsw.rx_packet_max=1526' %} > >> > >> > >> > >> {% set kernel_start_message = 'Welcome to' %} > >> > >> > >> > >> Device config: > >> > >> > >> > >> {% extends 'vcu.jinja2' %} > >> > >> > >> > >> {% set connection_command = 'telnet lava-disp-1.local 7000' %} > >> > >> {% set power_on_command = 'relay-ctrl --relay 1 --state on' %} > >> > >> {% set power_off_command = 'relay-ctrl --relay 1 --state off' %} > >> > >> {% set hard_reset_command = 'relay-ctrl --relay 1 --toggle' %} > >> > >> > >> > >> Boot action block from job definition: > >> > >> > >> > >> - boot: > >> > >> timeout: > >> > >> minutes: 5 > >> > >> method: u-boot > >> > >> commands: ramdisk > >> > >> auto_login: > >> > >> login_prompt: 'am335x-nmhw21 login: ' > >> > >> username: root > >> > >> prompts: > >> > >> - 'fct@am335x-nmhw21:~# ' > >> > >> > >> > >> Have I misconfigured something? What I’m missing? Thanks! > > > > > > >
lava-users@lists.lavasoftware.org