Hi there,
When I run LAVA hacking session on Juno but found sometimes Juno cannot be allocated IP properly:
- I created multinode definition for Juno: https://validation.linaro.org/scheduler/job/845471.0: this definition is to launch kvm so I can run testing suits on it; https://validation.linaro.org/scheduler/job/845471.1: this definition is to launch "deploy_linaro_image" on Juno board;
- After launched these two images, the kvm usually can work well and I can smoothly log in it with ssh;
- But for juno board, it will have below log for ssh:
395.1 ./setup_session_oe: line 38: /etc/init.d/ssh: No such file or directory 395.2 <LAVA_SIGNAL_TESTCASE TEST_CASE_ID=sshd-restart RESULT=fail> 395.3 sshd re-start failed 395.4 Target's Gateway: 10.0.0.1 395.5 ip: RTNETLINK answers: Network is unreachable 395.6 395.7 395.8 ********************************************************************************************* 395.9 Please connect to: ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no root@ (juno-07) 395.10 ********************************************************************************************* 395.11
So finally I cannot get the info for Juno's IP and cannot log in it with ssh. It's not everytime I can reproduce this failure, so sometimes it's lucky so I can get a correct IP.
- As a workaroud, I found if I create two saperated definitions for Juno and kvm independently, then Juno's IP issue can be resolved:
https://validation.linaro.org/scheduler/job/845552 https://validation.linaro.org/scheduler/job/845561
So could you help give suggestions for this?
Thanks, Leo Yan
On 24 April 2016 at 18:03, Leo Yan leo.yan@linaro.org wrote:
Hi there,
When I run LAVA hacking session on Juno but found sometimes Juno cannot be allocated IP properly:
- I created multinode definition for Juno:
Is there a reason you are trying to log into both sides of a multinode job? You won't be able to use the pass information between the two nodes using the multinode API as that requires signal handlers in the dispatcher codebase. More likely you just need one side of the job, in which case you already have that working.
https://validation.linaro.org/scheduler/job/845471.0: this definition is to launch kvm so I can run testing suits on it; https://validation.linaro.org/scheduler/job/845471.1: this definition is to launch "deploy_linaro_image" on Juno board;
After launched these two images, the kvm usually can work well and I can smoothly log in it with ssh;
But for juno board, it will have below log for ssh:
Not all OpenEmbedded images can support a hacking session. Some require extra scripting - maybe done in a test shell which is defined before the hacking session-oe is invoked. This is a test shell problem, not a device problem. The device itself has raised an IPv4 address during the master image stage.
For whatever reason, the OE image provided failed to request an IP address after bringing up the interface https://validation.linaro.org/scheduler/job/845472/log_file#L_375_3
So finally I cannot get the info for Juno's IP and cannot log in it with ssh. It's not everytime I can reproduce this failure, so sometimes it's lucky so I can get a correct IP.
You need to investigate this within the image you are using and identify what steps are needed to fix the image. It may be that you need ifdown and then an explicit ifup or a different image.
As a workaroud, I found if I create two saperated definitions for Juno and kvm independently, then Juno's IP issue can be resolved:
https://validation.linaro.org/scheduler/job/845552 https://validation.linaro.org/scheduler/job/845561
So could you help give suggestions for this?
What are you trying to achieve with a multinode hacking session? You have a working singlenode hacking session on a juno, adding a KVM which you won't be able to talk to in the same way as a multinode job probably doesn't help with anything useful.
On 25/04/16 07:56, Neil Williams wrote:
On 24 April 2016 at 18:03, Leo Yan leo.yan@linaro.org wrote:
Hi there,
When I run LAVA hacking session on Juno but found sometimes Juno cannot be allocated IP properly:
- I created multinode definition for Juno:
Is there a reason you are trying to log into both sides of a multinode job?
I started discussing this with Luca last Thursday and I have a draft mail describing things a bit more:
As part of Leo's work for one of the members we would like to interactively run some tests on Juno.
We think the ideal structure looks pretty much like the one below:
+------------------------------------------+ + | | | | +--------------+ +-------------+ | | | | Orchestrator | ssh | | | Leo +---------------> (lisa or +---------> Juno | | | | | workflow | | | | | | | automation) | | | | + | +--------------+ +-------------+ | China | KVM Target | Firewall | lava-lab | | | +------------------------------------------+
Some observations and rationale:
* Leo needs to be able to interact with the software running on the Orchestrator via a http:// (or maybe https://) socket. An SSH tunnel with port forwarding is a good way to achieve this since then the web socket can be bound to the lo interface. This is what drives the interest in running this setup as a hacking session.
* That software will, in turn, perform lots of scripted interaction with the Juno. The volume to interaction between orchestrator and target means they need a relatively low latency. This is why we avoid the simpler approach of running the orchestrator from home and having a single-node hacking session.
* The orchestrator is a moderately complex python code base which Leo has installed in a VM.
* The orchestrator needs to run in a sufficiently secure environment for it to locally hold the means to authenticate itself to the Juno.
For anyone deeply interested the software stack on the orchestrator is lisa from ARM. lisa consists of a bunch of useful python code glued together with jupyter (nee ipython). It is a good tool to build complex but ad-hoc investigatory tests.
Other tools such as ARM's workload-automation and (so I hear) some of the OE testing environments can be deployed to the lab in a similar fashion using multi-node jobs. However AFAIK neither of these tools needs an interactive link between the job owner and the VM.
Daniel.
You won't be able to use the pass information between the two
nodes using the multinode API as that requires signal handlers in the dispatcher codebase. More likely you just need one side of the job, in which case you already have that working.
https://validation.linaro.org/scheduler/job/845471.0: this definition is to launch kvm so I can run testing suits on it; https://validation.linaro.org/scheduler/job/845471.1: this definition is to launch "deploy_linaro_image" on Juno board;
After launched these two images, the kvm usually can work well and I can smoothly log in it with ssh;
But for juno board, it will have below log for ssh:
Not all OpenEmbedded images can support a hacking session. Some require extra scripting - maybe done in a test shell which is defined before the hacking session-oe is invoked. This is a test shell problem, not a device problem. The device itself has raised an IPv4 address during the master image stage.
For whatever reason, the OE image provided failed to request an IP address after bringing up the interface https://validation.linaro.org/scheduler/job/845472/log_file#L_375_3
So finally I cannot get the info for Juno's IP and cannot log in it with ssh. It's not everytime I can reproduce this failure, so sometimes it's lucky so I can get a correct IP.
You need to investigate this within the image you are using and identify what steps are needed to fix the image. It may be that you need ifdown and then an explicit ifup or a different image.
As a workaroud, I found if I create two saperated definitions for Juno and kvm independently, then Juno's IP issue can be resolved:
https://validation.linaro.org/scheduler/job/845552 https://validation.linaro.org/scheduler/job/845561
So could you help give suggestions for this?
What are you trying to achieve with a multinode hacking session? You have a working singlenode hacking session on a juno, adding a KVM which you won't be able to talk to in the same way as a multinode job probably doesn't help with anything useful.
On 25/04/16 08:50, Daniel Thompson wrote:
On 25/04/16 07:56, Neil Williams wrote:
On 24 April 2016 at 18:03, Leo Yan leo.yan@linaro.org wrote:
Hi there,
When I run LAVA hacking session on Juno but found sometimes Juno cannot be allocated IP properly:
- I created multinode definition for Juno:
Is there a reason you are trying to log into both sides of a multinode job?
I started discussing this with Luca last Thursday and I have a draft mail describing things a bit more:
... and promptly forgot to add him to Cc:. Sorry Luca!
As part of Leo's work for one of the members we would like to interactively run some tests on Juno.
We think the ideal structure looks pretty much like the one below:
+------------------------------------------+ + | | | | +--------------+ +-------------+ | | | | Orchestrator | ssh | | |
Leo +---------------> (lisa or +---------> Juno | | | | | workflow | | | | | | | automation) | | | | + | +--------------+ +-------------+ | China | KVM Target | Firewall | lava-lab | | | +------------------------------------------+
Some observations and rationale:
Leo needs to be able to interact with the software running on the Orchestrator via a http:// (or maybe https://) socket. An SSH tunnel with port forwarding is a good way to achieve this since then the web socket can be bound to the lo interface. This is what drives the interest in running this setup as a hacking session.
That software will, in turn, perform lots of scripted interaction with the Juno. The volume to interaction between orchestrator and target means they need a relatively low latency. This is why we avoid the simpler approach of running the orchestrator from home and having a single-node hacking session.
The orchestrator is a moderately complex python code base which Leo has installed in a VM.
The orchestrator needs to run in a sufficiently secure environment for it to locally hold the means to authenticate itself to the Juno.
For anyone deeply interested the software stack on the orchestrator is lisa from ARM. lisa consists of a bunch of useful python code glued together with jupyter (nee ipython). It is a good tool to build complex but ad-hoc investigatory tests.
Other tools such as ARM's workload-automation and (so I hear) some of the OE testing environments can be deployed to the lab in a similar fashion using multi-node jobs. However AFAIK neither of these tools needs an interactive link between the job owner and the VM.
Daniel.
You won't be able to use the pass information between the two
nodes using the multinode API as that requires signal handlers in the dispatcher codebase. More likely you just need one side of the job, in which case you already have that working.
https://validation.linaro.org/scheduler/job/845471.0: this definition is to launch kvm so I can run testing suits on it; https://validation.linaro.org/scheduler/job/845471.1: this definition is to launch "deploy_linaro_image" on Juno board;
After launched these two images, the kvm usually can work well and I can smoothly log in it with ssh;
But for juno board, it will have below log for ssh:
Not all OpenEmbedded images can support a hacking session. Some require extra scripting - maybe done in a test shell which is defined before the hacking session-oe is invoked. This is a test shell problem, not a device problem. The device itself has raised an IPv4 address during the master image stage.
For whatever reason, the OE image provided failed to request an IP address after bringing up the interface https://validation.linaro.org/scheduler/job/845472/log_file#L_375_3
So finally I cannot get the info for Juno's IP and cannot log in it with ssh. It's not everytime I can reproduce this failure, so sometimes it's lucky so I can get a correct IP.
You need to investigate this within the image you are using and identify what steps are needed to fix the image. It may be that you need ifdown and then an explicit ifup or a different image.
As a workaroud, I found if I create two saperated definitions for Juno and kvm independently, then Juno's IP issue can be resolved:
https://validation.linaro.org/scheduler/job/845552 https://validation.linaro.org/scheduler/job/845561
So could you help give suggestions for this?
What are you trying to achieve with a multinode hacking session? You have a working singlenode hacking session on a juno, adding a KVM which you won't be able to talk to in the same way as a multinode job probably doesn't help with anything useful.
On Mon, Apr 25, 2016 at 08:52:39AM +0100, Daniel Thompson wrote:
[...]
As part of Leo's work for one of the members we would like to interactively run some tests on Juno.
We think the ideal structure looks pretty much like the one below:
+------------------------------------------+ + | | | | +--------------+ +-------------+ | | | | Orchestrator | ssh | | |
Leo +---------------> (lisa or +---------> Juno | | | | | workflow | | | | | | | automation) | | | | + | +--------------+ +-------------+ | China | KVM Target | Firewall | lava-lab | | | +------------------------------------------+
Some observations and rationale:
- Leo needs to be able to interact with the software running on the Orchestrator via a http:// (or maybe https://) socket. An SSH tunnel with port forwarding is a good way to achieve this since then the web socket can be bound to the lo interface. This is what drives the interest in running this setup as a hacking session.
One more question, the kvm will only boot with a text console. So bind web socket with lo interface, we can launch web browser within kvm itself. So should we bind web socket with kvm's eth0, and then use my local PC to remote access Orchestrator with http://?
That software will, in turn, perform lots of scripted interaction with the Juno. The volume to interaction between orchestrator and target means they need a relatively low latency. This is why we avoid the simpler approach of running the orchestrator from home and having a single-node hacking session.
The orchestrator is a moderately complex python code base which Leo has installed in a VM.
The orchestrator needs to run in a sufficiently secure environment for it to locally hold the means to authenticate itself to the Juno.
For anyone deeply interested the software stack on the orchestrator is lisa from ARM. lisa consists of a bunch of useful python code glued together with jupyter (nee ipython). It is a good tool to build complex but ad-hoc investigatory tests.
Other tools such as ARM's workload-automation and (so I hear) some of the OE testing environments can be deployed to the lab in a similar fashion using multi-node jobs. However AFAIK neither of these tools needs an interactive link between the job owner and the VM.
Daniel.
You won't be able to use the pass information between the two
nodes using the multinode API as that requires signal handlers in the dispatcher codebase. More likely you just need one side of the job, in which case you already have that working.
https://validation.linaro.org/scheduler/job/845471.0: this definition is to launch kvm so I can run testing suits on it; https://validation.linaro.org/scheduler/job/845471.1: this definition is to launch "deploy_linaro_image" on Juno board;
After launched these two images, the kvm usually can work well and I can smoothly log in it with ssh;
But for juno board, it will have below log for ssh:
Not all OpenEmbedded images can support a hacking session. Some require extra scripting - maybe done in a test shell which is defined before the hacking session-oe is invoked. This is a test shell problem, not a device problem. The device itself has raised an IPv4 address during the master image stage.
For whatever reason, the OE image provided failed to request an IP address after bringing up the interface https://validation.linaro.org/scheduler/job/845472/log_file#L_375_3
So finally I cannot get the info for Juno's IP and cannot log in it with ssh. It's not everytime I can reproduce this failure, so sometimes it's lucky so I can get a correct IP.
You need to investigate this within the image you are using and identify what steps are needed to fix the image. It may be that you need ifdown and then an explicit ifup or a different image.
As a workaroud, I found if I create two saperated definitions for Juno and kvm independently, then Juno's IP issue can be resolved:
https://validation.linaro.org/scheduler/job/845552 https://validation.linaro.org/scheduler/job/845561
So could you help give suggestions for this?
What are you trying to achieve with a multinode hacking session? You have a working singlenode hacking session on a juno, adding a KVM which you won't be able to talk to in the same way as a multinode job probably doesn't help with anything useful.
On 25/04/16 09:28, Leo Yan wrote:
On Mon, Apr 25, 2016 at 08:52:39AM +0100, Daniel Thompson wrote:
[...]
As part of Leo's work for one of the members we would like to interactively run some tests on Juno.
We think the ideal structure looks pretty much like the one below:
+------------------------------------------+ + | | | | +--------------+ +-------------+ | | | | Orchestrator | ssh | | |
Leo +---------------> (lisa or +---------> Juno | | | | | workflow | | | | | | | automation) | | | | + | +--------------+ +-------------+ | China | KVM Target | Firewall | lava-lab | | | +------------------------------------------+
Some observations and rationale:
- Leo needs to be able to interact with the software running on the Orchestrator via a http:// (or maybe https://) socket. An SSH tunnel with port forwarding is a good way to achieve this since then the web socket can be bound to the lo interface. This is what drives the interest in running this setup as a hacking session.
One more question, the kvm will only boot with a text console. So bind web socket with lo interface, we can launch web browser within kvm itself. So should we bind web socket with kvm's eth0, and then use my local PC to remote access Orchestrator with http://?
Sorry Leo. I was mixing requirements and potential solutions. Right now it is more important for us to clearly express the requirements. There's plenty we don't know about LAVA so better to describe what we need rather than how we try to achieve it!
The *requirement* is that you need to be able to use a web browser running on your laptop to access a service running on the orchestrator.
With the requirement stated clearly, we can go back to my original point. ssh can meet the above requirement. When it forwards a socket it can connect within the KVM using the loopback interface; that what localhost means when it appears in "-L 8888:localhost:8888".
Daniel.
On 25 April 2016 at 08:50, Daniel Thompson daniel.thompson@linaro.org wrote:
On 25/04/16 07:56, Neil Williams wrote:
On 24 April 2016 at 18:03, Leo Yan leo.yan@linaro.org wrote:
Hi there,
When I run LAVA hacking session on Juno but found sometimes Juno cannot be allocated IP properly:
- I created multinode definition for Juno:
Is there a reason you are trying to log into both sides of a multinode job?
I started discussing this with Luca last Thursday and I have a draft mail describing things a bit more:
As part of Leo's work for one of the members we would like to interactively run some tests on Juno.
We think the ideal structure looks pretty much like the one below:
+------------------------------------------+ + | | | | +--------------+ +-------------+ | | | | Orchestrator | ssh | | |
Leo +---------------> (lisa or +---------> Juno | | | | | workflow | | | | | | | automation) | | | | + | +--------------+ +-------------+ | China | KVM Target | Firewall | lava-lab | | | +------------------------------------------+
Some observations and rationale:
- Leo needs to be able to interact with the software running on the Orchestrator via a http:// (or maybe https://) socket. An SSH tunnel with port forwarding is a good way to achieve this since then the web socket can be bound to the lo interface. This is what drives the interest in running this setup as a hacking session.
Is this automated or manual setup? If it's manual, why using LAVA?
- That software will, in turn, perform lots of scripted interaction with the Juno. The volume to interaction between orchestrator and target means they need a relatively low latency. This is why we avoid the simpler approach of running the orchestrator from home and having a single-node hacking session.
we already have this covered with Workload Automation and android benchmarking. WA supports ssh based connections for 'linux' targets. I guess lisa works in similar way. So what you need (assuming this is automated not manual setup) is to start session on KVM (host), start session on Juno (target), pass IP address from target to host, setup lisa to connect using this IP and run your testing. During this time target will wait for the host to signal 'testing finished'. Please check wa2-lava repo for examples:
https://git.linaro.org/qa/wa2-lava.git/blob/HEAD:/README
milosz
The orchestrator is a moderately complex python code base which Leo has installed in a VM.
The orchestrator needs to run in a sufficiently secure environment for it to locally hold the means to authenticate itself to the Juno.
For anyone deeply interested the software stack on the orchestrator is lisa from ARM. lisa consists of a bunch of useful python code glued together with jupyter (nee ipython). It is a good tool to build complex but ad-hoc investigatory tests.
Other tools such as ARM's workload-automation and (so I hear) some of the OE testing environments can be deployed to the lab in a similar fashion using multi-node jobs. However AFAIK neither of these tools needs an interactive link between the job owner and the VM.
Daniel.
You won't be able to use the pass information between the two
nodes using the multinode API as that requires signal handlers in the dispatcher codebase. More likely you just need one side of the job, in which case you already have that working.
https://validation.linaro.org/scheduler/job/845471.0: this definition is to launch kvm so I can run testing suits on it; https://validation.linaro.org/scheduler/job/845471.1: this definition is to launch "deploy_linaro_image" on Juno board;
After launched these two images, the kvm usually can work well and I can smoothly log in it with ssh;
But for juno board, it will have below log for ssh:
Not all OpenEmbedded images can support a hacking session. Some require extra scripting - maybe done in a test shell which is defined before the hacking session-oe is invoked. This is a test shell problem, not a device problem. The device itself has raised an IPv4 address during the master image stage.
For whatever reason, the OE image provided failed to request an IP address after bringing up the interface https://validation.linaro.org/scheduler/job/845472/log_file#L_375_3
So finally I cannot get the info for Juno's IP and cannot log in it with ssh. It's not everytime I can reproduce this failure, so sometimes it's lucky so I can get a correct IP.
You need to investigate this within the image you are using and identify what steps are needed to fix the image. It may be that you need ifdown and then an explicit ifup or a different image.
As a workaroud, I found if I create two saperated definitions for Juno and kvm independently, then Juno's IP issue can be resolved:
https://validation.linaro.org/scheduler/job/845552 https://validation.linaro.org/scheduler/job/845561
So could you help give suggestions for this?
What are you trying to achieve with a multinode hacking session? You have a working singlenode hacking session on a juno, adding a KVM which you won't be able to talk to in the same way as a multinode job probably doesn't help with anything useful.
Lava-users mailing list Lava-users@lists.linaro.org https://lists.linaro.org/mailman/listinfo/lava-users
On Mon, Apr 25, 2016 at 01:51:00PM +0100, Milosz Wasilewski wrote:
On 25 April 2016 at 08:50, Daniel Thompson daniel.thompson@linaro.org wrote:
On 25/04/16 07:56, Neil Williams wrote:
On 24 April 2016 at 18:03, Leo Yan leo.yan@linaro.org wrote:
Hi there,
When I run LAVA hacking session on Juno but found sometimes Juno cannot be allocated IP properly:
- I created multinode definition for Juno:
Is there a reason you are trying to log into both sides of a multinode job?
I started discussing this with Luca last Thursday and I have a draft mail describing things a bit more:
As part of Leo's work for one of the members we would like to interactively run some tests on Juno.
We think the ideal structure looks pretty much like the one below:
+------------------------------------------+ + | | | | +--------------+ +-------------+ | | | | Orchestrator | ssh | | |
Leo +---------------> (lisa or +---------> Juno | | | | | workflow | | | | | | | automation) | | | | + | +--------------+ +-------------+ | China | KVM Target | Firewall | lava-lab | | | +------------------------------------------+
Some observations and rationale:
- Leo needs to be able to interact with the software running on the Orchestrator via a http:// (or maybe https://) socket. An SSH tunnel with port forwarding is a good way to achieve this since then the web socket can be bound to the lo interface. This is what drives the interest in running this setup as a hacking session.
Is this automated or manual setup? If it's manual, why using LAVA?
It's a manual setup, and I want to use web browser on my local laptop to connect Orchestrator and can change testing case for ipython script frequently for EAS profiling.
I have no Juno board loacally so must use one Juno in LAVA lab.
- That software will, in turn, perform lots of scripted interaction with the Juno. The volume to interaction between orchestrator and target means they need a relatively low latency. This is why we avoid the simpler approach of running the orchestrator from home and having a single-node hacking session.
we already have this covered with Workload Automation and android benchmarking. WA supports ssh based connections for 'linux' targets. I guess lisa works in similar way. So what you need (assuming this is automated not manual setup) is to start session on KVM (host), start session on Juno (target), pass IP address from target to host, setup lisa to connect using this IP and run your testing. During this time target will wait for the host to signal 'testing finished'. Please check wa2-lava repo for examples:
Thanks for pointing. This will be helpful for later's automatic testing, as first step I can accept to do some steps manually to just make thing to work well, but definitly I should move on to automatic method as possible :)
The orchestrator is a moderately complex python code base which Leo has installed in a VM.
The orchestrator needs to run in a sufficiently secure environment for it to locally hold the means to authenticate itself to the Juno.
For anyone deeply interested the software stack on the orchestrator is lisa from ARM. lisa consists of a bunch of useful python code glued together with jupyter (nee ipython). It is a good tool to build complex but ad-hoc investigatory tests.
Other tools such as ARM's workload-automation and (so I hear) some of the OE testing environments can be deployed to the lab in a similar fashion using multi-node jobs. However AFAIK neither of these tools needs an interactive link between the job owner and the VM.
Daniel.
You won't be able to use the pass information between the two
nodes using the multinode API as that requires signal handlers in the dispatcher codebase. More likely you just need one side of the job, in which case you already have that working.
https://validation.linaro.org/scheduler/job/845471.0: this definition is to launch kvm so I can run testing suits on it; https://validation.linaro.org/scheduler/job/845471.1: this definition is to launch "deploy_linaro_image" on Juno board;
After launched these two images, the kvm usually can work well and I can smoothly log in it with ssh;
But for juno board, it will have below log for ssh:
Not all OpenEmbedded images can support a hacking session. Some require extra scripting - maybe done in a test shell which is defined before the hacking session-oe is invoked. This is a test shell problem, not a device problem. The device itself has raised an IPv4 address during the master image stage.
For whatever reason, the OE image provided failed to request an IP address after bringing up the interface https://validation.linaro.org/scheduler/job/845472/log_file#L_375_3
So finally I cannot get the info for Juno's IP and cannot log in it with ssh. It's not everytime I can reproduce this failure, so sometimes it's lucky so I can get a correct IP.
You need to investigate this within the image you are using and identify what steps are needed to fix the image. It may be that you need ifdown and then an explicit ifup or a different image.
As a workaroud, I found if I create two saperated definitions for Juno and kvm independently, then Juno's IP issue can be resolved:
https://validation.linaro.org/scheduler/job/845552 https://validation.linaro.org/scheduler/job/845561
So could you help give suggestions for this?
What are you trying to achieve with a multinode hacking session? You have a working singlenode hacking session on a juno, adding a KVM which you won't be able to talk to in the same way as a multinode job probably doesn't help with anything useful.
Lava-users mailing list Lava-users@lists.linaro.org https://lists.linaro.org/mailman/listinfo/lava-users
Hi Neil,
Thanks for reviewing this question.
On Mon, Apr 25, 2016 at 07:56:17AM +0100, Neil Williams wrote:
On 24 April 2016 at 18:03, Leo Yan leo.yan@linaro.org wrote:
Hi there,
When I run LAVA hacking session on Juno but found sometimes Juno cannot be allocated IP properly:
- I created multinode definition for Juno:
Is there a reason you are trying to log into both sides of a multinode job? You won't be able to use the pass information between the two nodes using the multinode API as that requires signal handlers in the dispatcher codebase. More likely you just need one side of the job, in which case you already have that working.
For my case I need boot up both KVM and Juno boards, In KVM I have a testkit which will access Juno board with ssh so it can launch some automatically testing. It's okay for me to manually finish this step.
BTW, at beginning I tried to use lab.validation.linaro.org to run testkit; but due it need root permission to install dependency libs (mainly related with python) so I'm suggested to use KVM with root permission.
https://validation.linaro.org/scheduler/job/845471.0: this definition is to launch kvm so I can run testing suits on it; https://validation.linaro.org/scheduler/job/845471.1: this definition is to launch "deploy_linaro_image" on Juno board;
After launched these two images, the kvm usually can work well and I can smoothly log in it with ssh;
But for juno board, it will have below log for ssh:
Not all OpenEmbedded images can support a hacking session. Some require extra scripting - maybe done in a test shell which is defined before the hacking session-oe is invoked. This is a test shell problem, not a device problem. The device itself has raised an IPv4 address during the master image stage.
I also got suggestion and reminding from Chase. So I tried more old OE images [1] to check if the OE images can support hacking session for not. It has the same phenomenon: If create standalone definition for LAVA job, then it's almostly can allocate IP successfully; but create multinode definition will easily introduce IP's failure.
[1] http://releases.linaro.org/15.06/openembedded/juno-lsk/lt-vexpress64-openemb...
For whatever reason, the OE image provided failed to request an IP address after bringing up the interface https://validation.linaro.org/scheduler/job/845472/log_file#L_375_3
So finally I cannot get the info for Juno's IP and cannot log in it with ssh. It's not everytime I can reproduce this failure, so sometimes it's lucky so I can get a correct IP.
You need to investigate this within the image you are using and identify what steps are needed to fix the image. It may be that you need ifdown and then an explicit ifup or a different image.
As a workaroud, I found if I create two saperated definitions for Juno and kvm independently, then Juno's IP issue can be resolved:
https://validation.linaro.org/scheduler/job/845552 https://validation.linaro.org/scheduler/job/845561
So could you help give suggestions for this?
What are you trying to achieve with a multinode hacking session? You have a working singlenode hacking session on a juno, adding a KVM which you won't be able to talk to in the same way as a multinode job probably doesn't help with anything useful.
Boot Juno with testing kernel/dtb, and launch KVM with my customized kvm image; after both boot successfully then I can manually log into KVM to execute toolkit.
Thanks, Leo Yan
I’ve compared the “working” and “non-working” kernel logs and there are only two differences I can find.
1) In the non working one it has to run an fsck: https://validation.linaro.org/scheduler/job/845472/log_file#L_349_223 https://validation.linaro.org/scheduler/job/845472/log_file#L_349_223 - I can’t imagine that this is causing the problem
2) A little later on, you see it tries to bring the network interfaces up: - Working one: https://validation.linaro.org/scheduler/job/845552/log_file#L_285_9 https://validation.linaro.org/scheduler/job/845552/log_file#L_285_9 - Non working one: https://validation.linaro.org/scheduler/job/845472/log_file#L_351_10 https://validation.linaro.org/scheduler/job/845472/log_file#L_351_10 - You can see that the non working one is saying that “lo” and “eth0” are both already up.
They are the same images, so I don’t quite understand how we get into this state.
Dave
On 24 Apr 2016, at 18:03, Leo Yan leo.yan@linaro.org wrote:
Hi there,
When I run LAVA hacking session on Juno but found sometimes Juno cannot be allocated IP properly:
- I created multinode definition for Juno:
https://validation.linaro.org/scheduler/job/845471.0: this definition is to launch kvm so I can run testing suits on it; https://validation.linaro.org/scheduler/job/845471.1: this definition is to launch "deploy_linaro_image" on Juno board;
- After launched these two images, the kvm usually can work well and I
can smoothly log in it with ssh;
- But for juno board, it will have below log for ssh:
395.1 ./setup_session_oe: line 38: /etc/init.d/ssh: No such file or directory 395.2 <LAVA_SIGNAL_TESTCASE TEST_CASE_ID=sshd-restart RESULT=fail> 395.3 sshd re-start failed 395.4 Target's Gateway: 10.0.0.1 395.5 ip: RTNETLINK answers: Network is unreachable 395.6 395.7 395.8 ********************************************************************************************* 395.9 Please connect to: ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no root@ (juno-07) 395.10 ********************************************************************************************* 395.11
So finally I cannot get the info for Juno's IP and cannot log in it with ssh. It's not everytime I can reproduce this failure, so sometimes it's lucky so I can get a correct IP.
- As a workaroud, I found if I create two saperated definitions for
Juno and kvm independently, then Juno's IP issue can be resolved:
https://validation.linaro.org/scheduler/job/845552 https://validation.linaro.org/scheduler/job/845561
So could you help give suggestions for this?
Thanks, Leo Yan
On 25/04/16 14:14, Dave Pigott wrote:
I’ve compared the “working” and “non-working” kernel logs and there are only two differences I can find.
- In the non working one it has to run an fsck:
https://validation.linaro.org/scheduler/job/845472/log_file#L_349_223
- I can’t imagine that this is causing the problem
- A little later on, you see it tries to bring the network interfaces up:
I was curious about this.
I looked at the kernel traces and was interested to see that the working version has an smsc911x message coming out as soon as the network adapter is used. On close inspection it looks to me like in the working case the smsc911x driver compiled into the kernel, whilst for the non-working version smsc911x is loaded as a module. Worse the module is not loaded until well after we have started trying to bring up the network.
- Working one:
https://validation.linaro.org/scheduler/job/845552/log_file#L_285_9
[ 0.000000] Linux version 3.10.63.0-1-linaro-lt-vexpress64 (buildslave@x86-64-07) (gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) ) #1ubuntu1~ci+150113192220 SMP Tue Jan 13 19:25:14 UTC 2015
- Non working one:
https://validation.linaro.org/scheduler/job/845472/log_file#L_351_10
[ 0.000000] Linux version 4.4.0-rc2+ (leoy@leoy-linaro) (gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) ) #73 SMP PREEMPT Sat Apr 23 23:11:46 CST 2016
I think non-working version either needs the module building into the kenrel or a line in modprobe.d to ensure the smsc911x driver is loaded when eth0 is used.
Daniel.
On 25 Apr 2016, at 14:56, Daniel Thompson daniel.thompson@linaro.org wrote:
On 25/04/16 14:14, Dave Pigott wrote:
I’ve compared the “working” and “non-working” kernel logs and there are only two differences I can find.
- In the non working one it has to run an fsck:
https://validation.linaro.org/scheduler/job/845472/log_file#L_349_223
- I can’t imagine that this is causing the problem
- A little later on, you see it tries to bring the network interfaces up:
I was curious about this.
I looked at the kernel traces and was interested to see that the working version has an smsc911x message coming out as soon as the network adapter is used. On close inspection it looks to me like in the working case the smsc911x driver compiled into the kernel, whilst for the non-working version smsc911x is loaded as a module. Worse the module is not loaded until well after we have started trying to bring up the network.
- Working one:
https://validation.linaro.org/scheduler/job/845552/log_file#L_285_9
[ 0.000000] Linux version 3.10.63.0-1-linaro-lt-vexpress64 (buildslave@x86-64-07) (gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) ) #1ubuntu1~ci+150113192220 SMP Tue Jan 13 19:25:14 UTC 2015
- Non working one:
https://validation.linaro.org/scheduler/job/845472/log_file#L_351_10
[ 0.000000] Linux version 4.4.0-rc2+ (leoy@leoy-linaro) (gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) ) #73 SMP PREEMPT Sat Apr 23 23:11:46 CST 2016
I think non-working version either needs the module building into the kenrel or a line in modprobe.d to ensure the smsc911x driver is loaded when eth0 is used.
That’s what’s odd. If you look at the two job definitions, they are deploying the same image.
Dave
On Mon, Apr 25, 2016 at 02:56:29PM +0100, Daniel Thompson wrote:
On 25/04/16 14:14, Dave Pigott wrote:
I’ve compared the “working” and “non-working” kernel logs and there are only two differences I can find.
- In the non working one it has to run an fsck:
https://validation.linaro.org/scheduler/job/845472/log_file#L_349_223
- I can’t imagine that this is causing the problem
- A little later on, you see it tries to bring the network interfaces up:
I was curious about this.
I looked at the kernel traces and was interested to see that the working version has an smsc911x message coming out as soon as the network adapter is used. On close inspection it looks to me like in the working case the smsc911x driver compiled into the kernel, whilst for the non-working version smsc911x is loaded as a module. Worse the module is not loaded until well after we have started trying to bring up the network.
- Working one:
https://validation.linaro.org/scheduler/job/845552/log_file#L_285_9
[ 0.000000] Linux version 3.10.63.0-1-linaro-lt-vexpress64 (buildslave@x86-64-07) (gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) ) #1ubuntu1~ci+150113192220 SMP Tue Jan 13 19:25:14 UTC 2015
- Non working one:
https://validation.linaro.org/scheduler/job/845472/log_file#L_351_10
[ 0.000000] Linux version 4.4.0-rc2+ (leoy@leoy-linaro) (gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09
- Linaro GCC 4.9-2014.09) ) #73 SMP PREEMPT Sat Apr 23 23:11:46 CST
2016
I think non-working version either needs the module building into the kenrel or a line in modprobe.d to ensure the smsc911x driver is loaded when eth0 is used.
Checked config file, smsc911x driver has been built into my kernel image:
CONFIG_SMC91X=y CONFIG_SMSC911X=y
With Dave's reminding, I checked into rootfs the networking will be enabled by command: ifup -a; but sometimes it will not really bring up all ethernet devices: "ifup tries to keep track of the whether the interface is up or down using the file /etc/network/run/ifstate Sometimes this can get out of sync with the true state of the device. In such a case you can use "ifdown eth0" to get the ifstate file back in sync, or you can use the --force option to ignore the ifstate file." [1]
I will try to change the script etc/init.d/networking to "ifup -a --force". But still cannot understand why it will mismatch issue between /etc/network/run/ifstate and true devices state.
[1] http://www.linuxquestions.org/questions/linux-newbie-8/ifup-a-not-working-79...
Thanks, Leo Yan
On 25 April 2016 at 14:56, Daniel Thompson daniel.thompson@linaro.org wrote:
On 25/04/16 14:14, Dave Pigott wrote:
I’ve compared the “working” and “non-working” kernel logs and there are only two differences I can find.
- In the non working one it has to run an fsck:
https://validation.linaro.org/scheduler/job/845472/log_file#L_349_223
- I can’t imagine that this is causing the problem
- A little later on, you see it tries to bring the network interfaces up:
I was curious about this.
I looked at the kernel traces and was interested to see that the working version has an smsc911x message coming out as soon as the network adapter is used. On close inspection it looks to me like in the working case the smsc911x driver compiled into the kernel, whilst for the non-working version smsc911x is loaded as a module. Worse the module is not loaded until well after we have started trying to bring up the network.
- Working one:
https://validation.linaro.org/scheduler/job/845552/log_file#L_285_9
[ 0.000000] Linux version 3.10.63.0-1-linaro-lt-vexpress64 (buildslave@x86-64-07) (gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) ) #1ubuntu1~ci+150113192220 SMP Tue Jan 13 19:25:14 UTC 2015
Wrong link, should be:
https://validation.linaro.org/scheduler/job/845552/log_file#L_281_8
Linux version 4.4.0-rc2+ (leoy@leoy-linaro) (gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) ) #73 SMP PREEMPT Sat Apr 23 23:11:46 CST 2016
Always check the kernel boot prior to the message: INFO: System is in test image now, performing basic user space tests.
- Non working one:
https://validation.linaro.org/scheduler/job/845472/log_file#L_351_10
[ 0.000000] Linux version 4.4.0-rc2+ (leoy@leoy-linaro) (gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) ) #73 SMP PREEMPT Sat Apr 23 23:11:46 CST 2016
Wrong link, should be: https://validation.linaro.org/scheduler/job/845472/log_file#L_279_8
Linux version 4.4.0-rc2+ (leoy@leoy-linaro) (gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) ) #73 SMP PREEMPT Sat Apr 23 23:11:46 CST 2016
So it is the same kernel booting each time and the logs also include the checksums of the downloaded files, those match in each job.
It's one of the problems of V1 - there are a lot of kernel boots involved in a juno testjob. You also need to check the reported firmware versions and other messages. It can be best to download both log files, strip out the other kernel boot logs and compare the results that way.
On 25 April 2016 at 15:22, Neil Williams neil.williams@linaro.org wrote:
On 25 April 2016 at 14:56, Daniel Thompson daniel.thompson@linaro.org wrote:
On 25/04/16 14:14, Dave Pigott wrote:
I’ve compared the “working” and “non-working” kernel logs and there are only two differences I can find.
- In the non working one it has to run an fsck:
https://validation.linaro.org/scheduler/job/845472/log_file#L_349_223
- I can’t imagine that this is causing the problem
- A little later on, you see it tries to bring the network interfaces up:
I was curious about this.
I looked at the kernel traces and was interested to see that the working version has an smsc911x message coming out as soon as the network adapter is used. On close inspection it looks to me like in the working case the smsc911x driver compiled into the kernel, whilst for the non-working version smsc911x is loaded as a module. Worse the module is not loaded until well after we have started trying to bring up the network.
- Working one:
https://validation.linaro.org/scheduler/job/845552/log_file#L_285_9
[ 0.000000] Linux version 3.10.63.0-1-linaro-lt-vexpress64 (buildslave@x86-64-07) (gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) ) #1ubuntu1~ci+150113192220 SMP Tue Jan 13 19:25:14 UTC 2015
Wrong link, should be:
https://validation.linaro.org/scheduler/job/845552/log_file#L_281_8
Linux version 4.4.0-rc2+ (leoy@leoy-linaro) (gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) ) #73 SMP PREEMPT Sat Apr 23 23:11:46 CST 2016
Always check the kernel boot prior to the message: INFO: System is in test image now, performing basic user space tests.
- Non working one:
https://validation.linaro.org/scheduler/job/845472/log_file#L_351_10
[ 0.000000] Linux version 4.4.0-rc2+ (leoy@leoy-linaro) (gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) ) #73 SMP PREEMPT Sat Apr 23 23:11:46 CST 2016
Wrong link, should be: https://validation.linaro.org/scheduler/job/845472/log_file#L_279_8
Linux version 4.4.0-rc2+ (leoy@leoy-linaro) (gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) ) #73 SMP PREEMPT Sat Apr 23 23:11:46 CST 2016
So it is the same kernel booting each time and the logs also include the checksums of the downloaded files, those match in each job.
It's one of the problems of V1 - there are a lot of kernel boots involved in a juno testjob. You also need to check the reported firmware versions and other messages. It can be best to download both log files, strip out the other kernel boot logs and compare the results that way.
https://validation.linaro.org/scheduler/job/845472/log_file#L_296_1
<LAVA_DISPATCHER>2016-04-24 09:11:19 AM ERROR: Userspace Error: image prompt not found.
So it tries again: https://validation.linaro.org/scheduler/job/845472/log_file#L_347_8 Linux version 4.4.0-rc2+ (leoy@leoy-linaro) (gcc version 4.9.2 20140904 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09) ) #73 SMP PREEMPT Sat Apr 23 23:11:46 CST 2016347.9 [ 0.000000] Boot CPU: AArch64 Processor [410fd030]
In each of those logs, I see the smsc911x message:
https://validation.linaro.org/scheduler/job/845552/log_file#L_283_152
https://validation.linaro.org/scheduler/job/845472/log_file#L_349_152
As spotted already, the problem is not that the driver isn't loading, it's that the state is inconsistent:
https://validation.linaro.org/scheduler/job/845552/log_file#L_285_9 Configuring network interfaces... [ 8.449499] smsc911x 18000000.ethernet eth0: SMSC911x/921x identified at 0xffffff8000120000, IRQ: 29285.10 udhcpc (v1.24.1) started 285.11 Sending discover... 285.12 Sending discover... 285.13 Sending select for 10.7.0.20... 285.14 Lease of 10.7.0.20 obtained, lease time 3600 285.15 /etc/udhcpc.d/50default: Adding DNS 10.0.0.2285.16 done.
https://validation.linaro.org/scheduler/job/845472/log_file#L_351_10 Configuring network interfaces... ifup: interface lo already configured 351.11 ifup: interface eth0 already configured 351.12 done.351.13 Starting OpenBSD Secure Shell server: sshd
On Mon, Apr 25, 2016 at 03:35:56PM +0100, Neil Williams wrote:
[...]
In each of those logs, I see the smsc911x message:
https://validation.linaro.org/scheduler/job/845552/log_file#L_283_152
https://validation.linaro.org/scheduler/job/845472/log_file#L_349_152
As spotted already, the problem is not that the driver isn't loading, it's that the state is inconsistent:
https://validation.linaro.org/scheduler/job/845552/log_file#L_285_9 Configuring network interfaces... [ 8.449499] smsc911x 18000000.ethernet eth0: SMSC911x/921x identified at 0xffffff8000120000, IRQ: 29285.10 udhcpc (v1.24.1) started 285.11 Sending discover... 285.12 Sending discover... 285.13 Sending select for 10.7.0.20... 285.14 Lease of 10.7.0.20 obtained, lease time 3600 285.15 /etc/udhcpc.d/50default: Adding DNS 10.0.0.2285.16 done.
https://validation.linaro.org/scheduler/job/845472/log_file#L_351_10 Configuring network interfaces... ifup: interface lo already configured 351.11 ifup: interface eth0 already configured 351.12 done.351.13 Starting OpenBSD Secure Shell server: sshd
After I chanaged the command to "ifup -a -f" in rootfs's file /etc/init.d/networking, then I can boot up 4 times with multinode definition, everytime Juno can get dynamic IP for Juno board successfully.
So this issue is caused by mismatch info in /etc/network/run/ifstate, is this caused by LAVA boots Juno boards for several times during this job and one previous boot time may write dirty info into ifstate?
Thanks, Leo Yan
[ May be an off topic ]
Hi Leo,
As we discussed offline. I would like to be a part of EAS testing. Since I have Juno board on my desk it would be easy for me to run tests from tools like Workload automation and LISA.
Milosz suggested to investigate the ways to automate this on LAVA like our team did for Workload automation [1].
[1] https://git.linaro.org/qa/wa2-lava.git
Best regards Naresh Kamboju
On 27 April 2016 at 19:21, Leo Yan leo.yan@linaro.org wrote:
On Mon, Apr 25, 2016 at 03:35:56PM +0100, Neil Williams wrote:
[...]
In each of those logs, I see the smsc911x message:
https://validation.linaro.org/scheduler/job/845552/log_file#L_283_152
https://validation.linaro.org/scheduler/job/845472/log_file#L_349_152
As spotted already, the problem is not that the driver isn't loading, it's that the state is inconsistent:
https://validation.linaro.org/scheduler/job/845552/log_file#L_285_9 Configuring network interfaces... [ 8.449499] smsc911x 18000000.ethernet eth0: SMSC911x/921x identified at 0xffffff8000120000, IRQ: 29285.10 udhcpc (v1.24.1) started 285.11 Sending discover... 285.12 Sending discover... 285.13 Sending select for 10.7.0.20... 285.14 Lease of 10.7.0.20 obtained, lease time 3600 285.15 /etc/udhcpc.d/50default: Adding DNS 10.0.0.2285.16 done.
https://validation.linaro.org/scheduler/job/845472/log_file#L_351_10 Configuring network interfaces... ifup: interface lo already configured 351.11 ifup: interface eth0 already configured 351.12 done.351.13 Starting OpenBSD Secure Shell server: sshd
After I chanaged the command to "ifup -a -f" in rootfs's file /etc/init.d/networking, then I can boot up 4 times with multinode definition, everytime Juno can get dynamic IP for Juno board successfully.
So this issue is caused by mismatch info in /etc/network/run/ifstate, is this caused by LAVA boots Juno boards for several times during this job and one previous boot time may write dirty info into ifstate?
Thanks, Leo Yan _______________________________________________ Lava-users mailing list Lava-users@lists.linaro.org https://lists.linaro.org/mailman/listinfo/lava-users
On Wed, Apr 27, 2016 at 09:27:37PM +0530, Naresh Kamboju wrote:
[ May be an off topic ]
Hi Leo,
As we discussed offline. I would like to be a part of EAS testing. Since I have Juno board on my desk it would be easy for me to run tests from tools like Workload automation and LISA.
Welcome :)
Milosz suggested to investigate the ways to automate this on LAVA like our team did for Workload automation [1].
If can use automatic method to launch EAS regression testing, this definitely will be very useful and save time.
The steps in Lisa's README.rd: $ source init_env $ nosetests -v tests/eas/rfc.py ./tools/report.py --base noeas --tests eas
Just curious, will you plan to enable this into wa2-lava.git?
[1] https://git.linaro.org/qa/wa2-lava.git
Best regards Naresh Kamboju
On 27 April 2016 at 19:21, Leo Yan leo.yan@linaro.org wrote:
On Mon, Apr 25, 2016 at 03:35:56PM +0100, Neil Williams wrote:
[...]
In each of those logs, I see the smsc911x message:
https://validation.linaro.org/scheduler/job/845552/log_file#L_283_152
https://validation.linaro.org/scheduler/job/845472/log_file#L_349_152
As spotted already, the problem is not that the driver isn't loading, it's that the state is inconsistent:
https://validation.linaro.org/scheduler/job/845552/log_file#L_285_9 Configuring network interfaces... [ 8.449499] smsc911x 18000000.ethernet eth0: SMSC911x/921x identified at 0xffffff8000120000, IRQ: 29285.10 udhcpc (v1.24.1) started 285.11 Sending discover... 285.12 Sending discover... 285.13 Sending select for 10.7.0.20... 285.14 Lease of 10.7.0.20 obtained, lease time 3600 285.15 /etc/udhcpc.d/50default: Adding DNS 10.0.0.2285.16 done.
https://validation.linaro.org/scheduler/job/845472/log_file#L_351_10 Configuring network interfaces... ifup: interface lo already configured 351.11 ifup: interface eth0 already configured 351.12 done.351.13 Starting OpenBSD Secure Shell server: sshd
After I chanaged the command to "ifup -a -f" in rootfs's file /etc/init.d/networking, then I can boot up 4 times with multinode definition, everytime Juno can get dynamic IP for Juno board successfully.
So this issue is caused by mismatch info in /etc/network/run/ifstate, is this caused by LAVA boots Juno boards for several times during this job and one previous boot time may write dirty info into ifstate?
Thanks, Leo Yan _______________________________________________ Lava-users mailing list Lava-users@lists.linaro.org https://lists.linaro.org/mailman/listinfo/lava-users
On 28 April 2016 at 08:40, Leo Yan leo.yan@linaro.org wrote:
On Wed, Apr 27, 2016 at 09:27:37PM +0530, Naresh Kamboju wrote:
[ May be an off topic ]
Hi Leo,
As we discussed offline. I would like to be a part of EAS testing. Since I have Juno board on my desk it would be easy for me to run tests from tools like Workload automation and LISA.
Welcome :)
Milosz suggested to investigate the ways to automate this on LAVA like our team did for Workload automation [1].
If can use automatic method to launch EAS regression testing, this definitely will be very useful and save time.
The steps in Lisa's README.rd: $ source init_env $ nosetests -v tests/eas/rfc.py ./tools/report.py --base noeas --tests eas
Hi Leo,
Just curious, will you plan to enable this into wa2-lava.git?
Yes Because, adb and ssh connection frame work can be re-used from wa2-lava.git
- Naresh
[1] https://git.linaro.org/qa/wa2-lava.git
Best regards Naresh Kamboju
On 27 April 2016 at 19:21, Leo Yan leo.yan@linaro.org wrote:
On Mon, Apr 25, 2016 at 03:35:56PM +0100, Neil Williams wrote:
[...]
In each of those logs, I see the smsc911x message:
https://validation.linaro.org/scheduler/job/845552/log_file#L_283_152
https://validation.linaro.org/scheduler/job/845472/log_file#L_349_152
As spotted already, the problem is not that the driver isn't loading, it's that the state is inconsistent:
https://validation.linaro.org/scheduler/job/845552/log_file#L_285_9 Configuring network interfaces... [ 8.449499] smsc911x 18000000.ethernet eth0: SMSC911x/921x identified at 0xffffff8000120000, IRQ: 29285.10 udhcpc (v1.24.1) started 285.11 Sending discover... 285.12 Sending discover... 285.13 Sending select for 10.7.0.20... 285.14 Lease of 10.7.0.20 obtained, lease time 3600 285.15 /etc/udhcpc.d/50default: Adding DNS 10.0.0.2285.16 done.
https://validation.linaro.org/scheduler/job/845472/log_file#L_351_10 Configuring network interfaces... ifup: interface lo already configured 351.11 ifup: interface eth0 already configured 351.12 done.351.13 Starting OpenBSD Secure Shell server: sshd
After I chanaged the command to "ifup -a -f" in rootfs's file /etc/init.d/networking, then I can boot up 4 times with multinode definition, everytime Juno can get dynamic IP for Juno board successfully.
So this issue is caused by mismatch info in /etc/network/run/ifstate, is this caused by LAVA boots Juno boards for several times during this job and one previous boot time may write dirty info into ifstate?
Thanks, Leo Yan _______________________________________________ Lava-users mailing list Lava-users@lists.linaro.org https://lists.linaro.org/mailman/listinfo/lava-users
On 28/04/16 05:32, Naresh Kamboju wrote:
On 28 April 2016 at 08:40, Leo Yan leo.yan@linaro.org wrote:
On Wed, Apr 27, 2016 at 09:27:37PM +0530, Naresh Kamboju wrote:
[ May be an off topic ]
Hi Leo,
As we discussed offline. I would like to be a part of EAS testing. Since I have Juno board on my desk it would be easy for me to run tests from tools like Workload automation and LISA.
Welcome :)
Milosz suggested to investigate the ways to automate this on LAVA like our team did for Workload automation [1].
If can use automatic method to launch EAS regression testing, this definitely will be very useful and save time.
The steps in Lisa's README.rd: $ source init_env $ nosetests -v tests/eas/rfc.py ./tools/report.py --base noeas --tests eas
Hi Leo,
Just curious, will you plan to enable this into wa2-lava.git?
Yes Because, adb and ssh connection frame work can be re-used from wa2-lava.git
Allowing us to run tests included with lisa (and other lisa derivations) would certainly be useful.
However, in addition, will it be possible for wa2-lava to generate lisa hacking session jobs as well? The automation used to construct KVM images from lisa repositories is useful for interactive work as well.
To explain a bit more...
One of the key features of lisa is that it can be used to interactively explore the effects of different synthetic workloads on a system (for example whilst seeking to reproduce "bad" behavior observed in a real system). Effectively it can act as a tool to help develop complex test cases, especially those that include energy metering.
Daniel.
- Naresh
[1] https://git.linaro.org/qa/wa2-lava.git
Best regards Naresh Kamboju
On 27 April 2016 at 19:21, Leo Yan leo.yan@linaro.org wrote:
On Mon, Apr 25, 2016 at 03:35:56PM +0100, Neil Williams wrote:
[...]
In each of those logs, I see the smsc911x message:
https://validation.linaro.org/scheduler/job/845552/log_file#L_283_152
https://validation.linaro.org/scheduler/job/845472/log_file#L_349_152
As spotted already, the problem is not that the driver isn't loading, it's that the state is inconsistent:
https://validation.linaro.org/scheduler/job/845552/log_file#L_285_9 Configuring network interfaces... [ 8.449499] smsc911x 18000000.ethernet eth0: SMSC911x/921x identified at 0xffffff8000120000, IRQ: 29285.10 udhcpc (v1.24.1) started 285.11 Sending discover... 285.12 Sending discover... 285.13 Sending select for 10.7.0.20... 285.14 Lease of 10.7.0.20 obtained, lease time 3600 285.15 /etc/udhcpc.d/50default: Adding DNS 10.0.0.2285.16 done.
https://validation.linaro.org/scheduler/job/845472/log_file#L_351_10 Configuring network interfaces... ifup: interface lo already configured 351.11 ifup: interface eth0 already configured 351.12 done.351.13 Starting OpenBSD Secure Shell server: sshd
After I chanaged the command to "ifup -a -f" in rootfs's file /etc/init.d/networking, then I can boot up 4 times with multinode definition, everytime Juno can get dynamic IP for Juno board successfully.
So this issue is caused by mismatch info in /etc/network/run/ifstate, is this caused by LAVA boots Juno boards for several times during this job and one previous boot time may write dirty info into ifstate?
Thanks, Leo Yan _______________________________________________ Lava-users mailing list Lava-users@lists.linaro.org https://lists.linaro.org/mailman/listinfo/lava-users
+ Milosz
On 28 April 2016 at 13:36, Daniel Thompson daniel.thompson@linaro.org wrote:
On 28/04/16 05:32, Naresh Kamboju wrote:
On 28 April 2016 at 08:40, Leo Yan leo.yan@linaro.org wrote:
On Wed, Apr 27, 2016 at 09:27:37PM +0530, Naresh Kamboju wrote:
[ May be an off topic ]
Hi Leo,
As we discussed offline. I would like to be a part of EAS testing. Since I have Juno board on my desk it would be easy for me to run tests from tools like Workload automation and LISA.
Welcome :)
Milosz suggested to investigate the ways to automate this on LAVA like our team did for Workload automation [1].
If can use automatic method to launch EAS regression testing, this definitely will be very useful and save time.
The steps in Lisa's README.rd: $ source init_env $ nosetests -v tests/eas/rfc.py ./tools/report.py --base noeas --tests eas
Hi Leo,
Just curious, will you plan to enable this into wa2-lava.git?
Yes Because, adb and ssh connection frame work can be re-used from wa2-lava.git
Allowing us to run tests included with lisa (and other lisa derivations) would certainly be useful.
However, in addition, will it be possible for wa2-lava to generate lisa hacking session jobs as well? The automation used to construct KVM images from lisa repositories is useful for interactive work as well.
To explain a bit more...
One of the key features of lisa is that it can be used to interactively explore the effects of different synthetic workloads on a system (for example whilst seeking to reproduce "bad" behavior observed in a real system). Effectively it can act as a tool to help develop complex test cases, especially those that include energy metering.
At this point of time wa2-lava.git for full automation from end-to-end. The design was focused for automation only.
Download-> Configure -> and Install-tests Then running adb or ssh session to connect target device from KVM. Run The tests on target Collect the results on host parse the results on host Attach to LAVA job.
For interactive way of running test is not in the scope now. I am adding Milosz for getting his thoughts on this subject.
- Naresh
Daniel.
- Naresh
[1] https://git.linaro.org/qa/wa2-lava.git
Best regards Naresh Kamboju
On 27 April 2016 at 19:21, Leo Yan leo.yan@linaro.org wrote:
On Mon, Apr 25, 2016 at 03:35:56PM +0100, Neil Williams wrote:
[...]
In each of those logs, I see the smsc911x message:
https://validation.linaro.org/scheduler/job/845552/log_file#L_283_152
https://validation.linaro.org/scheduler/job/845472/log_file#L_349_152
As spotted already, the problem is not that the driver isn't loading, it's that the state is inconsistent:
https://validation.linaro.org/scheduler/job/845552/log_file#L_285_9 Configuring network interfaces... [ 8.449499] smsc911x 18000000.ethernet eth0: SMSC911x/921x identified at 0xffffff8000120000, IRQ: 29285.10 udhcpc (v1.24.1) started 285.11 Sending discover... 285.12 Sending discover... 285.13 Sending select for 10.7.0.20... 285.14 Lease of 10.7.0.20 obtained, lease time 3600 285.15 /etc/udhcpc.d/50default: Adding DNS 10.0.0.2285.16 done.
https://validation.linaro.org/scheduler/job/845472/log_file#L_351_10 Configuring network interfaces... ifup: interface lo already configured 351.11 ifup: interface eth0 already configured 351.12 done.351.13 Starting OpenBSD Secure Shell server: sshd
After I chanaged the command to "ifup -a -f" in rootfs's file /etc/init.d/networking, then I can boot up 4 times with multinode definition, everytime Juno can get dynamic IP for Juno board successfully.
So this issue is caused by mismatch info in /etc/network/run/ifstate, is this caused by LAVA boots Juno boards for several times during this job and one previous boot time may write dirty info into ifstate?
Thanks, Leo Yan _______________________________________________ Lava-users mailing list Lava-users@lists.linaro.org https://lists.linaro.org/mailman/listinfo/lava-users
On Thu, Apr 28, 2016 at 03:09:43PM +0530, Naresh Kamboju wrote:
[...]
Allowing us to run tests included with lisa (and other lisa derivations) would certainly be useful.
However, in addition, will it be possible for wa2-lava to generate lisa hacking session jobs as well? The automation used to construct KVM images from lisa repositories is useful for interactive work as well.
To explain a bit more...
One of the key features of lisa is that it can be used to interactively explore the effects of different synthetic workloads on a system (for example whilst seeking to reproduce "bad" behavior observed in a real system). Effectively it can act as a tool to help develop complex test cases, especially those that include energy metering.
At this point of time wa2-lava.git for full automation from end-to-end. The design was focused for automation only.
Download-> Configure -> and Install-tests Then running adb or ssh session to connect target device from KVM. Run The tests on target Collect the results on host parse the results on host Attach to LAVA job.
For interactive way of running test is not in the scope now. I am adding Milosz for getting his thoughts on this subject.
Thanks Daniel pointed out this. The interactive method is important for profiling efficiency due we can do many profilings after boot up once.
I tried to connect KVM from my local laptop, but cannot connect successfully; but I can access KVM from lab.validation.linaro.org. So KVM cannot directly access from LAVA lab's outside.
Thanks, Leo Yan
On 28 April 2016 at 11:43, Leo Yan leo.yan@linaro.org wrote:
I tried to connect KVM from my local laptop, but cannot connect successfully; but I can access KVM from lab.validation.linaro.org. So KVM cannot directly access from LAVA lab's outside.
To connect to a VM running on a local machine - whether that is inside LAVA or not - relies upon the machine having bridged networking configured and having QEMU configured to use -net tap instead of -net user. LAVA device configuration can support -net tap but configuring the bridge in the first place is specific to how you use that laptop. It can be awkward to get it right if the laptop also has to work with one or more VPNs etc. Configuring the bridge on your machine is outside the scope of LAVA itself.
On Thu, Apr 28, 2016 at 12:22:17PM +0100, Neil Williams wrote:
On 28 April 2016 at 11:43, Leo Yan leo.yan@linaro.org wrote:
I tried to connect KVM from my local laptop, but cannot connect successfully; but I can access KVM from lab.validation.linaro.org. So KVM cannot directly access from LAVA lab's outside.
To connect to a VM running on a local machine - whether that is inside LAVA or not - relies upon the machine having bridged networking configured and having QEMU configured to use -net tap instead of -net user. LAVA device configuration can support -net tap but configuring the bridge in the first place is specific to how you use that laptop. It can be awkward to get it right if the laptop also has to work with one or more VPNs etc. Configuring the bridge on your machine is outside the scope of LAVA itself.
Thanks for this info; Let's simplize this issue :)
On my laptoo, I can try to directly connect KVM without VPN. So how I can set "-net tap"? I searched LAVA's hacking session, it says can specify this by using "context" field. But still don't know where I can add this entry so can get a quick try?
https://validation.linaro.org/static/docs/v2/hacking-session.html?highlight=...
Thanks, Leo Yan
On 28 April 2016 at 14:50, Leo Yan leo.yan@linaro.org wrote:
On Thu, Apr 28, 2016 at 12:22:17PM +0100, Neil Williams wrote:
On 28 April 2016 at 11:43, Leo Yan leo.yan@linaro.org wrote:
I tried to connect KVM from my local laptop, but cannot connect successfully; but I can access KVM from lab.validation.linaro.org. So KVM cannot directly access from LAVA lab's outside.
To connect to a VM running on a local machine - whether that is inside LAVA or not - relies upon the machine having bridged networking configured and having QEMU configured to use -net tap instead of -net user. LAVA device configuration can support -net tap but configuring the bridge in the first place is specific to how you use that laptop. It can be awkward to get it right if the laptop also has to work with one or more VPNs etc. Configuring the bridge on your machine is outside the scope of LAVA itself.
Thanks for this info; Let's simplize this issue :)
On my laptoo, I can try to directly connect KVM without VPN. So how I can set "-net tap"? I searched LAVA's hacking session, it says can specify this by using "context" field. But still don't know where I can add this entry so can get a quick try?
https://validation.linaro.org/static/docs/v2/hacking-session.html?highlight=...
That's the V2 documentation - take care because the docs for V1 and V2 are different and will continue to diverge.
Setting up bridging is beyond the scope of the LAVA documentation. It is entirely down to how you need to do it on your laptop and you'll need to work out the best config for your own machine yourself. It's an admin task and you need to know what you're doing or you can break your network configuration on that machine. This isn't a LAVA topic and setting up a bridge on your own machine is not necessarily a quick change to make. Bridging is not the same as using a VPN - if a VPN is configured, the VPN config needs to be taken into account when creating the bridge.
You won't be able to connect to the VM over the network unless QEMU can use a working bridge configuration - QEMU will simply fail if -net tap is used without a working bridge being available. An unbridged VM will be able to make outbound connections but not inbound connections.
Once you have a bridge device showing up in /sbin/ifconfig -a on your machine, then -net tap itself is set in the device configuration, e.g. https://git.linaro.org/lava/lava-lab.git/blob/HEAD:/validation.linaro.org/la...
I don't use a network bridge for testing my own local LAVA install.
On 28/04/16 10:39, Naresh Kamboju wrote:
- Milosz
On 28 April 2016 at 13:36, Daniel Thompson daniel.thompson@linaro.org wrote:
On 28/04/16 05:32, Naresh Kamboju wrote:
On 28 April 2016 at 08:40, Leo Yan leo.yan@linaro.org wrote:
On Wed, Apr 27, 2016 at 09:27:37PM +0530, Naresh Kamboju wrote:
[ May be an off topic ]
Hi Leo,
As we discussed offline. I would like to be a part of EAS testing. Since I have Juno board on my desk it would be easy for me to run tests from tools like Workload automation and LISA.
Welcome :)
Milosz suggested to investigate the ways to automate this on LAVA like our team did for Workload automation [1].
If can use automatic method to launch EAS regression testing, this definitely will be very useful and save time.
The steps in Lisa's README.rd: $ source init_env $ nosetests -v tests/eas/rfc.py ./tools/report.py --base noeas --tests eas
Hi Leo,
Just curious, will you plan to enable this into wa2-lava.git?
Yes Because, adb and ssh connection frame work can be re-used from wa2-lava.git
Allowing us to run tests included with lisa (and other lisa derivations) would certainly be useful.
However, in addition, will it be possible for wa2-lava to generate lisa hacking session jobs as well? The automation used to construct KVM images from lisa repositories is useful for interactive work as well.
To explain a bit more...
One of the key features of lisa is that it can be used to interactively explore the effects of different synthetic workloads on a system (for example whilst seeking to reproduce "bad" behavior observed in a real system). Effectively it can act as a tool to help develop complex test cases, especially those that include energy metering.
At this point of time wa2-lava.git for full automation from end-to-end. The design was focused for automation only.
Download-> Configure -> and Install-tests Then running adb or ssh session to connect target device from KVM. Run The tests on target Collect the results on host parse the results on host Attach to LAVA job.
I only brought it up because there seemed to be more in common that different. Main differences are:
1. Instead of launching the test suite runner on KVM we need to connect to it interactively to conduct the hacking session. lisa uses the same devlib library that WA uses so the way it connects to the target device from KVM remains unaltered.
2. The results to be collected for the results bundle are different. The primary outputs are .ipynb files saved on the KVM filesystem. These record the interactive session and provide the code that can eventually flow into fully automatic tests.
For interactive way of running test is not in the scope now. I am adding Milosz for getting his thoughts on this subject.
To be clear, whilst it would be great to be able to reuse all the automatic VM construction and multi-job setup for interactive uses, fully automatic support is useful on its own.
However beware of release announcements declaring "we have added lisa support" without adding any caveats; that *would* imply support for interactive sessions.
Daniel.
On 28 April 2016 at 12:04, Daniel Thompson daniel.thompson@linaro.org wrote:
On 28/04/16 10:39, Naresh Kamboju wrote:
- Milosz
On 28 April 2016 at 13:36, Daniel Thompson daniel.thompson@linaro.org wrote:
On 28/04/16 05:32, Naresh Kamboju wrote:
On 28 April 2016 at 08:40, Leo Yan leo.yan@linaro.org wrote:
On Wed, Apr 27, 2016 at 09:27:37PM +0530, Naresh Kamboju wrote:
[ May be an off topic ]
Hi Leo,
As we discussed offline. I would like to be a part of EAS testing. Since I have Juno board on my desk it would be easy for me to run tests from tools like Workload automation and LISA.
Welcome :)
Milosz suggested to investigate the ways to automate this on LAVA like our team did for Workload automation [1].
If can use automatic method to launch EAS regression testing, this definitely will be very useful and save time.
The steps in Lisa's README.rd: $ source init_env $ nosetests -v tests/eas/rfc.py ./tools/report.py --base noeas --tests eas
Hi Leo,
Just curious, will you plan to enable this into wa2-lava.git?
Yes Because, adb and ssh connection frame work can be re-used from wa2-lava.git
Allowing us to run tests included with lisa (and other lisa derivations) would certainly be useful.
However, in addition, will it be possible for wa2-lava to generate lisa hacking session jobs as well? The automation used to construct KVM images from lisa repositories is useful for interactive work as well.
To explain a bit more...
One of the key features of lisa is that it can be used to interactively explore the effects of different synthetic workloads on a system (for example whilst seeking to reproduce "bad" behavior observed in a real system). Effectively it can act as a tool to help develop complex test cases, especially those that include energy metering.
At this point of time wa2-lava.git for full automation from end-to-end. The design was focused for automation only.
Download-> Configure -> and Install-tests Then running adb or ssh session to connect target device from KVM. Run The tests on target Collect the results on host parse the results on host Attach to LAVA job.
I only brought it up because there seemed to be more in common that different. Main differences are:
Instead of launching the test suite runner on KVM we need to connect to it interactively to conduct the hacking session. lisa uses the same devlib library that WA uses so the way it connects to the target device from KVM remains unaltered.
The results to be collected for the results bundle are different. The primary outputs are .ipynb files saved on the KVM filesystem. These record the interactive session and provide the code that can eventually flow into fully automatic tests.
'Interactive' doesn't go well with LAVA unless you're OK waiting to start your session for days. As example there are currently 136 jobs in juno queue. Some jobs take 24h to complete, but assuming it's only 1h for job with 8 active devices you would still wait for 17 hours before your session starts (and there is no guarantee).
What you describe above is a multinode hacking session. I have a code for such session for android. I guess it can be easily extended for linux targets as well. https://git.linaro.org/lava-team/hacking-session.git/blob/HEAD:/hacking-sess... https://git.linaro.org/lava-team/hacking-session.git/blob/HEAD:/hacking-sess...
For interactive way of running test is not in the scope now. I am adding Milosz for getting his thoughts on this subject.
To be clear, whilst it would be great to be able to reuse all the automatic VM construction and multi-job setup for interactive uses, fully automatic support is useful on its own.
However beware of release announcements declaring "we have added lisa support" without adding any caveats; that *would* imply support for interactive sessions.
I'm not sure I understand why the support for interactive session is implied (I didn't use the tool). For WA we only support certain use cases and provide means to extend the pool of use cases by users. If we start supporting lisa, it will be the same scenarion. Only some defined use case will be supported. Interactive session is out of scope here.
milosz
Daniel.
lava-users@lists.lavasoftware.org