Hi Neil et al,
I'm trying to debug a simple qemu job that goes straight from running to incomplete without log creation (used to working ok, but I reinstalled everything on a different machine...)
Looking at /var/log/lava-server/lava-scheduler.log I see the following:
2015-12-09 15:22:27,838 [INFO] [lava_scheduler_daemon.job.JobRunner.14] starting job {u'timeout': 18000, 'health_check': False, u'job_name': u'qemu-arm-test', u'actions': [{u'command': u'deploy_linaro_kernel', u'parameters': {u'login_prompt': u'login:', u'kernel': u' http://images.validation.linaro.org/functional-test-images/qemu-arm/zImage-q...', u'username': u'root', u'rootfs': u' http://images.validation.linaro.org/functional-test-images/qemu-arm/core-ima..., {u'command': u'boot_linaro_image', u'parameters': {u'test_image_prompt': u'root@qemu-system-arm:~#'}}], u'target': u'qemu0'} 2015-12-09 15:22:27,838 [INFO] [lava_scheduler_daemon.job.MonitorJob] monitoring "setsid lava-server manage schedulermonitor 14 lava-dispatch qemu0 /tmp/tmpPd4nGs -l info -f /var/log/lava-server/lava-scheduler.log" 2015-12-09 15:22:29,171 [INFO] [lava_scheduler_daemon.job.Job.qemu0] executing "lava-dispatch /tmp/tmpFltuQQ --output-dir /var/lib/lava-server/default/media/job-output/job-14" 2015-12-09 15:22:30,388 [INFO] [lava_scheduler_daemon.job.DispatcherProcessProtocol] childConnectionLost for qemu0: 0 2015-12-09 15:22:30,389 [INFO] [lava_scheduler_daemon.job.DispatcherProcessProtocol] childConnectionLost for qemu0: 1 2015-12-09 15:22:30,389 [INFO] [lava_scheduler_daemon.job.DispatcherProcessProtocol] childConnectionLost for qemu0: 2 2015-12-09 15:22:30,389 [INFO] [lava_scheduler_daemon.job.DispatcherProcessProtocol] processExited for qemu0: A process has ended with a probable error condition: process ended with exit code 1. 2015-12-09 15:22:30,389 [INFO] [lava_scheduler_daemon.job.DispatcherProcessProtocol] processEnded for qemu0: A process has ended with a probable error condition: process ended with exit code 1. 2015-12-09 15:22:30,389 [INFO] [lava_scheduler_daemon.job.Job.qemu0] job finished on qemu0 2015-12-09 15:22:30,389 [INFO] [lava_scheduler_daemon.job.Job.qemu0] job incomplete: reported 1 exit code 2015-12-09 15:22:30,422 [INFO] [lava_scheduler_daemon.dbjobsource.DatabaseJobSource] job 14 completed on qemu0
I tried to run manually:
setsid lava-server manage schedulermonitor 14 lava-dispatch qemu0 qemu-arm.json
powerci@lab-baylibre:~/POWERCI/scripts/user$ 2015-12-09 15:23:23,285 [ERROR] [lava_scheduler_daemon.job.Job.qemu0] AttributeError: 'Job' object has no attribute '_protocol' Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 1203, in mainLoop self.runUntilCurrent() File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 798, in runUntilCurrent f(*a, **kw) File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 393, in callback self._startRunCallbacks(result) File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 501, in _startRunCallbacks self._runCallbacks() --- <exception caught here> --- File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 588, in _runCallbacks current.result = callback(current.result, *args, **kw) File "/usr/lib/python2.7/dist-packages/lava_scheduler_daemon/job.py", line 226, in _run self.cancel(exc) File "/usr/lib/python2.7/dist-packages/lava_scheduler_daemon/job.py", line 157, in cancel self._protocol.transport.signalProcess(getattr(signal, signame)) exceptions.AttributeError: 'Job' object has no attribute '_protocol'
Note that I get the same issue with other jobs (boards, kvm): submission is OK, but incomplete, and no log.
Any help would be much appreciated!
Many thanks, Marc.