On 30 January 2018 at 11:18, Chase Qi <chase.qi@linaro.org> wrote:
Hi,

I have lava-master and lava-slave v2018.1 installed, and a qemu device
added. Test job can be scheduler. Then I followed
https://validation.linaro.org/static/docs/v2/pipeline-server.html#using-zmq-authentication-and-encryption
to enable ZMQ authentication.


The docs may need an update.

/etc/lava-server/lava-logs also needs to be configured to support encryption.

e.g. from https://lava.codehelp.co.uk/ config

 ENCRYPT="--encrypt"
 MASTER_CERT="--master-cert /etc/lava-dispatcher/certificates.d/master.key_secret"
 SLAVES_CERTS="--slaves-certs /etc/lava-dispatcher/certificates.d/"


 
Certificates were generated correctly, public certificates were copied
to master and slave respectively. With the following configs:
lava-master
```
MASTER_SOCKET="--master-socket tcp://*:5556"

LOGLEVEL="DEBUG"

ENCRYPT="--encrypt"
MASTER_CERT="--master-cert
/etc/lava-dispatcher/certificates.d/master.key_secret"
SLAVES_CERTS="--slaves-certs /etc/lava-dispatcher/certificates.d/"
```

lava-slave
```
MASTER_URL="tcp://192.168.11.214:5556"
LOGGER_URL="tcp://192.168.11.214:5555"

HOSTNAME="--hostname lava-slave1"

LOGLEVEL="DEBUG"

ENCRYPT="--encrypt"
MASTER_CERT="--master-cert /etc/lava-dispatcher/certificates.d/master.key"
SLAVE_CERT="--slave-cert /etc/lava-dispatcher/certificates.d/slave1.key_secret"
```

After lava-master and lava-slave restarted, I see the following logs.
Seems the connect was established, but lava-logs went offline.
lava-master
```
2018-01-30 11:05:50,260   DEBUG lava-slave1 => PING(20)
2018-01-30 11:05:52,086   DEBUG lava-master => PING(20)
2018-01-30 11:06:08,728   DEBUG lava-logs => PING(20)
2018-01-30 11:06:10,261    INFO scheduling health checks:
2018-01-30 11:06:10,270   DEBUG -> disabled on: lxc, qemu
2018-01-30 11:06:10,271    INFO scheduling jobs:
2018-01-30 11:06:10,272   DEBUG - lxc
2018-01-30 11:06:10,292   DEBUG - qemu
2018-01-30 11:06:10,332   DEBUG lava-slave1 => PING(20)
2018-01-30 11:06:12,115   DEBUG lava-master => PING(20)
2018-01-30 11:06:20,252    INFO [POLL] Received a signal, leaving
2018-01-30 11:06:20,254    INFO [CLOSE] Closing the controler socket
and dropping messages
2018-01-30 11:06:21,203    INFO [INIT] Dropping privileges
2018-01-30 11:06:21,204   DEBUG Switching to (lavaserver(114), lavaserver(119))
2018-01-30 11:06:21,204    INFO [INIT] Marking all workers as offline
2018-01-30 11:06:21,209    INFO [INIT] Starting encryption
2018-01-30 11:06:21,211   DEBUG [INIT] Opening master certificate:
/etc/lava-dispatcher/certificates.d/master.key_secret
2018-01-30 11:06:21,238   DEBUG [INIT] Using slaves certificates from:
/etc/lava-dispatcher/certificates.d/
2018-01-30 11:06:21,245    INFO [INIT] LAVA master has started.
2018-01-30 11:06:21,246    INFO [INIT] Using protocol version 2
2018-01-30 11:06:41,247 WARNING lava-logs is offline: can't schedule jobs
2018-01-30 11:07:01,255 WARNING lava-logs is offline: can't schedule jobs
2018-01-30 11:07:04,433    INFO lava-slave1 => HELLO
2018-01-30 11:07:04,433 WARNING New dispatcher <lava-slave1>
2018-01-30 11:07:09,450   DEBUG lava-slave1 => PING(20)
2018-01-30 11:07:21,260 WARNING lava-logs is offline: can't schedule jobs
2018-01-30 11:07:29,477   DEBUG lava-slave1 => PING(20)
2018-01-30 11:07:41,265 WARNING lava-logs is offline: can't schedule jobs
```

lava-slave
```
2018-01-30 11:06:10,283   DEBUG PING => master (last message 20s ago)
2018-01-30 11:06:10,335   DEBUG master => PONG(20)
2018-01-30 11:06:30,356   DEBUG PING => master (last message 20s ago)
2018-01-30 11:07:04,379    INFO [INIT] LAVA slave has started.
2018-01-30 11:07:04,380    INFO [INIT] Using protocol version 2
2018-01-30 11:07:04,390    INFO [INIT] Starting encryption
2018-01-30 11:07:04,390   DEBUG Opening slave certificate:
/etc/lava-dispatcher/certificates.d/slave1.key_secret
2018-01-30 11:07:04,413   DEBUG Opening master certificate:
/etc/lava-dispatcher/certificates.d/master.key
2018-01-30 11:07:04,414    INFO [INIT] Connecting to master as <lava-slave1>
2018-01-30 11:07:04,415    INFO [INIT] Greeting the master => 'HELLO'
2018-01-30 11:07:04,440    INFO [INIT] Connection with master established
2018-01-30 11:07:04,442    INFO Master is ONLINE
2018-01-30 11:07:04,443    INFO Waiting for instructions
2018-01-30 11:07:09,450   DEBUG PING => master (last message 5s ago)
2018-01-30 11:07:09,455   DEBUG master => PONG(20)
```

>From django admin console, I see lava-slave1 still is online, but
both lava-master and lava-logs workers went offline, and it stopped
scheduling test job. Have you guys ever see/hit this issue? Any advice
and suggestions would be appreciated.


Thanks,
Chase
_______________________________________________
Lava-users mailing list
Lava-users@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lava-users



--