On Tue, 15 Sep 2020 at 10:32, Larry Shen larry.shen@nxp.com wrote:
Hi, Milosz,
See this: https://git.lavasoftware.org/lava/lava/-/merge_requests/1286/diffs#65156a950...
In fact I don't care if I can download big logs. Just care the issue "job submit failure", mentioned big log download just looks maybe related to same issue... I'm not sure what happened here, our environment or lava code change related...?
Looks really weird. We're also running eventlet gunicorn and it actually improved things a lot (no more weird timeouts). Maybe Remi has some better idea.
milosz
-----Original Message----- From: Milosz Wasilewski milosz.wasilewski@linaro.org Sent: Tuesday, September 15, 2020 5:28 PM To: Larry Shen larry.shen@nxp.com Cc: lava-users@lists.lavasoftware.org Subject: [EXT] Re: [Lava-users] Issues about XMLRPC & lavacli.
Caution: EXT Email
On Tue, 15 Sep 2020 at 04:04, Larry Shen larry.shen@nxp.com wrote:
Meanwhile, a strange issue in case help:
In the past, when download big logs on the web, if the log too big, it should be timeout, then failed to download.
But, now, we are still timeout in 2020.08, isn’t it should be ok with async worker?
What’s your expect with async for this big file download? Possible our local setting issues?
I'm not sure if this was enabled by default. In the /lib/systemd/system/lava-server-gunicorn.service you should have WORKER_CLASS set to 'eventlet'. If this is not the case it's most likely the source of your trouble.
milosz
From: Larry Shen Sent: Tuesday, September 15, 2020 10:52 AM To: lava-users@lists.lavasoftware.org Subject: Issues about XMLRPC & lavacli.
Hi, guys,
We find an issue related to job submit:
- One team use “lavacli” to submit request, and sometimes it will report next:
07-Sep-2020 16:37:35 Unable to connect: HTTPConnectionPool(host='lava-master.sw.nxp.com', port=80): Read timed out. (read timeout=20.0)
Looks this error happens at next, what do you think about this issue?
try:
# Create the Transport object parsed_uri = urlparse(uri) transport = RequestsTransport( parsed_uri.scheme, config.get("proxy"), config.get("timeout", 20.0), config.get("verify_ssl_cert", True), ) # allow_none is True because the server does support it proxy = xmlrpc.client.ServerProxy(uri, allow_none=True,
transport=transport)
version = proxy.system.version() except (OSError, xmlrpc.client.Error) as exc: print("Unable to connect: %s" % exc2str(exc)) return 1
- Another team write their own python code using XMLRPC to submit job, did something like next, it reports next:
ERROR in XMLRPC.py:submitJob:63 msg: Failed to submit job, reason: <ProtocolError for chuan.su:chuan.su@lava-master.sw.nxp.com/RPC2: 502 Bad Gateway>!
try:
job_id = self.connection.scheduler.submit_job(job) self.logger.debug("Successed to submit job , job_id:
%d, platform; %s!",job_id,platform)
return job_id except Exception as e: self.logger.error("Failed to submit job, reason:
%s!",str(e))
return None
We are currently using lava server version 2020.08, guys told me in the past days, we also encountered similar, but with very low probability. But recently it becomes very high probability.
I’d like to know if possible this will related to your changes to gunicorn eventlet? Or other possible reasons?
Thanks,
Larry
Lava-users mailing list Lava-users@lists.lavasoftware.org https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist s.lavasoftware.org%2Fmailman%2Flistinfo%2Flava-users&data=02%7C01% 7Clarry.shen%40nxp.com%7C99817af3bba04995041108d85959a9d4%7C686ea1d3bc 2b4c6fa92cd99c5c301635%7C0%7C0%7C637357588918656179&sdata=6vvO2UKM VGHQtyf8eVb3eK7Qw9nuhUwwJUt4qXL%2BSTA%3D&reserved=0