Hi, guys,
We find an issue related to job submit:
1) One team use "lavacli" to submit request, and sometimes it will report next:
07-Sep-2020 16:37:35 Unable to connect: HTTPConnectionPool(host='lava-master.sw.nxp.com', port=80): Read timed out. (read timeout=20.0)
Looks this error happens at next, what do you think about this issue?
try: # Create the Transport object parsed_uri = urlparse(uri) transport = RequestsTransport( parsed_uri.scheme, config.get("proxy"), config.get("timeout", 20.0), config.get("verify_ssl_cert", True), ) # allow_none is True because the server does support it proxy = xmlrpc.client.ServerProxy(uri, allow_none=True, transport=transport) version = proxy.system.version() except (OSError, xmlrpc.client.Error) as exc: print("Unable to connect: %s" % exc2str(exc)) return 1
2) Another team write their own python code using XMLRPC to submit job, did something like next, it reports next:
ERROR in XMLRPC.py:submitJob:63 msg: Failed to submit job, reason: <ProtocolError for chuan.su:chuan.su@lava-master.sw.nxp.com/RPC2: 502 Bad Gateway>!
try: job_id = self.connection.scheduler.submit_job(job) self.logger.debug("Successed to submit job , job_id: %d, platform; %s!",job_id,platform) return job_id except Exception as e: self.logger.error("Failed to submit job, reason: %s!",str(e)) return None
We are currently using lava server version 2020.08, guys told me in the past days, we also encountered similar, but with very low probability. But recently it becomes very high probability. I'd like to know if possible this will related to your changes to gunicorn eventlet? Or other possible reasons?
Thanks, Larry
Meanwhile, a strange issue in case help:
In the past, when download big logs on the web, if the log too big, it should be timeout, then failed to download. But, now, we are still timeout in 2020.08, isn't it should be ok with async worker? What's your expect with async for this big file download? Possible our local setting issues?
From: Larry Shen Sent: Tuesday, September 15, 2020 10:52 AM To: lava-users@lists.lavasoftware.org Subject: Issues about XMLRPC & lavacli.
Hi, guys,
We find an issue related to job submit:
1) One team use "lavacli" to submit request, and sometimes it will report next:
07-Sep-2020 16:37:35 Unable to connect: HTTPConnectionPool(host='lava-master.sw.nxp.com', port=80): Read timed out. (read timeout=20.0)
Looks this error happens at next, what do you think about this issue?
try: # Create the Transport object parsed_uri = urlparse(uri) transport = RequestsTransport( parsed_uri.scheme, config.get("proxy"), config.get("timeout", 20.0), config.get("verify_ssl_cert", True), ) # allow_none is True because the server does support it proxy = xmlrpc.client.ServerProxy(uri, allow_none=True, transport=transport) version = proxy.system.version() except (OSError, xmlrpc.client.Error) as exc: print("Unable to connect: %s" % exc2str(exc)) return 1
2) Another team write their own python code using XMLRPC to submit job, did something like next, it reports next:
ERROR in XMLRPC.py:submitJob:63 msg: Failed to submit job, reason: <ProtocolError for chuan.su:chuan.su@lava-master.sw.nxp.com/RPC2: 502 Bad Gateway>!
try: job_id = self.connection.scheduler.submit_job(job) self.logger.debug("Successed to submit job , job_id: %d, platform; %s!",job_id,platform) return job_id except Exception as e: self.logger.error("Failed to submit job, reason: %s!",str(e)) return None
We are currently using lava server version 2020.08, guys told me in the past days, we also encountered similar, but with very low probability. But recently it becomes very high probability. I'd like to know if possible this will related to your changes to gunicorn eventlet? Or other possible reasons?
Thanks, Larry
On Tue, 15 Sep 2020 at 04:04, Larry Shen larry.shen@nxp.com wrote:
Meanwhile, a strange issue in case help:
In the past, when download big logs on the web, if the log too big, it should be timeout, then failed to download.
But, now, we are still timeout in 2020.08, isn’t it should be ok with async worker?
What’s your expect with async for this big file download? Possible our local setting issues?
I'm not sure if this was enabled by default. In the /lib/systemd/system/lava-server-gunicorn.service you should have WORKER_CLASS set to 'eventlet'. If this is not the case it's most likely the source of your trouble.
milosz
From: Larry Shen Sent: Tuesday, September 15, 2020 10:52 AM To: lava-users@lists.lavasoftware.org Subject: Issues about XMLRPC & lavacli.
Hi, guys,
We find an issue related to job submit:
- One team use “lavacli” to submit request, and sometimes it will report next:
07-Sep-2020 16:37:35 Unable to connect: HTTPConnectionPool(host='lava-master.sw.nxp.com', port=80): Read timed out. (read timeout=20.0)
Looks this error happens at next, what do you think about this issue?
try:
# Create the Transport object parsed_uri = urlparse(uri) transport = RequestsTransport( parsed_uri.scheme, config.get("proxy"), config.get("timeout", 20.0), config.get("verify_ssl_cert", True), ) # allow_none is True because the server does support it proxy = xmlrpc.client.ServerProxy(uri, allow_none=True, transport=transport) version = proxy.system.version() except (OSError, xmlrpc.client.Error) as exc: print("Unable to connect: %s" % exc2str(exc)) return 1
- Another team write their own python code using XMLRPC to submit job, did something like next, it reports next:
ERROR in XMLRPC.py:submitJob:63 msg: Failed to submit job, reason: <ProtocolError for chuan.su:chuan.su@lava-master.sw.nxp.com/RPC2: 502 Bad Gateway>!
try:
job_id = self.connection.scheduler.submit_job(job) self.logger.debug("Successed to submit job , job_id: %d, platform; %s!",job_id,platform) return job_id except Exception as e: self.logger.error("Failed to submit job, reason: %s!",str(e)) return None
We are currently using lava server version 2020.08, guys told me in the past days, we also encountered similar, but with very low probability. But recently it becomes very high probability.
I’d like to know if possible this will related to your changes to gunicorn eventlet? Or other possible reasons?
Thanks,
Larry
Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users
Hi, Milosz,
See this: https://git.lavasoftware.org/lava/lava/-/merge_requests/1286/diffs#65156a950...
In fact I don't care if I can download big logs. Just care the issue "job submit failure", mentioned big log download just looks maybe related to same issue... I'm not sure what happened here, our environment or lava code change related...?
-----Original Message----- From: Milosz Wasilewski milosz.wasilewski@linaro.org Sent: Tuesday, September 15, 2020 5:28 PM To: Larry Shen larry.shen@nxp.com Cc: lava-users@lists.lavasoftware.org Subject: [EXT] Re: [Lava-users] Issues about XMLRPC & lavacli.
Caution: EXT Email
On Tue, 15 Sep 2020 at 04:04, Larry Shen larry.shen@nxp.com wrote:
Meanwhile, a strange issue in case help:
In the past, when download big logs on the web, if the log too big, it should be timeout, then failed to download.
But, now, we are still timeout in 2020.08, isn’t it should be ok with async worker?
What’s your expect with async for this big file download? Possible our local setting issues?
I'm not sure if this was enabled by default. In the /lib/systemd/system/lava-server-gunicorn.service you should have WORKER_CLASS set to 'eventlet'. If this is not the case it's most likely the source of your trouble.
milosz
From: Larry Shen Sent: Tuesday, September 15, 2020 10:52 AM To: lava-users@lists.lavasoftware.org Subject: Issues about XMLRPC & lavacli.
Hi, guys,
We find an issue related to job submit:
- One team use “lavacli” to submit request, and sometimes it will report next:
07-Sep-2020 16:37:35 Unable to connect: HTTPConnectionPool(host='lava-master.sw.nxp.com', port=80): Read timed out. (read timeout=20.0)
Looks this error happens at next, what do you think about this issue?
try:
# Create the Transport object parsed_uri = urlparse(uri) transport = RequestsTransport( parsed_uri.scheme, config.get("proxy"), config.get("timeout", 20.0), config.get("verify_ssl_cert", True), ) # allow_none is True because the server does support it proxy = xmlrpc.client.ServerProxy(uri, allow_none=True,
transport=transport)
version = proxy.system.version() except (OSError, xmlrpc.client.Error) as exc: print("Unable to connect: %s" % exc2str(exc)) return 1
- Another team write their own python code using XMLRPC to submit job, did something like next, it reports next:
ERROR in XMLRPC.py:submitJob:63 msg: Failed to submit job, reason: <ProtocolError for chuan.su:chuan.su@lava-master.sw.nxp.com/RPC2: 502 Bad Gateway>!
try:
job_id = self.connection.scheduler.submit_job(job) self.logger.debug("Successed to submit job , job_id:
%d, platform; %s!",job_id,platform)
return job_id except Exception as e: self.logger.error("Failed to submit job, reason:
%s!",str(e))
return None
We are currently using lava server version 2020.08, guys told me in the past days, we also encountered similar, but with very low probability. But recently it becomes very high probability.
I’d like to know if possible this will related to your changes to gunicorn eventlet? Or other possible reasons?
Thanks,
Larry
Lava-users mailing list Lava-users@lists.lavasoftware.org https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist s.lavasoftware.org%2Fmailman%2Flistinfo%2Flava-users&data=02%7C01% 7Clarry.shen%40nxp.com%7C99817af3bba04995041108d85959a9d4%7C686ea1d3bc 2b4c6fa92cd99c5c301635%7C0%7C0%7C637357588918656179&sdata=6vvO2UKM VGHQtyf8eVb3eK7Qw9nuhUwwJUt4qXL%2BSTA%3D&reserved=0
On Tue, 15 Sep 2020 at 10:32, Larry Shen larry.shen@nxp.com wrote:
Hi, Milosz,
See this: https://git.lavasoftware.org/lava/lava/-/merge_requests/1286/diffs#65156a950...
In fact I don't care if I can download big logs. Just care the issue "job submit failure", mentioned big log download just looks maybe related to same issue... I'm not sure what happened here, our environment or lava code change related...?
Looks really weird. We're also running eventlet gunicorn and it actually improved things a lot (no more weird timeouts). Maybe Remi has some better idea.
milosz
-----Original Message----- From: Milosz Wasilewski milosz.wasilewski@linaro.org Sent: Tuesday, September 15, 2020 5:28 PM To: Larry Shen larry.shen@nxp.com Cc: lava-users@lists.lavasoftware.org Subject: [EXT] Re: [Lava-users] Issues about XMLRPC & lavacli.
Caution: EXT Email
On Tue, 15 Sep 2020 at 04:04, Larry Shen larry.shen@nxp.com wrote:
Meanwhile, a strange issue in case help:
In the past, when download big logs on the web, if the log too big, it should be timeout, then failed to download.
But, now, we are still timeout in 2020.08, isn’t it should be ok with async worker?
What’s your expect with async for this big file download? Possible our local setting issues?
I'm not sure if this was enabled by default. In the /lib/systemd/system/lava-server-gunicorn.service you should have WORKER_CLASS set to 'eventlet'. If this is not the case it's most likely the source of your trouble.
milosz
From: Larry Shen Sent: Tuesday, September 15, 2020 10:52 AM To: lava-users@lists.lavasoftware.org Subject: Issues about XMLRPC & lavacli.
Hi, guys,
We find an issue related to job submit:
- One team use “lavacli” to submit request, and sometimes it will report next:
07-Sep-2020 16:37:35 Unable to connect: HTTPConnectionPool(host='lava-master.sw.nxp.com', port=80): Read timed out. (read timeout=20.0)
Looks this error happens at next, what do you think about this issue?
try:
# Create the Transport object parsed_uri = urlparse(uri) transport = RequestsTransport( parsed_uri.scheme, config.get("proxy"), config.get("timeout", 20.0), config.get("verify_ssl_cert", True), ) # allow_none is True because the server does support it proxy = xmlrpc.client.ServerProxy(uri, allow_none=True,
transport=transport)
version = proxy.system.version() except (OSError, xmlrpc.client.Error) as exc: print("Unable to connect: %s" % exc2str(exc)) return 1
- Another team write their own python code using XMLRPC to submit job, did something like next, it reports next:
ERROR in XMLRPC.py:submitJob:63 msg: Failed to submit job, reason: <ProtocolError for chuan.su:chuan.su@lava-master.sw.nxp.com/RPC2: 502 Bad Gateway>!
try:
job_id = self.connection.scheduler.submit_job(job) self.logger.debug("Successed to submit job , job_id:
%d, platform; %s!",job_id,platform)
return job_id except Exception as e: self.logger.error("Failed to submit job, reason:
%s!",str(e))
return None
We are currently using lava server version 2020.08, guys told me in the past days, we also encountered similar, but with very low probability. But recently it becomes very high probability.
I’d like to know if possible this will related to your changes to gunicorn eventlet? Or other possible reasons?
Thanks,
Larry
Lava-users mailing list Lava-users@lists.lavasoftware.org https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist s.lavasoftware.org%2Fmailman%2Flistinfo%2Flava-users&data=02%7C01% 7Clarry.shen%40nxp.com%7C99817af3bba04995041108d85959a9d4%7C686ea1d3bc 2b4c6fa92cd99c5c301635%7C0%7C0%7C637357588918656179&sdata=6vvO2UKM VGHQtyf8eVb3eK7Qw9nuhUwwJUt4qXL%2BSTA%3D&reserved=0
I just checked the log, looks there is nothing in server log.
Just, we use container master, and get next from docker logs, these happened when user sometimes submit job failure, will it possible be the cause? What does it mean?
{"log":"10.193.108.249 - - [15/Sep/2020:02:12:01 +0000] "POST /RPC2 HTTP/1.1" 200 587 "-" "lavacli v0.9.7"\n","stream":"stdout","time":"2020-09-15T02:12:02.222152631Z"} {"log":"10.192.244.28 - - [15/Sep/2020:02:12:01 +0000] "GET /scheduler/job/109711/job_status HTTP/1.1" 200 634 "http://lava-master.sw.nxp.com/scheduler/job/109711%5C" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"\n","stream":"stdout","time":"2020-09-15T02:12:02.222155609Z"} {"log":"10.192.244.28 - - [15/Sep/2020:02:12:01 +0000] "GET /scheduler/job/109711/log_pipeline_incremental?line=102 HTTP/1.1" 200 6071 "http://lava-master.sw.nxp.com/scheduler/job/109711%5C" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"\n","stream":"stdout","time":"2020-09-15T02:12:02.222178222Z"} {"log":"10.193.108.249 - - [15/Sep/2020:02:12:02 +0000] "POST /RPC2 HTTP/1.1" 200 433 "-" "lavacli v0.9.7"\n","stream":"stdout","time":"2020-09-15T02:12:02.22218312Z"} {"log":"ERROR:linaro-django-xmlrpc-dispatcher:Internal error in the XML-RPC dispatcher while calling method 'scheduler.jobs.show' with ('Unable',)\n","stream":"stderr","time":"2020-09-15T02:12:02.670767388Z"} {"log":"Traceback (most recent call last):\n","stream":"stderr","time":"2020-09-15T02:12:02.670790089Z"} {"log":" File "/usr/lib/python3/dist-packages/linaro_django_xmlrpc/models.py", line 441, in dispatch\n","stream":"stderr","time":"2020-09-15T02:12:02.670793859Z"} {"log":" return impl(*params)\n","stream":"stderr","time":"2020-09-15T02:12:02.670796867Z"} {"log":" File "/usr/lib/python3/dist-packages/lava_scheduler_app/api/jobs.py", line 383, in show\n","stream":"stderr","time":"2020-09-15T02:12:02.67079951Z"} {"log":" job = TestJob.get_by_job_number(job_id)\n","stream":"stderr","time":"2020-09-15T02:12:02.670802465Z"} {"log":" File "/usr/lib/python3/dist-packages/lava_scheduler_app/models.py", line 2010, in get_by_job_number\n","stream":"stderr","time":"2020-09-15T02:12:02.670805034Z"} {"log":" job = query.get(pk=job_id)\n","stream":"stderr","time":"2020-09-15T02:12:02.670807695Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/manager.py", line 85, in manager_method\n","stream":"stderr","time":"2020-09-15T02:12:02.670810275Z"} {"log":" return getattr(self.get_queryset(), name)(*args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67081302Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 371, in get\n","stream":"stderr","time":"2020-09-15T02:12:02.670815631Z"} {"log":" clone = self.filter(*args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67081854Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 787, in filter\n","stream":"stderr","time":"2020-09-15T02:12:02.670821466Z"} {"log":" return self._filter_or_exclude(False, *args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.670833942Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 805, in _filter_or_exclude\n","stream":"stderr","time":"2020-09-15T02:12:02.670836896Z"} {"log":" clone.query.add_q(Q(*args, **kwargs))\n","stream":"stderr","time":"2020-09-15T02:12:02.670839627Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1250, in add_q\n","stream":"stderr","time":"2020-09-15T02:12:02.670842124Z"} {"log":" clause, _ = self._add_q(q_object, self.used_aliases)\n","stream":"stderr","time":"2020-09-15T02:12:02.670844738Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1276, in _add_q\n","stream":"stderr","time":"2020-09-15T02:12:02.670847208Z"} {"log":" allow_joins=allow_joins, split_subq=split_subq,\n","stream":"stderr","time":"2020-09-15T02:12:02.670849936Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1210, in build_filter\n","stream":"stderr","time":"2020-09-15T02:12:02.670852387Z"} {"log":" condition = self.build_lookup(lookups, col, value)\n","stream":"stderr","time":"2020-09-15T02:12:02.670855092Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1104, in build_lookup\n","stream":"stderr","time":"2020-09-15T02:12:02.670857556Z"} {"log":" return final_lookup(lhs, rhs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67086024Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/lookups.py", line 24, in __init__\n","stream":"stderr","time":"2020-09-15T02:12:02.670862672Z"} {"log":" self.rhs = self.get_prep_lookup()\n","stream":"stderr","time":"2020-09-15T02:12:02.670877253Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/lookups.py", line 74, in get_prep_lookup\n","stream":"stderr","time":"2020-09-15T02:12:02.670880038Z"} {"log":" return self.lhs.output_field.get_prep_value(self.rhs)\n","stream":"stderr","time":"2020-09-15T02:12:02.670882674Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/fields/__init__.py", line 966, in get_prep_value\n","stream":"stderr","time":"2020-09-15T02:12:02.670886231Z"} {"log":" return int(value)\n","stream":"stderr","time":"2020-09-15T02:12:02.67088904Z"} {"log":"ValueError: invalid literal for int() with base 10: 'Unable'\n","stream":"stderr","time":"2020-09-15T02:12:02.670891628Z"} {"log":"\n","stream":"stdout","time":"2020-09-15T02:12:03.22232776Z"} {"log":"==\u003e gunicorn.log \u003c==\n","stream":"stdout","time":"2020-09-15T02:12:03.222350349Z"}
-----Original Message----- From: Milosz Wasilewski milosz.wasilewski@linaro.org Sent: Tuesday, September 15, 2020 5:39 PM To: Larry Shen larry.shen@nxp.com Cc: lava-users@lists.lavasoftware.org Subject: Re: [EXT] Re: [Lava-users] Issues about XMLRPC & lavacli.
Caution: EXT Email
On Tue, 15 Sep 2020 at 10:32, Larry Shen larry.shen@nxp.com wrote:
Hi, Milosz,
See this: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit. lavasoftware.org%2Flava%2Flava%2F-%2Fmerge_requests%2F1286%2Fdiffs%236 5156a95098dc512e7a4b7047ea511332947f649&data=02%7C01%7Clarry.shen% 40nxp.com%7Cc3cc54011c6f40b1781908d8595b2235%7C686ea1d3bc2b4c6fa92cd99 c5c301635%7C0%7C0%7C637357595236580077&sdata=wY2cJ%2F6kyfgkusZylxF QROEF8Dwew%2FvQg%2FaiBsPU8Og%3D&reserved=0
In fact I don't care if I can download big logs. Just care the issue "job submit failure", mentioned big log download just looks maybe related to same issue... I'm not sure what happened here, our environment or lava code change related...?
Looks really weird. We're also running eventlet gunicorn and it actually improved things a lot (no more weird timeouts). Maybe Remi has some better idea.
milosz
-----Original Message----- From: Milosz Wasilewski milosz.wasilewski@linaro.org Sent: Tuesday, September 15, 2020 5:28 PM To: Larry Shen larry.shen@nxp.com Cc: lava-users@lists.lavasoftware.org Subject: [EXT] Re: [Lava-users] Issues about XMLRPC & lavacli.
Caution: EXT Email
On Tue, 15 Sep 2020 at 04:04, Larry Shen larry.shen@nxp.com wrote:
Meanwhile, a strange issue in case help:
In the past, when download big logs on the web, if the log too big, it should be timeout, then failed to download.
But, now, we are still timeout in 2020.08, isn’t it should be ok with async worker?
What’s your expect with async for this big file download? Possible our local setting issues?
I'm not sure if this was enabled by default. In the /lib/systemd/system/lava-server-gunicorn.service you should have WORKER_CLASS set to 'eventlet'. If this is not the case it's most likely the source of your trouble.
milosz
From: Larry Shen Sent: Tuesday, September 15, 2020 10:52 AM To: lava-users@lists.lavasoftware.org Subject: Issues about XMLRPC & lavacli.
Hi, guys,
We find an issue related to job submit:
- One team use “lavacli” to submit request, and sometimes it will report next:
07-Sep-2020 16:37:35 Unable to connect: HTTPConnectionPool(host='lava-master.sw.nxp.com', port=80): Read timed out. (read timeout=20.0)
Looks this error happens at next, what do you think about this issue?
try:
# Create the Transport object parsed_uri = urlparse(uri) transport = RequestsTransport( parsed_uri.scheme, config.get("proxy"), config.get("timeout", 20.0), config.get("verify_ssl_cert", True), ) # allow_none is True because the server does support it proxy = xmlrpc.client.ServerProxy(uri, allow_none=True,
transport=transport)
version = proxy.system.version() except (OSError, xmlrpc.client.Error) as exc: print("Unable to connect: %s" % exc2str(exc)) return 1
- Another team write their own python code using XMLRPC to submit job, did something like next, it reports next:
ERROR in XMLRPC.py:submitJob:63 msg: Failed to submit job, reason: <ProtocolError for chuan.su:chuan.su@lava-master.sw.nxp.com/RPC2: 502 Bad Gateway>!
try:
job_id = self.connection.scheduler.submit_job(job) self.logger.debug("Successed to submit job , job_id:
%d, platform; %s!",job_id,platform)
return job_id except Exception as e: self.logger.error("Failed to submit job, reason:
%s!",str(e))
return None
We are currently using lava server version 2020.08, guys told me in the past days, we also encountered similar, but with very low probability. But recently it becomes very high probability.
I’d like to know if possible this will related to your changes to gunicorn eventlet? Or other possible reasons?
Thanks,
Larry
Lava-users mailing list Lava-users@lists.lavasoftware.org https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli st s.lavasoftware.org%2Fmailman%2Flistinfo%2Flava-users&data=02%7C0 1% 7Clarry.shen%40nxp.com%7C99817af3bba04995041108d85959a9d4%7C686ea1d3 bc 2b4c6fa92cd99c5c301635%7C0%7C0%7C637357588918656179&sdata=6vvO2U KM VGHQtyf8eVb3eK7Qw9nuhUwwJUt4qXL%2BSTA%3D&reserved=0
Have you tried increasing the lavacli default timeout. Maybe the network connection to the server is flaky?
Rgds
Le mar. 15 sept. 2020 à 12:06, Larry Shen larry.shen@nxp.com a écrit :
I just checked the log, looks there is nothing in server log.
Just, we use container master, and get next from docker logs, these happened when user sometimes submit job failure, will it possible be the cause? What does it mean?
{"log":"10.193.108.249 - - [15/Sep/2020:02:12:01 +0000] "POST /RPC2 HTTP/1.1" 200 587 "-" "lavacli v0.9.7"\n","stream":"stdout","time":"2020-09-15T02:12:02.222152631Z"} {"log":"10.192.244.28 - - [15/Sep/2020:02:12:01 +0000] "GET /scheduler/job/109711/job_status HTTP/1.1" 200 634 " http://lava-master.sw.nxp.com/scheduler/job/109711%5C" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"\n","stream":"stdout","time":"2020-09-15T02:12:02.222155609Z"} {"log":"10.192.244.28 - - [15/Sep/2020:02:12:01 +0000] "GET /scheduler/job/109711/log_pipeline_incremental?line=102 HTTP/1.1" 200 6071 "http://lava-master.sw.nxp.com/scheduler/job/109711%5C" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"\n","stream":"stdout","time":"2020-09-15T02:12:02.222178222Z"} {"log":"10.193.108.249 - - [15/Sep/2020:02:12:02 +0000] "POST /RPC2 HTTP/1.1" 200 433 "-" "lavacli v0.9.7"\n","stream":"stdout","time":"2020-09-15T02:12:02.22218312Z"} {"log":"ERROR:linaro-django-xmlrpc-dispatcher:Internal error in the XML-RPC dispatcher while calling method 'scheduler.jobs.show' with ('Unable',)\n","stream":"stderr","time":"2020-09-15T02:12:02.670767388Z"} {"log":"Traceback (most recent call last):\n","stream":"stderr","time":"2020-09-15T02:12:02.670790089Z"} {"log":" File "/usr/lib/python3/dist-packages/linaro_django_xmlrpc/models.py", line 441, in dispatch\n","stream":"stderr","time":"2020-09-15T02:12:02.670793859Z"} {"log":" return impl(*params)\n","stream":"stderr","time":"2020-09-15T02:12:02.670796867Z"} {"log":" File "/usr/lib/python3/dist-packages/lava_scheduler_app/api/jobs.py", line 383, in show\n","stream":"stderr","time":"2020-09-15T02:12:02.67079951Z"} {"log":" job = TestJob.get_by_job_number(job_id)\n","stream":"stderr","time":"2020-09-15T02:12:02.670802465Z"} {"log":" File "/usr/lib/python3/dist-packages/lava_scheduler_app/models.py", line 2010, in get_by_job_number\n","stream":"stderr","time":"2020-09-15T02:12:02.670805034Z"} {"log":" job = query.get(pk=job_id)\n","stream":"stderr","time":"2020-09-15T02:12:02.670807695Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/manager.py", line 85, in manager_method\n","stream":"stderr","time":"2020-09-15T02:12:02.670810275Z"} {"log":" return getattr(self.get_queryset(), name)(*args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67081302Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 371, in get\n","stream":"stderr","time":"2020-09-15T02:12:02.670815631Z"} {"log":" clone = self.filter(*args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67081854Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 787, in filter\n","stream":"stderr","time":"2020-09-15T02:12:02.670821466Z"} {"log":" return self._filter_or_exclude(False, *args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.670833942Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 805, in _filter_or_exclude\n","stream":"stderr","time":"2020-09-15T02:12:02.670836896Z"} {"log":" clone.query.add_q(Q(*args, **kwargs))\n","stream":"stderr","time":"2020-09-15T02:12:02.670839627Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1250, in add_q\n","stream":"stderr","time":"2020-09-15T02:12:02.670842124Z"} {"log":" clause, _ = self._add_q(q_object, self.used_aliases)\n","stream":"stderr","time":"2020-09-15T02:12:02.670844738Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1276, in _add_q\n","stream":"stderr","time":"2020-09-15T02:12:02.670847208Z"} {"log":" allow_joins=allow_joins, split_subq=split_subq,\n","stream":"stderr","time":"2020-09-15T02:12:02.670849936Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1210, in build_filter\n","stream":"stderr","time":"2020-09-15T02:12:02.670852387Z"} {"log":" condition = self.build_lookup(lookups, col, value)\n","stream":"stderr","time":"2020-09-15T02:12:02.670855092Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1104, in build_lookup\n","stream":"stderr","time":"2020-09-15T02:12:02.670857556Z"} {"log":" return final_lookup(lhs, rhs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67086024Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/lookups.py", line 24, in __init__\n","stream":"stderr","time":"2020-09-15T02:12:02.670862672Z"} {"log":" self.rhs = self.get_prep_lookup()\n","stream":"stderr","time":"2020-09-15T02:12:02.670877253Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/lookups.py", line 74, in get_prep_lookup\n","stream":"stderr","time":"2020-09-15T02:12:02.670880038Z"} {"log":" return self.lhs.output_field.get_prep_value(self.rhs)\n","stream":"stderr","time":"2020-09-15T02:12:02.670882674Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/fields/__init__.py", line 966, in get_prep_value\n","stream":"stderr","time":"2020-09-15T02:12:02.670886231Z"} {"log":" return int(value)\n","stream":"stderr","time":"2020-09-15T02:12:02.67088904Z"} {"log":"ValueError: invalid literal for int() with base 10: 'Unable'\n","stream":"stderr","time":"2020-09-15T02:12:02.670891628Z"} {"log":"\n","stream":"stdout","time":"2020-09-15T02:12:03.22232776Z"} {"log":"==\u003e gunicorn.log \u003c==\n","stream":"stdout","time":"2020-09-15T02:12:03.222350349Z"}
-----Original Message----- From: Milosz Wasilewski milosz.wasilewski@linaro.org Sent: Tuesday, September 15, 2020 5:39 PM To: Larry Shen larry.shen@nxp.com Cc: lava-users@lists.lavasoftware.org Subject: Re: [EXT] Re: [Lava-users] Issues about XMLRPC & lavacli.
Caution: EXT Email
On Tue, 15 Sep 2020 at 10:32, Larry Shen larry.shen@nxp.com wrote:
Hi, Milosz,
See this: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit. lavasoftware.org%2Flava%2Flava%2F-%2Fmerge_requests%2F1286%2Fdiffs%236 5156a95098dc512e7a4b7047ea511332947f649&data=02%7C01%7Clarry.shen% 40nxp.com%7Cc3cc54011c6f40b1781908d8595b2235%7C686ea1d3bc2b4c6fa92cd99 c5c301635%7C0%7C0%7C637357595236580077&sdata=wY2cJ%2F6kyfgkusZylxF QROEF8Dwew%2FvQg%2FaiBsPU8Og%3D&reserved=0
In fact I don't care if I can download big logs. Just care the issue "job submit failure", mentioned big log download
just looks maybe related to same issue...
I'm not sure what happened here, our environment or lava code change
related...?
Looks really weird. We're also running eventlet gunicorn and it actually improved things a lot (no more weird timeouts). Maybe Remi has some better idea.
milosz
-----Original Message----- From: Milosz Wasilewski milosz.wasilewski@linaro.org Sent: Tuesday, September 15, 2020 5:28 PM To: Larry Shen larry.shen@nxp.com Cc: lava-users@lists.lavasoftware.org Subject: [EXT] Re: [Lava-users] Issues about XMLRPC & lavacli.
Caution: EXT Email
On Tue, 15 Sep 2020 at 04:04, Larry Shen larry.shen@nxp.com wrote:
Meanwhile, a strange issue in case help:
In the past, when download big logs on the web, if the log too big, it
should be timeout, then failed to download.
But, now, we are still timeout in 2020.08, isn’t it should be ok with
async worker?
What’s your expect with async for this big file download? Possible our
local setting issues?
I'm not sure if this was enabled by default. In the
/lib/systemd/system/lava-server-gunicorn.service you should have WORKER_CLASS set to 'eventlet'. If this is not the case it's most likely the source of your trouble.
milosz
From: Larry Shen Sent: Tuesday, September 15, 2020 10:52 AM To: lava-users@lists.lavasoftware.org Subject: Issues about XMLRPC & lavacli.
Hi, guys,
We find an issue related to job submit:
- One team use “lavacli” to submit request, and sometimes it will
report next:
07-Sep-2020 16:37:35 Unable to connect:
HTTPConnectionPool(host='lava-master.sw.nxp.com', port=80): Read timed out. (read timeout=20.0)
Looks this error happens at next, what do you think about this issue?
try:
# Create the Transport object parsed_uri = urlparse(uri) transport = RequestsTransport( parsed_uri.scheme, config.get("proxy"), config.get("timeout", 20.0), config.get("verify_ssl_cert", True), ) # allow_none is True because the server does support it proxy = xmlrpc.client.ServerProxy(uri, allow_none=True,
transport=transport)
version = proxy.system.version() except (OSError, xmlrpc.client.Error) as exc: print("Unable to connect: %s" % exc2str(exc)) return 1
- Another team write their own python code using XMLRPC to submit
job, did something like next, it reports next:
ERROR in XMLRPC.py:submitJob:63 msg: Failed to submit job, reason:
<ProtocolError for chuan.su:chuan.su@lava-master.sw.nxp.com/RPC2: 502 Bad Gateway>!
try:
job_id = self.connection.scheduler.submit_job(job) self.logger.debug("Successed to submit job , job_id:
%d, platform; %s!",job_id,platform)
return job_id except Exception as e: self.logger.error("Failed to submit job, reason:
%s!",str(e))
return None
We are currently using lava server version 2020.08, guys told me in
the past days, we also encountered similar, but with very low probability. But recently it becomes very high probability.
I’d like to know if possible this will related to your changes to
gunicorn eventlet? Or other possible reasons?
Thanks,
Larry
Lava-users mailing list Lava-users@lists.lavasoftware.org https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli st s.lavasoftware.org%2Fmailman%2Flistinfo%2Flava-users&data=02%7C0 1% 7Clarry.shen%40nxp.com%7C99817af3bba04995041108d85959a9d4%7C686ea1d3 bc 2b4c6fa92cd99c5c301635%7C0%7C0%7C637357588918656179&sdata=6vvO2U KM VGHQtyf8eVb3eK7Qw9nuhUwwJUt4qXL%2BSTA%3D&reserved=0
Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users
Yes, Remi, we tried it like next:
staging: token: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx uri: http://lava-staging.sw.nxp.com/RPC2 username: larry.shen timeout: 300.0 The only result is the error message becomes: Unable to connect: HTTPConnectionPool(host='lava-master.sw.nxp.comhttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master.sw.nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7C41a826380bec44f31c5808d85970d145%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637357688370859674&sdata=BZSsMqgy12Urh%2ByQI3PM5LYm3z%2BiquSdyi0UrB%2FDetA%3D&reserved=0', port=80): Read timed out. (read timeout=300.0)
And another guys which directly use XMLRPC to submit without set timeout, will directly get 502, I think the root cause is same. The server is handling connections which cannot response client, the lavacli timeout even 300 seconds.
From: Remi Duraffort remi.duraffort@linaro.org Sent: Tuesday, September 15, 2020 8:14 PM To: Larry Shen larry.shen@nxp.com Cc: Milosz Wasilewski milosz.wasilewski@linaro.org; lava-users@lists.lavasoftware.org Subject: Re: [Lava-users] [EXT] Re: Issues about XMLRPC & lavacli.
Caution: EXT Email Have you tried increasing the lavacli default timeout. Maybe the network connection to the server is flaky?
Rgds
Le mar. 15 sept. 2020 à 12:06, Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> a écrit : I just checked the log, looks there is nothing in server log.
Just, we use container master, and get next from docker logs, these happened when user sometimes submit job failure, will it possible be the cause? What does it mean?
{"log":"10.193.108.249 - - [15/Sep/2020:02:12:01 +0000] "POST /RPC2 HTTP/1.1" 200 587 "-" "lavacli v0.9.7"\n","stream":"stdout","time":"2020-09-15T02:12:02.222152631Z"} {"log":"10.192.244.28 - - [15/Sep/2020:02:12:01 +0000] "GET /scheduler/job/109711/job_status HTTP/1.1" 200 634 "http://lava-master.sw.nxp.com/scheduler/job/109711%5C<https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master..." "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"\n","stream":"stdout","time":"2020-09-15T02:12:02.222155609Z"} {"log":"10.192.244.28 - - [15/Sep/2020:02:12:01 +0000] "GET /scheduler/job/109711/log_pipeline_incremental?line=102 HTTP/1.1" 200 6071 "http://lava-master.sw.nxp.com/scheduler/job/109711%5C<https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master..." "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"\n","stream":"stdout","time":"2020-09-15T02:12:02.222178222Z"} {"log":"10.193.108.249 - - [15/Sep/2020:02:12:02 +0000] "POST /RPC2 HTTP/1.1" 200 433 "-" "lavacli v0.9.7"\n","stream":"stdout","time":"2020-09-15T02:12:02.22218312Z"} {"log":"ERROR:linaro-django-xmlrpc-dispatcher:Internal error in the XML-RPC dispatcher while calling method 'scheduler.jobs.show' with ('Unable',)\n","stream":"stderr","time":"2020-09-15T02:12:02.670767388Z"} {"log":"Traceback (most recent call last):\n","stream":"stderr","time":"2020-09-15T02:12:02.670790089Z"} {"log":" File "/usr/lib/python3/dist-packages/linaro_django_xmlrpc/models.py", line 441, in dispatch\n","stream":"stderr","time":"2020-09-15T02:12:02.670793859Z"} {"log":" return impl(*params)\n","stream":"stderr","time":"2020-09-15T02:12:02.670796867Z"} {"log":" File "/usr/lib/python3/dist-packages/lava_scheduler_app/api/jobs.py", line 383, in show\n","stream":"stderr","time":"2020-09-15T02:12:02.67079951Z"} {"log":" job = TestJob.get_by_job_number(job_id)\n","stream":"stderr","time":"2020-09-15T02:12:02.670802465Z"} {"log":" File "/usr/lib/python3/dist-packages/lava_scheduler_app/models.py", line 2010, in get_by_job_number\n","stream":"stderr","time":"2020-09-15T02:12:02.670805034Z"} {"log":" job = query.get(pk=job_id)\n","stream":"stderr","time":"2020-09-15T02:12:02.670807695Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/manager.py", line 85, in manager_method\n","stream":"stderr","time":"2020-09-15T02:12:02.670810275Z"} {"log":" return getattr(self.get_queryset(), name)(*args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67081302Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 371, in get\n","stream":"stderr","time":"2020-09-15T02:12:02.670815631Z"} {"log":" clone = self.filter(*args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67081854Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 787, in filter\n","stream":"stderr","time":"2020-09-15T02:12:02.670821466Z"} {"log":" return self._filter_or_exclude(False, *args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.670833942Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 805, in _filter_or_exclude\n","stream":"stderr","time":"2020-09-15T02:12:02.670836896Z"} {"log":" clone.query.add_q(Q(*args, **kwargs))\n","stream":"stderr","time":"2020-09-15T02:12:02.670839627Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1250, in add_q\n","stream":"stderr","time":"2020-09-15T02:12:02.670842124Z"} {"log":" clause, _ = self._add_q(q_object, self.used_aliases)\n","stream":"stderr","time":"2020-09-15T02:12:02.670844738Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1276, in _add_q\n","stream":"stderr","time":"2020-09-15T02:12:02.670847208Z"} {"log":" allow_joins=allow_joins, split_subq=split_subq,\n","stream":"stderr","time":"2020-09-15T02:12:02.670849936Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1210, in build_filter\n","stream":"stderr","time":"2020-09-15T02:12:02.670852387Z"} {"log":" condition = self.build_lookup(lookups, col, value)\n","stream":"stderr","time":"2020-09-15T02:12:02.670855092Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1104, in build_lookup\n","stream":"stderr","time":"2020-09-15T02:12:02.670857556Z"} {"log":" return final_lookup(lhs, rhs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67086024Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/lookups.py", line 24, in __init__\n","stream":"stderr","time":"2020-09-15T02:12:02.670862672Z"} {"log":" self.rhs = self.get_prep_lookup()\n","stream":"stderr","time":"2020-09-15T02:12:02.670877253Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/lookups.py", line 74, in get_prep_lookup\n","stream":"stderr","time":"2020-09-15T02:12:02.670880038Z"} {"log":" return self.lhs.output_field.get_prep_value(self.rhs)\n","stream":"stderr","time":"2020-09-15T02:12:02.670882674Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/fields/__init__.py", line 966, in get_prep_value\n","stream":"stderr","time":"2020-09-15T02:12:02.670886231Z"} {"log":" return int(value)\n","stream":"stderr","time":"2020-09-15T02:12:02.67088904Z"} {"log":"ValueError: invalid literal for int() with base 10: 'Unable'\n","stream":"stderr","time":"2020-09-15T02:12:02.670891628Z"} {"log":"\n","stream":"stdout","time":"2020-09-15T02:12:03.22232776Z"} {"log":"==\u003e gunicorn.log \u003c==\n","stream":"stdout","time":"2020-09-15T02:12:03.222350349Z"}
-----Original Message----- From: Milosz Wasilewski <milosz.wasilewski@linaro.orgmailto:milosz.wasilewski@linaro.org> Sent: Tuesday, September 15, 2020 5:39 PM To: Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> Cc: lava-users@lists.lavasoftware.orgmailto:lava-users@lists.lavasoftware.org Subject: Re: [EXT] Re: [Lava-users] Issues about XMLRPC & lavacli.
Caution: EXT Email
On Tue, 15 Sep 2020 at 10:32, Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> wrote:
Hi, Milosz,
See this: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit. lavasoftware.orghttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flavasoftware.org%2F&data=02%7C01%7Clarry.shen%40nxp.com%7C41a826380bec44f31c5808d85970d145%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637357688370849720&sdata=ytzhdMiR0RUadzD%2BNmOtVtwRFXdYWFEWf9%2FuD4inLFQ%3D&reserved=0%2Flava%2Flava%2F-%2Fmerge_requests%2F1286%2Fdiffs%236 5156a95098dc512e7a4b7047ea511332947f649&data=02%7C01%7Clarry.shen% 40nxp.comhttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F40nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7C41a826380bec44f31c5808d85970d145%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637357688370849720&sdata=u%2Fq24o3PkwiG5L5sFpLk7eogdx4g%2Bi25ztBSdtdGbxg%3D&reserved=0%7Cc3cc54011c6f40b1781908d8595b2235%7C686ea1d3bc2b4c6fa92cd99 c5c301635%7C0%7C0%7C637357595236580077&sdata=wY2cJ%2F6kyfgkusZylxF QROEF8Dwew%2FvQg%2FaiBsPU8Og%3D&reserved=0
In fact I don't care if I can download big logs. Just care the issue "job submit failure", mentioned big log download just looks maybe related to same issue... I'm not sure what happened here, our environment or lava code change related...?
Looks really weird. We're also running eventlet gunicorn and it actually improved things a lot (no more weird timeouts). Maybe Remi has some better idea.
milosz
-----Original Message----- From: Milosz Wasilewski <milosz.wasilewski@linaro.orgmailto:milosz.wasilewski@linaro.org> Sent: Tuesday, September 15, 2020 5:28 PM To: Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> Cc: lava-users@lists.lavasoftware.orgmailto:lava-users@lists.lavasoftware.org Subject: [EXT] Re: [Lava-users] Issues about XMLRPC & lavacli.
Caution: EXT Email
On Tue, 15 Sep 2020 at 04:04, Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> wrote:
Meanwhile, a strange issue in case help:
In the past, when download big logs on the web, if the log too big, it should be timeout, then failed to download.
But, now, we are still timeout in 2020.08, isn’t it should be ok with async worker?
What’s your expect with async for this big file download? Possible our local setting issues?
I'm not sure if this was enabled by default. In the /lib/systemd/system/lava-server-gunicorn.service you should have WORKER_CLASS set to 'eventlet'. If this is not the case it's most likely the source of your trouble.
milosz
From: Larry Shen Sent: Tuesday, September 15, 2020 10:52 AM To: lava-users@lists.lavasoftware.orgmailto:lava-users@lists.lavasoftware.org Subject: Issues about XMLRPC & lavacli.
Hi, guys,
We find an issue related to job submit:
- One team use “lavacli” to submit request, and sometimes it will report next:
07-Sep-2020 16:37:35 Unable to connect: HTTPConnectionPool(host='lava-master.sw.nxp.comhttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master.sw.nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7C41a826380bec44f31c5808d85970d145%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637357688370859674&sdata=BZSsMqgy12Urh%2ByQI3PM5LYm3z%2BiquSdyi0UrB%2FDetA%3D&reserved=0', port=80): Read timed out. (read timeout=20.0)
Looks this error happens at next, what do you think about this issue?
try:
# Create the Transport object parsed_uri = urlparse(uri) transport = RequestsTransport( parsed_uri.scheme, config.get("proxy"), config.get("timeout", 20.0), config.get("verify_ssl_cert", True), ) # allow_none is True because the server does support it proxy = xmlrpc.client.ServerProxy(uri, allow_none=True,
transport=transport)
version = proxy.system.version() except (OSError, xmlrpc.client.Error) as exc: print("Unable to connect: %s" % exc2str(exc)) return 1
- Another team write their own python code using XMLRPC to submit job, did something like next, it reports next:
ERROR in XMLRPC.py:submitJob:63 msg: Failed to submit job, reason: <ProtocolError for chuan.su:chuan.su@lava-master.sw.nxp.com/RPC2https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fhuan.su%40lava-master.sw.nxp.com%2FRPC2&data=02%7C01%7Clarry.shen%40nxp.com%7C41a826380bec44f31c5808d85970d145%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637357688370859674&sdata=HME6LKfHJS6GXfgbTMVuGfXDTjVOrB%2Bx%2BytFsjsMNAE%3D&reserved=0: 502 Bad Gateway>!
try:
job_id = self.connection.scheduler.submit_job(job) self.logger.debug("Successed to submit job , job_id:
%d, platform; %s!",job_id,platform)
return job_id except Exception as e: self.logger.error("Failed to submit job, reason:
%s!",str(e))
return None
We are currently using lava server version 2020.08, guys told me in the past days, we also encountered similar, but with very low probability. But recently it becomes very high probability.
I’d like to know if possible this will related to your changes to gunicorn eventlet? Or other possible reasons?
Thanks,
Larry
Lava-users mailing list Lava-users@lists.lavasoftware.orgmailto:Lava-users@lists.lavasoftware.org https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli st s.lavasoftware.orghttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fs.lavasoftware.org%2F&data=02%7C01%7Clarry.shen%40nxp.com%7C41a826380bec44f31c5808d85970d145%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637357688370859674&sdata=sxKP2FIHSq0VKXLbmaosnjql9RrPWvbfwtLenC%2BAwFw%3D&reserved=0%2Fmailman%2Flistinfo%2Flava-users&data=02%7C0 1% 7Clarry.shen%40nxp.comhttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F40nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7C41a826380bec44f31c5808d85970d145%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637357688370859674&sdata=L9NliIdH4ZhtgxL6ATTRgCujXuUtxOEpbOyLXm5%2FdQU%3D&reserved=0%7C99817af3bba04995041108d85959a9d4%7C686ea1d3 bc 2b4c6fa92cd99c5c301635%7C0%7C0%7C637357588918656179&sdata=6vvO2U KM VGHQtyf8eVb3eK7Qw9nuhUwwJUt4qXL%2BSTA%3D&reserved=0
_______________________________________________ Lava-users mailing list Lava-users@lists.lavasoftware.orgmailto:Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-usershttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.lavasoftware.org%2Fmailman%2Flistinfo%2Flava-users&data=02%7C01%7Clarry.shen%40nxp.com%7C41a826380bec44f31c5808d85970d145%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637357688370869628&sdata=%2B2AsMg6lJZPwrwowF%2BPmM%2Fus32SrtyFCgJ5ZGubhCSI%3D&reserved=0
-- Rémi Duraffort LAVA Architect Linaro
Are you in the same network as the server?
Le mer. 16 sept. 2020 à 04:37, Larry Shen larry.shen@nxp.com a écrit :
Yes, Remi, we tried it like next:
staging:
token: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
uri: http://lava-staging.sw.nxp.com/RPC2
username: larry.shen
timeout: 300.0
The only result is the error message becomes:
Unable to connect: HTTPConnectionPool(host='lava-master.sw.nxp.com https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master.sw.nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7C41a826380bec44f31c5808d85970d145%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637357688370859674&sdata=BZSsMqgy12Urh%2ByQI3PM5LYm3z%2BiquSdyi0UrB%2FDetA%3D&reserved=0', port=80): Read timed out. (read timeout=300.0)
And another guys which directly use XMLRPC to submit without set timeout, will directly get 502, I think the root cause is same. The server is handling connections which cannot response client, the lavacli timeout even 300 seconds.
*From:* Remi Duraffort remi.duraffort@linaro.org *Sent:* Tuesday, September 15, 2020 8:14 PM *To:* Larry Shen larry.shen@nxp.com *Cc:* Milosz Wasilewski milosz.wasilewski@linaro.org; lava-users@lists.lavasoftware.org *Subject:* Re: [Lava-users] [EXT] Re: Issues about XMLRPC & lavacli.
*Caution: *EXT Email
Have you tried increasing the lavacli default timeout. Maybe the network connection to the server is flaky?
Rgds
Le mar. 15 sept. 2020 à 12:06, Larry Shen larry.shen@nxp.com a écrit :
I just checked the log, looks there is nothing in server log.
Just, we use container master, and get next from docker logs, these happened when user sometimes submit job failure, will it possible be the cause? What does it mean?
{"log":"10.193.108.249 - - [15/Sep/2020:02:12:01 +0000] "POST /RPC2 HTTP/1.1" 200 587 "-" "lavacli v0.9.7"\n","stream":"stdout","time":"2020-09-15T02:12:02.222152631Z"} {"log":"10.192.244.28 - - [15/Sep/2020:02:12:01 +0000] "GET /scheduler/job/109711/job_status HTTP/1.1" 200 634 " http://lava-master.sw.nxp.com/scheduler/job/109711%5C https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master.sw.nxp.com%2Fscheduler%2Fjob%2F109711%255C&data=02%7C01%7Clarry.shen%40nxp.com%7C41a826380bec44f31c5808d85970d145%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637357688370839774&sdata=ZQ7rWrau3o8pCyxfxtqU%2Bh5C0OAUfavxl0gd7mule8I%3D&reserved=0" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"\n","stream":"stdout","time":"2020-09-15T02:12:02.222155609Z"} {"log":"10.192.244.28 - - [15/Sep/2020:02:12:01 +0000] "GET /scheduler/job/109711/log_pipeline_incremental?line=102 HTTP/1.1" 200 6071 "http://lava-master.sw.nxp.com/scheduler/job/109711%5C https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master.sw.nxp.com%2Fscheduler%2Fjob%2F109711%255C&data=02%7C01%7Clarry.shen%40nxp.com%7C41a826380bec44f31c5808d85970d145%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637357688370849720&sdata=UtGCAix5Ozpp1a8mGH8Qph%2BeIefkES%2BMA5hHFBXBVzU%3D&reserved=0" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"\n","stream":"stdout","time":"2020-09-15T02:12:02.222178222Z"} {"log":"10.193.108.249 - - [15/Sep/2020:02:12:02 +0000] "POST /RPC2 HTTP/1.1" 200 433 "-" "lavacli v0.9.7"\n","stream":"stdout","time":"2020-09-15T02:12:02.22218312Z"} {"log":"ERROR:linaro-django-xmlrpc-dispatcher:Internal error in the XML-RPC dispatcher while calling method 'scheduler.jobs.show' with ('Unable',)\n","stream":"stderr","time":"2020-09-15T02:12:02.670767388Z"} {"log":"Traceback (most recent call last):\n","stream":"stderr","time":"2020-09-15T02:12:02.670790089Z"} {"log":" File "/usr/lib/python3/dist-packages/linaro_django_xmlrpc/models.py", line 441, in dispatch\n","stream":"stderr","time":"2020-09-15T02:12:02.670793859Z"} {"log":" return impl(*params)\n","stream":"stderr","time":"2020-09-15T02:12:02.670796867Z"} {"log":" File "/usr/lib/python3/dist-packages/lava_scheduler_app/api/jobs.py", line 383, in show\n","stream":"stderr","time":"2020-09-15T02:12:02.67079951Z"} {"log":" job = TestJob.get_by_job_number(job_id)\n","stream":"stderr","time":"2020-09-15T02:12:02.670802465Z"} {"log":" File "/usr/lib/python3/dist-packages/lava_scheduler_app/models.py", line 2010, in get_by_job_number\n","stream":"stderr","time":"2020-09-15T02:12:02.670805034Z"} {"log":" job = query.get(pk=job_id)\n","stream":"stderr","time":"2020-09-15T02:12:02.670807695Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/manager.py", line 85, in manager_method\n","stream":"stderr","time":"2020-09-15T02:12:02.670810275Z"} {"log":" return getattr(self.get_queryset(), name)(*args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67081302Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 371, in get\n","stream":"stderr","time":"2020-09-15T02:12:02.670815631Z"} {"log":" clone = self.filter(*args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67081854Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 787, in filter\n","stream":"stderr","time":"2020-09-15T02:12:02.670821466Z"} {"log":" return self._filter_or_exclude(False, *args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.670833942Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 805, in _filter_or_exclude\n","stream":"stderr","time":"2020-09-15T02:12:02.670836896Z"} {"log":" clone.query.add_q(Q(*args, **kwargs))\n","stream":"stderr","time":"2020-09-15T02:12:02.670839627Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1250, in add_q\n","stream":"stderr","time":"2020-09-15T02:12:02.670842124Z"} {"log":" clause, _ = self._add_q(q_object, self.used_aliases)\n","stream":"stderr","time":"2020-09-15T02:12:02.670844738Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1276, in _add_q\n","stream":"stderr","time":"2020-09-15T02:12:02.670847208Z"} {"log":" allow_joins=allow_joins, split_subq=split_subq,\n","stream":"stderr","time":"2020-09-15T02:12:02.670849936Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1210, in build_filter\n","stream":"stderr","time":"2020-09-15T02:12:02.670852387Z"} {"log":" condition = self.build_lookup(lookups, col, value)\n","stream":"stderr","time":"2020-09-15T02:12:02.670855092Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1104, in build_lookup\n","stream":"stderr","time":"2020-09-15T02:12:02.670857556Z"} {"log":" return final_lookup(lhs, rhs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67086024Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/lookups.py", line 24, in __init__\n","stream":"stderr","time":"2020-09-15T02:12:02.670862672Z"} {"log":" self.rhs = self.get_prep_lookup()\n","stream":"stderr","time":"2020-09-15T02:12:02.670877253Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/lookups.py", line 74, in get_prep_lookup\n","stream":"stderr","time":"2020-09-15T02:12:02.670880038Z"} {"log":" return self.lhs.output_field.get_prep_value(self.rhs)\n","stream":"stderr","time":"2020-09-15T02:12:02.670882674Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/fields/__init__.py", line 966, in get_prep_value\n","stream":"stderr","time":"2020-09-15T02:12:02.670886231Z"} {"log":" return int(value)\n","stream":"stderr","time":"2020-09-15T02:12:02.67088904Z"} {"log":"ValueError: invalid literal for int() with base 10: 'Unable'\n","stream":"stderr","time":"2020-09-15T02:12:02.670891628Z"} {"log":"\n","stream":"stdout","time":"2020-09-15T02:12:03.22232776Z"} {"log":"==\u003e gunicorn.log \u003c==\n","stream":"stdout","time":"2020-09-15T02:12:03.222350349Z"}
-----Original Message----- From: Milosz Wasilewski milosz.wasilewski@linaro.org Sent: Tuesday, September 15, 2020 5:39 PM To: Larry Shen larry.shen@nxp.com Cc: lava-users@lists.lavasoftware.org Subject: Re: [EXT] Re: [Lava-users] Issues about XMLRPC & lavacli.
Caution: EXT Email
On Tue, 15 Sep 2020 at 10:32, Larry Shen larry.shen@nxp.com wrote:
Hi, Milosz,
See this: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit. lavasoftware.org
https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flavasoftware.org%2F&data=02%7C01%7Clarry.shen%40nxp.com%7C41a826380bec44f31c5808d85970d145%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637357688370849720&sdata=ytzhdMiR0RUadzD%2BNmOtVtwRFXdYWFEWf9%2FuD4inLFQ%3D&reserved=0 %2Flava%2Flava%2F-%2Fmerge_requests%2F1286%2Fdiffs%236
5156a95098dc512e7a4b7047ea511332947f649&data=02%7C01%7Clarry.shen% 40nxp.com
https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F40nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7C41a826380bec44f31c5808d85970d145%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637357688370849720&sdata=u%2Fq24o3PkwiG5L5sFpLk7eogdx4g%2Bi25ztBSdtdGbxg%3D&reserved=0 %7Cc3cc54011c6f40b1781908d8595b2235%7C686ea1d3bc2b4c6fa92cd99
c5c301635%7C0%7C0%7C637357595236580077&sdata=wY2cJ%2F6kyfgkusZylxF QROEF8Dwew%2FvQg%2FaiBsPU8Og%3D&reserved=0
In fact I don't care if I can download big logs. Just care the issue "job submit failure", mentioned big log download
just looks maybe related to same issue...
I'm not sure what happened here, our environment or lava code change
related...?
Looks really weird. We're also running eventlet gunicorn and it actually improved things a lot (no more weird timeouts). Maybe Remi has some better idea.
milosz
-----Original Message----- From: Milosz Wasilewski milosz.wasilewski@linaro.org Sent: Tuesday, September 15, 2020 5:28 PM To: Larry Shen larry.shen@nxp.com Cc: lava-users@lists.lavasoftware.org Subject: [EXT] Re: [Lava-users] Issues about XMLRPC & lavacli.
Caution: EXT Email
On Tue, 15 Sep 2020 at 04:04, Larry Shen larry.shen@nxp.com wrote:
Meanwhile, a strange issue in case help:
In the past, when download big logs on the web, if the log too big, it
should be timeout, then failed to download.
But, now, we are still timeout in 2020.08, isn’t it should be ok with
async worker?
What’s your expect with async for this big file download? Possible our
local setting issues?
I'm not sure if this was enabled by default. In the
/lib/systemd/system/lava-server-gunicorn.service you should have WORKER_CLASS set to 'eventlet'. If this is not the case it's most likely the source of your trouble.
milosz
From: Larry Shen Sent: Tuesday, September 15, 2020 10:52 AM To: lava-users@lists.lavasoftware.org Subject: Issues about XMLRPC & lavacli.
Hi, guys,
We find an issue related to job submit:
- One team use “lavacli” to submit request, and sometimes it will
report next:
07-Sep-2020 16:37:35 Unable to connect:
HTTPConnectionPool(host='lava-master.sw.nxp.com https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master.sw.nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7C41a826380bec44f31c5808d85970d145%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637357688370859674&sdata=BZSsMqgy12Urh%2ByQI3PM5LYm3z%2BiquSdyi0UrB%2FDetA%3D&reserved=0', port=80): Read timed out. (read timeout=20.0)
Looks this error happens at next, what do you think about this issue?
try:
# Create the Transport object parsed_uri = urlparse(uri) transport = RequestsTransport( parsed_uri.scheme, config.get("proxy"), config.get("timeout", 20.0), config.get("verify_ssl_cert", True), ) # allow_none is True because the server does support it proxy = xmlrpc.client.ServerProxy(uri, allow_none=True,
transport=transport)
version = proxy.system.version() except (OSError, xmlrpc.client.Error) as exc: print("Unable to connect: %s" % exc2str(exc)) return 1
- Another team write their own python code using XMLRPC to submit
job, did something like next, it reports next:
ERROR in XMLRPC.py:submitJob:63 msg: Failed to submit job, reason:
<ProtocolError for chuan.su:chuan.su@lava-master.sw.nxp.com/RPC2 https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fhuan.su%40lava-master.sw.nxp.com%2FRPC2&data=02%7C01%7Clarry.shen%40nxp.com%7C41a826380bec44f31c5808d85970d145%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637357688370859674&sdata=HME6LKfHJS6GXfgbTMVuGfXDTjVOrB%2Bx%2BytFsjsMNAE%3D&reserved=0: 502 Bad Gateway>!
try:
job_id = self.connection.scheduler.submit_job(job) self.logger.debug("Successed to submit job , job_id:
%d, platform; %s!",job_id,platform)
return job_id except Exception as e: self.logger.error("Failed to submit job, reason:
%s!",str(e))
return None
We are currently using lava server version 2020.08, guys told me in
the past days, we also encountered similar, but with very low probability. But recently it becomes very high probability.
I’d like to know if possible this will related to your changes to
gunicorn eventlet? Or other possible reasons?
Thanks,
Larry
Lava-users mailing list Lava-users@lists.lavasoftware.org https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli st s.lavasoftware.org
https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fs.lavasoftware.org%2F&data=02%7C01%7Clarry.shen%40nxp.com%7C41a826380bec44f31c5808d85970d145%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637357688370859674&sdata=sxKP2FIHSq0VKXLbmaosnjql9RrPWvbfwtLenC%2BAwFw%3D&reserved=0 %2Fmailman%2Flistinfo%2Flava-users&data=02%7C0
1% 7Clarry.shen%40nxp.com
https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F40nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7C41a826380bec44f31c5808d85970d145%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637357688370859674&sdata=L9NliIdH4ZhtgxL6ATTRgCujXuUtxOEpbOyLXm5%2FdQU%3D&reserved=0 %7C99817af3bba04995041108d85959a9d4%7C686ea1d3
bc 2b4c6fa92cd99c5c301635%7C0%7C0%7C637357588918656179&sdata=6vvO2U KM VGHQtyf8eVb3eK7Qw9nuhUwwJUt4qXL%2BSTA%3D&reserved=0
Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.lavasoftware.org%2Fmailman%2Flistinfo%2Flava-users&data=02%7C01%7Clarry.shen%40nxp.com%7C41a826380bec44f31c5808d85970d145%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637357688370869628&sdata=%2B2AsMg6lJZPwrwowF%2BPmM%2Fus32SrtyFCgJ5ZGubhCSI%3D&reserved=0
--
Rémi Duraffort
LAVA Architect
Linaro
No, not in the same network.
And I check the log again, seems everytime when 502 issue happens, there is something like next: {"log":"[Tue Sep 15 02:14:58.023302 2020] [proxy:error] [pid 332:tid 139822047946496] (32)Broken pipe: [client 10.192.244.203:54182] AH01084: pass request body failed to 127.0.0.1:8000 (127.0.0.1)\n","stream":"stdout","time":"2020-09-15T02:14:58.319479074Z"} {"log":"[Tue Sep 15 02:14:58.023352 2020] [proxy_http:error] [pid 332:tid 139822047946496] [client 10.192.244.203:54182] AH01097: pass request body failed to 127.0.0.1:8000 (127.0.0.1) from 10.192.244.203 ()\n","stream":"stdout","time":"2020-09-15T02:14:58.319481826Z"}
What is 127.0.0.1:8000? Reverse proxy in lava setup? Any suggestion?
From: Remi Duraffort remi.duraffort@linaro.org Sent: Thursday, September 17, 2020 3:37 PM To: Larry Shen larry.shen@nxp.com Cc: Milosz Wasilewski milosz.wasilewski@linaro.org; lava-users@lists.lavasoftware.org Subject: Re: [Lava-users] [EXT] Re: Issues about XMLRPC & lavacli.
Caution: EXT Email Are you in the same network as the server?
Le mer. 16 sept. 2020 à 04:37, Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> a écrit : Yes, Remi, we tried it like next:
staging: token: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx uri: http://lava-staging.sw.nxp.com/RPC2https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-staging.sw.nxp.com%2FRPC2&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090161749&sdata=1ZtMgbOE78ZyUhrEcEMifWJ5MtHjCu8JHbB1KiXO8Vs%3D&reserved=0 username: larry.shen timeout: 300.0 The only result is the error message becomes: Unable to connect: HTTPConnectionPool(host='lava-master.sw.nxp.comhttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master.sw.nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090171744&sdata=k4eCWmoPupCDmg%2FzUrX%2BSXu6xAcSNfqLE31v9BFIc9M%3D&reserved=0', port=80): Read timed out. (read timeout=300.0)
And another guys which directly use XMLRPC to submit without set timeout, will directly get 502, I think the root cause is same. The server is handling connections which cannot response client, the lavacli timeout even 300 seconds.
From: Remi Duraffort <remi.duraffort@linaro.orgmailto:remi.duraffort@linaro.org> Sent: Tuesday, September 15, 2020 8:14 PM To: Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> Cc: Milosz Wasilewski <milosz.wasilewski@linaro.orgmailto:milosz.wasilewski@linaro.org>; lava-users@lists.lavasoftware.orgmailto:lava-users@lists.lavasoftware.org Subject: Re: [Lava-users] [EXT] Re: Issues about XMLRPC & lavacli.
Caution: EXT Email Have you tried increasing the lavacli default timeout. Maybe the network connection to the server is flaky?
Rgds
Le mar. 15 sept. 2020 à 12:06, Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> a écrit : I just checked the log, looks there is nothing in server log.
Just, we use container master, and get next from docker logs, these happened when user sometimes submit job failure, will it possible be the cause? What does it mean?
{"log":"10.193.108.249 - - [15/Sep/2020:02:12:01 +0000] "POST /RPC2 HTTP/1.1" 200 587 "-" "lavacli v0.9.7"\n","stream":"stdout","time":"2020-09-15T02:12:02.222152631Z"} {"log":"10.192.244.28 - - [15/Sep/2020:02:12:01 +0000] "GET /scheduler/job/109711/job_status HTTP/1.1" 200 634 "http://lava-master.sw.nxp.com/scheduler/job/109711%5C<https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master..." "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"\n","stream":"stdout","time":"2020-09-15T02:12:02.222155609Z"} {"log":"10.192.244.28 - - [15/Sep/2020:02:12:01 +0000] "GET /scheduler/job/109711/log_pipeline_incremental?line=102 HTTP/1.1" 200 6071 "http://lava-master.sw.nxp.com/scheduler/job/109711%5C<https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master..." "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"\n","stream":"stdout","time":"2020-09-15T02:12:02.222178222Z"} {"log":"10.193.108.249 - - [15/Sep/2020:02:12:02 +0000] "POST /RPC2 HTTP/1.1" 200 433 "-" "lavacli v0.9.7"\n","stream":"stdout","time":"2020-09-15T02:12:02.22218312Z"} {"log":"ERROR:linaro-django-xmlrpc-dispatcher:Internal error in the XML-RPC dispatcher while calling method 'scheduler.jobs.show' with ('Unable',)\n","stream":"stderr","time":"2020-09-15T02:12:02.670767388Z"} {"log":"Traceback (most recent call last):\n","stream":"stderr","time":"2020-09-15T02:12:02.670790089Z"} {"log":" File "/usr/lib/python3/dist-packages/linaro_django_xmlrpc/models.py", line 441, in dispatch\n","stream":"stderr","time":"2020-09-15T02:12:02.670793859Z"} {"log":" return impl(*params)\n","stream":"stderr","time":"2020-09-15T02:12:02.670796867Z"} {"log":" File "/usr/lib/python3/dist-packages/lava_scheduler_app/api/jobs.py", line 383, in show\n","stream":"stderr","time":"2020-09-15T02:12:02.67079951Z"} {"log":" job = TestJob.get_by_job_number(job_id)\n","stream":"stderr","time":"2020-09-15T02:12:02.670802465Z"} {"log":" File "/usr/lib/python3/dist-packages/lava_scheduler_app/models.py", line 2010, in get_by_job_number\n","stream":"stderr","time":"2020-09-15T02:12:02.670805034Z"} {"log":" job = query.get(pk=job_id)\n","stream":"stderr","time":"2020-09-15T02:12:02.670807695Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/manager.py", line 85, in manager_method\n","stream":"stderr","time":"2020-09-15T02:12:02.670810275Z"} {"log":" return getattr(self.get_queryset(), name)(*args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67081302Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 371, in get\n","stream":"stderr","time":"2020-09-15T02:12:02.670815631Z"} {"log":" clone = self.filter(*args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67081854Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 787, in filter\n","stream":"stderr","time":"2020-09-15T02:12:02.670821466Z"} {"log":" return self._filter_or_exclude(False, *args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.670833942Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 805, in _filter_or_exclude\n","stream":"stderr","time":"2020-09-15T02:12:02.670836896Z"} {"log":" clone.query.add_q(Q(*args, **kwargs))\n","stream":"stderr","time":"2020-09-15T02:12:02.670839627Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1250, in add_q\n","stream":"stderr","time":"2020-09-15T02:12:02.670842124Z"} {"log":" clause, _ = self._add_q(q_object, self.used_aliases)\n","stream":"stderr","time":"2020-09-15T02:12:02.670844738Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1276, in _add_q\n","stream":"stderr","time":"2020-09-15T02:12:02.670847208Z"} {"log":" allow_joins=allow_joins, split_subq=split_subq,\n","stream":"stderr","time":"2020-09-15T02:12:02.670849936Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1210, in build_filter\n","stream":"stderr","time":"2020-09-15T02:12:02.670852387Z"} {"log":" condition = self.build_lookup(lookups, col, value)\n","stream":"stderr","time":"2020-09-15T02:12:02.670855092Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1104, in build_lookup\n","stream":"stderr","time":"2020-09-15T02:12:02.670857556Z"} {"log":" return final_lookup(lhs, rhs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67086024Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/lookups.py", line 24, in __init__\n","stream":"stderr","time":"2020-09-15T02:12:02.670862672Z"} {"log":" self.rhs = self.get_prep_lookup()\n","stream":"stderr","time":"2020-09-15T02:12:02.670877253Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/lookups.py", line 74, in get_prep_lookup\n","stream":"stderr","time":"2020-09-15T02:12:02.670880038Z"} {"log":" return self.lhs.output_field.get_prep_value(self.rhs)\n","stream":"stderr","time":"2020-09-15T02:12:02.670882674Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/fields/__init__.py", line 966, in get_prep_value\n","stream":"stderr","time":"2020-09-15T02:12:02.670886231Z"} {"log":" return int(value)\n","stream":"stderr","time":"2020-09-15T02:12:02.67088904Z"} {"log":"ValueError: invalid literal for int() with base 10: 'Unable'\n","stream":"stderr","time":"2020-09-15T02:12:02.670891628Z"} {"log":"\n","stream":"stdout","time":"2020-09-15T02:12:03.22232776Z"} {"log":"==\u003e gunicorn.log \u003c==\n","stream":"stdout","time":"2020-09-15T02:12:03.222350349Z"}
-----Original Message----- From: Milosz Wasilewski <milosz.wasilewski@linaro.orgmailto:milosz.wasilewski@linaro.org> Sent: Tuesday, September 15, 2020 5:39 PM To: Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> Cc: lava-users@lists.lavasoftware.orgmailto:lava-users@lists.lavasoftware.org Subject: Re: [EXT] Re: [Lava-users] Issues about XMLRPC & lavacli.
Caution: EXT Email
On Tue, 15 Sep 2020 at 10:32, Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> wrote:
Hi, Milosz,
See this: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit. lavasoftware.orghttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flavasoftware.org%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090181735&sdata=XUsj44t8GLSY3NaxdfD%2FmeCKyH%2FArd7o0GYvmmfjZuo%3D&reserved=0%2Flava%2Flava%2F-%2Fmerge_requests%2F1286%2Fdiffs%236 5156a95098dc512e7a4b7047ea511332947f649&data=02%7C01%7Clarry.shen% 40nxp.comhttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F40nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090191732&sdata=z52vzvzsi4dDIkJXUqLkl4b5Gy%2Fv4ac7Zns7UEtmNEM%3D&reserved=0%7Cc3cc54011c6f40b1781908d8595b2235%7C686ea1d3bc2b4c6fa92cd99 c5c301635%7C0%7C0%7C637357595236580077&sdata=wY2cJ%2F6kyfgkusZylxF QROEF8Dwew%2FvQg%2FaiBsPU8Og%3D&reserved=0
In fact I don't care if I can download big logs. Just care the issue "job submit failure", mentioned big log download just looks maybe related to same issue... I'm not sure what happened here, our environment or lava code change related...?
Looks really weird. We're also running eventlet gunicorn and it actually improved things a lot (no more weird timeouts). Maybe Remi has some better idea.
milosz
-----Original Message----- From: Milosz Wasilewski <milosz.wasilewski@linaro.orgmailto:milosz.wasilewski@linaro.org> Sent: Tuesday, September 15, 2020 5:28 PM To: Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> Cc: lava-users@lists.lavasoftware.orgmailto:lava-users@lists.lavasoftware.org Subject: [EXT] Re: [Lava-users] Issues about XMLRPC & lavacli.
Caution: EXT Email
On Tue, 15 Sep 2020 at 04:04, Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> wrote:
Meanwhile, a strange issue in case help:
In the past, when download big logs on the web, if the log too big, it should be timeout, then failed to download.
But, now, we are still timeout in 2020.08, isn’t it should be ok with async worker?
What’s your expect with async for this big file download? Possible our local setting issues?
I'm not sure if this was enabled by default. In the /lib/systemd/system/lava-server-gunicorn.service you should have WORKER_CLASS set to 'eventlet'. If this is not the case it's most likely the source of your trouble.
milosz
From: Larry Shen Sent: Tuesday, September 15, 2020 10:52 AM To: lava-users@lists.lavasoftware.orgmailto:lava-users@lists.lavasoftware.org Subject: Issues about XMLRPC & lavacli.
Hi, guys,
We find an issue related to job submit:
- One team use “lavacli” to submit request, and sometimes it will report next:
07-Sep-2020 16:37:35 Unable to connect: HTTPConnectionPool(host='lava-master.sw.nxp.comhttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master.sw.nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090201729&sdata=D%2BL11Sg2dmWliHWjJxPEqL93HJ5IIREL1xq9EgNP51c%3D&reserved=0', port=80): Read timed out. (read timeout=20.0)
Looks this error happens at next, what do you think about this issue?
try:
# Create the Transport object parsed_uri = urlparse(uri) transport = RequestsTransport( parsed_uri.scheme, config.get("proxy"), config.get("timeout", 20.0), config.get("verify_ssl_cert", True), ) # allow_none is True because the server does support it proxy = xmlrpc.client.ServerProxy(uri, allow_none=True,
transport=transport)
version = proxy.system.version() except (OSError, xmlrpc.client.Error) as exc: print("Unable to connect: %s" % exc2str(exc)) return 1
- Another team write their own python code using XMLRPC to submit job, did something like next, it reports next:
ERROR in XMLRPC.py:submitJob:63 msg: Failed to submit job, reason: <ProtocolError for chuan.su:chuan.su@lava-master.sw.nxp.com/RPC2https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fhuan.su%40lava-master.sw.nxp.com%2FRPC2&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090201729&sdata=HHlfKFM6x%2BZ1lEtYz%2BFHPtl5k2NBo8600r1Mx24cdcs%3D&reserved=0: 502 Bad Gateway>!
try:
job_id = self.connection.scheduler.submit_job(job) self.logger.debug("Successed to submit job , job_id:
%d, platform; %s!",job_id,platform)
return job_id except Exception as e: self.logger.error("Failed to submit job, reason:
%s!",str(e))
return None
We are currently using lava server version 2020.08, guys told me in the past days, we also encountered similar, but with very low probability. But recently it becomes very high probability.
I’d like to know if possible this will related to your changes to gunicorn eventlet? Or other possible reasons?
Thanks,
Larry
Lava-users mailing list Lava-users@lists.lavasoftware.orgmailto:Lava-users@lists.lavasoftware.org https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli st s.lavasoftware.orghttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fs.lavasoftware.org%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090211716&sdata=K9%2Fh84f7gUgZ0WOb%2F2ksmTFuvonMeB9%2F1m8ud%2FUvlwY%3D&reserved=0%2Fmailman%2Flistinfo%2Flava-users&data=02%7C0 1% 7Clarry.shen%40nxp.comhttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F40nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090221714&sdata=rlAI6TF2daLcOptrzMDDcHqec0zuN7YsFYfOaD1bsG0%3D&reserved=0%7C99817af3bba04995041108d85959a9d4%7C686ea1d3 bc 2b4c6fa92cd99c5c301635%7C0%7C0%7C637357588918656179&sdata=6vvO2U KM VGHQtyf8eVb3eK7Qw9nuhUwwJUt4qXL%2BSTA%3D&reserved=0
_______________________________________________ Lava-users mailing list Lava-users@lists.lavasoftware.orgmailto:Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-usershttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.lavasoftware.org%2Fmailman%2Flistinfo%2Flava-users&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090221714&sdata=Aqb2CpiZ4xJNI3RSVDmBK2EdeVq1Vw2yQGI1hHs1%2F2I%3D&reserved=0
-- Rémi Duraffort LAVA Architect Linaro
-- Rémi Duraffort LAVA Architect Linaro
Hello,
127.0.0.1:8000 is lava-server-gunicorn. Do you use apache2 as a reverse proxy?
Have you tried reverting to using "sync" and restart lava-server-gunicorn?
Rgds
Le ven. 18 sept. 2020 à 05:41, Larry Shen larry.shen@nxp.com a écrit :
No, not in the same network.
And I check the log again, seems everytime when 502 issue happens, there is something like next:
{"log":"[Tue Sep 15 02:14:58.023302 2020] [proxy:error] [pid 332:tid 139822047946496] (32)Broken pipe: [client 10.192.244.203:54182] AH01084: pass request body failed to 127.0.0.1:8000 (127.0.0.1)\n","stream":"stdout","time":"2020-09-15T02:14:58.319479074Z"}
{"log":"[Tue Sep 15 02:14:58.023352 2020] [proxy_http:error] [pid 332:tid 139822047946496] [client 10.192.244.203:54182] AH01097: pass request body failed to 127.0.0.1:8000 (127.0.0.1) from 10.192.244.203 ()\n","stream":"stdout","time":"2020-09-15T02:14:58.319481826Z"}
What is 127.0.0.1:8000? Reverse proxy in lava setup? Any suggestion?
*From:* Remi Duraffort remi.duraffort@linaro.org *Sent:* Thursday, September 17, 2020 3:37 PM *To:* Larry Shen larry.shen@nxp.com *Cc:* Milosz Wasilewski milosz.wasilewski@linaro.org; lava-users@lists.lavasoftware.org *Subject:* Re: [Lava-users] [EXT] Re: Issues about XMLRPC & lavacli.
*Caution: *EXT Email
Are you in the same network as the server?
Le mer. 16 sept. 2020 à 04:37, Larry Shen larry.shen@nxp.com a écrit :
Yes, Remi, we tried it like next:
staging:
token: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
uri: http://lava-staging.sw.nxp.com/RPC2 https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-staging.sw.nxp.com%2FRPC2&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090161749&sdata=1ZtMgbOE78ZyUhrEcEMifWJ5MtHjCu8JHbB1KiXO8Vs%3D&reserved=0
username: larry.shen
timeout: 300.0
The only result is the error message becomes:
Unable to connect: HTTPConnectionPool(host='lava-master.sw.nxp.com https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master.sw.nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090171744&sdata=k4eCWmoPupCDmg%2FzUrX%2BSXu6xAcSNfqLE31v9BFIc9M%3D&reserved=0', port=80): Read timed out. (read timeout=300.0)
And another guys which directly use XMLRPC to submit without set timeout, will directly get 502, I think the root cause is same. The server is handling connections which cannot response client, the lavacli timeout even 300 seconds.
*From:* Remi Duraffort remi.duraffort@linaro.org *Sent:* Tuesday, September 15, 2020 8:14 PM *To:* Larry Shen larry.shen@nxp.com *Cc:* Milosz Wasilewski milosz.wasilewski@linaro.org; lava-users@lists.lavasoftware.org *Subject:* Re: [Lava-users] [EXT] Re: Issues about XMLRPC & lavacli.
*Caution: *EXT Email
Have you tried increasing the lavacli default timeout. Maybe the network connection to the server is flaky?
Rgds
Le mar. 15 sept. 2020 à 12:06, Larry Shen larry.shen@nxp.com a écrit :
I just checked the log, looks there is nothing in server log.
Just, we use container master, and get next from docker logs, these happened when user sometimes submit job failure, will it possible be the cause? What does it mean?
{"log":"10.193.108.249 - - [15/Sep/2020:02:12:01 +0000] "POST /RPC2 HTTP/1.1" 200 587 "-" "lavacli v0.9.7"\n","stream":"stdout","time":"2020-09-15T02:12:02.222152631Z"} {"log":"10.192.244.28 - - [15/Sep/2020:02:12:01 +0000] "GET /scheduler/job/109711/job_status HTTP/1.1" 200 634 " http://lava-master.sw.nxp.com/scheduler/job/109711%5C https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master.sw.nxp.com%2Fscheduler%2Fjob%2F109711%255C&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090171744&sdata=%2BY1qPr2BRInLH88Z8ASfYX6CG3LNyyZ0hoVeNtBOHh4%3D&reserved=0" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"\n","stream":"stdout","time":"2020-09-15T02:12:02.222155609Z"} {"log":"10.192.244.28 - - [15/Sep/2020:02:12:01 +0000] "GET /scheduler/job/109711/log_pipeline_incremental?line=102 HTTP/1.1" 200 6071 "http://lava-master.sw.nxp.com/scheduler/job/109711%5C https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master.sw.nxp.com%2Fscheduler%2Fjob%2F109711%255C&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090181735&sdata=LGVnuW2BduRGTRCD3y%2Bf4xyA2AKbmHl9ZE1C25le25Q%3D&reserved=0" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"\n","stream":"stdout","time":"2020-09-15T02:12:02.222178222Z"} {"log":"10.193.108.249 - - [15/Sep/2020:02:12:02 +0000] "POST /RPC2 HTTP/1.1" 200 433 "-" "lavacli v0.9.7"\n","stream":"stdout","time":"2020-09-15T02:12:02.22218312Z"} {"log":"ERROR:linaro-django-xmlrpc-dispatcher:Internal error in the XML-RPC dispatcher while calling method 'scheduler.jobs.show' with ('Unable',)\n","stream":"stderr","time":"2020-09-15T02:12:02.670767388Z"} {"log":"Traceback (most recent call last):\n","stream":"stderr","time":"2020-09-15T02:12:02.670790089Z"} {"log":" File "/usr/lib/python3/dist-packages/linaro_django_xmlrpc/models.py", line 441, in dispatch\n","stream":"stderr","time":"2020-09-15T02:12:02.670793859Z"} {"log":" return impl(*params)\n","stream":"stderr","time":"2020-09-15T02:12:02.670796867Z"} {"log":" File "/usr/lib/python3/dist-packages/lava_scheduler_app/api/jobs.py", line 383, in show\n","stream":"stderr","time":"2020-09-15T02:12:02.67079951Z"} {"log":" job = TestJob.get_by_job_number(job_id)\n","stream":"stderr","time":"2020-09-15T02:12:02.670802465Z"} {"log":" File "/usr/lib/python3/dist-packages/lava_scheduler_app/models.py", line 2010, in get_by_job_number\n","stream":"stderr","time":"2020-09-15T02:12:02.670805034Z"} {"log":" job = query.get(pk=job_id)\n","stream":"stderr","time":"2020-09-15T02:12:02.670807695Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/manager.py", line 85, in manager_method\n","stream":"stderr","time":"2020-09-15T02:12:02.670810275Z"} {"log":" return getattr(self.get_queryset(), name)(*args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67081302Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 371, in get\n","stream":"stderr","time":"2020-09-15T02:12:02.670815631Z"} {"log":" clone = self.filter(*args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67081854Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 787, in filter\n","stream":"stderr","time":"2020-09-15T02:12:02.670821466Z"} {"log":" return self._filter_or_exclude(False, *args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.670833942Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 805, in _filter_or_exclude\n","stream":"stderr","time":"2020-09-15T02:12:02.670836896Z"} {"log":" clone.query.add_q(Q(*args, **kwargs))\n","stream":"stderr","time":"2020-09-15T02:12:02.670839627Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1250, in add_q\n","stream":"stderr","time":"2020-09-15T02:12:02.670842124Z"} {"log":" clause, _ = self._add_q(q_object, self.used_aliases)\n","stream":"stderr","time":"2020-09-15T02:12:02.670844738Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1276, in _add_q\n","stream":"stderr","time":"2020-09-15T02:12:02.670847208Z"} {"log":" allow_joins=allow_joins, split_subq=split_subq,\n","stream":"stderr","time":"2020-09-15T02:12:02.670849936Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1210, in build_filter\n","stream":"stderr","time":"2020-09-15T02:12:02.670852387Z"} {"log":" condition = self.build_lookup(lookups, col, value)\n","stream":"stderr","time":"2020-09-15T02:12:02.670855092Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1104, in build_lookup\n","stream":"stderr","time":"2020-09-15T02:12:02.670857556Z"} {"log":" return final_lookup(lhs, rhs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67086024Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/lookups.py", line 24, in __init__\n","stream":"stderr","time":"2020-09-15T02:12:02.670862672Z"} {"log":" self.rhs = self.get_prep_lookup()\n","stream":"stderr","time":"2020-09-15T02:12:02.670877253Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/lookups.py", line 74, in get_prep_lookup\n","stream":"stderr","time":"2020-09-15T02:12:02.670880038Z"} {"log":" return self.lhs.output_field.get_prep_value(self.rhs)\n","stream":"stderr","time":"2020-09-15T02:12:02.670882674Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/fields/__init__.py", line 966, in get_prep_value\n","stream":"stderr","time":"2020-09-15T02:12:02.670886231Z"} {"log":" return int(value)\n","stream":"stderr","time":"2020-09-15T02:12:02.67088904Z"} {"log":"ValueError: invalid literal for int() with base 10: 'Unable'\n","stream":"stderr","time":"2020-09-15T02:12:02.670891628Z"} {"log":"\n","stream":"stdout","time":"2020-09-15T02:12:03.22232776Z"} {"log":"==\u003e gunicorn.log \u003c==\n","stream":"stdout","time":"2020-09-15T02:12:03.222350349Z"}
-----Original Message----- From: Milosz Wasilewski milosz.wasilewski@linaro.org Sent: Tuesday, September 15, 2020 5:39 PM To: Larry Shen larry.shen@nxp.com Cc: lava-users@lists.lavasoftware.org Subject: Re: [EXT] Re: [Lava-users] Issues about XMLRPC & lavacli.
Caution: EXT Email
On Tue, 15 Sep 2020 at 10:32, Larry Shen larry.shen@nxp.com wrote:
Hi, Milosz,
See this: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit. lavasoftware.org
https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flavasoftware.org%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090181735&sdata=XUsj44t8GLSY3NaxdfD%2FmeCKyH%2FArd7o0GYvmmfjZuo%3D&reserved=0 %2Flava%2Flava%2F-%2Fmerge_requests%2F1286%2Fdiffs%236
5156a95098dc512e7a4b7047ea511332947f649&data=02%7C01%7Clarry.shen% 40nxp.com
https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F40nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090191732&sdata=z52vzvzsi4dDIkJXUqLkl4b5Gy%2Fv4ac7Zns7UEtmNEM%3D&reserved=0 %7Cc3cc54011c6f40b1781908d8595b2235%7C686ea1d3bc2b4c6fa92cd99
c5c301635%7C0%7C0%7C637357595236580077&sdata=wY2cJ%2F6kyfgkusZylxF QROEF8Dwew%2FvQg%2FaiBsPU8Og%3D&reserved=0
In fact I don't care if I can download big logs. Just care the issue "job submit failure", mentioned big log download
just looks maybe related to same issue...
I'm not sure what happened here, our environment or lava code change
related...?
Looks really weird. We're also running eventlet gunicorn and it actually improved things a lot (no more weird timeouts). Maybe Remi has some better idea.
milosz
-----Original Message----- From: Milosz Wasilewski milosz.wasilewski@linaro.org Sent: Tuesday, September 15, 2020 5:28 PM To: Larry Shen larry.shen@nxp.com Cc: lava-users@lists.lavasoftware.org Subject: [EXT] Re: [Lava-users] Issues about XMLRPC & lavacli.
Caution: EXT Email
On Tue, 15 Sep 2020 at 04:04, Larry Shen larry.shen@nxp.com wrote:
Meanwhile, a strange issue in case help:
In the past, when download big logs on the web, if the log too big, it
should be timeout, then failed to download.
But, now, we are still timeout in 2020.08, isn’t it should be ok with
async worker?
What’s your expect with async for this big file download? Possible
our local setting issues?
I'm not sure if this was enabled by default. In the
/lib/systemd/system/lava-server-gunicorn.service you should have WORKER_CLASS set to 'eventlet'. If this is not the case it's most likely the source of your trouble.
milosz
From: Larry Shen Sent: Tuesday, September 15, 2020 10:52 AM To: lava-users@lists.lavasoftware.org Subject: Issues about XMLRPC & lavacli.
Hi, guys,
We find an issue related to job submit:
- One team use “lavacli” to submit request, and sometimes it will
report next:
07-Sep-2020 16:37:35 Unable to connect:
HTTPConnectionPool(host='lava-master.sw.nxp.com https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master.sw.nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090201729&sdata=D%2BL11Sg2dmWliHWjJxPEqL93HJ5IIREL1xq9EgNP51c%3D&reserved=0', port=80): Read timed out. (read timeout=20.0)
Looks this error happens at next, what do you think about this issue?
try:
# Create the Transport object parsed_uri = urlparse(uri) transport = RequestsTransport( parsed_uri.scheme, config.get("proxy"), config.get("timeout", 20.0), config.get("verify_ssl_cert", True), ) # allow_none is True because the server does support it proxy = xmlrpc.client.ServerProxy(uri, allow_none=True,
transport=transport)
version = proxy.system.version() except (OSError, xmlrpc.client.Error) as exc: print("Unable to connect: %s" % exc2str(exc)) return 1
- Another team write their own python code using XMLRPC to submit
job, did something like next, it reports next:
ERROR in XMLRPC.py:submitJob:63 msg: Failed to submit job, reason:
<ProtocolError for chuan.su:chuan.su@lava-master.sw.nxp.com/RPC2 https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fhuan.su%40lava-master.sw.nxp.com%2FRPC2&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090201729&sdata=HHlfKFM6x%2BZ1lEtYz%2BFHPtl5k2NBo8600r1Mx24cdcs%3D&reserved=0: 502 Bad Gateway>!
try:
job_id = self.connection.scheduler.submit_job(job) self.logger.debug("Successed to submit job , job_id:
%d, platform; %s!",job_id,platform)
return job_id except Exception as e: self.logger.error("Failed to submit job, reason:
%s!",str(e))
return None
We are currently using lava server version 2020.08, guys told me in
the past days, we also encountered similar, but with very low probability. But recently it becomes very high probability.
I’d like to know if possible this will related to your changes to
gunicorn eventlet? Or other possible reasons?
Thanks,
Larry
Lava-users mailing list Lava-users@lists.lavasoftware.org https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli st s.lavasoftware.org
https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fs.lavasoftware.org%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090211716&sdata=K9%2Fh84f7gUgZ0WOb%2F2ksmTFuvonMeB9%2F1m8ud%2FUvlwY%3D&reserved=0 %2Fmailman%2Flistinfo%2Flava-users&data=02%7C0
1% 7Clarry.shen%40nxp.com
https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F40nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090221714&sdata=rlAI6TF2daLcOptrzMDDcHqec0zuN7YsFYfOaD1bsG0%3D&reserved=0 %7C99817af3bba04995041108d85959a9d4%7C686ea1d3
bc 2b4c6fa92cd99c5c301635%7C0%7C0%7C637357588918656179&sdata=6vvO2U KM VGHQtyf8eVb3eK7Qw9nuhUwwJUt4qXL%2BSTA%3D&reserved=0
Lava-users mailing list Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-users https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.lavasoftware.org%2Fmailman%2Flistinfo%2Flava-users&data=02%7C01%7Clarry.shen%40nxp.com%7Cf46948d338214d1d5c8e08d85adc6e39%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637359250090221714&sdata=Aqb2CpiZ4xJNI3RSVDmBK2EdeVq1Vw2yQGI1hHs1%2F2I%3D&reserved=0
--
Rémi Duraffort
LAVA Architect
Linaro
--
Rémi Duraffort
LAVA Architect
Linaro
Hi, Remi,
We found only 2 sites have this issue, other sites ok.
After double check, they use something like “lavacli -i admin@validation jobs wait --polling 60 $job_id” As temporary solution, already ask them to reduce the RPC call with “—polling 300”
Looks no this issue for last 1 week already, so I will pending this check.
BTW, I remember we could monitor the event of job finish, will this remove the polling of master issue? Will it also work after this https://git.lavasoftware.org/lava/lava/-/merge_requests/1253 ?
From: Remi Duraffort remi.duraffort@linaro.org Sent: Monday, September 21, 2020 8:38 PM To: Larry Shen larry.shen@nxp.com Cc: Milosz Wasilewski milosz.wasilewski@linaro.org; lava-users@lists.lavasoftware.org Subject: Re: [Lava-users] [EXT] Re: Issues about XMLRPC & lavacli.
Caution: EXT Email Hello,
127.0.0.1:8000https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F127.0.0.1%3A8000%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cdb422c58aab94480a62708d85e2b2f8f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637362886861965211&sdata=u5GnLGCX6V4lOW4s8UD%2FLl5M0lN1s4n4zcT92bxZpuQ%3D&reserved=0 is lava-server-gunicorn. Do you use apache2 as a reverse proxy?
Have you tried reverting to using "sync" and restart lava-server-gunicorn?
Rgds
Le ven. 18 sept. 2020 à 05:41, Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> a écrit : No, not in the same network.
And I check the log again, seems everytime when 502 issue happens, there is something like next: {"log":"[Tue Sep 15 02:14:58.023302 2020] [proxy:error] [pid 332:tid 139822047946496] (32)Broken pipe: [client 10.192.244.203:54182https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F10.192.244.203%3A54182%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cdb422c58aab94480a62708d85e2b2f8f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637362886861970194&sdata=OCy3YH7nzNNgcI2bp9xjhV9a%2BHCPoYCuviKWDPh%2Bzu0%3D&reserved=0] AH01084: pass request body failed to 127.0.0.1:8000https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F127.0.0.1%3A8000%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cdb422c58aab94480a62708d85e2b2f8f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637362886861975169&sdata=jebxuD57FwKx%2FS4PHT%2FXEXB41qCTg73TjPsC4jbWWjc%3D&reserved=0 (127.0.0.1)\n","stream":"stdout","time":"2020-09-15T02:14:58.319479074Z"} {"log":"[Tue Sep 15 02:14:58.023352 2020] [proxy_http:error] [pid 332:tid 139822047946496] [client 10.192.244.203:54182https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F10.192.244.203%3A54182%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cdb422c58aab94480a62708d85e2b2f8f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637362886861980147&sdata=OcGLygceb2LMAvF%2BS%2FPsOkM6dNIieaXvDS8Vtc65Lpg%3D&reserved=0] AH01097: pass request body failed to 127.0.0.1:8000https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F127.0.0.1%3A8000%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cdb422c58aab94480a62708d85e2b2f8f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637362886861985124&sdata=Pv8uPqDJ%2Fjp%2BoJkQ7ynLua6pZTsYS1LutLQOWDiX7vM%3D&reserved=0 (127.0.0.1) from 10.192.244.203 ()\n","stream":"stdout","time":"2020-09-15T02:14:58.319481826Z"}
What is 127.0.0.1:8000https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F127.0.0.1%3A8000%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cdb422c58aab94480a62708d85e2b2f8f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637362886861990108&sdata=aSHz%2By66%2B1%2BrIwvpxZtmmZpn2DzLBj0tdwuzcr%2FqAno%3D&reserved=0? Reverse proxy in lava setup? Any suggestion?
From: Remi Duraffort <remi.duraffort@linaro.orgmailto:remi.duraffort@linaro.org> Sent: Thursday, September 17, 2020 3:37 PM To: Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> Cc: Milosz Wasilewski <milosz.wasilewski@linaro.orgmailto:milosz.wasilewski@linaro.org>; lava-users@lists.lavasoftware.orgmailto:lava-users@lists.lavasoftware.org Subject: Re: [Lava-users] [EXT] Re: Issues about XMLRPC & lavacli.
Caution: EXT Email Are you in the same network as the server?
Le mer. 16 sept. 2020 à 04:37, Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> a écrit : Yes, Remi, we tried it like next:
staging: token: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx uri: http://lava-staging.sw.nxp.com/RPC2https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-staging.sw.nxp.com%2FRPC2&data=02%7C01%7Clarry.shen%40nxp.com%7Cdb422c58aab94480a62708d85e2b2f8f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637362886861990108&sdata=UJJu7DW1xdoOLSEfjvsKvqvU00sZpDslkYLDiQwefXs%3D&reserved=0 username: larry.shen timeout: 300.0 The only result is the error message becomes: Unable to connect: HTTPConnectionPool(host='lava-master.sw.nxp.comhttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master.sw.nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cdb422c58aab94480a62708d85e2b2f8f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637362886861995085&sdata=CVLuvMYSR1kGr1GvXdYFETDeLBFnuIioCI9%2BE35lhbc%3D&reserved=0', port=80): Read timed out. (read timeout=300.0)
And another guys which directly use XMLRPC to submit without set timeout, will directly get 502, I think the root cause is same. The server is handling connections which cannot response client, the lavacli timeout even 300 seconds.
From: Remi Duraffort <remi.duraffort@linaro.orgmailto:remi.duraffort@linaro.org> Sent: Tuesday, September 15, 2020 8:14 PM To: Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> Cc: Milosz Wasilewski <milosz.wasilewski@linaro.orgmailto:milosz.wasilewski@linaro.org>; lava-users@lists.lavasoftware.orgmailto:lava-users@lists.lavasoftware.org Subject: Re: [Lava-users] [EXT] Re: Issues about XMLRPC & lavacli.
Caution: EXT Email Have you tried increasing the lavacli default timeout. Maybe the network connection to the server is flaky?
Rgds
Le mar. 15 sept. 2020 à 12:06, Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> a écrit : I just checked the log, looks there is nothing in server log.
Just, we use container master, and get next from docker logs, these happened when user sometimes submit job failure, will it possible be the cause? What does it mean?
{"log":"10.193.108.249 - - [15/Sep/2020:02:12:01 +0000] "POST /RPC2 HTTP/1.1" 200 587 "-" "lavacli v0.9.7"\n","stream":"stdout","time":"2020-09-15T02:12:02.222152631Z"} {"log":"10.192.244.28 - - [15/Sep/2020:02:12:01 +0000] "GET /scheduler/job/109711/job_status HTTP/1.1" 200 634 "http://lava-master.sw.nxp.com/scheduler/job/109711%5C<https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master..." "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"\n","stream":"stdout","time":"2020-09-15T02:12:02.222155609Z"} {"log":"10.192.244.28 - - [15/Sep/2020:02:12:01 +0000] "GET /scheduler/job/109711/log_pipeline_incremental?line=102 HTTP/1.1" 200 6071 "http://lava-master.sw.nxp.com/scheduler/job/109711%5C<https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master..." "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"\n","stream":"stdout","time":"2020-09-15T02:12:02.222178222Z"} {"log":"10.193.108.249 - - [15/Sep/2020:02:12:02 +0000] "POST /RPC2 HTTP/1.1" 200 433 "-" "lavacli v0.9.7"\n","stream":"stdout","time":"2020-09-15T02:12:02.22218312Z"} {"log":"ERROR:linaro-django-xmlrpc-dispatcher:Internal error in the XML-RPC dispatcher while calling method 'scheduler.jobs.show' with ('Unable',)\n","stream":"stderr","time":"2020-09-15T02:12:02.670767388Z"} {"log":"Traceback (most recent call last):\n","stream":"stderr","time":"2020-09-15T02:12:02.670790089Z"} {"log":" File "/usr/lib/python3/dist-packages/linaro_django_xmlrpc/models.py", line 441, in dispatch\n","stream":"stderr","time":"2020-09-15T02:12:02.670793859Z"} {"log":" return impl(*params)\n","stream":"stderr","time":"2020-09-15T02:12:02.670796867Z"} {"log":" File "/usr/lib/python3/dist-packages/lava_scheduler_app/api/jobs.py", line 383, in show\n","stream":"stderr","time":"2020-09-15T02:12:02.67079951Z"} {"log":" job = TestJob.get_by_job_number(job_id)\n","stream":"stderr","time":"2020-09-15T02:12:02.670802465Z"} {"log":" File "/usr/lib/python3/dist-packages/lava_scheduler_app/models.py", line 2010, in get_by_job_number\n","stream":"stderr","time":"2020-09-15T02:12:02.670805034Z"} {"log":" job = query.get(pk=job_id)\n","stream":"stderr","time":"2020-09-15T02:12:02.670807695Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/manager.py", line 85, in manager_method\n","stream":"stderr","time":"2020-09-15T02:12:02.670810275Z"} {"log":" return getattr(self.get_queryset(), name)(*args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67081302Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 371, in get\n","stream":"stderr","time":"2020-09-15T02:12:02.670815631Z"} {"log":" clone = self.filter(*args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67081854Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 787, in filter\n","stream":"stderr","time":"2020-09-15T02:12:02.670821466Z"} {"log":" return self._filter_or_exclude(False, *args, **kwargs)\n","stream":"stderr","time":"2020-09-15T02:12:02.670833942Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/query.py", line 805, in _filter_or_exclude\n","stream":"stderr","time":"2020-09-15T02:12:02.670836896Z"} {"log":" clone.query.add_q(Q(*args, **kwargs))\n","stream":"stderr","time":"2020-09-15T02:12:02.670839627Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1250, in add_q\n","stream":"stderr","time":"2020-09-15T02:12:02.670842124Z"} {"log":" clause, _ = self._add_q(q_object, self.used_aliases)\n","stream":"stderr","time":"2020-09-15T02:12:02.670844738Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1276, in _add_q\n","stream":"stderr","time":"2020-09-15T02:12:02.670847208Z"} {"log":" allow_joins=allow_joins, split_subq=split_subq,\n","stream":"stderr","time":"2020-09-15T02:12:02.670849936Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1210, in build_filter\n","stream":"stderr","time":"2020-09-15T02:12:02.670852387Z"} {"log":" condition = self.build_lookup(lookups, col, value)\n","stream":"stderr","time":"2020-09-15T02:12:02.670855092Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/sql/query.py", line 1104, in build_lookup\n","stream":"stderr","time":"2020-09-15T02:12:02.670857556Z"} {"log":" return final_lookup(lhs, rhs)\n","stream":"stderr","time":"2020-09-15T02:12:02.67086024Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/lookups.py", line 24, in __init__\n","stream":"stderr","time":"2020-09-15T02:12:02.670862672Z"} {"log":" self.rhs = self.get_prep_lookup()\n","stream":"stderr","time":"2020-09-15T02:12:02.670877253Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/lookups.py", line 74, in get_prep_lookup\n","stream":"stderr","time":"2020-09-15T02:12:02.670880038Z"} {"log":" return self.lhs.output_field.get_prep_value(self.rhs)\n","stream":"stderr","time":"2020-09-15T02:12:02.670882674Z"} {"log":" File "/usr/lib/python3/dist-packages/django/db/models/fields/__init__.py", line 966, in get_prep_value\n","stream":"stderr","time":"2020-09-15T02:12:02.670886231Z"} {"log":" return int(value)\n","stream":"stderr","time":"2020-09-15T02:12:02.67088904Z"} {"log":"ValueError: invalid literal for int() with base 10: 'Unable'\n","stream":"stderr","time":"2020-09-15T02:12:02.670891628Z"} {"log":"\n","stream":"stdout","time":"2020-09-15T02:12:03.22232776Z"} {"log":"==\u003e gunicorn.log \u003c==\n","stream":"stdout","time":"2020-09-15T02:12:03.222350349Z"}
-----Original Message----- From: Milosz Wasilewski <milosz.wasilewski@linaro.orgmailto:milosz.wasilewski@linaro.org> Sent: Tuesday, September 15, 2020 5:39 PM To: Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> Cc: lava-users@lists.lavasoftware.orgmailto:lava-users@lists.lavasoftware.org Subject: Re: [EXT] Re: [Lava-users] Issues about XMLRPC & lavacli.
Caution: EXT Email
On Tue, 15 Sep 2020 at 10:32, Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> wrote:
Hi, Milosz,
See this: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit. lavasoftware.orghttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flavasoftware.org%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cdb422c58aab94480a62708d85e2b2f8f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637362886862010016&sdata=2G6If7pQZCoQYIpN%2BhiaV64zrt5rJqwYgIhjjFl3qVE%3D&reserved=0%2Flava%2Flava%2F-%2Fmerge_requests%2F1286%2Fdiffs%236 5156a95098dc512e7a4b7047ea511332947f649&data=02%7C01%7Clarry.shen% 40nxp.comhttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F40nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cdb422c58aab94480a62708d85e2b2f8f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637362886862019972&sdata=QYQ4qd9y2NAYCHBOIEamiPKLuHNO44RPTSV68UinUj4%3D&reserved=0%7Cc3cc54011c6f40b1781908d8595b2235%7C686ea1d3bc2b4c6fa92cd99 c5c301635%7C0%7C0%7C637357595236580077&sdata=wY2cJ%2F6kyfgkusZylxF QROEF8Dwew%2FvQg%2FaiBsPU8Og%3D&reserved=0
In fact I don't care if I can download big logs. Just care the issue "job submit failure", mentioned big log download just looks maybe related to same issue... I'm not sure what happened here, our environment or lava code change related...?
Looks really weird. We're also running eventlet gunicorn and it actually improved things a lot (no more weird timeouts). Maybe Remi has some better idea.
milosz
-----Original Message----- From: Milosz Wasilewski <milosz.wasilewski@linaro.orgmailto:milosz.wasilewski@linaro.org> Sent: Tuesday, September 15, 2020 5:28 PM To: Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> Cc: lava-users@lists.lavasoftware.orgmailto:lava-users@lists.lavasoftware.org Subject: [EXT] Re: [Lava-users] Issues about XMLRPC & lavacli.
Caution: EXT Email
On Tue, 15 Sep 2020 at 04:04, Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> wrote:
Meanwhile, a strange issue in case help:
In the past, when download big logs on the web, if the log too big, it should be timeout, then failed to download.
But, now, we are still timeout in 2020.08, isn’t it should be ok with async worker?
What’s your expect with async for this big file download? Possible our local setting issues?
I'm not sure if this was enabled by default. In the /lib/systemd/system/lava-server-gunicorn.service you should have WORKER_CLASS set to 'eventlet'. If this is not the case it's most likely the source of your trouble.
milosz
From: Larry Shen Sent: Tuesday, September 15, 2020 10:52 AM To: lava-users@lists.lavasoftware.orgmailto:lava-users@lists.lavasoftware.org Subject: Issues about XMLRPC & lavacli.
Hi, guys,
We find an issue related to job submit:
- One team use “lavacli” to submit request, and sometimes it will report next:
07-Sep-2020 16:37:35 Unable to connect: HTTPConnectionPool(host='lava-master.sw.nxp.comhttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flava-master.sw.nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cdb422c58aab94480a62708d85e2b2f8f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637362886862024948&sdata=VlbAj8fCTWrYAXg%2FRiG5P5OE0kwVbTxF8o9FnmKV%2Bp8%3D&reserved=0', port=80): Read timed out. (read timeout=20.0)
Looks this error happens at next, what do you think about this issue?
try:
# Create the Transport object parsed_uri = urlparse(uri) transport = RequestsTransport( parsed_uri.scheme, config.get("proxy"), config.get("timeout", 20.0), config.get("verify_ssl_cert", True), ) # allow_none is True because the server does support it proxy = xmlrpc.client.ServerProxy(uri, allow_none=True,
transport=transport)
version = proxy.system.version() except (OSError, xmlrpc.client.Error) as exc: print("Unable to connect: %s" % exc2str(exc)) return 1
- Another team write their own python code using XMLRPC to submit job, did something like next, it reports next:
ERROR in XMLRPC.py:submitJob:63 msg: Failed to submit job, reason: <ProtocolError for chuan.su:chuan.su@lava-master.sw.nxp.com/RPC2https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fhuan.su%40lava-master.sw.nxp.com%2FRPC2&data=02%7C01%7Clarry.shen%40nxp.com%7Cdb422c58aab94480a62708d85e2b2f8f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637362886862029925&sdata=PjqXvyTz5lwkJvL9m%2ByHmQvap8SGly0aNZub%2BhWV9CY%3D&reserved=0: 502 Bad Gateway>!
try:
job_id = self.connection.scheduler.submit_job(job) self.logger.debug("Successed to submit job , job_id:
%d, platform; %s!",job_id,platform)
return job_id except Exception as e: self.logger.error("Failed to submit job, reason:
%s!",str(e))
return None
We are currently using lava server version 2020.08, guys told me in the past days, we also encountered similar, but with very low probability. But recently it becomes very high probability.
I’d like to know if possible this will related to your changes to gunicorn eventlet? Or other possible reasons?
Thanks,
Larry
Lava-users mailing list Lava-users@lists.lavasoftware.orgmailto:Lava-users@lists.lavasoftware.org https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli st s.lavasoftware.orghttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fs.lavasoftware.org%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cdb422c58aab94480a62708d85e2b2f8f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637362886862034907&sdata=AT6kcRo11bpF89aLg9zV0g0JWkZCu7J9r5l39amrsdw%3D&reserved=0%2Fmailman%2Flistinfo%2Flava-users&data=02%7C0 1% 7Clarry.shen%40nxp.comhttps://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F40nxp.com%2F&data=02%7C01%7Clarry.shen%40nxp.com%7Cdb422c58aab94480a62708d85e2b2f8f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637362886862039881&sdata=VwE2hf%2FAi%2BbQ3tlzQBZ1zZWBogMHDuJexwe7s7my6Nw%3D&reserved=0%7C99817af3bba04995041108d85959a9d4%7C686ea1d3 bc 2b4c6fa92cd99c5c301635%7C0%7C0%7C637357588918656179&sdata=6vvO2U KM VGHQtyf8eVb3eK7Qw9nuhUwwJUt4qXL%2BSTA%3D&reserved=0
_______________________________________________ Lava-users mailing list Lava-users@lists.lavasoftware.orgmailto:Lava-users@lists.lavasoftware.org https://lists.lavasoftware.org/mailman/listinfo/lava-usershttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.lavasoftware.org%2Fmailman%2Flistinfo%2Flava-users&data=02%7C01%7Clarry.shen%40nxp.com%7Cdb422c58aab94480a62708d85e2b2f8f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C637362886862044862&sdata=9GCTPQyslFrmZBGun%2FHmj8buu1W6Ij8%2BNubWh5xBzrs%3D&reserved=0
-- Rémi Duraffort LAVA Architect Linaro
-- Rémi Duraffort LAVA Architect Linaro
-- Rémi Duraffort LAVA Architect Linaro
Hello,
Le jeu. 24 sept. 2020 à 04:21, Larry Shen larry.shen@nxp.com a écrit :
Hi, Remi,
We found only 2 sites have this issue, other sites ok.
After double check, they use something like “lavacli -i admin@validation jobs wait --polling 60 $job_id”
As temporary solution, already ask them to reduce the RPC call with “—polling 300”
Look like the server is struggling to cope with the load.
Looks no this issue for last 1 week already, so I will pending this check.
BTW, I remember we could monitor the event of job finish, will this remove the polling of master issue?
Will it also work after this https://git.lavasoftware.org/lava/lava/-/merge_requests/1253 ?
Yes the zmq event stream is still available. It's also possible to use web sockets to get the event stream.
Rgds
Hi, Remi, thanks.
- the zmq event stream is still available.
Q: This means still a zeromq for lava-publisher, and only for lava-publisher?
- It's also possible to use web sockets to get the event stream.
Q: Any documentation we can refer to so that we could had a try for this later?
From: Remi Duraffort remi.duraffort@linaro.org Sent: Thursday, September 24, 2020 3:09 PM To: Larry Shen larry.shen@nxp.com Cc: Milosz Wasilewski milosz.wasilewski@linaro.org; lava-users@lists.lavasoftware.org Subject: Re: [Lava-users] [EXT] Re: Issues about XMLRPC & lavacli.
Caution: EXT Email Hello,
Le jeu. 24 sept. 2020 à 04:21, Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> a écrit : Hi, Remi,
We found only 2 sites have this issue, other sites ok.
After double check, they use something like “lavacli -i admin@validation jobs wait --polling 60 $job_id” As temporary solution, already ask them to reduce the RPC call with “—polling 300”
Look like the server is struggling to cope with the load.
Looks no this issue for last 1 week already, so I will pending this check.
BTW, I remember we could monitor the event of job finish, will this remove the polling of master issue? Will it also work after this https://git.lavasoftware.org/lava/lava/-/merge_requests/1253https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.lavasoftware.org%2Flava%2Flava%2F-%2Fmerge_requests%2F1253&data=02%7C01%7Clarry.shen%40nxp.com%7C1ef5e35c0a5d4635f37e08d86058a7a7%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637365281172476301&sdata=UryLo3Gh3aBJcKY9%2BZy4T37crSNB69AOrjBZpNTO8CQ%3D&reserved=0 ?
Yes the zmq event stream is still available. It's also possible to use web sockets to get the event stream.
Rgds
-- Rémi Duraffort LAVA Architect Linaro
Le ven. 25 sept. 2020 à 03:50, Larry Shen larry.shen@nxp.com a écrit :
Hi, Remi, thanks.
- the zmq event stream is still available.
Q: This means still a zeromq for lava-publisher, and only for lava-publisher?
zmq is still used internally to send messages from lava-scheduler and lava-server-gunicorn to lava-publisher. lava-publisher will then publish events via zmq or websockets.
- It's also possible to use web sockets to get the event stream.
Q: Any documentation we can refer to so that we could had a try for this later?
See lavacli source code. Using python3-aiohttp that fairly simple:
https://git.lavasoftware.org/lava/lavacli/-/blob/master/lavacli/commands/eve...
Rgds
Hi, Remi, our issue happen again even we reduce the RPC call.
Also, I notice when use “eventlet”, when we download a 8M log from web, the chrome will reported “network error”, and the server log reported: {"log":"[Mon Sep 28 06:56:27.101163 2020] [proxy_http:error] [pid 189:tid 140256774096640] (70008)Partial results are valid but processing is incomplete: [client 10.192.244.206:56938] AH01110: error reading response,
Looks above similar to our job submit issue?
Then, I switch to use “sync” mode, and find now I can download the 8M log completely.
If possible this be an environment issue or? Yes, we have apache to proxy the request to gunicorn, but looks this is setup by lava release? (We use docker solution)
From: Remi Duraffort remi.duraffort@linaro.org Sent: Friday, September 25, 2020 3:28 PM To: Larry Shen larry.shen@nxp.com Cc: Milosz Wasilewski milosz.wasilewski@linaro.org; lava-users@lists.lavasoftware.org Subject: Re: [Lava-users] [EXT] Re: Issues about XMLRPC & lavacli.
Caution: EXT Email
Le ven. 25 sept. 2020 à 03:50, Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> a écrit : Hi, Remi, thanks.
- the zmq event stream is still available.
Q: This means still a zeromq for lava-publisher, and only for lava-publisher?
zmq is still used internally to send messages from lava-scheduler and lava-server-gunicorn to lava-publisher. lava-publisher will then publish events via zmq or websockets.
- It's also possible to use web sockets to get the event stream.
Q: Any documentation we can refer to so that we could had a try for this later?
See lavacli source code. Using python3-aiohttp that fairly simple:
https://git.lavasoftware.org/lava/lavacli/-/blob/master/lavacli/commands/eve...https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.lavasoftware.org%2Flava%2Flavacli%2F-%2Fblob%2Fmaster%2Flavacli%2Fcommands%2Fevents.py%23L235&data=02%7C01%7Clarry.shen%40nxp.com%7Cd0780e8458864dd1298a08d861248557%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637366156768873646&sdata=R4EL6ZQtdRTfj4YYF%2BQqxbB8vl4Rv0dx5UgzYPw66Wc%3D&reserved=0
Rgds
-- Rémi Duraffort LAVA Architect Linaro
Hi, Remi,
I just found this: https://github.com/aio-libs/aiohttp/issues/2687#issuecomment-580158992 , looks similar to our issue. Is aiohttp used here? If possible this caused by lava choose “apache proxy to gunicorn” not “nginx proxy to gunicorn”?
From: Larry Shen Sent: Monday, September 28, 2020 3:58 PM To: Remi Duraffort remi.duraffort@linaro.org Cc: Milosz Wasilewski milosz.wasilewski@linaro.org; lava-users@lists.lavasoftware.org Subject: RE: [Lava-users] [EXT] Re: Issues about XMLRPC & lavacli.
Hi, Remi, our issue happen again even we reduce the RPC call.
Also, I notice when use “eventlet”, when we download a 8M log from web, the chrome will reported “network error”, and the server log reported: {"log":"[Mon Sep 28 06:56:27.101163 2020] [proxy_http:error] [pid 189:tid 140256774096640] (70008)Partial results are valid but processing is incomplete: [client 10.192.244.206:56938] AH01110: error reading response,
Looks above similar to our job submit issue?
Then, I switch to use “sync” mode, and find now I can download the 8M log completely.
If possible this be an environment issue or? Yes, we have apache to proxy the request to gunicorn, but looks this is setup by lava release? (We use docker solution)
From: Remi Duraffort <remi.duraffort@linaro.orgmailto:remi.duraffort@linaro.org> Sent: Friday, September 25, 2020 3:28 PM To: Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> Cc: Milosz Wasilewski <milosz.wasilewski@linaro.orgmailto:milosz.wasilewski@linaro.org>; lava-users@lists.lavasoftware.orgmailto:lava-users@lists.lavasoftware.org Subject: Re: [Lava-users] [EXT] Re: Issues about XMLRPC & lavacli.
Caution: EXT Email
Le ven. 25 sept. 2020 à 03:50, Larry Shen <larry.shen@nxp.commailto:larry.shen@nxp.com> a écrit : Hi, Remi, thanks.
- the zmq event stream is still available.
Q: This means still a zeromq for lava-publisher, and only for lava-publisher?
zmq is still used internally to send messages from lava-scheduler and lava-server-gunicorn to lava-publisher. lava-publisher will then publish events via zmq or websockets.
- It's also possible to use web sockets to get the event stream.
Q: Any documentation we can refer to so that we could had a try for this later?
See lavacli source code. Using python3-aiohttp that fairly simple:
https://git.lavasoftware.org/lava/lavacli/-/blob/master/lavacli/commands/eve...https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.lavasoftware.org%2Flava%2Flavacli%2F-%2Fblob%2Fmaster%2Flavacli%2Fcommands%2Fevents.py%23L235&data=02%7C01%7Clarry.shen%40nxp.com%7Cd0780e8458864dd1298a08d861248557%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637366156768873646&sdata=R4EL6ZQtdRTfj4YYF%2BQqxbB8vl4Rv0dx5UgzYPw66Wc%3D&reserved=0
Rgds
-- Rémi Duraffort LAVA Architect Linaro
Hello,
I don't know why you have such issues. But this does not look related to the github issue that you pointed out.
Rgds
Le lun. 28 sept. 2020 à 10:08, Larry Shen larry.shen@nxp.com a écrit :
Hi, Remi,
I just found this: https://github.com/aio-libs/aiohttp/issues/2687#issuecomment-580158992 , looks similar to our issue.
Is aiohttp used here? If possible this caused by lava choose “apache proxy to gunicorn” not “nginx proxy to gunicorn”?
*From:* Larry Shen *Sent:* Monday, September 28, 2020 3:58 PM *To:* Remi Duraffort remi.duraffort@linaro.org *Cc:* Milosz Wasilewski milosz.wasilewski@linaro.org; lava-users@lists.lavasoftware.org *Subject:* RE: [Lava-users] [EXT] Re: Issues about XMLRPC & lavacli.
Hi, Remi, our issue happen again even we reduce the RPC call.
Also, I notice when use “eventlet”, when we download a 8M log from web, the chrome will reported “network error”, and the server log reported: {"log":"[Mon Sep 28 06:56:27.101163 2020] [proxy_http:error] [pid 189:tid 140256774096640] (70008)Partial results are valid but processing is incomplete: [client 10.192.244.206:56938] AH01110: error reading response,
Looks above similar to our job submit issue?
Then, I switch to use “sync” mode, and find now I can download the 8M log completely.
If possible this be an environment issue or? Yes, we have apache to proxy the request to gunicorn, but looks this is setup by lava release? (We use docker solution)
*From:* Remi Duraffort remi.duraffort@linaro.org *Sent:* Friday, September 25, 2020 3:28 PM *To:* Larry Shen larry.shen@nxp.com *Cc:* Milosz Wasilewski milosz.wasilewski@linaro.org; lava-users@lists.lavasoftware.org *Subject:* Re: [Lava-users] [EXT] Re: Issues about XMLRPC & lavacli.
*Caution: *EXT Email
Le ven. 25 sept. 2020 à 03:50, Larry Shen larry.shen@nxp.com a écrit :
Hi, Remi, thanks.
- the zmq event stream is still available.
Q: This means still a zeromq for lava-publisher, and only for lava-publisher?
zmq is still used internally to send messages from lava-scheduler and lava-server-gunicorn to lava-publisher. lava-publisher will then publish events via zmq or websockets.
- It's also possible to use web sockets to get the event stream.
Q: Any documentation we can refer to so that we could had a try for this later?
See lavacli source code. Using python3-aiohttp that fairly simple:
https://git.lavasoftware.org/lava/lavacli/-/blob/master/lavacli/commands/eve... https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.lavasoftware.org%2Flava%2Flavacli%2F-%2Fblob%2Fmaster%2Flavacli%2Fcommands%2Fevents.py%23L235&data=02%7C01%7Clarry.shen%40nxp.com%7Cd0780e8458864dd1298a08d861248557%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637366156768873646&sdata=R4EL6ZQtdRTfj4YYF%2BQqxbB8vl4Rv0dx5UgzYPw66Wc%3D&reserved=0
Rgds
--
Rémi Duraffort
LAVA Architect
Linaro
lava-users@lists.lavasoftware.org