Skip to content

Gracefully handle DNS name lookup failures #3046

Open
@portante

Description

@portante

We need to catch "Name or service not known" exceptions and handle them gracefully instead of printing a huge stack trace.

If a DNS name lookup fails, we should most likely enter a periodic retry loop with a configured timeout, since these kinds of failures can happen sporadically and clear again almost immediately depending on name server updates and resets in the environment.

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 162, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw)
  File "/usr/lib/python3.6/site-packages/urllib3/util/connection.py", line 57, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/lib64/python3.6/socket.py", line 745, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/pbench-server/lib/python3.8/site-packages/elasticsearch1/connection/http_urllib3.py", line 78, in perform_request
    response = self.pool.urlopen(method, url, body, retries=False, headers=self.headers, **kw)
  File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/lib/python3.6/site-packages/urllib3/util/retry.py", line 344, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/lib/python3.6/site-packages/urllib3/packages/six.py", line 693, in reraise
    raise value
  File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 354, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib64/python3.6/http/client.py", line 1273, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib64/python3.6/http/client.py", line 1319, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib64/python3.6/http/client.py", line 1268, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib64/python3.6/http/client.py", line 1044, in _send_output
    self.send(msg)
  File "/usr/lib64/python3.6/http/client.py", line 982, in send
    self.connect()
  File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 184, in connect
    conn = self._new_conn()
  File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 171, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f28b7bc9f98>: Failed to establish a new connection: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/pbench-server/lib/pbench/indexer.py", line 506, in update_templates
    es, name=name, body=self.templates[name]
  File "/opt/pbench-server/lib/pbench/indexer.py", line 644, in es_put_template
    tmpl = es.indices.get_template(name=name)
  File "/opt/pbench-server/lib/python3.8/site-packages/elasticsearch1/client/utils.py", line 69, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/opt/pbench-server/lib/python3.8/site-packages/elasticsearch1/client/indices.py", line 549, in get_template
    name), params=params)
  File "/opt/pbench-server/lib/python3.8/site-packages/elasticsearch1/transport.py", line 307, in perform_request
    status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
  File "/opt/pbench-server/lib/python3.8/site-packages/elasticsearch1/connection/http_urllib3.py", line 89, in perform_request
    raise ConnectionError('N/A', str(e), e)
elasticsearch1.exceptions.ConnectionError: ConnectionError(<urllib3.connection.HTTPConnection object at 0x7f28b7bc9f98>: Failed to establish a new connection: [Errno -2] Name or service not known) caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x7f28b7bc9f98>: Failed to establish a new connection: [Errno -2] Name or service not known)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/pbench-server/bin/pbench-report-status.py", line 82, in <module>
    report.init_report_template()
  File "/opt/pbench-server/lib/pbench/report.py", line 104, in init_report_template
    self.templates.update_templates([self.es](http://self.es/), "server-reports")
  File "/opt/pbench-server/lib/pbench/indexer.py", line 510, in update_templates
    raise TemplateError(e)
pbench.indexer.TemplateError: ConnectionError(<urllib3.connection.HTTPConnection object at 0x7f28b7bc9f98>: Failed to establish a new connection: [Errno -2] Name or service not known) caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x7f28b7bc9f98>: Failed to establish a new connection: [Errno -2] Name or service not known)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    To Do

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions