Skip to content

template list.peerings timeout #25508

Open
@EtienneBruines

Description

@EtienneBruines

Nomad version

Nomad v1.9.7
BuildDate 2025-03-11T09:07:15Z
Revision f869597+CHANGES

Operating system and Environment details

Ubuntu 24.04.2 LTS on amd64

Issue

The Nomad job does not handle the wait-period for list.peerings correctly. Even though it wants to use a blocking-query to get the latest updates, it doesn't seem patient enough to wait for changes. Not sure whether this logic is in Nomad and how it uses Consul Template, or whether this is an issue within Consul Template v0.40.0. It only occurs in Nomad v1.9.7 and above, ever since the Consul Template dependency got a bump to v0.40.0

Template failed: list.peerings: Get "http://127.0.0.1:8500/v1/peerings?index=41156796&stale=&wait=300000ms": context deadline exceeded

Reproduction steps

  • Have a Consul cluster with cluster peering enabled (not sure if important)
  • Make that peered cluster export a service (not sure if important)
  • Have a Nomad job template block like this (see below)
  • Wait for 20-30 minutes (by default 12 attempts) to see the job fail with the context deadline exceeded error message
template {
        data = <<EOF
{{ range peerings }}
{{ range .StreamStatus.ImportedServices }}
# {{ . }}
{{ end }}
{{ end }}
EOF
        destination = "local/backends.cfg"
        env = false
        change_mode = "signal"
        change_signal = "SIGUSR2"
      }

Expected Result

No timeouts.

Actual Result

Timeouts which cause the job to fail.

Job file (if appropriate)

See related snippet above.

Nomad Server logs (if appropriate)

Not applicable.

Nomad Client logs (if appropriate)

This might contain a few (not many) allocs that are unrelated. The issue with list.peerings does clearly show.

{"@level":"info","@message":"(runner) rendered \"(dynamic)\" =\u003e \"/opt/nomad/data/alloc/b3e17da0-8fec-fb9b-51b9-b14ea2c404e2/haproxy/local/backends.cfg\"","@module":"agent","@timestamp":"2025-03-24T15:48:44.400586Z"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2025-03-24T15:48:44.823143Z","alloc_id":"b3e17da0-8fec-fb9b-51b9-b14ea2c404e2","failed":false,"msg":"Template re-rendered","task":"haproxy","type":"Signaling"}
{"@level":"warn","@message":"(view) list.peerings: Get \"http://127.0.0.1:8500/v1/peerings?index=41156796\u0026stale=\u0026wait=300000ms\": context deadline exceeded (retry attempt 1 after \"250ms\")","@module":"agent","@timestamp":"2025-03-24T15:48:53.277751Z"}
{"@level":"error","@message":"(runner) sending server error back to caller","@module":"agent","@timestamp":"2025-03-24T15:48:53.277803Z"}
{"@level":"info","@message":"(runner) rendered \"(dynamic)\" =\u003e \"/opt/nomad/data/alloc/b3e17da0-8fec-fb9b-51b9-b14ea2c404e2/haproxy/local/backends.cfg\"","@module":"agent","@timestamp":"2025-03-24T15:48:54.377757Z"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2025-03-24T15:48:55.970820Z","alloc_id":"b3e17da0-8fec-fb9b-51b9-b14ea2c404e2","failed":false,"msg":"Template re-rendered","task":"haproxy","type":"Signaling"}
{"@level":"error","@message":"(runner) sending server error back to caller","@module":"agent","@timestamp":"2025-03-24T15:49:53.529783Z"}
{"@level":"warn","@message":"(view) list.peerings: Get \"http://127.0.0.1:8500/v1/peerings?index=41156796\u0026stale=\u0026wait=300000ms\": context deadline exceeded (retry attempt 2 after \"500ms\")","@module":"agent","@timestamp":"2025-03-24T15:49:53.529873Z"}
{"@level":"warn","@message":"(view) list.peerings: Get \"http://127.0.0.1:8500/v1/peerings?index=41156796\u0026stale=\u0026wait=300000ms\": context deadline exceeded (retry attempt 3 after \"1s\")","@module":"agent","@timestamp":"2025-03-24T15:50:54.030772Z"}
{"@level":"error","@message":"(runner) sending server error back to caller","@module":"agent","@timestamp":"2025-03-24T15:50:54.030826Z"}
{"@level":"info","@message":"(runner) rendered \"(dynamic)\" =\u003e \"/opt/nomad/data/alloc/b3e17da0-8fec-fb9b-51b9-b14ea2c404e2/haproxy/local/backends.cfg\"","@module":"agent","@timestamp":"2025-03-24T15:51:29.773247Z"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2025-03-24T15:51:32.146192Z","alloc_id":"b3e17da0-8fec-fb9b-51b9-b14ea2c404e2","failed":false,"msg":"Template re-rendered","task":"haproxy","type":"Signaling"}
{"@level":"warn","@message":"(view) list.peerings: Get \"http://127.0.0.1:8500/v1/peerings?index=41156796\u0026stale=\u0026wait=300000ms\": context deadline exceeded (retry attempt 4 after \"2s\")","@module":"agent","@timestamp":"2025-03-24T15:51:55.032151Z"}
{"@level":"error","@message":"(runner) sending server error back to caller","@module":"agent","@timestamp":"2025-03-24T15:51:55.032215Z"}
{"@level":"info","@message":"(runner) rendered \"(dynamic)\" =\u003e \"/opt/nomad/data/alloc/b3e17da0-8fec-fb9b-51b9-b14ea2c404e2/haproxy/local/backends.cfg\"","@module":"agent","@timestamp":"2025-03-24T15:52:25.322546Z"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2025-03-24T15:52:26.641251Z","alloc_id":"b3e17da0-8fec-fb9b-51b9-b14ea2c404e2","failed":false,"msg":"Template re-rendered","task":"haproxy","type":"Signaling"}
{"@level":"info","@message":"(runner) rendered \"(dynamic)\" =\u003e \"/opt/nomad/data/alloc/b3e17da0-8fec-fb9b-51b9-b14ea2c404e2/haproxy/local/backends.cfg\"","@module":"agent","@timestamp":"2025-03-24T15:52:41.801646Z"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2025-03-24T15:52:45.268276Z","alloc_id":"b3e17da0-8fec-fb9b-51b9-b14ea2c404e2","failed":false,"msg":"Template re-rendered","task":"haproxy","type":"Signaling"}
{"@level":"warn","@message":"(view) list.peerings: Get \"http://127.0.0.1:8500/v1/peerings?index=41156796\u0026stale=\u0026wait=300000ms\": context deadline exceeded (retry attempt 5 after \"4s\")","@module":"agent","@timestamp":"2025-03-24T15:52:57.033830Z"}
{"@level":"error","@message":"(runner) sending server error back to caller","@module":"agent","@timestamp":"2025-03-24T15:52:57.033891Z"}
{"@level":"warn","@message":"(view) list.peerings: Get \"http://127.0.0.1:8500/v1/peerings?index=41156796\u0026stale=\u0026wait=300000ms\": context deadline exceeded (retry attempt 6 after \"8s\")","@module":"agent","@timestamp":"2025-03-24T15:54:01.035274Z"}
{"@level":"error","@message":"(runner) sending server error back to caller","@module":"agent","@timestamp":"2025-03-24T15:54:01.035344Z"}
{"@level":"warn","@message":"(view) list.peerings: Get \"http://127.0.0.1:8500/v1/peerings?index=41156796\u0026stale=\u0026wait=300000ms\": context deadline exceeded (retry attempt 7 after \"16s\")","@module":"agent","@timestamp":"2025-03-24T15:55:09.037003Z"}
{"@level":"error","@message":"(runner) sending server error back to caller","@module":"agent","@timestamp":"2025-03-24T15:55:09.037086Z"}
{"@level":"warn","@message":"(view) list.peerings: Get \"http://127.0.0.1:8500/v1/peerings?index=41156796\u0026stale=\u0026wait=300000ms\": context deadline exceeded (retry attempt 8 after \"32s\")","@module":"agent","@timestamp":"2025-03-24T15:56:25.039093Z"}
{"@level":"error","@message":"(runner) sending server error back to caller","@module":"agent","@timestamp":"2025-03-24T15:56:25.039145Z"}
{"@level":"warn","@message":"(view) list.peerings: Get \"http://127.0.0.1:8500/v1/peerings?index=41156796\u0026stale=\u0026wait=300000ms\": context deadline exceeded (retry attempt 9 after \"1m0s\")","@module":"agent","@timestamp":"2025-03-24T15:57:57.040381Z"}
{"@level":"error","@message":"(runner) sending server error back to caller","@module":"agent","@timestamp":"2025-03-24T15:57:57.040425Z"}
{"@level":"warn","@message":"(view) list.peerings: Get \"http://127.0.0.1:8500/v1/peerings?index=41156796\u0026stale=\u0026wait=300000ms\": context deadline exceeded (retry attempt 10 after \"1m0s\")","@module":"agent","@timestamp":"2025-03-24T15:59:57.041280Z"}
{"@level":"error","@message":"(runner) sending server error back to caller","@module":"agent","@timestamp":"2025-03-24T15:59:57.041330Z"}
{"@level":"warn","@message":"(view) list.peerings: Get \"http://127.0.0.1:8500/v1/peerings?index=41156796\u0026stale=\u0026wait=300000ms\": context deadline exceeded (retry attempt 11 after \"1m0s\")","@module":"agent","@timestamp":"2025-03-24T16:01:57.042940Z"}
{"@level":"error","@message":"(runner) sending server error back to caller","@module":"agent","@timestamp":"2025-03-24T16:01:57.042997Z"}
{"@level":"warn","@message":"(view) list.peerings: Get \"http://127.0.0.1:8500/v1/peerings?index=41156796\u0026stale=\u0026wait=300000ms\": context deadline exceeded (retry attempt 12 after \"1m0s\")","@module":"agent","@timestamp":"2025-03-24T16:03:57.044784Z"}
{"@level":"error","@message":"(runner) sending server error back to caller","@module":"agent","@timestamp":"2025-03-24T16:03:57.044858Z"}
{"@level":"error","@message":"(view) list.peerings: Get \"http://127.0.0.1:8500/v1/peerings?index=41156796\u0026stale=\u0026wait=300000ms\": context deadline exceeded (exceeded maximum retries)","@module":"agent","@timestamp":"2025-03-24T16:05:57.045917Z"}
{"@level":"error","@message":"(runner) sending server error back to caller","@module":"agent","@timestamp":"2025-03-24T16:05:57.045988Z"}
{"@level":"error","@message":"(runner) watcher reported error: list.peerings: Get \"http://127.0.0.1:8500/v1/peerings?index=41156796\u0026stale=\u0026wait=300000ms\": context deadline exceeded","@module":"agent","@timestamp":"2025-03-24T16:05:57.046013Z"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2025-03-24T16:05:57.046067Z","alloc_id":"b3e17da0-8fec-fb9b-51b9-b14ea2c404e2","failed":true,"msg":"Template failed: list.peerings: Get \"http://127.0.0.1:8500/v1/peerings?index=41156796\u0026stale=\u0026wait=300000ms\": context deadline exceeded","task":"haproxy","type":"Killing"}

Metadata

Metadata

Assignees

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions