Skip to content

Nomad 1.9.7 client does not start with "host_network" option set. #25591

Open
@neuroserve

Description

@neuroserve

Nomad version

Nomad v1.9.7
BuildDate 2025-03-11T09:07:15Z
Revision f869597+CHANGES

Operating system and Environment details

root@nomad-client-0:/etc/nomad# cat /etc/debian_version
11.11
root@nomad-client-0:/etc/nomad# uname -a
Linux nomad-client-0 5.10.0-34-cloud-amd64 #1 SMP Debian 5.10.234-1 (2025-02-24) x86_64 GNU/Linux

Issue

I used a Nomad client with Nomad Version 1.9.7 for some time. Then I added

  host_network "overlay" {
    interface = "nebula1"
  }
 host_network "internal" {
    interface = "ens3"
  }

to the nomad.hcl config file and restarted Nomad. After that, Nomad is started via systemd - that's why Nomad went into a restart-loop. The client logged:

Apr 3 08:36:42 nomad-client-0 nomad[8142]: SDK 2025/04/03 08:36:42 WARN falling back to IMDSv1: operation error ec2imds: getToken, http response error StatusCode: 404, request to EC2 IMDS failed

and

Apr 3 08:42:55 nomad-client-0 nomad[8345]: 2025-04-03T08:42:55.038Z [ERROR] agent: error starting agent: error="client setup failed: fingerprinting failed: operation error ec2imds: GetMetadata, canceled, context deadline exceeded" Apr 3 08:42:55 nomad-client-0 systemd[1]: nomad.service: Main process exited, code=exited, status=1/FAILURE Apr 3 08:42:55 nomad-client-0 systemd[1]: nomad.service: Failed with result 'exit-code'. Apr 3 08:42:57 nomad-client-0 systemd[1]: nomad.service: Scheduled restart job, restart counter is at 1. Apr 3 08:42:57 nomad-client-0 systemd[1]: Stopped Nomad. Apr 3 08:42:57 nomad-client-0 systemd[1]: Started Nomad.
The interesting thing: Even after stopping Nomad deleting all /opt/nomad contents, removing the new configuration options and restarting, Nomad would not start.

Reproduction steps

Add a host_network configuration to your existing client configuration and restart Nomad 1.9.7.
I switched back to Nomad 1.8.3 (as I had that still on the client) and it came back up without problems using the host_network configuration.

Expected Result

Nomad 1.9.7 should start without problems using a host_network configuration.
BTW.: I have a client with Nomad 1.9.7, where that works - I have no idea, why it does not work on the other client.

Actual Result

S. above. Client exits.

Metadata

Metadata

Assignees

Type

No type

Projects

Status

Triaging

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions