Allow Agent Reload with API - /v1/agent/reload#27106
Allow Agent Reload with API - /v1/agent/reload#27106benjamin-lykins wants to merge 28 commits intohashicorp:mainfrom
Conversation
tgross
left a comment
There was a problem hiding this comment.
@benjamin-lykins there's a failing test not touched here but clearly related to this work. Let's get that resolved before we review:
=== FAIL: command/agent TestServer_Reload_TLS_Certificate (0.00s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered, repanicked]
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x30fefa3]
goroutine 87962 [running]:
testing.tRunner.func1.2({0x3898300, 0x69a5e70})
/opt/hostedtoolcache/go/1.25.3/x64/src/testing/testing.go:1872 +0x237
testing.tRunner.func1()
/opt/hostedtoolcache/go/1.25.3/x64/src/testing/testing.go:1875 +0x35b
panic({0x3898300?, 0x69a5e70?})
/opt/hostedtoolcache/go/1.25.3/x64/src/runtime/panic.go:783 +0x132
github.com/hashicorp/nomad/command/agent.(*Agent).Reload(0xc00c83c700, 0xc00d476d08)
/home/runner/work/nomad/nomad/command/agent/agent.go:1533 +0x63
github.com/hashicorp/nomad/command/agent.TestServer_Reload_TLS_Certificate(0xc00a9f8540)
/home/runner/work/nomad/nomad/command/agent/agent_test.go:1197 +0x1d7
testing.tRunner(0xc00a9f8540, 0x42c8720)
/opt/hostedtoolcache/go/1.25.3/x64/src/testing/testing.go:1934 +0xea
created by testing.(*T).Run in goroutine 1
/opt/hostedtoolcache/go/1.25.3/x64/src/testing/testing.go:1997 +0x465
DONE 5542 tests, 12 skipped, 4 failures in 182.433s
I think I know what is causing this. I'll get it fixed. |
|
@tgross fixed the test, which then uncovered another broken test after another I changed the return response. both of those were cleaned up and looks happier now here. |
tgross
left a comment
There was a problem hiding this comment.
This is a good start @benjamin-lykins!
In addition to the comments I've left, make sure you've added a changelog entry file. You can do that via make cl. Once this is getting closer to being ready, we'll need a docs PR to go with this, which is now going to be over in the web-unified-docs repo.
|
|
||
| newConf := DefaultConfig() | ||
|
|
||
| for _, path := range currConf.Files { |
There was a problem hiding this comment.
If the -config flag in the command line arguments for this agent points to a directory, this Files field will contain only the files that were in that directory at the time we started the agent. If you add a new file, it won't be present on reload. We may need to figure out a way to get the original -config flag attached to the config for this to work.
There was a problem hiding this comment.
Thanks for catching this one, wasn't thinking of it or realized when I originally ran through it. Made some modifications to save the -config flag values and tested locally without issue. Tested if I passed over two -config as well and worked as expected. I was also able to reuse the logic in LoadConfig to check each path, which seems to work out well. Lastly, I tested adding files with bad configuration and got a decent response back, but I could also just send a canned message back
curl -k -X PUT \
https://localhost:4646/v1/agent/reload
Error loading /Users/benjamin.lykins/git/nomad/nomad/bin/config1/badfile.hcl: failed to decode HCL file /Users/benjamin.lykins/git/nomad/nomad/bin/config1/badfile.hcl: At 1:6: key 'asdf' expected start of object ('{') or assignment ('=')|
|
||
| newConf.Files = append([]string(nil), currConf.Files...) | ||
|
|
||
| if err := s.agent.Reload(newConf); err != nil { |
There was a problem hiding this comment.
If you take a look at the signal handler for reloading, you'll see that this step only reloads the agent configuration and won't push any changes to the server, client, or HTTP server configuration. We should at least have parity with the signal handler here, which calls (*Command) handleReload. We may need to figure out a way to extract that function so that it's reachable here.
There was a problem hiding this comment.
This still needs to be done, and I'll start looking ahead at how to get this going. I saw this when I was mucking around with TLS rotations with server/client, and it was only updating the more general agent config.
…w config values/files
|
tgross
left a comment
There was a problem hiding this comment.
I think I see where you're going with this, but there's an architectural issue. See my comment for more details.
Description
Provide an API endpoint to reload agent config without elevated permissions on the infrastructure running Nomad.
Testing
Reloading log level without ACL
curl -k -X PUT \ https://localhost:4646/v1/agent/reload {"message":"agent configuration reloaded"}Attempting without token (ACL Enabled):
curl -k -X PUT \ https://localhost:4646/v1/agent/reload Permission deniedAttempting with token (ACL Enabled;
Agent = Write):Attempting with token (ACL Enabled;
Agent = Read):curl -k -X PUT \ --header "X-Nomad-Token: 2cde0074-d0ac-53e1-a6dc-3f416a8e1737" \ https://localhost:4646/v1/agent/reload Permission deniedInvalid Config File:
curl -k -X PUT \ --header "X-Nomad-Token: 79d77f6a-e168-3bbb-8fa7-e668b2b09346" \ https://localhost:4646/v1/agent/reload Error loading /Users/benjamin.lykins/git/nomad/nomad/bin/config/general.hcl: failed to decode HCL file /Users/benjamin.lykins/git/nomad/nomad/bin/config/general.hcl: At 3:16: literal not terminatedLinks
Fixes: #24048
Contributor Checklist
changelog entry using the
make clcommand.ensure regressions will be caught.
and job configuration, please update the Nomad website documentation to reflect this. Refer to
the website README for docs guidelines. Please also consider whether the
change requires notes within the upgrade guide.
Reviewer Checklist
backporting document.
in the majority of situations. The main exceptions are long-lived feature branches or merges where
history should be preserved.
within the public repository.
Changes to Security Controls
Are there any changes to security controls (access controls, encryption, logging) in this pull request? If so, explain.