-
Notifications
You must be signed in to change notification settings - Fork 444
Add support for drop-in configs #1710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
18681f2 to
8193205
Compare
|
Please note. This is a very quick first draft. |
fff1388 to
456ff8f
Compare
controllers/object_controls.go
Outdated
| topLevelConfigFile = value | ||
| } | ||
| // TODO: This should be a sane default. | ||
| dropInConfigFile := "/run/toolkit/config/99-nvidia.toml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question -- what about /run/nvidia/toolkit/config/99-nvidia.toml? The toolkit container creates the toolkit.pid file at /run/nvidia/toolkit/toolkit.pid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated this to /run/nvidia/toolkit/config/99-nvidia.toml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tariq1890 and I discussed this offline. I have updated the host path to /etc/containerd/conf.d/99-nvidia.toml
controllers/object_controls.go
Outdated
| case gpuv1.Containerd.String(): | ||
| runtimeConfigFile = DefaultContainerdConfigFile | ||
| topLevelConfigFile := DefaultContainerdConfigFile | ||
| // TODO: We should also read RUNTIME_CONFIG here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. Should RUNTIME_CONFIG, if defined, take precedence over CONTAINERD_CONFIG/CRIO_CONFIG?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the latter should have higher precedence as it is more specific than RUNTIME_CONFIG. If you're in agreement with this, can we resolve the TODO in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Let me update this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated so we now read RUNTIME_CONFIG as well, with the more specific envvars, e.g. CONTAINERD_CONFIG / CRIO_CONFIG, taking precedence. I also added additional unit test cases for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
b3245de to
f60dadd
Compare
aad82d3 to
c5dc4f4
Compare
|
Why can't we mount the parent dir of the top-level config and the drop-in config dir? It seems like a lot of the container-toolkit implementation details are spilling into this PR? Any way to minimise? |
We are mounting the parent dir for both config files. |
c5dc4f4 to
764cdb3
Compare
This change adds explicit support for drop-in configs as supported by containerd and cri-o. Signed-off-by: Evan Lezar <elezar@nvidia.com> Co-authored-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
764cdb3 to
584c4c5
Compare
🚀 🚀 🚀 |
This change adds explicit support for drop-in configs as supported by containerd and cri-o.
See: