-
-
Notifications
You must be signed in to change notification settings - Fork 652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New kubernetes backend #21796
New kubernetes backend #21796
Conversation
cc @tgolsson who I know has done some k8s related work at https://github.com/tgolsson/pants-backends/tree/main/pants-plugins/k8s Broad topic not intended to detail this whole PR: Something we have struggled with internally is that we would like to be able to generate the final reified k8s yaml for inspection or alternative deployments. (I have a PoC plugin that does this for helm) This is along the lines of 'package' but not not quite what helm means by 'package', and I'm unsure what to call it. It would be nice if all the Pant k8s generators could eventually both "spit out the yaml" and "deploy". |
Haha, we have exactly this problem, I've called the goal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
My only real sticking point is about the semantics of the requirement for context
putative_targets = [] | ||
|
||
if k8s.tailor_source_targets: | ||
all_k8s_files_globs = req.path_globs("*.yaml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
include "*.yml" too? or maybe make this a customisable option? Not necessary for a first pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I want consistent .yaml or .yml in the whole repo, so probably it should be customisable, yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on this being initially set to find both *.yaml
and *.yml
files. Users of this could end up confused why tailor is not finding their .yml
files.
if k8s.tailor_source_targets: | ||
all_k8s_files_globs = req.path_globs("*.yaml") | ||
all_k8s_files = await Get(Paths, PathGlobs, all_k8s_files_globs) | ||
unowned_k8s_files = set(all_k8s_files.files) - set(all_owned_sources) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that globbing every yaml file will be too broad, but I don't know that there's a standard way of detecting if a manifest is a k8s manifest (kubectl apply --validate=true --dry-run=client
seems to need a server). Worst case people can turn off tailoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking 2 most common use cases here:
- Developer's repo with code will probably use some kind of templating to generate k8s_sources (I'm planning to open couple more prs for that), so tailor for yaml won't be that useful. In this case I expect devs to disable tailoring.
- DevOps's repo with mostly k8s yaml files. In this case I expect tailor to work good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just also scan the file for the apiVersion
string as a heuristic? The go tailor rules use a similar pattern to search for package main
to find potential putative go_binary
targets.
if has_package_main(file_content.content) and has_go_mod_ancestor(dirname, all_go_mod_dirs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, given that YAML is used for many other uses, I'm thinking that any user with this backend enabled is probably going to become annoyed if tailor creates false positive k8s_source
targets every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, would it make sense to have an option to restrict tailor usage to only certain directory trees?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, it makes sense, but I don't want to do it here yet. Let me temporarily remove the whole tailor
feature for now and I'll open a separate pr where we can continue the discussion. 12ad882
|
||
|
||
@dataclass(frozen=True) | ||
class VersionHash: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use pants.core.util_rules.external_tool.ExternalToolVersion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not quite the same thing, because platform
here is not a predefined platform that pants uses, but a platform that needs to be mapped using url_platform_mapping
. But I can make it work like this 6efda67
) | ||
|
||
backward_platform_mapping = {v: k for k, v in platform_mapping.items()} | ||
for result in results: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could have these output in semver order (instead of lexical order) by using from packaging.version import Version
, by collecting the versions and then sorting with something like sorted(versions, key=lambda e: Version(e.version))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure 9d84c9e
platform: Platform, | ||
) -> DeployProcess: | ||
context = field_set.context.value | ||
if context is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've missed something in the logic here. Why do we force the field_set.context
to have a value even when kubectl.pass_context
is False
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that setting context
on a target is required. This is because we had recurring issues because somebody forgot to set the correct context on the target and it got deployed to a different context by accident. However, we've got CI agents running in the cluster itself, so they can only deploy k8s objects to the specific cluster specified with KUBERNETES_SERVICE_HOST
and KUBERNETES_SERVICE_PORT
, that's why we need to disable --context
argument in pants.ci.toml
.
I understand this is kinda custom setup, so people might need some other behavior. However, in general, I think context requirement is a good thing that can prevent misdeployments. I can probably move this context validation to a small linter and check context
field there, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to have a failsafe to prevent deploying things in the wrong clusters. I was thinking of how the Helm backend does it. It has an optional HelmDeploymentNamespaceField
and no field for context.
I think we can just add a line in the docs explaining the intended workflow. Something like
To prevent accidentally deploying kubernetes manifests to the wrong cluster, the
context
field is required onk8s_bundle
s for deployment.
For deploying the samek8s_bundle
to multiple contexts, consider usingparametrize
likek8s_bundle(context=parametrize("stg", "prd"))
For CI agents which will only have access to a single context, set thekubectl.pass_context
tofalse
to have them use their default context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of how the Helm backend does it.
We have the exact same problem in helm backend. I had to vendor it and keep maintaining the patch that adds the context requirement
I'd be excited to see that! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lilatomic Thank you for review! Ready for another round
if k8s.tailor_source_targets: | ||
all_k8s_files_globs = req.path_globs("*.yaml") | ||
all_k8s_files = await Get(Paths, PathGlobs, all_k8s_files_globs) | ||
unowned_k8s_files = set(all_k8s_files.files) - set(all_owned_sources) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking 2 most common use cases here:
- Developer's repo with code will probably use some kind of templating to generate k8s_sources (I'm planning to open couple more prs for that), so tailor for yaml won't be that useful. In this case I expect devs to disable tailoring.
- DevOps's repo with mostly k8s yaml files. In this case I expect tailor to work good
putative_targets = [] | ||
|
||
if k8s.tailor_source_targets: | ||
all_k8s_files_globs = req.path_globs("*.yaml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I want consistent .yaml or .yml in the whole repo, so probably it should be customisable, yes
|
||
|
||
@dataclass(frozen=True) | ||
class VersionHash: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not quite the same thing, because platform
here is not a predefined platform that pants uses, but a platform that needs to be mapped using url_platform_mapping
. But I can make it work like this 6efda67
) | ||
|
||
backward_platform_mapping = {v: k for k, v in platform_mapping.items()} | ||
for result in results: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure 9d84c9e
platform: Platform, | ||
) -> DeployProcess: | ||
context = field_set.context.value | ||
if context is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that setting context
on a target is required. This is because we had recurring issues because somebody forgot to set the correct context on the target and it got deployed to a different context by accident. However, we've got CI agents running in the cluster itself, so they can only deploy k8s objects to the specific cluster specified with KUBERNETES_SERVICE_HOST
and KUBERNETES_SERVICE_PORT
, that's why we need to disable --context
argument in pants.ci.toml
.
I understand this is kinda custom setup, so people might need some other behavior. However, in general, I think context requirement is a good thing that can prevent misdeployments. I can probably move this context validation to a small linter and check context
field there, wdyt?
Looks like some flaky test |
Opened a pr for |
3rdparty/python/requirements.txt
Outdated
# Only used in build-support/ | ||
tqdm~=4.67.1 | ||
types-tqdm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not put these into their own resolve if they are only used internally in the repository? (For example, the pbs-script
resolve exists for this reason for the src/python/pants/backend/python/providers/python_build_standalone/scripts/generate_urls.py
script.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want me to parametrize the whole src/python/pants
like this?
**parametrize("python-default", resolve="python-default"),
**parametrize("build-support", resolve="build-support"),
I'm importing ExternalTool
-s in the pex, so the sources have to be present in the resolve.
I believe all the other not-really-requirements are in this requirements.txt exactly because of this problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, I think it makes sense to move this build-support script to a separate pr f87cb21
putative_targets = [] | ||
|
||
if k8s.tailor_source_targets: | ||
all_k8s_files_globs = req.path_globs("*.yaml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on this being initially set to find both *.yaml
and *.yml
files. Users of this could end up confused why tailor is not finding their .yml
files.
if k8s.tailor_source_targets: | ||
all_k8s_files_globs = req.path_globs("*.yaml") | ||
all_k8s_files = await Get(Paths, PathGlobs, all_k8s_files_globs) | ||
unowned_k8s_files = set(all_k8s_files.files) - set(all_owned_sources) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just also scan the file for the apiVersion
string as a heuristic? The go tailor rules use a similar pattern to search for package main
to find potential putative go_binary
targets.
if has_package_main(file_content.content) and has_go_mod_ancestor(dirname, all_go_mod_dirs): |
if k8s.tailor_source_targets: | ||
all_k8s_files_globs = req.path_globs("*.yaml") | ||
all_k8s_files = await Get(Paths, PathGlobs, all_k8s_files_globs) | ||
unowned_k8s_files = set(all_k8s_files.files) - set(all_owned_sources) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, given that YAML is used for many other uses, I'm thinking that any user with this backend enabled is probably going to become annoyed if tailor creates false positive k8s_source
targets every time.
if k8s.tailor_source_targets: | ||
all_k8s_files_globs = req.path_globs("*.yaml") | ||
all_k8s_files = await Get(Paths, PathGlobs, all_k8s_files_globs) | ||
unowned_k8s_files = set(all_k8s_files.files) - set(all_owned_sources) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, would it make sense to have an option to restrict tailor usage to only certain directory trees?
If/when this PR becomes ready, we've just branched for 2.25, so merging this pull request now will come out in 2.26, please move the release notes updates to |
> fd _category_ docs/docs/ --max-depth 2 --exec rg position '{}' --with-filename --line-number | sort -k 3,3 -n docs/docs/introduction/_category_.json:3: "position": 1 docs/docs/getting-started/_category_.json:3: "position": 2 docs/docs/using-pants/_category_.json:3: "position": 3 docs/docs/python/_category_.json:3: "position": 4 docs/docs/go/_category_.json:3: "position": 5 docs/docs/jvm/_category_.json:3: "position": 6 docs/docs/shell/_category_.json:3: "position": 7 docs/docs/docker/_category_.json:3: "position": 8 docs/docs/kubernetes/_category_.json:3: "position": 9 docs/docs/helm/_category_.json:3: "position": 10 docs/docs/terraform/_category_.json:3: "position": 11 docs/docs/sql/_category_.json:3: "position": 12 docs/docs/ad-hoc-tools/_category_.json:3: "position": 13 docs/docs/javascript/_category_.json:3: "position": 13 docs/docs/writing-plugins/_category_.json:3: "position": 14 docs/docs/releases/_category_.json:3: "position": 15 docs/docs/contributions/_category_.json:3: "position": 16 docs/docs/tutorials/_category_.json:3: "position": 17
Co-authored-by: Daniel Goldman <[email protected]>
@tdyas Thank you for review! I decided to remove some features for now and open separate prs for those. Ready for another round |
@grihabor are you looking for any design feedback at this point, or is this good to go from your point of view? |
@cburroughs We're heavily using the plugin with this design internally, so it would be nice to keep it roughly as is. But you can definitely share your thoughts if you have reasons to tweak something or you have some features you have in mind that would be impossible with this design. But yeah, I think we're good to go (I'm planning to open a couple more prs that generate |
Looks like the check on macos failed because of some flaky behavior
|
This had sat for a while with and was trying to shepherd it along. I don't have any concerns with the design and am personally excited to see the other PRs you mentioned. Thanks for the patches! |
Here is the pr with |
While
helm
backend already exists and works fine for kubernetes, creating a helm chart might be an overkill for a small thing like a single configmap or a secret. New kubernetes backend can deploy single object easily given a yaml file.See docs for more details on usage.
I'm also planning to contribute our integration with
python_format_string
target (I will open a pr) that can handle simple python templating in text files.