refactor: unify inconsistent variables in apiserver endpoint#12897
refactor: unify inconsistent variables in apiserver endpoint#12897kajal-jotwani wants to merge 14 commits intokubernetes-sigs:masterfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: kajal-jotwani The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @kajal-jotwani. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
roles/kubernetes/control-plane/templates/kubeadm-config.v1beta3.yaml.j2
Outdated
Show resolved
Hide resolved
roles/kubernetes/control-plane/templates/kubeadm-config.v1beta4.yaml.j2
Outdated
Show resolved
Hide resolved
| - name: Update server field in kubelet kubeconfig - external lb | ||
| - name: Update server field in kubelet kubeconfig - when using control plane endpoint | ||
| lineinfile: | ||
| dest: "{{ kube_config_dir }}/kubelet.conf" | ||
| regexp: '^ server: https' | ||
| line: ' server: {{ kube_apiserver_endpoint }}' | ||
| line: ' server: {{ kube_apiserver_cluster_internal_endpoint }}' | ||
| backup: true | ||
| when: | ||
| - ('kube_control_plane' not in group_names) | ||
| - loadbalancer_apiserver is defined | ||
| - kubeadm_use_control_plane_endpoint | default(false) | ||
| notify: Kubeadm | restart kubelet |
There was a problem hiding this comment.
Got it! So should I remove this task entirely?
There was a problem hiding this comment.
You can leave it, if the other PR merge before that will need a rebase, no big deal.
There was a problem hiding this comment.
Sorry I was confused regarding the other PR. We can delete both theses tasks ("Update server field"), with the correct variables into kubeadm configuration this should not be needed anymore.
| https://{{ loadbalancer_apiserver.address | ansible.utils.ipwrap }}:{{ loadbalancer_apiserver.port | default(kube_apiserver_port) }} | ||
| {%- elif ('kube_control_plane' not in group_names) and loadbalancer_apiserver_localhost -%} | ||
| https://localhost:{{ loadbalancer_apiserver_port | default(kube_apiserver_port) }} | ||
| {%- elif 'kube_control_plane' in group_names -%} |
There was a problem hiding this comment.
This won't be necessary because kubeadm does it by itself see #12870
359dc46 to
19417e2
Compare
|
Thanks a lot for the detailed review! |
| Access API endpoints are evaluated automatically, as the following: | ||
|
|
||
| | Endpoint type | kube_control_plane | non-master | external | | ||
| |------------------------------|------------------------------------------|-------------------------|-----------------------| | ||
| | Local LB (default) | `https://dbip:sp` | `https://lc:nsp` | `https://m[0].aip:sp` | | ||
| | Local LB (default) + cbip | `https://cbip:sp` and `https://lc:nsp` | `https://lc:nsp` | `https://m[0].aip:sp` | | ||
| | Local LB + Unmanaged here LB | `https://dbip:sp` | `https://lc:nsp` | `https://ext` | | ||
| | External LB, no internal | `https://dbip:sp` | `<https://lb:lp>` | `https://lb:lp` | | ||
| | No ext/int LB | `https://dbip:sp` | `<https://m[0].aip:sp>` | `https://m[0].aip:sp` | | ||
|
|
||
| Where: | ||
|
|
||
| * `m[0]` - the first node in the `kube_control_plane` group; | ||
| * `lb` - LB FQDN, `apiserver_loadbalancer_domain_name`; | ||
| * `lb` - External loadbalancer address from `kube_apiserver_endpoint`; | ||
| * `ext` - Externally load balanced VIP:port and FQDN, not managed by Kubespray; | ||
| * `lc` - localhost; | ||
| * `cbip` - a custom bind IP, `kube_apiserver_bind_address`; | ||
| * `dbip` - localhost for the default bind IP '0.0.0.0'; | ||
| * `nsp` - nginx secure port, `loadbalancer_apiserver_port`, defers to `sp`; | ||
| * `sp` - secure port, `kube_apiserver_port`; | ||
| * `lp` - LB port, `loadbalancer_apiserver.port`, defers to the secure port; | ||
| * `lp` - External loadbalancer port from `kube_apiserver_endpoint`; | ||
| * `ip` - the node IP, defers to the ansible IP; | ||
| * `aip` - `access_ip`, defers to the ip. | ||
|
|
||
| The `kube_apiserver_endpoint` and `kube_apiserver_cluster_internal_endpoint` variables | ||
| automatically resolve to the appropriate values based on your loadbalancer configuration. | ||
|
|
There was a problem hiding this comment.
I've been of the impression this whole section was a bit too much complicated to be clear (way too much variables).
Also, since we're going towards more alignement with kubeadm (see #12870 ) it will be wrong for control plane (always using localhost).
I would maybe simplify this and just mention both variables (kube_**endpoint) and maybe link to the defaults file where we would put comment explaining the computed default ?
@tico88612 what do you think ? I kinda struggle to know what's the right amount of details ?
(It's kinda out of scope of the PR though so it's okay if we don't fix this here).
| - name: Update server field in kubelet kubeconfig - external lb | ||
| - name: Update server field in kubelet kubeconfig - when using control plane endpoint | ||
| lineinfile: | ||
| dest: "{{ kube_config_dir }}/kubelet.conf" | ||
| regexp: '^ server: https' | ||
| line: ' server: {{ kube_apiserver_endpoint }}' | ||
| line: ' server: {{ kube_apiserver_cluster_internal_endpoint }}' | ||
| backup: true | ||
| when: | ||
| - ('kube_control_plane' not in group_names) | ||
| - loadbalancer_apiserver is defined | ||
| - kubeadm_use_control_plane_endpoint | default(false) | ||
| notify: Kubeadm | restart kubelet |
There was a problem hiding this comment.
You can leave it, if the other PR merge before that will need a rebase, no big deal.
| first_kube_control_plane_address: "{{ hostvars[groups['kube_control_plane'][0]]['main_access_ip'] }}" | ||
| loadbalancer_apiserver_localhost: "{{ loadbalancer_apiserver is not defined }}" | ||
| # Loadbalancer configuration | ||
| loadbalancer_apiserver_localhost: true |
There was a problem hiding this comment.
This is kinda tricky to get right if we want to keep the current behavior, roughly.
(aka, defaulting to LB localhost if we don't explicitely define a LB) 🤔 .
Possible solution:
Use an internal var (in /vars, which can't be overriden from inventory.) for the actual value, with something like kube_apiserver_endpoint | d(first_master_endpoint).
This then means we can check if the user level variable is defined with is defined.
Wdyt ?
There was a problem hiding this comment.
Yep, that makes sense. I considered a true/false user option, but that would lose the distinction between explicit intent and defaults. Keeping the user var optional and resolving an internal effective value preserves current behavior and lets us detect explicit LB config via is defined. should I go ahead and implement this?
There was a problem hiding this comment.
Yeah, I think so. We should still have the variable in defaults but with "{{ undef() }}" which should have the same effect as not defining it but be more explicit.
VannTen
left a comment
There was a problem hiding this comment.
Incomplete review of some stuff:
But the test failure might be related to the parenthese stuff, not sure.
| # TODO: remove after release 2.31 | ||
| - name: Stop if legacy apiserver LB variables are used | ||
| assert: | ||
| that: | ||
| - loadbalancer_apiserver is not defined | ||
| - apiserver_loadbalancer_domain_name is not defined | ||
| - kube_apiserver_global_endpoint is not defined | ||
| - kubeadm_config_api_fqdn is not defined | ||
| fail_msg: |- | ||
| API server loadbalancer variables have been deprecated. | ||
|
|
||
| Please update your inventory to use: | ||
| - kube_apiserver_endpoint | ||
| - kube_apiserver_cluster_internal_endpoint | ||
|
|
||
| And optionally: | ||
| - loadbalancer_apiserver_localhost (true/false) | ||
| - loadbalancer_apiserver_port | ||
| run_once: true | ||
| when: not ignore_assert_errors | ||
|
|
There was a problem hiding this comment.
Now that #12942 landed you can use it directly 👍
|
|
||
| # Internal endpoint for cluster traffic. | ||
| # Defaults to the global endpoint, unless we are using a localhost API load balancer. | ||
| kube_apiserver_cluster_internal_endpoint: "{{ 'https://127.0.0.1:' ~ loadbalancer_apiserver_port if loadbalancer_apiserver_localhost else kube_apiserver_endpoint }}" |
There was a problem hiding this comment.
I think you need some
| kube_apiserver_cluster_internal_endpoint: "{{ 'https://127.0.0.1:' ~ loadbalancer_apiserver_port if loadbalancer_apiserver_localhost else kube_apiserver_endpoint }}" | |
| kube_apiserver_cluster_internal_endpoint: "{{ ('https://127.0.0.1:' ~ loadbalancer_apiserver_port) if loadbalancer_apiserver_localhost else kube_apiserver_endpoint }}" |
I'm not sure but the if else might be more closely binding than ~ which would probably not work.
3dc1e99 to
0072dcc
Compare
|
/test pull-kubespray-yamllint |
85c7ebd to
a36237b
Compare
| {% else %} | ||
| apiServerEndpoint: "{{ kubeadm_discovery_address }}" | ||
| {% endif %} | ||
| apiServerEndpoint: "{{ _kube_apiserver_cluster_internal_endpoint | urlsplit('netloc') }}" |
There was a problem hiding this comment.
This should be the _kube_apiserver_endpoint here, not the internal one.
The reason for this is that this is used for discovery, aka, for the control plane, before there is a local apiserver. I suspect this causes the CI failures.
There was a problem hiding this comment.
Changed this to _kube_apiserver_endpoint, thanks for pointing that!!
I tried to fix the CI failures as well but I’m still not able to resolve them. I’m working on it and trying to figure out what’s breaking.
| kubeadm_discovery_address: >- | ||
| {%- if "127.0.0.1" in kube_apiserver_endpoint or "localhost" in kube_apiserver_endpoint -%} | ||
| {%- if "127.0.0.1" in _kube_apiserver_endpoint or "localhost" in _kube_apiserver_endpoint -%} | ||
| {{ first_kube_control_plane_address | ansible.utils.ipwrap }}:{{ kube_apiserver_port }} | ||
| {%- else -%} | ||
| {{ kube_apiserver_endpoint | regex_replace('https://', '') }} | ||
| {{ _kube_apiserver_endpoint | regex_replace('https://', '') }} | ||
| {%- endif %} |
There was a problem hiding this comment.
I don't think we need this at all, that variable should just go and we'll use _kube_apiserver_endpoint instead.
There could be an argument to expose it to the user for a endpoint only for discovery, but that's out of scope.
There was a problem hiding this comment.
Let's delete the kubeadm_discovery_address variable.
From my reading, it's replaceable with _kube_apiserver_endpoint without issue.
fbb09c2 to
34b8923
Compare
922c608 to
7a56c8e
Compare
|
@VannTen CI is now passing after the latest fixes and I have also incorporated the suggested changes!! |
roles/kubernetes/control-plane/templates/kubeadm-controlplane.yaml.j2
Outdated
Show resolved
Hide resolved
| kubeadm_discovery_address: >- | ||
| {%- if "127.0.0.1" in kube_apiserver_endpoint or "localhost" in kube_apiserver_endpoint -%} | ||
| {%- if "127.0.0.1" in _kube_apiserver_endpoint or "localhost" in _kube_apiserver_endpoint -%} | ||
| {{ first_kube_control_plane_address | ansible.utils.ipwrap }}:{{ kube_apiserver_port }} | ||
| {%- else -%} | ||
| {{ kube_apiserver_endpoint | regex_replace('https://', '') }} | ||
| {{ _kube_apiserver_endpoint | regex_replace('https://', '') }} | ||
| {%- endif %} |
There was a problem hiding this comment.
Let's delete the kubeadm_discovery_address variable.
From my reading, it's replaceable with _kube_apiserver_endpoint without issue.
| {% else %} | ||
| controlPlaneEndpoint: "{{ main_ip | ansible.utils.ipwrap }}:{{ kube_apiserver_port }}" | ||
| {% endif %} | ||
| controlPlaneEndpoint: "{{ _kube_apiserver_endpoint | urlsplit('netloc') }}" |
There was a problem hiding this comment.
I think these ones should still use _kube_apiserver_cluster_internal_endpoint, or did that break CI ?
This is different from the JoinConfiguration (which is used when adding a node/control plane), as it should be for normal operations, so for example the localhost loadbalancer should be there (which should only be internal)
If that doesn't work it probably means we have and ordering problems somewhere 🤔
There was a problem hiding this comment.
Yes CI was failing when using _kube_apiserver_cluster_internal_endpoint, which is why I changed it to _kube_apiserver_endpoint. I'll dig into it to understand what’s causing the issue
| {% else %} | ||
| controlPlaneEndpoint: "{{ main_ip | ansible.utils.ipwrap }}:{{ kube_apiserver_port }}" | ||
| {% endif %} | ||
| controlPlaneEndpoint: "{{ _kube_apiserver_endpoint | urlsplit('netloc') }}" |
| - name: Update server field in kubelet kubeconfig - external lb | ||
| - name: Update server field in kubelet kubeconfig - when using control plane endpoint | ||
| lineinfile: | ||
| dest: "{{ kube_config_dir }}/kubelet.conf" | ||
| regexp: '^ server: https' | ||
| line: ' server: {{ kube_apiserver_endpoint }}' | ||
| line: ' server: {{ kube_apiserver_cluster_internal_endpoint }}' | ||
| backup: true | ||
| when: | ||
| - ('kube_control_plane' not in group_names) | ||
| - loadbalancer_apiserver is defined | ||
| - kubeadm_use_control_plane_endpoint | default(false) | ||
| notify: Kubeadm | restart kubelet |
There was a problem hiding this comment.
Sorry I was confused regarding the other PR. We can delete both theses tasks ("Update server field"), with the correct variables into kubeadm configuration this should not be needed anymore.
7a56c8e to
4e461ba
Compare
175fa57 to
ff949d3
Compare
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
This PR replaces several overlapping and inconsistently named API server
load balancer variables with two clear endpoint variables.
It updates defaults, templates, and documentation to consistently use these
endpoints, making it easier to understand which address is used externally and
which is used internally within the cluster, while keeping existing setups
working.
Which issue(s) this PR fixes:
Fixes #12883
Special notes for your reviewer:
Does this PR introduce a user-facing change?: