Skip to content

Conversation

@lubronzhan
Copy link
Contributor

@lubronzhan lubronzhan commented Mar 5, 2025

What this PR does / why we need it:
With current script, install-cni.sh will stuck in an infinite loop, not continue to ip-control-loop.
This PR extract the loop into a background process token-watcher.sh

Example whereabouts log

kubo@OuN5Mh8kyHu3u:~$ kubectl logs -n kube-system -f whereabouts-hq9fs
Done configuring CNI.  Sleep=false
Sleep and Watching for service account token and CA file changes...
2025-03-05T07:23:58Z [debug] Filtering pods with filter key 'spec.nodeName' and filter value 'wl-calico-controlplane-spbtt-7qzd7'
2025-03-05T07:23:58Z [verbose] pod controller created
2025-03-05T07:23:58Z [verbose] Starting informer factories ...
2025-03-05T07:23:58Z [verbose] Informer factories started
2025-03-05T07:23:58Z [verbose] starting network controller
2025-03-05T07:23:59Z [verbose] using expression: 30 4 * *

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes ##575

Special notes for your reviewer (optional):
Other option would be like adding token refreshing using golang, like https://github.com/projectcalico/calico/pull/5910/files#diff-3bc859f09da6edba95b02904c6c5da879513d9d6e4c87e9c35ebc97b1afad6e6
Or rearchitect to thick model, having the shim binary talking to the whereabouts daemon through socket, instead of directly talking to apiserver

@coveralls
Copy link

coveralls commented Mar 12, 2025

Pull Request Test Coverage Report for Build 13906708718

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 8 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.2%) to 51.312%

Files with Coverage Reduction New Missed Lines %
pkg/controlloop/pod.go 8 81.04%
Totals Coverage Status
Change from base Build 13898027075: -0.2%
Covered Lines: 1955
Relevant Lines: 3810

💛 - Coveralls

@mlguerrero12
Copy link
Collaborator

I think this approach is good to fix what we currently have, but ideally, we should refresh it in code once we get a 401 error. Is that what Calico does?

@mlguerrero12 mlguerrero12 requested a review from bpickard22 March 12, 2025 17:11
@lubronzhan lubronzhan force-pushed the topic/lubron/refresh_token branch 2 times, most recently from 45421a4 to 1a65b6c Compare March 12, 2025 18:41
@lubronzhan
Copy link
Contributor Author

lubronzhan commented Mar 12, 2025

I think this approach is good to fix what we currently have, but ideally, we should refresh it in code once we get a 401 error. Is that what Calico does?

Calico just rotate the token in kubeconfig if it is about to expire in equal or less than 15 minutes, if default ttl of the token is 1 hour. (buffer time is 1/4 of ttl). And calico generates separate SA for this kubeconfig.

Yeah ideally golang would be good. Now this is just a simple working version.

@lubronzhan
Copy link
Contributor Author

lubronzhan commented Mar 15, 2025

Interesting, I decoded the token, actually the ttl is 1 year

/ # cat /var/run/secrets/kubernetes.io/serviceaccount/token
eyJhbGciOiJSUzI1NiIsImtpZCI6Ik9NZjBLWXlLSHlIRDhZWmx0SEFkNURxOE5YTVU3bDVjdi1fdjJ4bkJfZnMifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNzczNTQ3NDYxLCJpYXQiOjE3NDIwMTE0NjEsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsInBvZCI6eyJuYW1lIjoid2hlcmVhYm91dHMtNnQ1Nm4iLCJ1aWQiOiI3MWFmNjFlMS0zZDNhLTQzNWEtYTUxZS01NWFkMDhhOTQ3NWMifSwic2VydmljZWFjY291bnQiOnsibmFtZSI6IndoZXJlYWJvdXRzIiwidWlkIjoiNTE2OTgyZTItZDYxNC00NTY5LTg0YjgtZDNjNzliN2FmNGYxIn0sIndhcm5hZnRlciI6MTc0MjAxNTA2OH0sIm5iZiI6MTc0MjAxMTQ2MSwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOndoZXJlYWJvdXRzIn0.ff-UqSVH2RKpooWOiceaWGXGbDyh1JbcCPlwq36Eau0ffR6x2x2G99heHd6tnLcNIX6_GFk0PERejeYYlWN74MQtuh16TKtRI0cNKcEngyBZg_MvPxyZrbPEXl8mWuBH52hR8uuqwoNzoz30-jkQC_2ZxSNR1EQsuKhwEQpHawr2fRubB6JREscGsl5tkccYOQdMucIszKf-B6VPGcapf-eZYDuZTe0xch0ALkBJVE76-TLwpCWcq5QvuWQiEXHFKF6Y_BnlhQBja1e8tqUjYnPWkVMrDCcXnsGtBQHTxxh4yI0mDxakr4fuizoldve-ZnSzvNZg6F2KfksIjLTccw
image

I didn't configure anything about ttl in kubelet

 cat /var/lib/kubelet/config.yaml
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
  anonymous:
    enabled: false
  webhook:
    cacheTTL: 0s
    enabled: true
  x509:
    clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 0s
    cacheUnauthorizedTTL: 0s
cgroupDriver: systemd
clusterDNS:
- 100.64.0.10
clusterDomain: cluster.local
containerRuntimeEndpoint: ""
cpuManagerReconcilePeriod: 0s
evictionPressureTransitionPeriod: 0s
fileCheckFrequency: 0s
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 0s
imageMaximumGCAge: 0s
imageMinimumGCAge: 0s
kind: KubeletConfiguration
logging:
  flushFrequency: 0
  options:
    json:
      infoBufferSize: "0"
  verbosity: 0
memorySwap: {}
nodeStatusReportFrequency: 0s
nodeStatusUpdateFrequency: 0s
resolvConf: /run/systemd/resolve/resolv.conf
rotateCertificates: true
runtimeRequestTimeout: 0s
shutdownGracePeriod: 0s
shutdownGracePeriodCriticalPods: 0s
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 0s
syncFrequency: 0s
volumeStatsAggPeriod: 0s
/usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cloud-provider=external --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock --event-qps=50 --make-iptables-util-chains=true --node-ip=10.162.55.1 --node-labels=image-type=ova,os-arch=amd64,os-name=ubuntu,os-type=linux,os-version=2204, --pod-infra-container-image=pause:3.9 

@mlguerrero12
Copy link
Collaborator

LGTM

Thanks @lubronzhan. Before I merge this, please organize the commits. Each one should have a relevant title and description. You could also squash them and have a single one.

* Calls token-watcher.sh in background
* Update the manifest
@lubronzhan lubronzhan force-pushed the topic/lubron/refresh_token branch from 1dc0bf1 to 47db6d8 Compare March 17, 2025 18:08
@mlguerrero12 mlguerrero12 merged commit 807db9d into k8snetworkplumbingwg:master Mar 19, 2025
11 checks passed
@lubronzhan lubronzhan deleted the topic/lubron/refresh_token branch March 19, 2025 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants