Skip to content

re: separate the connect-agent binary into its own chart#26

Merged
madalazar merged 10 commits intomainfrom
i-27361
May 1, 2025
Merged

re: separate the connect-agent binary into its own chart#26
madalazar merged 10 commits intomainfrom
i-27361

Conversation

@madalazar
Copy link
Copy Markdown
Contributor

@madalazar madalazar commented Apr 24, 2025

Description

The purpose of this issue is to separate the connect-agent binary into its own chart so we can update the CCG controller separately. This won't fix the fact that an update of the connect-agent can't be performed on the same machine. Given how things are implemented now we won't be able to support in-place upgrade

Any Newly Introduced Dependencies

Please describe any newly introduced 3rd party dependencies in this change. List their name, license information and how they are used in the project.

How Has This Been Tested?

Tested this with local tests using the following:

updated CCG

+++ b/cmd/connect-controller/main.go
@@ -238,6 +238,7 @@ func main() {
        }

   ..
+       setupLog.Info("**** starting manager v1.1.1 for testing. will mimic a version bump")
        if err := mgr.Start(ctx); err != nil {
...

Bumped the CCG chart version to 1.1.1, left the agent 0.1.0 to use image: "registry-rs.edgeorchestration.intel.com/edge-orch/cluster/connect-agent:1.0.6"

Ran make helm-build, make docker-build to build the chart into a tgz file, then the docker images. Pulled everything on a coder machine to use with cluster tests. I installed the chart manually from the .tgz file using helm. When updating CCG the agent remained the same and the cluster was still active and running fine.

> make test
...
Switched to a new branch 'i-27361'
Running command: cd _workspace/cluster-connect-gateway && VERSION=v0.0.0 HELM_VERSION=v0.0.0 KIND_CLUSTER=kind NAMESPACE=default HELM_VARS="--set controller.privateCA.enabled=false --set agent.image.tag=latest" make docker-build
make[1]: Entering directory '/home/seu/open-edge-platform/cluster-tests/_workspace/cluster-connect-gateway'
GOPRIVATE=* go mod vendor

STEP: Checking that connect agent metric shows a successful connection @ 04/28/25 07:38:00.442
        found metric: websocket_connections_total{status="succeeded"} 1
Total time from cluster creation to fully active: 3m39.460222128s 🚀 ✅
• [227.141 seconds]

>kind load docker-image localhost:5000/connect-controller:1.1.1
Image: "" with ID "sha256:d963c07ee6d1baff7b326521d8732e61310498bb30fc4392a92cfb50853858da" not yet present on node "kind-control-plane", loading...
>  kind load docker-image  localhost:5000/connect-gateway:1.1.1
Image: "" with ID "sha256:9c6c2dd5c9657da632660834405188282346c7f520c7aa78bf55377eeffbe20c" not yet present on node "kind-control-plane", loading...

> kubectl describe pods  cluster-connect-gateway-gateway-8556b88fbd-842jm | grep -i image
    Image:           registry-rs.edgeorchestration.intel.com/edge-orch/cluster/connect-gateway:v0.0.0
    Image ID:        docker.io/library/import-2025-04-28@sha256:9352784131e9c86cacce9264b4aab5da68082c7413d0ba9740eaa79e2a53a4fa

> root@cluster-agent-0:/etc/enic# /var/lib/rancher/rke2/bin/kubectl  describe pods connect-agent-cluster-agent-0 -n kube-system --kubeconfig=/etc/rancher/rke2/rke2.yaml | grep -i image
    Image:           registry-rs.edgeorchestration.intel.com/edge-orch/cluster/connect-agent:1.0.6
    Image ID:        registry-rs.edgeorchestration.intel.com/edge-orch/cluster/connect-agent@sha256:b05e2137dd2566dc9da095896d6aa74cd93d45d62ae7b7f9e8569add5f611871

> helm upgrade --install cluster-connect-gateway ./cluster-connect-gateway-1.1.1.tgz
Release "cluster-connect-gateway" has been upgraded. Happy Helming!#
...
REVISION: 2

 helm list -A
NAME                            NAMESPACE               REVISION        UPDATED                                 STATUS          CHART                                           APP VERSION
capi-operator                   capi-operator-system    1               2025-04-28 07:21:30.943926536 -0700 PDT deployed        cluster-api-operator-0.15.1                     0.15.1
cluster-connect-gateway         default                 2               2025-04-28 08:01:57.960466981 -0700 PDT deployed        cluster-connect-gateway-1.1.1                   1.1.1

>  kubectl describe pods  cluster-connect-gateway-gateway-79fb79f45-pj655 | grep -i image
    Image:           localhost:5000/connect-gateway:1.1.1
    Image ID:        docker.io/library/import-2025-04-28@sha256:b94cfd1307e4c2ae3bec0314cf4f5897e6057e1be4c6874d4dd99fcd8138f8cc


> root@cluster-agent-0:/etc/enic# /var/lib/rancher/rke2/bin/kubectl  describe pods connect-agent-cluster-agent-0 -n kube-system --kubeconfig=/etc/rancher/rke2/rke2.yaml | grep -i image
    Image:           registry-rs.edgeorchestration.intel.com/edge-orch/cluster/connect-agent:1.0.6


 clusterctl describe cluster demo-cluster -n 53cd37b9-66b2-4cc8-b080-3722ed7af64a
NAME                                                       READY  SEVERITY  REASON  SINCE  MESSAGE
Cluster/demo-cluster                                       True                     32m
├─ClusterInfrastructure - IntelCluster/demo-cluster-vgjll  True                     36m
└─ControlPlane - RKE2ControlPlane/demo-cluster-fcg7x       True                     32m
  └─Machine/demo-cluster-fcg7x-rkhmv                       True                     35m

> kubectl logs  cluster-connect-gateway-controller-55b68f78bd-ljr72
2025-04-28T15:01:59Z    INFO    setup   starting manager
2025-04-28T15:01:59Z    INFO    setup   **** starting manager v1.1.1 for testing. will mimic a version bump


>root@cluster-agent-0:/etc/enic# /var/lib/rancher/rke2/bin/kubectl  logs  connect-agent-cluster-agent-0 -n kube-system --kubeconfig=/etc/rancher/rke2/rke2.yaml
{"level":"info","ts":1745852526.6492581,"caller":"agent/agent.go:58","msg":"Token auth to gateway enabled","tunnel-id":"53cd37b9-66b2-4cc8-b080-3722ed7af64a-demo-cluster-vgjll"}
time="2025-04-28T15:02:06Z" level=info msg="Connecting to proxy" url="ws://cluster-connect-gateway.default.svc:8080/connect"
{"level":"info","ts":1745852526.6567497,"caller":"agent/agent.go:90","msg":"Connected to gateway","tunnel-id":"53cd37b9-66b2-4cc8-b080-3722ed7af64a-demo-cluster-vgjll"}

> clusterctl describe cluster demo-cluster -n 53cd37b9-66b2-4cc8-b080-3722ed7af64a
NAME                                                       READY  SEVERITY  REASON  SINCE  MESSAGE
Cluster/demo-cluster                                       True                     3h1m
├─ClusterInfrastructure - IntelCluster/demo-cluster-vgjll  True                     3h5m
└─ControlPlane - RKE2ControlPlane/demo-cluster-fcg7x       True                     3h1m
  └─Machine/demo-cluster-fcg7x-rkhmv                       True                     3h3m

updated connect agent


--- a/cmd/connect-agent/main.go
+++ b/cmd/connect-agent/main.go
@@ -66,5 +66,6 @@ func main() {
                AuthToken:          authToken,
        }

+       logger.Info("**** about to run the connect agent, to mimic a version bump for the agent ****")
        agent.Run(ctx)
 }

>  docker load -i gateway-112.tar
> docker tag 988beb876946ca5d7e638f992d26c41337ba7da740e1466bf79e71fdd1330f0d localhost:5000/connect-gateway:1.1.2
>  docker load -i controller-112.tar
...
>  kind load docker-image  localhost:5000/connect-gateway:1.1.2
Image: "" with ID "sha256:988beb876946ca5d7e638f992d26c41337ba7da740e1466bf79e71fdd1330f0d" not yet present on node "kind-control-plane", loading...
>  kind load docker-image  localhost:5000/connect-controller:1.1.2
Image: "" with ID "sha256:05410e45cb361f1d799455cf7d93be88c2cdf832e62ae9401c4a665fa8ae9778" not yet present on node "kind-control-plane", loading...
>  kind load docker-image  localhost:5000/connect-agent:1.0.7
Image: "" with ID "sha256:860b5be0a19261e9d2d23c4b3ef3e5e5d0adcaf9affce5f1a3e984e18b9e0edf" not yet present on node "kind-control-plane", loading...

>  helm upgrade --install cluster-connect-gateway ./cluster-connect-gateway-1.1.2.tgz
Release "cluster-connect-gateway" has been upgraded. Happy Helming!
...
REVISION: 3
TEST SUITE: None

> helm list -A
NAME                            NAMESPACE               REVISION        UPDATED                                 STATUS          CHART                                           APP VERSION
... 
cluster-connect-gateway         default                 3               2025-04-28 11:06:41.74791032 -0700 PDT  deployed        cluster-connect-gateway-1.1.2                   1.1.2 


> kubectl describe pods  cluster-connect-gateway-controller-c87b874f5-sh9ls | grep -i image
    Image:           localhost:5000/connect-controller:1.1.2
    Image ID:        docker.io/library/import-2025-04-28@sha256:450b3307cb571db64c8b850fd82bac777f765a2be6740f09301c3ca95f8096c5

> clusterctl describe cluster demo-cluster -n 53cd37b9-66b2-4cc8-b080-3722ed7af64a
NAME                                                       READY  SEVERITY  REASON                                                       SINCE  MESSAGE
Cluster/demo-cluster                                       False  Warning   WaitingForMachineBinding @ Machine/demo-cluster-fcg7x-w254b  54s    1 of 2 completed
├─ClusterInfrastructure - IntelCluster/demo-cluster-vgjll  True                                                                          3h33m
└─ControlPlane - RKE2ControlPlane/demo-cluster-fcg7x       False  Warning   WaitingForMachineBinding @ Machine/demo-cluster-fcg7x-w254b  54s    1 of 2 completed
  ├─Machine/demo-cluster-fcg7x-rkhmv                       True                                                                          3h32m
  └─Machine/demo-cluster-fcg7x-w254b                       False  Warning   WaitingForMachineBinding                                     55s    1 of 2 completed


> kubectl get rke2config -A
NAMESPACE                              NAME                       AGE
53cd37b9-66b2-4cc8-b080-3722ed7af64a   demo-cluster-fcg7x-d8w79   3h37m
53cd37b9-66b2-4cc8-b080-3722ed7af64a   demo-cluster-fcg7x-gt7xf   4m39s

> kubectl get rke2config demo-cluster-fcg7x-d8w79 -n 53cd37b9-66b2-4cc8-b080-3722ed7af64a -o yaml | grep -i image
        sandbox_image = \"index.docker.io/rancher/mirrored-pause:3.6\"
          image: "registry-rs.edgeorchestration.intel.com/edge-orch/cluster/connect-agent:1.0.6"

> kubectl get rke2config demo-cluster-fcg7x-gt7xf -n 53cd37b9-66b2-4cc8-b080-3722ed7af64a -o yaml | grep -i image
        sandbox_image = \"index.docker.io/rancher/mirrored-pause:3.6\"
          image: "localhost:5000/connect-agent:1.0.7"

Checklist:

  • I agree to use the APACHE-2.0 license for my code changes
  • I have not introduced any 3rd party dependency changes
  • I have performed a self-review of my code

@madalazar madalazar changed the title [WIP]I 27361 re: separate the connect-agent binary into its own chart Apr 29, 2025
@madalazar madalazar marked this pull request as ready for review April 29, 2025 09:56
Comment thread deployment/charts/cluster-connect-gateway/values.yaml Outdated
Copy link
Copy Markdown
Contributor

@MikolajKasprzak MikolajKasprzak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments

madalazar added 9 commits May 1, 2025 15:25
Signed-off-by: Lazar, Madalina <madalina.lazar@intel.com>
Signed-off-by: Lazar, Madalina <madalina.lazar@intel.com>
Signed-off-by: Lazar, Madalina <madalina.lazar@intel.com>
Signed-off-by: Lazar, Madalina <madalina.lazar@intel.com>
Signed-off-by: Lazar, Madalina <madalina.lazar@intel.com>
Signed-off-by: Lazar, Madalina <madalina.lazar@intel.com>
Signed-off-by: Lazar, Madalina <madalina.lazar@intel.com>
Signed-off-by: Lazar, Madalina <madalina.lazar@intel.com>
Signed-off-by: Lazar, Madalina <madalina.lazar@intel.com>
Signed-off-by: Lazar, Madalina <madalina.lazar@intel.com>
@madalazar madalazar merged commit 49bc2fe into main May 1, 2025
20 checks passed
@madalazar madalazar deleted the i-27361 branch May 1, 2025 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants