Table of Contents
The primary goal of this project is to exercise with Argo CD based GitOps deployment covering the full cycle - up to production via promotion, if you want to. Experimentation and production should not conflict.
The change process starts at localhost. Hence, we consider kind experience very important. Given that, some elements may be useful in CI context. Most things, should play nice productive environments as well.
Demo using terraform bootstrapping a single node kind cluster showing deployments,statefulsets and daemonsets as they enter their desired state πͺπ©π°
- Speed : A fast cycle from localhost to production π
- Fail early and loud (notifications)
- Scalability
- Simplicity (yes, really)
- Composability
- Target
kind, vanillaKubernetesand Openshift includingcrc
Stage propagation is hard. Folders, branches, repos, ... you name it. All those come with pros and cons and it ends up being a tradeoff. Apparently, it is so hard that a dedicated project kargo was born to solve it. Long story short:
We have started using kargo, and we are trying the migration following effective processes for monorepos while using a single long lived branch (unchanged as we did before) and the "Rendered Config" pattern. Essential bits appear to be working with the kargo (environment/stage), and we may even get way without changing the folder structure.
We use single level environment staging with one cluster per environment. We do not use names and namespaces in this context, and we don't even dare to do multi-tenancy in a single cluster (OLMv1 drops it). This should help with isolation, loose coupling, support the cattle model and keep things simpler. We want cluster scoped staging. Using another nested level introduces issues ("Matrjoschka Architecture").
We prefer Pull over Push.
We focus on one "Platform Team" managing many clusters using a single repo. It should enable ArgoCD embedding for Application verticals.
Following the App of Apps pattern, our local root Application is at (envs/local). The root app kicks off various ApplicationSets covering similarly shaped (e.g. helm/kustomize) apps hosted in apps. Within that folder, we do not want Argo CD resources. This helps with separation and quick testing cycles.
OLM footprint has a bigger footprint than helm and it comes with its own set of issues as well. It is higher level and way more user friendly. With some components (e.g. Argo CD, Loki, LVM) helm is the second class citizen. With others (e.g. Rook), it's the opposite. We prefer first class citizens. Hence, we default to bring in OLM when it is not there initially (such as on kind).
We cover deployments of:
- Argo CD (self managed)
- Argo CD Notifications
- Argo-CD Image-Updater
- Argo Rollouts
- Argo Events
- Operator Lifecycle Management
- Metallb
- Kube-Prometheus
- Loki/Promtail
- Velero
- Cert-Manager
- AWS Credentials Sync
- Sealed Secrets
- SOPS Secrets
- Submariner
- Caretta
- LitmusChaos
Beyond deployments, we feature:
miseaiming at a more uniform environment locally and in CImakebased tasks- Github Actions integration
- Prometheus Rule Unit Testing
- A bare bones alerting application in case want to send alerts to very custom receivers (like Matrix Chat Rooms)
- Open Cluster Management / Submariner Hub and Spoke Setup (WIP)
Some opinions first:
- YAML at scale is ... terrible. Unfortunately, there is no way around.
- CI/CD usually comes with horrible DX: β.. itβs this amalgamation of scripts in YAML tied together with duct tape.β
- CI/CD should enable basic inner loop local development. Should not be
git commit -m hoping-for-the-best && git push - Naming ... is hard
- Joining clusters is hard (e.g. Submariner)
- Beware of Magic π©πͺπ° (e.g. Argo CD helm release changes when Prometheus CRDs become available)
- Beware of helm shared values or kustomize base. We deploy
mainand shared bits kick in on all environments. - Versions/Refs: Pin or Float? It depends. We should probably pin things in critical environments and keep things floating a bit more elsewhere
- Don't try too hard modeling deps and ordering. Failing to start a few times can be perfectly fine. Honor this modeling your alerts.
- We should propagate to production frequently.
- Rebuilding whole things automatically from scratch matters a lot. Drift kicks in fast and it helps with Recovery.
- Bootstrapping OLM is painful - thanks god, there is a helm chart these days.
- Using Kubernetes bits in Terraform (e.g.
helm,kustomize,kubectl,kubernetesproviders), only use the bare minimum (because deps are painful) This is an example of how you may give instructions on setting up your project locally. To get a local copy up and running follow these simple example steps.
makekubectlmise(highly recommended)docker(if usingkind)terraform(optional)helm(if not using terraform)
For basic demo purposes, you can use this public repo. If you want to run against your own, replace the git server reference with your own.
First, you should choose where to start, specifically whether you want to use terraform.
If you don't want to use terraform, you should be starting at the root folder. There is a Makefile with various ad hoc tasks. Simply running
makeshould give you some help.
If you want to use terraform, you'll start similarly in the ./tf folder. The terraform module supports deployment to kind clusters.
Our preferred approach to secrets is sealed-secrets (have a look at gen-keys.sh in case you'd like to use sops instead).
If using github, you may want to disable github actions and/or add a public deployment key.
gh repo deploy-key add ...
In the root folder (w/o terraform), you should be checking
make -n argocd-helm-install-basic argocd-apply-root
Run this without -n once you feel confident to get the ball rolling.
The default local deployment will deploy a SealedSecret. It will fail during decryption, because we won't be sharing our key. It is meant to be used with Argo Notifications, so it is not critical for a basic demo. Feel free to introduce your own bootstrap secret.
We want lifecycle of things (Create/Destroy) to be as fast as possible. Pulling images can slow things down significantly. Contrary docker a host based solution (such as k3s), challenges are harder with kind. Make sure to understand your the defails of your painpoints before implementing your solution.
- Local Registry
- Pull-through Docker registry on Kind clusters (
registry:2supports only one registry per instnance) kind loadmay address some use cases- Remove everything in
kindinstalled by Argo CD (so we can rebuild from cached images). (s.make argocd-destroy)
- Environment propagation : Try Kargo
- Try kro
- Operator Controller Should Provide a Standard Install Process
- Improve ad hoc task support (smart branching) for Red Hat OpenShift GitOps (ns, secrets), and Ingress (login)
Introduce proper GitOps time travel support (tags/hashes)- Improve Openshift harmonization (esp. with regards to naming/namespaces)
kindbased testing- Improve Unit/Integration Test Coverage
- Prometheus based sync failure alerts (s. known issues)
- It appear odd that using olm based installation of ocm still requires us to worry about the hub registration-operator.
- There are
TODOtags in code (to provide context) - It takes too long for prometheus to get up
terraformwithin Argo CD? (just like intf-controller)- crossplane
- For
kind, we may want to replace Metallb withcloud-provider-kind - keycloak + sso (DNS) local trickery
- Aspire Dashboard? (ultralight oTel)
- Customer Use Case Demo litmus? Should probably bring the pure chaos bits to Argo CD
deas/kaos helm job sampleArgo CD Grafana DashboardArgo CD Service Monitor (depends on prom)- Canary-/Green/Blue Deployment (Rollouts)
default to auto update everything?Proper self management of Argo CDmetrics-server- contour?
ciliumOPA Policies: _Gatekeeper vs usage in CI- kubeconform in CI
- Argo CD +/vs ACM/open cluster management
- Notifications Sync alerts Slack/Matrix
- Manage Kubernetes Operators with Argo CD?
- Try Argo-CD Autopilot
- Proper cascaded removal. Argo CD should be last. Will likely involve terraform.
Applications in any namespace (s. Known Issues)- Service Account based OAuth integration on Openshift is nice - but tricky to implement: OpenShift Authentication Integration with Argo CD, Authentication using OpenShift
- Openshift Proxy/Global Pull Secrets, Global Pull Secrets, Ingress + API Server Certs, IDP Integration
- Improve Github Actions Quality Gates
- Tracing Solution (zipkin, tempo)
- oTel Sample
- More Grafana Dashboards / Integrations with Openshift Console Plugin
- Consider migrating
maketojust - Dedupe/Modularize
Makefile/Justfile - ocm solutions See the open issues for a full list of proposed features (and known issues).
- OCM : Integration with Argo CD
- Argo CD rbac/multi tenancy?
- ACM appears to auto approve CSRs. Open source auto-approvers appear to specifically target cert-manager (CRD) or kubelet. Introduce
csr-approver - Introduce IPv6 with
crc/kvm - Go deeper with
nix/devenv- maybe even replacemise
- Alert only if certain time passed
- Wildcards in Argo CD sourceNamespaces prevent resource creation
argcocdcli does not support apps with multiple sources.- Support configuration of HTTP_PROXY, HTTPS_PROXY and NO_PROXY for Gateway DaemonSet
- Appears there is not straight forward way to make OLM Deployments use one pod per Deployment/Replica
- Create a dry run tool for ConfigurationPolicy
- RFE Create tools to assist in Policy development
- Operator cannot be upgraded with the error "Cannot update: CatalogSource was removed" while the CatalogSource exists in OpenShift 4
- OLMv1 Design Decisions
- Kustomized Helm (Application plugin)
- Bootstrapping: ApplicationSets vs App-of-apps vs Kustomize
- Argo CD 2.10: ApplicationSet full templating
- viaduct-ai/kustomize-sops
- Introduction to GitOps with Argo CD
- 3 patterns for deploying Helm charts with Argo CD
- Self Managed Argo CD β App Of Everything
- Setting up Argo CD with Helm
- terraform-argocd-bootstrap
- Argo CD with Kustomize and KSOPS using Age encryption
- https://blog.devgenius.io/argocd-with-kustomize-and-ksops-2d43472e9d3b
- https://github.com/majinghe/argocd-sops
- https://dev.to/callepuzzle/secrets-in-argocd-with-sops-fc9
- Argo CD Application Dependencies
- Progressive Syncs (alpha)
- Custom Root CAs in OpenShift
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature) - Commit your Changes (
git commit -m 'Add some AmazingFeature') - Push to the Branch (
git push origin feature/AmazingFeature) - Open a Pull Request
Distributed under the MIT License. See LICENSE.txt for more information.
