Skip to content

Implement (simplified) Deployment controller and its model #611

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 98 commits into from
May 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
98 commits
Select commit Hold shift + click to select a range
38a1b38
do not use docker builder
Catoverflow May 3, 2025
8653d86
fix sed in conformance test
Catoverflow May 3, 2025
8d9971e
add instructions of scripts
Catoverflow May 5, 2025
4f2c1a8
Minor fix of format
Catoverflow May 5, 2025
e9e1163
deployment controller skeleton init
Catoverflow Apr 14, 2025
983b1e4
add dc controller doc
Catoverflow Apr 16, 2025
519df71
add note on controlled rs
Catoverflow Apr 16, 2025
5799c7a
complete skeleton, model to be fulfulled
Catoverflow Apr 16, 2025
a50b420
add todo
Catoverflow Apr 16, 2025
94213cd
add corresponding model
Catoverflow Apr 17, 2025
7a71242
evolve over the spec for dc
Catoverflow Apr 17, 2025
24b0021
model for vdeployment controller complete, proceed to executable part
Catoverflow Apr 18, 2025
5eda9ee
add exec part for vdeployment controller
Catoverflow Apr 18, 2025
1b59982
update state machine model
Catoverflow Apr 21, 2025
39ec390
hack vd dependency, bugs fix, update reconciler to match new state ma…
Catoverflow Apr 21, 2025
5b5b6a1
add clone trait for vrs, bugs fix for vd
Catoverflow Apr 21, 2025
2c7182d
bugs fix; mask Option<replicas>, reconcile_core postcondition for the…
Catoverflow Apr 22, 2025
8025a91
bugs fix, unmask problems
Catoverflow Apr 22, 2025
dc2d225
changes made during the meeting
Catoverflow Apr 22, 2025
f2135d4
add Default for VRSSpec, move seq_filter_implies_contains to vstd_ext…
Catoverflow Apr 22, 2025
a3940e5
prove objects_to_vrs_list by reusing the proof for objects_to_pods
Catoverflow Apr 22, 2025
9f362bc
get rid of old lemma
Catoverflow Apr 23, 2025
2364db4
fed up with silly proofs
Catoverflow Apr 23, 2025
c7ad6e7
followup with the spec change
Catoverflow Apr 23, 2025
9adb16c
rewrite filter_old_and_new_vrs to follow the spec pattern, also try t…
Catoverflow Apr 23, 2025
1746219
no progress on make_replica_set
Catoverflow Apr 23, 2025
62a7d84
fix filter_vrs_list and Clone trait for VRS
codyjrivera Apr 24, 2025
4622b74
bug fix of inconsistency between - and _
Catoverflow Apr 25, 2025
17b9aa8
change filter_old_and_new_vrs signature, prove it
Catoverflow Apr 28, 2025
1892304
add well_formed to ObjevtMeta and VReplicaSet exec, prove VD
Catoverflow Apr 29, 2025
917b11a
add e2e test for VDeployment controller
Catoverflow Apr 29, 2025
8bb6691
remove duplicated e2e test
Catoverflow Apr 29, 2025
f916057
fix e2e bug partially
Catoverflow Apr 29, 2025
3f55127
support dual deploy for vd and vrs
Catoverflow Apr 29, 2025
da9319e
add vrs crd as required by vd
Catoverflow Apr 29, 2025
c3b64da
e2e test bug fix
Catoverflow Apr 29, 2025
8ae6f13
typo fix
Catoverflow Apr 29, 2025
d6f9169
add vrs controller to vd e2e deploy
Catoverflow Apr 29, 2025
f944cec
fix crd bug
Catoverflow Apr 29, 2025
db7f8c8
add a --no-build option so vargo will not always rebuild, causing doc…
Catoverflow Apr 29, 2025
4c3dd3f
bug fix
Catoverflow Apr 30, 2025
9f13857
bug fix, should move to new vd controller model later
Catoverflow Apr 30, 2025
0f7ec51
fixes according to Cody's suggestion
Catoverflow Apr 30, 2025
d079ce1
fix bug in e2e test, add unset method in object_meta
Catoverflow Apr 30, 2025
1c8cb2a
prove vd
Catoverflow May 1, 2025
eca80e8
strengthen state_validation (which can be proved from the current def…
Catoverflow May 1, 2025
9c0dfd2
add remove support
Catoverflow May 1, 2025
eb7a35d
make implicit result explicit for the ease of proof, which further st…
Catoverflow May 1, 2025
54e5c23
fix proof for vd
Catoverflow May 1, 2025
93b4b3c
add hash to vrs labels to follow k8s impl
Catoverflow May 2, 2025
9691bbf
fix inconsistency in labels added
Catoverflow May 5, 2025
32ea47c
add vd support after rebase
Catoverflow May 5, 2025
8989954
fix state machine
Catoverflow May 6, 2025
ed3a949
change no-build to be reusing existing docker image instead of contro…
Catoverflow May 6, 2025
9c19aa8
e2e passed; add template patch test
Catoverflow May 6, 2025
21dd055
add back loop in state machine
Catoverflow May 6, 2025
ab8494c
add triggers
Catoverflow May 6, 2025
ec75489
switch back to original model after bugs are identified
Catoverflow May 6, 2025
e22356c
fix proof
Catoverflow May 7, 2025
d23b80a
fix vd controller state machine
Catoverflow May 7, 2025
e9cabd4
fix vd e2e test ci
Catoverflow May 7, 2025
424dcd5
fix wrong commit picked during rebase
Catoverflow May 7, 2025
e0e50da
add log on correct res
Catoverflow May 7, 2025
b2c73cc
move cluster setup to deploy.sh for the ease of local test
Catoverflow May 7, 2025
75e0605
add more complicated and powerful
Catoverflow May 7, 2025
2b5b72f
fix admission controller test, we may want to merge them into e2e/ an…
Catoverflow May 7, 2025
eb13404
move cluster name to deploy.sh
Catoverflow May 7, 2025
e744b02
remove stale comment
Catoverflow May 7, 2025
c26007b
fix PR comments
Catoverflow May 7, 2025
3fd6c1b
move state machine to comment to make it easier to maintain consistency
Catoverflow May 7, 2025
c7c7fc9
fix postcondition
Catoverflow May 7, 2025
802550e
temporary fix for vrs.state_validation(), permenant fix should be pro…
Catoverflow May 8, 2025
f830f7d
rewrite vdeployment controller by vibe coding
Catoverflow May 9, 2025
9e821a8
proved 3 wrapper funcs
Catoverflow May 9, 2025
4f9ee49
add ref macro support
Catoverflow May 12, 2025
e8f669a
prove vd
Catoverflow May 12, 2025
d8209c6
add todo
Catoverflow May 12, 2025
5c06c37
add ESR for vd
Catoverflow May 12, 2025
d56bb1f
remove admission controller accidentally introduced during rebase
Catoverflow May 12, 2025
7ed6ae7
fix unfixed conflict left during rebase
Catoverflow May 12, 2025
2e1d91e
try to unify admission test (again)
Catoverflow May 12, 2025
770f604
try to merge admission test workflow to deploy.sh
Catoverflow May 13, 2025
0f9e2d8
remove unused validation introduced during rebase
Catoverflow May 13, 2025
7ed4d84
as map::empty() is no longer used, remove the comment
Catoverflow May 13, 2025
635882f
use default trait
Catoverflow May 13, 2025
ab124fd
fix ci
Catoverflow May 13, 2025
63ad822
fix admission controller testflow
Catoverflow May 13, 2025
d23f1c8
remove unused imports, temporarily fix flaky proof in VRS
Catoverflow May 13, 2025
2374736
move namespace.is_Some into CR's requirements; unset_label -> remove_…
Catoverflow May 13, 2025
81e5fef
add vd.state_validation; add comment on TODO in vd; fix comment in vrs
Catoverflow May 13, 2025
5b633e0
fix calling is_Some in exec code, which was masked by #[verifier(exte…
Catoverflow May 13, 2025
c0a77a0
follow up with changes on main(again), use deep view
Catoverflow May 13, 2025
7eabc26
prove reconcile_init_state
Catoverflow May 15, 2025
96cc751
vd.spec.template and vrs.spec.template are required
Catoverflow May 15, 2025
89befe5
remove accidentally introduced error log
Catoverflow May 15, 2025
e6ca625
fix e2e test
Catoverflow May 15, 2025
7710a19
make vrs.spec.template optional to be consistent with k8s-oneapi spec
Catoverflow May 15, 2025
ff7e62d
noop to trigger ci
Catoverflow May 15, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 28 additions & 14 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -405,13 +405,7 @@ jobs:
- name: Install kind
run: go install sigs.k8s.io/kind@v$kind_version
- name: Deploy vreplicaset admission controller
run: |
VERUS_DIR="${HOME}/verus" ./build.sh v2_vreplicaset_admission_controller.rs --no-verify --time
docker build -f docker/controller/Dockerfile.local -t local/admission-controller:v0.1.0 --build-arg APP=v2_vreplicaset_admission .
kind create cluster --config deploy/kind.yaml
kind load docker-image local/admission-controller:v0.1.0
cd e2e
./admission_setup.sh vreplicaset
run: VERUS_DIR="${HOME}/verus" ./local-test.sh v2-vreplicaset-admission --build
- name: Run vreplicaset e2e tests for admission
run: . "$HOME/.cargo/env" && cd e2e && cargo run -- v2-vreplicaset-admission
vstatefulset-admission-e2e-test:
Expand All @@ -436,13 +430,7 @@ jobs:
- name: Install kind
run: go install sigs.k8s.io/kind@v$kind_version
- name: Deploy vstatefulset admission controller
run: |
VERUS_DIR="${HOME}/verus" ./build.sh v2_vstatefulset_admission_controller.rs --no-verify --time
docker build -f docker/controller/Dockerfile.local -t local/admission-controller:v0.1.0 --build-arg APP=v2_vstatefulset_admission .
kind create cluster --config deploy/kind.yaml
kind load docker-image local/admission-controller:v0.1.0
cd e2e
./admission_setup.sh vstatefulset
run: VERUS_DIR="${HOME}/verus" ./local-test.sh v2-vstatefulset-admission --build
- name: Run vstatefulset e2e tests for admission
run: . "$HOME/.cargo/env" && cd e2e && cargo run -- v2-vstatefulset-admission
vreplicaset-e2e-test:
Expand Down Expand Up @@ -470,3 +458,29 @@ jobs:
run: VERUS_DIR="${HOME}/verus" ./local-test.sh v2-vreplicaset --build-remote
- name: Run vreplicaset e2e tests
run: . "$HOME/.cargo/env" && cd e2e && cargo run -- v2-vreplicaset
vdeployment-e2e-test:
needs:
- build-and-cache-verus
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v4
- name: Restore Verus cache
uses: actions/cache@v4
with:
path: |
${{ env.home_dir }}/verus/source
${{ env.home_dir }}/verus/dependencies
${{ env.home_dir }}/.cargo
${{ env.home_dir }}/.rustup
key: verus-${{ runner.os }}-${{ env.verus_commit }}-${{ hashFiles('rust-toolchain.toml') }}
- name: Setup Go
uses: actions/setup-go@v5
with:
go-version: "^1.20"
- name: Install kind
run: go install sigs.k8s.io/[email protected]
- name: Deploy v2-vdeployment controller
run: VERUS_DIR="${HOME}/verus" ./local-test.sh v2-vdeployment --build
- name: Run vdeployment e2e tests
run: . "$HOME/.cargo/env" && cd e2e && cargo run -- v2-vdeployment

15 changes: 13 additions & 2 deletions build.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,14 +60,25 @@ Make sure `VERUS_DIR` points to verus repo location and built binary exists, `<c
3. Setup cluster, apply controller image using [kind](https://kind.sigs.k8s.io/).
4. Apply test specified in `e2e/src` and workload in `deploy` by `deploy.sh`

This process can be automated with
This process can be automated with:

**1-3**

```
./local-test.sh <controller_name> [--build|--build-remote]
Usage:
--build: Call ./build.sh to build the controller before test, should have VERUS_DIR speccified
--build-remote: Call ./build.sh to build the controller image using Verus builder. This is useful when host has different runtime environment from image (Ubuntu 22.04), for example, different glibc version
unspecified: Just use existing host built controller to setup the controller image. Assume binary is ready in `src/<controller_name>`
unspecified: Just use existing built controller image to set up kind cluster. Assume the image is named as `local/$app-controller:v0.1.0`
```

If deployment/test failed, you can manually run `./deploy.sh <controller_name> [local|remote]` to reset the e2e test environment.

**4**
```
cd e2e
# ./admission_setup.sh <controller_name> for admission test
cargo run -- <controller_name>
```

> You can find more examples in `.github/workflows/ci.yml`
46 changes: 43 additions & 3 deletions deploy.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
##
## Requires a running Kubernetes cluster and kubectl to be installed.

set -xeu
set -xu

YELLOW='\033[1;33m'
GREEN='\033[1;32m'
Expand All @@ -13,10 +13,50 @@ NC='\033[0m'

app=$(echo "$1" | tr '_' '-') # should be the controller's name (with words separated by dashes)
app_filename=$(echo "$app" | tr '-' '_')
cluster_name="${app}-e2e"
registry=$2 # should be either remote or local

## use imperative management for CRDs since metadata for PodTemplateSpec is too long.
if cd deploy/$app_filename && kubectl create -f crd.yaml && kubectl apply -f rbac.yaml && kubectl apply -f deploy_$registry.yaml; then
kind get clusters | grep $cluster_name > /dev/null 2>&1
if [ $? -eq 0 ]; then
echo -e "${YELLOW}A kind cluster named \"$cluster_name\" already exists. Deleting...${NC}"
kind delete cluster --name $cluster_name
fi

set -xeu
# Set up the kind cluster and load the image into the cluster
kind create cluster --config deploy/kind.yaml --name $cluster_name
kind load docker-image local/$app-controller:v0.1.0 --name $cluster_name

# for VDeployment, need to deploy VReplicaSet as a dependency
if [ "$app" == "v2-vdeployment" ]; then
kind load docker-image local/v2-vreplicaset-controller:v0.1.0 --name $cluster_name
fi

# admission controller has a different deployment process
if [ $(echo $app | awk -F'-' '{print $NF}') == "admission" ]; then
app=${app%-admission}
app_filename=${app_filename%_admission}
set -o pipefail
kubectl create -f deploy/${app_filename}/crd.yaml
echo "Creating Webhook Server Certs"
mkdir -p certs
openssl genrsa -out certs/tls.key 2048
openssl req -new -key certs/tls.key -out certs/tls.csr -subj "/CN=admission-server.default.svc"
openssl x509 -req -extfile <(printf "subjectAltName=DNS:admission-server.default.svc") -in certs/tls.csr -signkey certs/tls.key -out certs/tls.crt

echo "Creating Webhook Server TLS Secret"
kubectl create secret tls admission-server-tls \
--cert "certs/tls.crt" \
--key "certs/tls.key"
echo "Creating Webhook Server Deployment"
sed -e 's@${APP}@'"${app}-admission-controller"'@g' <"e2e/manifests/admission_server.yaml" | kubectl create -f -
CA_PEM64="$(openssl base64 -A < certs/tls.crt)"
echo "Creating K8s Webhooks"
sed -e 's@${CA_PEM_B64}@'"$CA_PEM64"'@g' -e 's@${RESOURCE}@'"${app#v2-}"s'@g' <"e2e/manifests/admission_webhooks.yaml" | kubectl create -f -
exit 0
fi

if cd deploy/$app_filename && { for crd in $(ls crd*.yaml); do kubectl create -f "$crd"; done } && kubectl apply -f rbac.yaml && kubectl apply -f deploy_$registry.yaml; then
echo ""
echo -e "${GREEN}The $app controller is deployed in your Kubernetes cluster in namespace \"$app\".${NC}"
echo -e "${GREEN}Run \"kubectl get pod -n $app\" to check the controller pod.${NC}"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ spec:
kind: VDeployment
plural: vdeployments
shortNames:
- vrs
- vd
singular: vdeployment
scope: Namespaced
versions:
Expand Down Expand Up @@ -4743,3 +4743,4 @@ spec:
served: true
storage: true
subresources: {}

1 change: 1 addition & 0 deletions deploy/v2_vdeployment/crd_vreplicaset.yaml
45 changes: 45 additions & 0 deletions deploy/v2_vdeployment/deploy_local.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: v2-vdeployment-controller
namespace: v2-vdeployment
labels:
app.kubernetes.io/name: v2-vdeployment-controller
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: v2-vdeployment-controller
template:
metadata:
labels:
app.kubernetes.io/name: v2-vdeployment-controller
spec:
containers:
- image: local/v2-vdeployment-controller:v0.1.0
imagePullPolicy: IfNotPresent
name: controller
serviceAccountName: v2-vdeployment-controller
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: v2-vreplicaset-controller
namespace: v2-vdeployment
labels:
app.kubernetes.io/name: v2-vreplicaset-controller
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: v2-vreplicaset-controller
template:
metadata:
labels:
app.kubernetes.io/name: v2-vreplicaset-controller
spec:
containers:
- image: local/v2-vreplicaset-controller:v0.1.0
imagePullPolicy: IfNotPresent
name: controller
serviceAccountName: v2-vdeployment-controller
43 changes: 43 additions & 0 deletions deploy/v2_vdeployment/deploy_remote.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: v2-vdeployment-controller
namespace: v2-vdeployment
labels:
app.kubernetes.io/name: v2-vdeployment-controller
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: v2-vdeployment-controller
template:
metadata:
labels:
app.kubernetes.io/name: v2-vdeployment-controller
spec:
containers:
- image: ghcr.io/anvil-verifier/anvil/v2-vdeployment-controller:latest
name: controller
serviceAccountName: v2-vdeployment-controller
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: v2-vreplicaset-controller
namespace: v2-vdeployment
labels:
app.kubernetes.io/name: v2-vreplicaset-controller
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: v2-vreplicaset-controller
template:
metadata:
labels:
app.kubernetes.io/name: v2-vreplicaset-controller
spec:
containers:
- image: ghcr.io/anvil-verifier/anvil/v2-vreplicaset-controller:latest
name: controller
serviceAccountName: v2-vdeployment-controller
55 changes: 55 additions & 0 deletions deploy/v2_vdeployment/rbac.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
apiVersion: v1
kind: Namespace
metadata:
labels:
app.kubernetes.io/name: v2-vdeployment
name: v2-vdeployment
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: v2-vdeployment-controller
namespace: v2-vdeployment
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app.kubernetes.io/name: v2-vdeployment-controller
name: v2-vdeployment-controller-role
rules:
- apiGroups:
- anvil.dev
resources:
- "*"
verbs:
- "*"
- apiGroups:
- ""
resources:
- pods
- replicasets
- services
- endpoints
- persistentvolumeclaims
- events
- configmaps
- secrets
- serviceaccounts
verbs:
- "*"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
app.kubernetes.io/name: v2-vdeployment-controller
name: v2-vdeployment-controller-rolebinding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: v2-vdeployment-controller-role
subjects:
- kind: ServiceAccount
name: v2-vdeployment-controller
namespace: v2-vdeployment
19 changes: 19 additions & 0 deletions deploy/v2_vdeployment/v2_vdeployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
apiVersion: anvil.dev/v1
kind: VDeployment
metadata:
name: pause-deployment
labels:
app: pause-demo
spec:
replicas: 3
selector:
matchLabels:
app: pause-demo
template:
metadata:
labels:
app: pause-demo
spec:
containers:
- name: pause
image: k8s.gcr.io/pause:3.9
30 changes: 30 additions & 0 deletions discussion/kubernetes-model/deployment_controller.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Deployment Controller (DC)

## Reconciliation Login

> In `pkg/controller/deployment/deployment_controller.go`

Reconciliation is performed by `syncDeployment`, which can be modeled as state machine:

0. Orphaning, adoption. This one is not considered in the model

1. [List all replicasets](https://github.com/kubernetes/kubernetes/blob/cdc807a9e849b651fb48c962cc18e25d39ec5edf/pkg/controller/deployment/deployment_controller.go#L629) (RS) owned by this DC from API Server

2. [Get all pods](https://github.com/kubernetes/kubernetes/blob/cdc807a9e849b651fb48c962cc18e25d39ec5edf/pkg/controller/deployment/deployment_controller.go#L638) managed by controlled RS and create a RS-pod map from API Server

3. Make decision on how to manage controlled RS:
x. Rollback, not considered in the model
Compare managed RS template and DC template,
**A**. if the number of replica is different, this is a [scaling event](https://github.com/kubernetes/kubernetes/blob/cdc807a9e849b651fb48c962cc18e25d39ec5edf/pkg/controller/deployment/deployment_controller.go#L665), just scale up/down the single newest controlled RS
**B**. The application managed* by DC should be updated (`spec.template.container`). Based on the configured update policy**:

​ I. [rollout](https://github.com/kubernetes/kubernetes/blob/cdc807a9e849b651fb48c962cc18e25d39ec5edf/pkg/controller/deployment/rolling.go#L31): (Create if non-existing and) scale up new RS, then scale down old RS
​ II. [recreate](https://github.com/kubernetes/kubernetes/blob/cdc807a9e849b651fb48c962cc18e25d39ec5edf/pkg/controller/deployment/recreate.go#L29): Scale down old RS, then (create if non-existing and) scale up the new RS.

---

*: The deployment controller will only control at most one new replica set, and multiple old replica sets

> In k8s implementation it's [possible](https://github.com/kubernetes/kubernetes/blob/cdc807a9e849b651fb48c962cc18e25d39ec5edf/pkg/controller/deployment/util/deployment_util.go#L633-L634) to have multiple new replica sets in rare cases

**: The scale up/down process should satisfy `maxSurge` and `maxUnavailable` properties of DC, which means DC may not scale the controlled new & old RS to desired replicas in one step. Instead DC will scale up & down gradually in turn. Currently we do not model this feature.
Loading
Loading