Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
193e572
docs: design spec and implementation plan for eBPF credential injection
SoulKyu May 2, 2026
f412edc
feat(placeholder): fixed-length token generator and matcher
SoulKyu May 2, 2026
f770c84
feat(vault): add WrapValues and UnwrapValues on Connector
SoulKyu May 2, 2026
1cc5b2d
fix(vault): address review feedback on wrap/unwrap
SoulKyu May 2, 2026
733f6a0
feat(config): add BPFConfig and ModeBPF runtime mode
SoulKyu May 2, 2026
8ace2d8
refactor(config): inline BPFConfig defaults in NewConfig
SoulKyu May 2, 2026
11661bc
feat(k8smutator): wrap creds with placeholders when cfg.BPF.Enabled
SoulKyu May 2, 2026
798d51c
refactor(k8smutator): extract wrapAndAnnotate helper, share BPFMappin…
SoulKyu May 2, 2026
720e49d
feat(controller): add RunBPF skeleton and ModeBPF dispatch
SoulKyu May 2, 2026
2b6dc9f
fix(controller): propagate context.Err on RunBPF idle path
SoulKyu May 2, 2026
4deff7e
feat(bpf): resolve cgroup_id from podUID + containerID
SoulKyu May 2, 2026
418e4d3
refactor(bpf): document cgroup layout and improve error messages
SoulKyu May 2, 2026
189f876
feat(bpf): tmpfs persister for cross-restart mapping recovery
SoulKyu May 2, 2026
0b7660f
feat(bpf): LSM substitution program
SoulKyu May 2, 2026
6ee2898
feat(bpf): cilium/ebpf-based loader
SoulKyu May 2, 2026
c247794
ci: bump go-version to 1.24 to match go.mod
SoulKyu May 2, 2026
4222af6
refactor(bpf): cache cgroup_mappings map handle and wrap update errors
SoulKyu May 2, 2026
58ff899
feat(bpf): node-local runner watching local pods
SoulKyu May 2, 2026
cc7489b
fix(bpf): handle informer tombstones and reset processed on missing f…
SoulKyu May 2, 2026
72bf217
feat(helm): BPF DaemonSet and bpf.enabled switch
SoulKyu May 2, 2026
fcd3f09
fix(helm): wire BPF DaemonSet to its ConfigMap via --config arg
SoulKyu May 2, 2026
511ead0
build/ci: BPF compile stage in Dockerfile + integration workflow
SoulKyu May 2, 2026
9e8ba0f
chore: restore BPF_LIBBPF_INCLUDE with -I flag and CI workflow polish
SoulKyu May 2, 2026
08c8c53
docs: BPF mode operator and contributor documentation
SoulKyu May 2, 2026
63d63fd
fix(helm,config): correct BPF DaemonSet args, env prefix, ConfigMap s…
SoulKyu May 2, 2026
edd596a
fix(bpf,k8smutator): thread WrapTokenTTL and validate credential leng…
SoulKyu May 2, 2026
83ce85d
docs(bpf-mode): correct EAGAIN claim, placeholder length, and phantom…
SoulKyu May 2, 2026
dca15b9
fix(bpf): use ConnectAndRenew for Vault token auto-renewal
SoulKyu May 2, 2026
cd3a497
feat(bpf): add vault_injector_bpf_map_size gauge metric
SoulKyu May 2, 2026
d2bc75f
feat(bpf): program BPF map for all containers in a pod
SoulKyu May 2, 2026
bbb8156
fix(bpf): repopulate BPF map after DS restart using stored cgroup IDs
SoulKyu May 2, 2026
57d3ae4
fix(bpf): snapshot restored UIDs to avoid double-counting mappingsLoaded
SoulKyu May 2, 2026
fdc7a41
chore(deps): bump safe minor/patch dependencies
SoulKyu May 2, 2026
f32d308
chore(deps): bump hashicorp/vault stack and drop RC pin
SoulKyu May 2, 2026
665672b
chore(deps): bump k8s.io 0.32.1 → 0.36.0
SoulKyu May 2, 2026
dfa22e1
chore(deps): bump getsentry/sentry-go to latest
SoulKyu May 2, 2026
90a1d0d
ci: bump CI and Dockerfile Go version to 1.26
SoulKyu May 2, 2026
fd3eb8f
fix(bpf): scan envp byte-by-byte using bpf_loop (5.17+)
SoulKyu May 2, 2026
3fdaa28
fix(controller): start healthcheck and metrics servers in RunBPF
SoulKyu May 2, 2026
c0255bd
fix(ci): align bpf-integration workflow to Go 1.26
SoulKyu May 2, 2026
9c25aa8
fix(bpf): wire MaxMappingsPerNode to map, add save-rollback, cgroup p…
SoulKyu May 2, 2026
569a368
fix(webhook): short-circuit annotation collision in wrapAndAnnotate b…
SoulKyu May 2, 2026
a5b2ce1
docs(bpf): document kubectl exec limitation and pod hardening recomme…
SoulKyu May 2, 2026
834fd2e
fix(bpf): roll back partial PutMapping on multi-container failure
SoulKyu May 2, 2026
64732ce
chore(bpf): extend verify-bpf-object filter to catch scan_callback edits
SoulKyu May 2, 2026
35e8a96
chore(bpf): refresh committed arm64 BPF object
SoulKyu May 2, 2026
9765a13
fix(bpf): runtime-validate program — GPL license + integration test c…
SoulKyu May 2, 2026
aea8e87
feat(bpf): support cgroupfs driver and mount /sys/kernel/security
SoulKyu May 2, 2026
b4b5f40
fix(bpf): switch from LSM hook to tracepoint/sys_enter_execve
SoulKyu May 2, 2026
8c7767f
fix(bpf): runtime-validated fixes from k3d end-to-end test
SoulKyu May 2, 2026
acb040f
fix(bpf): pre-program pod-level cgroup to cover init/crash race
SoulKyu May 2, 2026
1b818cd
fix(bpf): persist tmpfs as hostPath /run so it survives DS restarts
SoulKyu May 2, 2026
b00aa7d
docs: add NRI migration spec superseding eBPF design
SoulKyu May 2, 2026
d0927d2
docs: add NRI migration implementation plan
SoulKyu May 2, 2026
a4dc493
refactor(nri): replace BPF substitution layer with NRI plugin
SoulKyu May 2, 2026
d7a420c
helm: replace BPF DaemonSet/configmap with NRI variants
SoulKyu May 2, 2026
dda0c60
chore: sweep remaining BPF references after NRI pivot
SoulKyu May 2, 2026
d0b0eb4
fix(nri): persist cache on tmpfs to survive plugin restarts
SoulKyu May 3, 2026
7fd352a
fix(nri): reject malformed placeholder keys in mapping
SoulKyu May 3, 2026
645ae92
fix(nri): node affinity gate + readiness label
SoulKyu May 3, 2026
cb524f7
Revert "fix(nri): node affinity gate + readiness label"
SoulKyu May 3, 2026
cf9075d
docs: NRI mode operator runbook with failure modes and sample alert
SoulKyu May 3, 2026
124915d
fix(nri): periodic cache sweep evicts force-deleted pods
SoulKyu May 3, 2026
74e2370
docs+helm: NRI socket hardening — Kyverno policy + README security se…
SoulKyu May 3, 2026
5b22875
feat(nri): pull-not-push refactor — no Vault token in PodSpec
SoulKyu May 3, 2026
2a7dd33
fix(nri): verify pod identity via K8s API to block annotation forgery…
SoulKyu May 4, 2026
55b013a
feat: add .claude
SoulKyu May 4, 2026
b274ef2
fix(nri): address pre-ship review findings (CRIT-1, CRIT-2, IMP-1, IM…
SoulKyu May 4, 2026
b44596e
helm: tolerate-all on NRI DaemonSet by default
SoulKyu May 4, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .claude/scheduled_tasks.lock
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"sessionId":"1bb06834-abb1-4950-8435-0ad4c12c46f7","pid":3802262,"procStart":"62177279","acquiredAt":1777658765555}
337 changes: 337 additions & 0 deletions .claude/skills/desloppify/SKILL.md

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
pkg/bpf/*.bpf.o binary
pkg/bpf/c/headers/vmlinux.h linguist-generated=true
5 changes: 1 addition & 4 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
# This workflow will build a golang project
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-go

name: Go

on:
Expand All @@ -19,7 +16,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v4
with:
go-version: '1.22'
go-version: '1.26'

- name: Build
run: go build -v ./...
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
.desloppify/
pkg/nri/.tmp
.claude/
47 changes: 47 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Contributing to vault-db-injector

Thank you for your interest in contributing. This guide covers how to get a
working development environment and how to validate NRI mode locally on a k3d cluster.

For general project information, see the [documentation site](https://numberly.github.io/vault-db-injector).

---

## Getting started

```bash
git clone https://github.com/numberly/vault-db-injector.git
cd vault-db-injector
go build ./...
go test ./...
```

Standard unit tests require no external dependencies and run on any platform.

---

## Testing NRI mode locally

NRI mode requires a Kubernetes runtime that supports NRI (containerd ≥ 1.7
or CRI-O ≥ 1.26). Use the `scripts/enable-nri-on-k3d.sh` helper to enable
NRI on an existing k3d cluster, or pass the bundled `config.toml.tmpl` at
cluster creation:

```bash
K3D_FIX_DNS=0 k3d cluster create vault-db-test --servers 1 --agents 1 \
--image rancher/k3s:v1.34.1-k3s1 \
--volume "$PWD/scripts/k3d-containerd-config.toml.tmpl:/var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl@all"
```

Verify the NRI socket exists on each node: `docker exec <node> ls /var/run/nri/nri.sock`.

## Pull request checklist

- [ ] `go test ./...` passes
- [ ] `go vet ./...` and `golangci-lint run` produce no errors
- [ ] New packages include unit tests
- [ ] If the PR changes webhook behavior, add a test case to
`pkg/k8smutator` for both `cfg.NRI.Enabled=false` and
`cfg.NRI.Enabled=true`
- [ ] Commit messages follow [Conventional Commits](https://www.conventionalcommits.org/)
(`feat:`, `fix:`, `chore:`, `docs:`, `perf:`)
7 changes: 4 additions & 3 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# numberlyinfra/vault-injector
FROM golang:1.23.6-alpine3.21 AS build
# numberlyinfra/vault-db-injector

FROM golang:1.26-alpine AS build

WORKDIR /app
COPY go.mod go.sum ./
Expand All @@ -16,4 +17,4 @@ COPY --from=build /vault-db-injector /vault-db-injector
USER 65534
EXPOSE 8443 8080

ENTRYPOINT ["/vault-db-injector"]
ENTRYPOINT ["/vault-db-injector"]
27 changes: 27 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ The Vault DB Injector relies on the database engine from Vault to generate crede
- Distribute credentials to workload using annotations and Kubernetes mutating webhook
- Renew credentials when necessary
- Revoke credentials when application pod is deleted
- Optionally protect credentials at the Kubernetes API layer using an NRI plugin substitution layer

## 2. <a name='Documentation'></a>Documentation

Expand All @@ -31,6 +32,32 @@ The demo environment is based on:
🧪 **Demo code used during the talk:**
https://github.com/SoulKyu/vault-db-injector-cnd

## 3.5. <a name='SecurityNRI'></a>Security: NRI mode hardening

NRI mode requires the plugin DaemonSet to mount `/var/run/nri/nri.sock` —
the same socket containerd uses for plugin registration. Any pod that
mounts this hostPath can register as an NRI plugin and mutate every
container created on the node (env, mounts, capabilities, args).

This is **inherent to NRI**, not specific to this project. The cluster
admin must restrict who can mount these paths.

**Required mitigations** (in order of strength):

1. **PodSecurityAdmission `restricted` or `baseline`** on user namespaces:
both forbid hostPath volumes. The plugin DS must run in a namespace
labeled `pod-security.kubernetes.io/enforce=privileged`.
2. **Kyverno ClusterPolicy** that blocks `/var/run/nri` and `/opt/nri`
hostPath mounts outside the trusted namespace. A reference policy is
provided at [helm/policies/kyverno-restrict-nri-socket.yaml](helm/policies/kyverno-restrict-nri-socket.yaml).
3. **SELinux/AppArmor**: on RHEL/CoreOS, leave SELinux enforcing;
do not run the plugin pod with `seLinuxOptions.type: spc_t`. The
default `container_runtime_t` socket label prevents user pods from
connecting even if they bypass the hostPath check.

See [docs/how-it-works/nri-mode.md](docs/how-it-works/nri-mode.md) for
the complete threat model.

## 4. <a name='Contribution'></a>Contribution

Contributions to the vault-db-injector are welcome. Please submit your pull requests or issues to the project's GitLab repository.
Expand Down
1 change: 1 addition & 0 deletions docs/getting-started/comparison.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ Here are our needs by importance in our research :
| **Lease Renew** | ✅ Yes | ✅ Yes | - | 🤔 With restarting | - |
| **Lease Revocation** | ✅ Yes | ❌ No | - | ❌ No | - |
| **Community Support** | 🌱 Growing | 🟢 Established | 🟠 Moderate | 🟠 Moderate | 🟢 Established |
| **Credentials invisible at K8s API layer (PodSpec / etcd / audit logs / GitOps)** | ✅ Yes (with NRI mode) | ❌ No | ❌ No | ❌ No | ❌ No |

### 4.1. <a name='Key'></a>Key

Expand Down
168 changes: 168 additions & 0 deletions docs/how-it-works/nri-mode.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
# NRI mode

NRI mode replaces credentials with opaque placeholders in the PodSpec and
substitutes them at container creation time via a node-local DaemonSet
plugin. Credentials never appear in any persisted Kubernetes resource
(PodSpec, etcd, audit logs, GitOps captures).

## Architecture (schema v2 — pull-not-push)

```
kube-apiserver
↓ pod admitted with annotation
↓ db-creds-injector.numberly.io/nri-mapping = {
↓ schema:2, db_path, db_role, placeholders, request_id,
↓ pod_namespace, pod_service_account
↓ } (NO Vault token, NO bearer credential)
kubelet
containerd (NRI enabled)
↓ /var/run/nri/nri.sock
[vault-db-injector NRI plugin DaemonSet]
↓ on CreateContainer:
↓ 1. read pod-sandbox annotation, parse NRIMapping
↓ 2. GET pod from kube-apiserver — verify UID, namespace, SA
↓ match annotation (defense vs annotation forgery)
↓ 3. authenticate to Vault as the plugin's OWN SA token
↓ (k8s auth method)
↓ 4. CanIGetRoles for the actual pod identity → confirms the
↓ Vault auth role binds this (namespace, SA)
↓ 5. GetDbCredentials — dynamic credential issued, lease tagged
↓ with pod UID for renewer/revoker correlation
↓ 6. emit ContainerAdjustment{env: substituted}
runc
↓ execve with real envp
app
```

## Components

- **Webhook** — generates placeholders, stamps the
`db-creds-injector.numberly.io/nri-mapping` annotation. Calls Vault
CanIGetRoles to fail-fast at admission if the pod's SA isn't bound to
the requested role. **Does not** call Vault sys/wrapping/wrap; no
credential or token is placed in the PodSpec.
- **DaemonSet (NRI plugin)** — node-local. Authenticates to Vault using
its own ServiceAccount token. Verifies pod identity against the K8s
API. Creates the dynamic credential. Substitutes placeholders in the
container env at `CreateContainer` (before runc).
- **Cache** — per-node tmpfs at `/run/vault-db-injector/nri/cache.json`
persists unwrapped credentials so they survive plugin pod restart but
not node reboot. A pod whose plugin DS restarts mid-CrashLoop continues
to receive the substituted env on retry instead of the placeholder.

## Failure modes and detection

The plugin emits Prometheus metrics:

- `vdbi_nri_substitutions_total` — successful adjustments emitted
- `vdbi_nri_unwrap_failures_total{reason}` — labels: `malformed_annotation`,
`fetch_error` (covers identity mismatch, Vault errors, missing pod, etc.)

### What can still go wrong

1. **No NRI plugin on the target node.** If the DS pod is missing on a
node (image pull, broken DS, post-install delay), pods scheduled there
start with the literal placeholder string in env. The app fails to
connect to the database with the placeholder as password and crashes
visibly. The plugin emits no metric for this case (it is not running
on that node).

**Detection** — alert when a pod has the `nri-mapping` annotation but
its node has no ready NRI plugin pod:

```yaml
- alert: NRIPluginMissingOnNode
expr: |
count by (node) (
kube_pod_annotations{annotation_db_creds_injector_numberly_io_nri_mapping!=""}
)
and on (node) (
count by (node) (
kube_pod_status_ready{condition="true",pod=~"vault-db-injector-nri-.*"}
) == 0
)
for: 1m
annotations:
summary: NRI plugin not ready on node {{ $labels.node }} — pods with credentials are starting unsubstituted
```

2. **Annotation forgery** (closed by Hunter finding #H6). An attacker
with `pods.create` or `pods.update` RBAC can craft an annotation
claiming any `pod_namespace`/`pod_service_account`. The plugin
defends against this in three layers:
- The pod's actual UID (from NRI sandbox) must match
`pod.metadata.uid` recorded by kube-apiserver.
- The pod's actual namespace and `spec.serviceAccountName` (from
kube-apiserver) must match the annotation's claim.
- Vault `CanIGetRoles` is called with the K8s-attested identity, not
the annotation's, so a mismatched claim fails authorization.

3. **Plugin DS pod and main container restart in the wrong order.** The
on-disk cache covers this: the second CreateContainer attempt for the
same pod UID reuses the stored credential, no second Vault round-trip.

4. **Force-deleted pod cache leak** — a pod deleted with
`--grace-period=0 --force` does not fire NRI's `RemovePodSandbox`
event. The plugin runs a periodic 5-minute sweep that lists pods on
its node via the K8s API and evicts cache entries whose UIDs no
longer exist.

## Schema versioning

The plugin only accepts annotations with `"schema":2`. Schema 1 (the
legacy `wrap_token` design) is rejected with a clear error so an
operator never silently runs in an inconsistent state during upgrade.

**Upgrade path** — when moving from a v1 webhook + v1 plugin
deployment to v2:

1. Set `nri.enabled: false` in helm values and apply. New pods now
inject literal credentials in PodSpec (legacy mode, byte-identical
to pre-NRI behavior).
2. Upgrade the webhook and plugin Deployment/DaemonSet images together.
3. Set `nri.enabled: true` and apply.

If you upgrade hot (without disabling NRI), pods admitted by an old
webhook just before the upgrade will hit the new plugin and be rejected
with `unsupported nri-mapping schema version 1`. Container starts with
placeholder, app crashes with bad cred, kubelet restarts it. Within a
few seconds the new webhook is admitting v2 annotations and recovery
is automatic — but expect ~30 seconds of pod CrashLoop noise during the
window. Cleaner to drain.

## Hardening checklist

- Set resource requests on the DS so it is not OOM-killed on memory pressure
- Use `priorityClass: system-node-critical` to make eviction less likely
- Monitor `NRIPluginMissingOnNode` (above) and
`vdbi_nri_unwrap_failures_total{reason="fetch_error"}`
- Apply the Kyverno policy at
[helm/policies/kyverno-restrict-nri-socket.yaml](../../helm/policies/kyverno-restrict-nri-socket.yaml)
to block hostPath mounts of `/var/run/nri` and `/opt/nri` outside the
plugin's namespace
- On RHEL/CoreOS leave SELinux enforcing; do not run user pods with
`seLinuxOptions.type: spc_t`

## Trust posture

The cache file at `/run/vault-db-injector/nri/cache.json` contains
unwrapped credentials in cleartext, perms `0600 root:root`, on tmpfs.
The same posture applies to:

- kubelet's projected service-account tokens at
`/var/lib/kubelet/pods/<UID>/volumes/kubernetes.io~projected/...`
- Any Secret mounted as a volume

A root-on-node attacker can already read `/proc/<pid>/environ` of every
container, so the cache adds no new attack surface beyond what root already
has. The cache is **never on persistent disk** (tmpfs) and **never in
backups** (`/run` is excluded by every node backup tool).

A pod that mounts hostPath `/run` AND runs as UID 0 (root) can read the
cache. PSA `restricted` and `baseline` profiles forbid hostPath mounts
entirely, which is the recommended baseline for user namespaces. The
Kyverno policy referenced above does not currently include
`/run/vault-db-injector` because PSA covers it; if you must keep
`baseline` off and root-on-pod allowed, extend the Kyverno policy
manually.
Loading
Loading