Skip to content

Commit 9222814

Browse files
authored
feat(api-server): sync user account in OneCLI on login (#98)
* feat(api-server): sync user account in OneCLI on first authentication Calls POST /api/auth/sync on OneCLI (v0.0.8+) to ensure the user account exists. Uses an RxJS saga with distinct() dedup so the sync only fires once per user per server lifetime, without blocking requests. Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com> Signed-off-by: Radek Ježek <radek.jezek@ibm.com> * chore: bump onecli to v0.0.8 Required for the /api/auth/sync endpoint used by the new user sync saga. Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com> Signed-off-by: Radek Ježek <radek.jezek@ibm.com> * fix: e2e cluster install timeout and OneCLI sync method - Add --no-restart flag to cluster:install, used in e2e to skip unnecessary pod restarts that caused a deadlock (OneCLI waiting for Keycloak while both are restarting) - Add --wait to helm upgrade so readiness is guaranteed without restarts - Only restart locally-built image deployments (not OneCLI/Keycloak) - Fix syncUser to use GET instead of POST (OneCLI returns 405 on POST) Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com> Signed-off-by: Radek Ježek <radek.jezek@ibm.com> * fix(e2e): remove restrictive resource overrides causing OOM on startup Delete values-test.yaml — its 512Mi limits for OneCLI and Keycloak were too tight during startup (Keycloak sits at 454Mi at idle alone, Next.js in OneCLI spikes during boot). Default requests total ~1.66 GiB which fits comfortably in the 3 GiB test VM. Keep --set domain=localtest.me --set port=5555 since test helpers hardcode those values. Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com> Signed-off-by: Radek Ježek <radek.jezek@ibm.com> * ci(e2e): maximize CI disk space and dump cluster diagnostics on failure - Port maximize-build-space action from kagenti-adk to free CI disk space for the Lima VM image and k3s state (removes dotnet/android/ haskell/codeql tooling) - Add diagnostics dump (nodes, pods, events, describe, logs) on api-server:test failure to make CI debugging possible - Bump helm install timeout to 15m Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com> Signed-off-by: Radek Ježek <radek.jezek@ibm.com> * ci(e2e): re-checkout after maximize-build-space The action mounts an LVM volume at $GITHUB_WORKSPACE which wipes the initial checkout. Re-checkout after to restore the repo contents. Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com> Signed-off-by: Radek Ježek <radek.jezek@ibm.com> * fix(onecli): remove wait-for-keycloak init container It created a deadlock with helm --wait: the init container polled /realms/humr, but the humr realm is created by the keycloak-provision post-install hook, which runs AFTER --wait completes. OneCLI doesn't need Keycloak at boot — JWKS fetching is lazy, and by the time a user authenticates, the full stack is up. Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com> Signed-off-by: Radek Ježek <radek.jezek@ibm.com> * fix(e2e): restore wait-for-keycloak, drop helm --wait OneCLI's gateway fetches JWKS from /realms/humr at startup and crashes with "missing field keys" if the realm doesn't exist. So the wait-for-keycloak init container is actually necessary. The humr realm is imported by the keycloak-provision post-install hook. With helm --wait, hooks only run AFTER resources become Ready, creating a deadlock. Removing --wait lets the hook run immediately after resources are submitted, so the realm exists by the time OneCLI's init container polls for it. Replace --wait with an explicit kubectl wait on deployments after the install, so we still guarantee readiness before tests run. Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com> Signed-off-by: Radek Ježek <radek.jezek@ibm.com> * fix(test): share auth token across vitest globalSetup → workers The global authenticated client used module-level `_token` set by `setToken()` in globalSetup. But globalSetup runs in a separate process from test workers — module state can't cross that boundary, so `_token` was undefined in tests, and authenticated requests silently went out without a Bearer header. The 401 response then failed JSON parsing in tRPC's batch link as "Unable to transform response from server". Use vitest's `provide()`/`inject()` API to share the token across the process boundary instead. Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com> Signed-off-by: Radek Ježek <radek.jezek@ibm.com> --------- Signed-off-by: Radek Ježek <radek.jezek@ibm.com>
1 parent 6467467 commit 9222814

14 files changed

Lines changed: 372 additions & 94 deletions

File tree

Lines changed: 221 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,221 @@
1+
# Modified version of https://github.com/easimon/maximize-build-space
2+
name: 'Maximize build disk space'
3+
description: 'Maximize the available disk space for your build job'
4+
branding:
5+
icon: 'crop'
6+
color: 'orange'
7+
inputs:
8+
root-reserve-mb:
9+
description: 'Space to be left free on the root filesystem, in Megabytes.'
10+
required: false
11+
default: '1024'
12+
temp-reserve-mb:
13+
description: 'Space to be left free on the temp filesystem (/mnt), in Megabytes.'
14+
required: false
15+
default: '100'
16+
17+
swap-size-mb:
18+
description: 'Swap space to create, in Megabytes.'
19+
required: false
20+
default: '4096'
21+
overprovision-lvm:
22+
description: |
23+
Create the LVM disk images as sparse files, making the space required for the LVM image files *appear* unused on the
24+
hosting volumes until actually allocated. Use with care, this can lead to surprising out-of-disk-space situations.
25+
You should prefer adjusting root-reserve-mb/temp-reserve-mb over using this option.
26+
required: false
27+
default: 'false'
28+
build-mount-path:
29+
description: 'Absolute path to the mount point where the build space will be available, defaults to $GITHUB_WORKSPACE if unset.'
30+
required: false
31+
build-mount-path-ownership:
32+
description: 'Ownership of the mount point path, defaults to standard "runner" user and group.'
33+
required: false
34+
default: 'runner:runner'
35+
pv-loop-path:
36+
description: 'Absolute file path for the LVM image created on the root filesystem, the default is usually fine.'
37+
required: false
38+
default: '/pv.img'
39+
tmp-pv-loop-path:
40+
description: 'Absolute file path for the LVM image created on the temp filesystem, the default is usually fine. Must reside on /mnt'
41+
required: false
42+
default: '/mnt/tmp-pv.img'
43+
44+
remove-dotnet:
45+
description: 'Removes .NET runtime and libraries. (frees ~17 GB)'
46+
required: false
47+
default: 'false'
48+
remove-android:
49+
description: 'Removes Android SDKs and Tools. (frees ~11 GB)'
50+
required: false
51+
default: 'false'
52+
remove-haskell:
53+
description: 'Removes GHC (Haskell) artifacts. (frees ~2.7 GB)'
54+
required: false
55+
default: 'false'
56+
remove-codeql:
57+
description: 'Removes CodeQL Action Bundles. (frees ~5.4 GB)'
58+
required: false
59+
default: 'false'
60+
remove-docker-images:
61+
description: 'Removes cached Docker images. (frees ~3 GB)'
62+
required: false
63+
default: 'false'
64+
runs:
65+
using: "composite"
66+
steps:
67+
- name: Disk space report before modification
68+
shell: bash
69+
run: |
70+
echo "Memory and swap:"
71+
sudo free -h
72+
echo
73+
sudo swapon --show
74+
echo
75+
76+
echo "Available storage:"
77+
sudo df -h
78+
echo
79+
80+
- name: Maximize build disk space
81+
shell: bash
82+
run: |
83+
set -euo pipefail
84+
85+
BUILD_MOUNT_PATH="${{ inputs.build-mount-path }}"
86+
if [[ -z "${BUILD_MOUNT_PATH}" ]]; then
87+
BUILD_MOUNT_PATH="${GITHUB_WORKSPACE}"
88+
fi
89+
90+
echo "Arguments:"
91+
echo
92+
echo " Root reserve: ${{ inputs.root-reserve-mb }} MiB"
93+
echo " Temp reserve: ${{ inputs.temp-reserve-mb }} MiB"
94+
echo " Swap space: ${{ inputs.swap-size-mb }} MiB"
95+
echo " Overprovision LVM: ${{ inputs.overprovision-lvm }}"
96+
echo " Mount path: ${BUILD_MOUNT_PATH}"
97+
echo " Root PV loop path: ${{ inputs.pv-loop-path }}"
98+
echo " Temp PV loop path: ${{ inputs.tmp-pv-loop-path }}"
99+
100+
echo -n " Removing: "
101+
if [[ ${{ inputs.remove-dotnet }} == 'true' ]]; then
102+
echo -n "dotnet "
103+
fi
104+
if [[ ${{ inputs.remove-android }} == 'true' ]]; then
105+
echo -n "android "
106+
fi
107+
if [[ ${{ inputs.remove-haskell }} == 'true' ]]; then
108+
echo -n "haskell "
109+
fi
110+
if [[ ${{ inputs.remove-codeql }} == 'true' ]]; then
111+
echo -n "codeql "
112+
fi
113+
if [[ ${{ inputs.remove-docker-images }} == 'true' ]]; then
114+
echo -n "docker "
115+
fi
116+
echo
117+
118+
# store owner of $GITHUB_WORKSPACE in case the action deletes it
119+
WORKSPACE_OWNER="$(stat -c '%U:%G' "${GITHUB_WORKSPACE}")"
120+
121+
# ensure mount path exists before the action
122+
sudo mkdir -p "${BUILD_MOUNT_PATH}"
123+
sudo find "${BUILD_MOUNT_PATH}" -maxdepth 0 ! -empty -exec echo 'WARNING: directory [{}] is not empty, data loss might occur. Content:' \; -exec ls -al "{}" \;
124+
125+
echo "Removing unwanted software... "
126+
if [[ ${{ inputs.remove-dotnet }} == 'true' ]]; then
127+
sudo rm -rf /usr/share/dotnet
128+
fi
129+
if [[ ${{ inputs.remove-android }} == 'true' ]]; then
130+
sudo rm -rf /usr/local/lib/android
131+
fi
132+
if [[ ${{ inputs.remove-haskell }} == 'true' ]]; then
133+
sudo rm -rf /opt/ghc
134+
fi
135+
if [[ ${{ inputs.remove-codeql }} == 'true' ]]; then
136+
sudo rm -rf /opt/hostedtoolcache/CodeQL
137+
fi
138+
if [[ ${{ inputs.remove-docker-images }} == 'true' ]]; then
139+
sudo docker image prune --all --force
140+
fi
141+
echo "... done"
142+
143+
VG_NAME=buildvg
144+
145+
# github runners have an active swap file in /mnt/swapfile
146+
# we want to reuse the temp disk, so first unmount swap and clean the temp disk
147+
echo "Unmounting and removing swap file."
148+
sudo swapoff -a
149+
sudo rm -f /mnt/swapfile
150+
151+
echo "Creating LVM Volume."
152+
echo " Creating LVM PV on root fs."
153+
# create loop pv image on root fs
154+
ROOT_RESERVE_KB=$(expr ${{ inputs.root-reserve-mb }} \* 1024)
155+
ROOT_FREE_KB=$(df --block-size=1024 --output=avail / | tail -1)
156+
ROOT_LVM_SIZE_KB=$(expr $ROOT_FREE_KB - $ROOT_RESERVE_KB)
157+
ROOT_LVM_SIZE_BYTES=$(expr $ROOT_LVM_SIZE_KB \* 1024)
158+
sudo touch "${{ inputs.pv-loop-path }}" && sudo fallocate -z -l "${ROOT_LVM_SIZE_BYTES}" "${{ inputs.pv-loop-path }}"
159+
export ROOT_LOOP_DEV=$(sudo losetup --find --show "${{ inputs.pv-loop-path }}")
160+
sudo pvcreate -f "${ROOT_LOOP_DEV}"
161+
162+
163+
164+
# create volume group from these pvs
165+
# create pv on temp disk if it is on a different filesystem than root
166+
TMP_LOOP_DEV=""
167+
168+
if mountpoint -q /mnt; then
169+
echo " /mnt is a mountpoint. Creating LVM PV on temp fs."
170+
TMP_RESERVE_KB=$(expr ${{ inputs.temp-reserve-mb }} \* 1024)
171+
TMP_FREE_KB=$(df --block-size=1024 --output=avail /mnt | tail -1)
172+
TMP_LVM_SIZE_KB=$(expr $TMP_FREE_KB - $TMP_RESERVE_KB)
173+
TMP_LVM_SIZE_BYTES=$(expr $TMP_LVM_SIZE_KB \* 1024)
174+
sudo touch "${{ inputs.tmp-pv-loop-path }}" && sudo fallocate -z -l "${TMP_LVM_SIZE_BYTES}" "${{ inputs.tmp-pv-loop-path }}"
175+
export TMP_LOOP_DEV=$(sudo losetup --find --show "${{ inputs.tmp-pv-loop-path }}")
176+
sudo pvcreate -f "${TMP_LOOP_DEV}"
177+
else
178+
echo " /mnt is NOT a mountpoint. Skipping LVM PV on temp fs."
179+
fi
180+
181+
# create volume group from these pvs
182+
if [[ -n "${TMP_LOOP_DEV}" ]]; then
183+
sudo vgcreate "${VG_NAME}" "${TMP_LOOP_DEV}" "${ROOT_LOOP_DEV}"
184+
else
185+
sudo vgcreate "${VG_NAME}" "${ROOT_LOOP_DEV}"
186+
fi
187+
188+
echo "Recreating swap"
189+
# create and activate swap
190+
sudo lvcreate -L "${{ inputs.swap-size-mb }}M" -n swap "${VG_NAME}"
191+
sudo mkswap "/dev/mapper/${VG_NAME}-swap"
192+
sudo swapon "/dev/mapper/${VG_NAME}-swap"
193+
194+
echo "Creating build volume"
195+
# create and mount build volume
196+
sudo lvcreate -l 100%FREE -n buildlv "${VG_NAME}"
197+
if [[ ${{ inputs.overprovision-lvm }} == 'true' ]]; then
198+
sudo mkfs.ext4 -m0 "/dev/mapper/${VG_NAME}-buildlv"
199+
else
200+
sudo mkfs.ext4 -Enodiscard -m0 "/dev/mapper/${VG_NAME}-buildlv"
201+
fi
202+
sudo mount "/dev/mapper/${VG_NAME}-buildlv" "${BUILD_MOUNT_PATH}"
203+
sudo chown -R "${{ inputs.build-mount-path-ownership }}" "${BUILD_MOUNT_PATH}"
204+
205+
# if build mount path is a parent of $GITHUB_WORKSPACE, and has been deleted, recreate it
206+
if [[ ! -d "${GITHUB_WORKSPACE}" ]]; then
207+
sudo mkdir -p "${GITHUB_WORKSPACE}"
208+
sudo chown -R "${WORKSPACE_OWNER}" "${GITHUB_WORKSPACE}"
209+
fi
210+
211+
- name: Disk space report after modification
212+
shell: bash
213+
run: |
214+
echo "Memory and swap:"
215+
sudo free -h
216+
echo
217+
sudo swapon --show
218+
echo
219+
220+
echo "Available storage:"
221+
sudo df -h

.github/workflows/e2e.yml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,22 @@ jobs:
2525
runs-on: ubuntu-latest
2626
timeout-minutes: 30
2727
steps:
28+
# Initial checkout required to resolve the local maximize-build-space action.
29+
- uses: actions/checkout@v4
30+
31+
- name: Maximize build disk space
32+
uses: ./.github/actions/maximize-build-space
33+
with:
34+
root-reserve-mb: 15360
35+
temp-reserve-mb: 2048
36+
swap-size-mb: 1024
37+
remove-dotnet: 'true'
38+
remove-android: 'true'
39+
remove-haskell: 'true'
40+
remove-codeql: 'true'
41+
42+
# Re-checkout — maximize-build-space mounts an LVM volume at $GITHUB_WORKSPACE,
43+
# replacing the earlier checkout contents.
2844
- uses: actions/checkout@v4
2945

3046
- name: Enable KVM + install QEMU

deploy/helm/humr/templates/onecli/app.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,10 @@ spec:
7979
image: {{ .Values.controller.caCertInitImage | default "busybox:stable" }}
8080
command: ["sh", "-c"]
8181
args:
82+
# Wait for the humr realm — OneCLI gateway fetches JWKS at startup and
83+
# crashes if the realm doesn't exist yet. The realm is imported by the
84+
# keycloak-provision post-install hook; helm install must be invoked
85+
# without --wait so that hook runs before OneCLI tries to start.
8286
- |
8387
until wget -qO- http://{{ include "humr.keycloak.fullname" . }}:{{ .Values.keycloak.port }}/realms/{{ .Values.keycloak.realm }} >/dev/null 2>&1; do
8488
echo "Waiting for Keycloak..."

deploy/helm/humr/values-test.yaml

Lines changed: 0 additions & 69 deletions
This file was deleted.

deploy/helm/humr/values.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ postgres:
4040
# -- OneCLI (credential proxy — gateway + web run in a single container)
4141
onecli:
4242
# Single Docker image containing both Rust gateway and Node.js web app
43-
image: ghcr.io/kagenti/onecli:0.0.7
43+
image: ghcr.io/kagenti/onecli:0.0.8
4444
replicas: 1
4545

4646
# -- Database connection (empty host = use shared local postgres)
@@ -153,7 +153,7 @@ keycloak:
153153

154154
# -- Test user bootstrapped via realm import.
155155
# DISABLED by default — production deployments must not ship a known credential.
156-
# Local dev (values-local.yaml) and e2e tests (values-test.yaml) enable it.
156+
# Local dev and e2e tests (via values-local.yaml) enable it.
157157
testUser:
158158
enabled: false
159159
username: ""

0 commit comments

Comments
 (0)