|
1 | 1 | # Tenant isolation — open follow-ups |
2 | 2 |
|
3 | | -PR #70 closed the two original VULNs (namespace pin via `get_deployments`, |
4 | | -wizard-create existence check). The review found three more issues; two are |
5 | | -addressed by this same PR with augmented commits, one remains open and is |
6 | | -documented here. |
| 3 | +Three items uit de review van PR #70 die buiten scope blijven: |
7 | 4 |
|
8 | | -## Status overview |
| 5 | +## TOCTOU op concurrente wizard-create |
9 | 6 |
|
10 | | -| Item | Status | |
11 | | -|---------------------------------------------------------------------|-------------------------------------| |
12 | | -| VULN 1: `get_deployments` namespace pin | ✅ in PR #70 (`enforce_namespace_pin`) | |
13 | | -| VULN 2: wizard-create existence check | ✅ in PR #70 (`simple_background`) | |
14 | | -| VULN 2 bypass: git-monitor path | ✅ in PR #70 (`git_monitor.check_and_create_namespaces`) | |
15 | | -| GAT 1: `task_handlers_project.handle_create_project` existence check | ✅ added in augmented commit | |
16 | | -| GAT 3: `extract_deployment_namespace` pin enforcement (9 callsites) | ✅ added in augmented commit (helper-level fix) | |
17 | | -| GAT 2: TOCTOU on concurrent wizard-create | ⏳ open — design needed | |
18 | | -| Refactor: reuse `git.check_overwrite_project_file` | ⏳ small follow-up | |
19 | | -| UX: ValueError → 400 instead of 500 | ⏳ small follow-up | |
| 7 | +`file_exists` leest de lokale clone; `push_changes` doet bij conflict tot |
| 8 | +3x rebase. Twee gelijktijdige creates voor dezelfde projectnaam: beide |
| 9 | +passen de bestaanscheck (lokale clone weet nog niets), beide pushen, |
| 10 | +tweede push wordt gerebased zonder conflict (verschillende content op |
| 11 | +hetzelfde pad) en wint. Last-write wins, stille overname. |
20 | 12 |
|
21 | | -## GAT 2 — TOCTOU race on concurrent create |
| 13 | +Vereist twee SSO-sessies die binnen seconden coördineren — lage |
| 14 | +waarschijnlijkheid, wel echt. |
22 | 15 |
|
23 | | -**The race.** `file_exists` reads the local working copy of the git |
24 | | -repository. `push_changes` (called by `create_or_update_file` when |
25 | | -`do_commit_and_push=True`) does up to `max_retries=3` with rebase on a |
26 | | -non-fast-forward push. Sequence under contention: |
| 16 | +Aanbevolen fix: na een gerapporteerde rebase nogmaals `file_exists` |
| 17 | +draaien. Als het bestand er nu wel staat, afbreken met dezelfde |
| 18 | +foutmelding. Andere opties (advisory lock via ConfigMap, atomic |
| 19 | +git-update-ref) zijn robuuster maar veel meer werk; alleen overwegen |
| 20 | +als we echte contention zien. |
27 | 21 |
|
28 | | -1. Tenant A submits create-project request for name `target`. OPI's |
29 | | - `file_exists` returns `False` (no such file in local clone). |
30 | | -2. Tenant B submits create-project request for the same name `target` at |
31 | | - roughly the same time. OPI's local clone still says no such file. |
32 | | -3. Both flows pass the existence-check guard. |
33 | | -4. Both flows attempt to commit and push. The first push succeeds. The |
34 | | - second push is rejected as non-fast-forward, OPI rebases, **the rebase |
35 | | - succeeds with no conflict because each create wrote a different content |
36 | | - to the same file path**, and the second push goes through. Last-write |
37 | | - wins, silent takeover. |
| 22 | +## Hergebruik `check_overwrite_project_file` |
38 | 23 |
|
39 | | -**Requirement gate before fixing.** Concurrent wizard-creates with the same |
40 | | -target name require two SSO sessions to coordinate within seconds. Low |
41 | | -likelihood, real if intentional. |
| 24 | +`opi/connectors/git.py:1196` heeft al een existence-check helper. PR #70 |
| 25 | +dupliceert het patroon inline in `simple_background.process_project_background` |
| 26 | +en `task_handlers_project.handle_create_project`. Pre-existing |
| 27 | +duplicatie, laagprioriteit. |
42 | 28 |
|
43 | | -**Possible designs.** |
| 29 | +## ValueError → 400 (UX) |
44 | 30 |
|
45 | | -1. **Re-check after rebase.** After `push_changes` reports a rebase |
46 | | - happened, call `file_exists` again. If the file now exists (i.e. the |
47 | | - other tenant's create won the race), abort with the same error message. |
48 | | - Simplest fix; works because rebase exposes the conflicting file. |
49 | | -2. **Cluster-side advisory lock.** A `LockSecret` / `LockConfigMap` in the |
50 | | - ops namespace, acquired with optimistic concurrency (`resourceVersion`) |
51 | | - before `file_exists` and released after `push`. Robust, more moving |
52 | | - parts. |
53 | | -3. **Switch from optimistic to per-project file-create-only semantics.** Use |
54 | | - git's atomic ref operations (e.g. `git update-ref` with expected |
55 | | - value) so a "create only if doesn't exist" semantic is enforced at the |
56 | | - git layer, not the file-system layer. Cleanest but rewires the |
57 | | - GitConnector. |
| 31 | +`extract_deployment_namespace` raise't `ValueError` bij namespace- |
| 32 | +mismatch. De callers (`backup_router`, `restore_router`, `backup_tasks`, |
| 33 | +`router_detail_edit`) vangen dat niet, dus FastAPI maakt er een 500 |
| 34 | +van. Geen info-leak, wel een verwarrende foutmelding voor legitieme |
| 35 | +gebruikers met een typo. |
58 | 36 |
|
59 | | -Recommend (1) as the practical fix; (2)/(3) only if we see real contention. |
60 | | - |
61 | | -## Refactor: reuse `git.check_overwrite_project_file` |
62 | | - |
63 | | -`opi/connectors/git.py:1196` already implements an existence-check helper |
64 | | -(`check_overwrite_project_file`). PR #70 added an inline check in |
65 | | -`simple_background.process_project_background` and the augmented commit |
66 | | -adds another inline check in `task_handlers_project.handle_create_project`. |
67 | | -Both should ideally call the existing helper for consistency, instead of |
68 | | -duplicating the `file_exists` + error-message pattern. Pre-existing |
69 | | -duplication, low priority. |
70 | | - |
71 | | -## UX: ValueError → 400 Bad Request |
72 | | - |
73 | | -After GAT 3, `extract_deployment_namespace` raises `ValueError` when the |
74 | | -declared namespace mismatches the project name. The callers (backup_router, |
75 | | -restore_router, backup_tasks, router_detail_edit) do not catch this; FastAPI |
76 | | -turns an uncaught `ValueError` into a 500 Internal Server Error. The body |
77 | | -content is generic, so no information leak, but the user sees an opaque |
78 | | -error instead of "namespace must match project name". |
79 | | - |
80 | | -Recommend a global exception handler that maps `ValueError` from these |
81 | | -paths to `HTTPException(status_code=400, detail=str(e))`. Better still: a |
82 | | -dedicated `TenantIsolationError` subclass so the handler is precise. |
83 | | - |
84 | | -Pre-existing pattern in this codebase: most endpoints already wrap their |
85 | | -own `ValueError`s into `HTTPException`. The new raise-paths in |
86 | | -`extract_deployment_namespace` should match. |
87 | | - |
88 | | -## Note on the documented error-message |
89 | | - |
90 | | -`enforce_namespace_pin` (and the augmented `extract_deployment_namespace`) |
91 | | -raises with both `project_name` and `declared_namespace` in the message. |
92 | | -This is not an information leak: the attacker submitted both values |
93 | | -themselves. The message is useful for legitimate users who typo'd. Keep |
94 | | -as-is. |
| 37 | +Aanbevolen: een `TenantIsolationError` subclass + global exception |
| 38 | +handler die hem mapt naar `HTTPException(400, str(e))`. |
0 commit comments