Skip to content

Commit 581795e

Browse files
authored
admin: store operators in config store, drop SSO-group role mapping (#840)
* admin: store operators in config store, drop SSO-group role mapping Replace the Google/Cognito group-based admin role mapping with an `operators` table in the config store. Roles are resolved per-request from that table: an SSO-authenticated `@posthog.com` identity is admin only if a matching `admin` operator row exists, else viewer. The break-glass internal token still grants admin. The Cognito/Google federation never forwarded Workspace group membership, so the group claim was always empty and every SSO user fell through to viewer. The operators table is self-service. The first SSO login auto-provisions a create-only viewer row for the caller, so an operator appears in the config store just by logging in. To make the first admin, log in over break-glass (internal-secret) and patch your row to admin under the new Admin -> Operators section of the config-store explorer. - configstore: Operator model (runtime schema, AutoMigrate) + CRUD (OperatorRole, List/Upsert/Delete, CountAdmins, SeedOperator) - admin/authz: RoleResolver replaces SSOConfig; @posthog.com + email_verified hardening; groups removed from Identity - admin: admin-only operators API with last-admin guards; Admin sidebar group after Runtime - multitenant: resolver over OperatorRole with create-only viewer auto-provision; env reads for DUCKGRES_ADMIN_SSO_GROUP/ALLOW_ALL removed - ui: Operators management view in the config-store Admin group * configstore: create operators via goose migration, not AutoMigrate Operators are authoritative admin-console RBAC principals — losing the table means admin lockout, so it is not rebuildable/expirable runtime state. Move it from the runtime schema (AutoMigrate) to the config schema via a goose migration, alongside the other duckgres_ config tables. - new migration 000006_create_operators.sql creating duckgres_operators - Operator.TableName() -> "duckgres_operators" - drop the operators entry from autoMigrateRuntimeTables - operators CRUD queries the bare config-schema table (no runtime qualification) - models explorer descriptor: Runtime flag false (config schema) * configstore: fix operators migration test parity The goose-migration integration test asserted the latest version was 5 and compared every migrated config table against GORM model metadata. Adding 000006 broke both: - bump the expected latest goose version to 6 (and assert 6 recorded) - add duckgres_operators to the four metadata-parity table lists and add Operator to the GORM comparison AutoMigrate set, so the table is covered by the model<->migration drift check - align the Operator model and migration so they match exactly: size tags on the varchar columns, role NOT NULL, no string column default
1 parent 23a988d commit 581795e

20 files changed

Lines changed: 879 additions & 161 deletions

File tree

CLAUDE.md

Lines changed: 18 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -320,10 +320,18 @@ user). Design + decisions: `docs/design/admin-ui.md`; package details:
320320
**before** `go build`. Do not delete `.gitkeep` and do not commit `ui/dist`.
321321
- **Two-tier authz** (`authz.go`): `AuthMiddleware` resolves every `/api/v1`
322322
request to admin (valid `TokenSet` internal secret — service/break-glass) or to
323-
an SSO identity from the ALB `X-Amzn-Oidc-Data` JWT (group `DUCKGRES_ADMIN_SSO_GROUP`
324-
→ admin, else viewer). `RoleGate` requires admin for all mutating verbs + the
325-
audit GET. `AuditMiddleware` records every mutation. Keep new mutating routes
326-
under this gate; never add a write path that bypasses RoleGate/audit.
323+
an SSO identity from the ALB `X-Amzn-Oidc-Data` JWT. The SSO email
324+
(`@posthog.com` + `email_verified != false`, else 401) is mapped to a role
325+
**per-request** by a `RoleResolver` backed by the `duckgres_operators` config-schema
326+
table (goose migration `000006_create_operators.sql`) — `admin` row → admin, else
327+
viewer. Admins manage operators
328+
under **Admin → Operators** (`/api/v1/operators`); the first SSO login
329+
auto-provisions a create-only **viewer** row, and the first admin is minted by
330+
logging in over the break-glass internal token and patching that row to `admin`
331+
under **Admin → Operators**. `RoleGate` requires admin for
332+
all mutating verbs + the audit GET. `AuditMiddleware` records every mutation.
333+
Keep new mutating routes under this gate; never add a write path that bypasses
334+
RoleGate/audit.
327335
- **Impersonation is a real session** (`impersonate.go` + `admin_providers.go`):
328336
it reuses `SessionManager.CreateSessionWithProtocol` (workers trust the CP — no
329337
password) and **always** `DestroySession` in a defer. Admin-only, every
@@ -335,9 +343,12 @@ user). Design + decisions: `docs/design/admin-ui.md`; package details:
335343
panel KEY, PromQL is built server-side from `rangePanels` (never an open PromQL
336344
relay) and forwarded to `DUCKGRES_PROMETHEUS_URL`. Org-labelled panels keep
337345
slicing enforced.
338-
- **Env-only knobs**: `DUCKGRES_ADMIN_SSO_GROUP`, `DUCKGRES_PROMETHEUS_URL` (read
339-
in `multitenant.go`; set by the chart). The audit table `duckgres_admin_audit`
340-
is AutoMigrated at startup (operational state, not goose-migrated tenant config).
346+
- **Env-only knobs**: `DUCKGRES_PROMETHEUS_URL` (read in
347+
`multitenant.go`; set by the chart). The audit table `duckgres_admin_audit` is
348+
AutoMigrated at startup (operational state, not goose-migrated tenant config).
349+
The `duckgres_operators` table is authoritative access-control data, so it lives
350+
in the config schema via goose migration `000006_create_operators.sql`, not
351+
AutoMigrate.
341352
- `ManagedSession.Username` is populated at session create so the console can
342353
slice live sessions/queries by user; keep it set on every create path.
343354
- Touching any of the above → update `controlplane/admin/*_test.go` (esp

controlplane/admin/README.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,16 @@ with a `Role`:
2828
- A valid `TokenSet` token (`X-Duckgres-Internal-Secret` header or the
2929
`duckgres_admin_token` cookie) → **admin**. This is the service-to-service /
3030
break-glass path (`RegisterLogin` mints the cookie via `POST /login`).
31-
- Otherwise the ALB-injected `X-Amzn-Oidc-Data` JWT (Cognito/Google) is decoded
32-
to email + groups. Membership in `DUCKGRES_ADMIN_SSO_GROUP`**admin**, else
33-
**viewer**. (Unset group → admin for any SSO user, logged — set it in prod.)
31+
- Otherwise the ALB-injected `X-Amzn-Oidc-Data` JWT (Cognito/Google) yields the
32+
caller's email (only `@posthog.com`, `email_verified != false`; otherwise
33+
treated as unauthenticated). The role is then resolved **per-request** from the
34+
`duckgres_operators` table in the config schema (goose migration
35+
`000006_create_operators.sql`): an `admin` row →
36+
**admin**, anything else (including no row) → **viewer**. Operators are managed
37+
by admins under **Admin → Operators** in the config-store explorer (and the
38+
`/api/v1/operators` API). The first SSO login auto-provisions a create-only
39+
**viewer** operator row; to mint the first admin, log in over the break-glass
40+
internal token and patch that row to `admin` under **Admin → Operators**.
3441

3542
`RoleGate` enforces the split: mutating verbs (POST/PUT/PATCH/DELETE) and the
3643
audit-log GET require admin; other GETs allow viewer. `AuditMiddleware` records
@@ -61,6 +68,9 @@ Added for the console:
6168
| `GET /api/v1/orgs/:id/users/:username/secrets`, `DELETE .../:name` | viewer/admin | list/delete stored persistent secrets (ciphertext never returned) |
6269
| `POST /api/v1/orgs/:id/impersonate/query` | admin | run SQL as an org user on their worker |
6370
| `GET /api/v1/audit` | admin | admin action log |
71+
| `GET /api/v1/operators` | admin | list console operators (email → role) |
72+
| `POST /api/v1/operators` | admin | add/update an operator (`{email, role}`; last-admin demotion → 409) |
73+
| `DELETE /api/v1/operators/:email` | admin | remove an operator (removing the last admin → 409) |
6474

6575
### Cross-CP live-state aggregation (`live_aggregate.go` + `controlplane/live_aggregator.go`)
6676

controlplane/admin/api.go

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,9 @@ func RegisterAPI(r *gin.RouterGroup, store *configstore.ConfigStore, info OrgSta
7878
// the concrete store directly because it needs the runtime schema name and
7979
// raw DB for tables the typed apiStore interface doesn't surface.
8080
registerModelsAPI(r, store)
81+
// Admin-only Operators management (the admin-console access list). Each
82+
// route self-gates with RequireAdmin; mutations are audited via the group.
83+
registerOperatorsAPI(r, store)
8184
}
8285

8386
func registerAPIWithStore(r *gin.RouterGroup, store apiStore, info OrgStackInfo, fetcher PeerFetcher) {

controlplane/admin/authz.go

Lines changed: 61 additions & 90 deletions
Original file line numberDiff line numberDiff line change
@@ -39,24 +39,21 @@ const (
3939
albOIDCIdentityHeader = "X-Amzn-Oidc-Identity"
4040
)
4141

42-
// SSOConfig controls how the ALB/Cognito identity maps to a Role.
43-
type SSOConfig struct {
44-
// AdminGroup is the Google Workspace / Cognito group whose members get the
45-
// admin role. When empty, SSO users default to VIEWER (fail closed) unless
46-
// AllowAllSSO is set. The break-glass internal-secret path is unaffected.
47-
AdminGroup string
48-
// AllowAllSSO, when true AND AdminGroup is empty, grants admin to every
49-
// SSO-authenticated user (single-tier dev convenience). It must be opted
50-
// into explicitly (DUCKGRES_ADMIN_SSO_ALLOW_ALL=true) so a missing group
51-
// config in production cannot silently grant admin to everyone.
52-
AllowAllSSO bool
53-
}
42+
// ssoEmailDomain is the only email domain accepted on the SSO path. SSO emails
43+
// outside this domain are treated as unauthenticated (defense in depth on top
44+
// of the ALB/Cognito allow-list and the operators table).
45+
const ssoEmailDomain = "@posthog.com"
46+
47+
// RoleResolver maps an authenticated SSO email to a Role. It is injected into
48+
// AuthMiddleware so the auth layer stays free of any config-store dependency;
49+
// the control plane wires one backed by the operators table. A nil resolver
50+
// (or one that returns RoleViewer) means the caller is a viewer.
51+
type RoleResolver func(email string) Role
5452

5553
// Identity is the resolved caller for an admin-UI request.
5654
type Identity struct {
57-
Email string `json:"email"`
58-
Groups []string `json:"groups"`
59-
Role Role `json:"role"`
55+
Email string `json:"email"`
56+
Role Role `json:"role"`
6057
// Source records how the identity was established (for audit): "sso" or
6158
// "internal-secret".
6259
Source string `json:"source"`
@@ -74,83 +71,84 @@ func IdentityFromContext(c *gin.Context) *Identity {
7471

7572
// AuthMiddleware authenticates a request and resolves its Role. A valid
7673
// TokenSet bearer token (header/cookie) is the service-to-service / break-glass
77-
// path and always maps to admin. Otherwise the ALB-injected Cognito JWT is
78-
// decoded into an Identity. Unauthenticated requests are rejected 401.
79-
func AuthMiddleware(tokens TokenSet, sso SSOConfig) gin.HandlerFunc {
74+
// path and always maps to admin. Otherwise the ALB-injected Cognito JWT yields
75+
// the caller's email, and resolve (the operators-table lookup) maps that email
76+
// to a Role. Unauthenticated requests are rejected 401.
77+
func AuthMiddleware(tokens TokenSet, resolve RoleResolver) gin.HandlerFunc {
8078
return func(c *gin.Context) {
81-
// 1. Internal secret (header or login cookie) -> admin.
79+
// 1. Internal secret (header or login cookie) -> admin (break-glass).
8280
if tokens.Valid(requestAdminToken(c)) {
8381
c.Set(ctxIdentityKey, &Identity{Email: "internal-secret", Role: RoleAdmin, Source: "internal-secret"})
8482
c.Next()
8583
return
8684
}
87-
// 2. ALB/Cognito SSO identity.
88-
if id := identityFromOIDC(c, sso); id != nil {
89-
c.Set(ctxIdentityKey, id)
90-
c.Next()
85+
// 2. ALB/Cognito SSO identity: extract the email, then resolve its role.
86+
email := emailFromOIDC(c)
87+
if email == "" {
88+
c.AbortWithStatusJSON(http.StatusUnauthorized, gin.H{"error": "not authenticated"})
9189
return
9290
}
93-
c.AbortWithStatusJSON(http.StatusUnauthorized, gin.H{"error": "not authenticated"})
91+
role := RoleViewer
92+
if resolve != nil {
93+
role = resolve(email)
94+
}
95+
c.Set(ctxIdentityKey, &Identity{Email: email, Role: role, Source: "sso"})
96+
c.Next()
9497
}
9598
}
9699

97-
// identityFromOIDC decodes the ALB OIDC data JWT (or falls back to the identity
98-
// header) into an Identity with a resolved Role. Returns nil if no SSO identity
99-
// is present.
100+
// emailFromOIDC extracts the caller's email from the ALB OIDC data JWT (falling
101+
// back to the `sub` claim, then to the identity-only header). It returns "" when
102+
// no usable SSO identity is present OR when the email fails domain hardening.
103+
//
104+
// Domain hardening: only @posthog.com emails are accepted, and a JWT
105+
// email_verified claim that is explicitly false rejects the identity. This is
106+
// defense in depth on top of the ALB/Cognito allow-list — a stray non-corporate
107+
// or unverified identity never becomes a logged-in (even viewer) caller.
100108
//
101109
// The JWT signature is NOT verified here: the request only reaches this pod via
102110
// the internal-scheme ALB (which signs and injects the header and strips
103111
// client-supplied copies) over a tailnet-restricted network. Verifying the
104112
// ALB's regional public key by `kid` is a hardening follow-up tracked in the
105-
// design doc; it does not change the role mapping below.
106-
func identityFromOIDC(c *gin.Context, sso SSOConfig) *Identity {
113+
// design doc.
114+
func emailFromOIDC(c *gin.Context) string {
107115
raw := c.GetHeader(albOIDCDataHeader)
108116
if raw == "" {
109-
// Fallback: identity-only header (email/subject), no groups.
110-
if email := c.GetHeader(albOIDCIdentityHeader); email != "" {
111-
return resolveRole(&Identity{Email: email, Source: "sso"}, sso)
117+
// Fallback: identity-only header (email/subject). The data JWT is absent,
118+
// so there is no email_verified claim to consult — domain check still applies.
119+
if email := c.GetHeader(albOIDCIdentityHeader); acceptableSSOEmail(email, true) {
120+
return strings.ToLower(strings.TrimSpace(email))
112121
}
113-
return nil
122+
return ""
114123
}
115124
claims, err := decodeJWTClaims(raw)
116125
if err != nil {
117126
slog.Warn("admin: failed to decode ALB OIDC data header", "error", err)
118-
return nil
127+
return ""
128+
}
129+
email := stringClaim(claims, "email")
130+
if email == "" {
131+
email = stringClaim(claims, "sub")
119132
}
120-
id := &Identity{
121-
Email: stringClaim(claims, "email"),
122-
Groups: groupsClaim(claims),
123-
Source: "sso",
133+
// email_verified, when present, must not be false.
134+
verified := true
135+
if v, ok := claims["email_verified"]; ok {
136+
if b, ok := v.(bool); ok {
137+
verified = b
138+
}
124139
}
125-
if id.Email == "" {
126-
id.Email = stringClaim(claims, "sub")
140+
if !acceptableSSOEmail(email, verified) {
141+
return ""
127142
}
128-
return resolveRole(id, sso)
143+
return strings.ToLower(strings.TrimSpace(email))
129144
}
130145

131-
// resolveRole assigns admin/viewer based on group membership.
132-
func resolveRole(id *Identity, sso SSOConfig) *Identity {
133-
if sso.AdminGroup == "" {
134-
if sso.AllowAllSSO {
135-
// Explicit opt-in (DUCKGRES_ADMIN_SSO_ALLOW_ALL): single-tier dev mode.
136-
id.Role = RoleAdmin
137-
slog.Warn("admin: DUCKGRES_ADMIN_SSO_GROUP unset and ALLOW_ALL set — granting admin to all SSO users", "email", id.Email)
138-
return id
139-
}
140-
// Fail closed: without a configured admin group, SSO users are viewers.
141-
// (The internal-secret / break-glass path still grants admin.)
142-
id.Role = RoleViewer
143-
slog.Warn("admin: DUCKGRES_ADMIN_SSO_GROUP unset — SSO user defaults to viewer (set the group, or DUCKGRES_ADMIN_SSO_ALLOW_ALL for dev)", "email", id.Email)
144-
return id
146+
// acceptableSSOEmail enforces the domain + verification hardening rules.
147+
func acceptableSSOEmail(email string, verified bool) bool {
148+
if !verified {
149+
return false
145150
}
146-
id.Role = RoleViewer
147-
for _, g := range id.Groups {
148-
if strings.EqualFold(strings.TrimSpace(g), sso.AdminGroup) {
149-
id.Role = RoleAdmin
150-
break
151-
}
152-
}
153-
return id
151+
return strings.HasSuffix(strings.ToLower(strings.TrimSpace(email)), ssoEmailDomain)
154152
}
155153

156154
// RoleGate enforces the viewer/admin split:
@@ -246,30 +244,3 @@ func stringClaim(claims map[string]any, key string) string {
246244
}
247245
return ""
248246
}
249-
250-
// groupsClaim extracts group membership from the common claim shapes Cognito /
251-
// Google emit: a JSON array, or a space/comma-separated string. Checks several
252-
// claim names.
253-
func groupsClaim(claims map[string]any) []string {
254-
for _, key := range []string{"cognito:groups", "custom:groups", "groups"} {
255-
v, ok := claims[key]
256-
if !ok {
257-
continue
258-
}
259-
switch t := v.(type) {
260-
case []any:
261-
out := make([]string, 0, len(t))
262-
for _, g := range t {
263-
if s, ok := g.(string); ok {
264-
out = append(out, s)
265-
}
266-
}
267-
if len(out) > 0 {
268-
return out
269-
}
270-
case string:
271-
return strings.FieldsFunc(t, func(r rune) bool { return r == ' ' || r == ',' })
272-
}
273-
}
274-
return nil
275-
}

0 commit comments

Comments
 (0)