Skip to content

feat(security): CI source-guard requiring a role check on every /v1/manager route#409

Open
lml2468 wants to merge 1 commit into
mainfrom
feat/oct20-manager-authz-guard
Open

feat(security): CI source-guard requiring a role check on every /v1/manager route#409
lml2468 wants to merge 1 commit into
mainfrom
feat/oct20-manager-authz-guard

Conversation

@lml2468

@lml2468 lml2468 commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Summary

octo-server#366 Part 1. Authz on the admin /v1/manager surface is enforced per-handler by an inline c.CheckLoginRoleIsSuperAdmin() / c.CheckLoginRole() call — there is no central policy middleware yet (that's the SecurityEngineer's systemic workstream). With the check scattered across ~100 handlers, a new /v1/manager route that forgets the check silently ships a privilege-escalation hole: AuthMiddleware only authenticates, it does not gate admin role.

This PR is a thin, reversible CI safety net (no refactor): it makes a missing role check a build failure.

  • tools/lint-manager-authz — an AST guard that fails if any handler registered on a /v1/manager route group lacks a role check.
    • Route-group variables are resolved function-locally, so the admin group doesn't conflate with unrelated /v1 or /v1/space groups that reuse names like auth (this was a real false-positive trap — see the regression test).
    • A handler counts as covered if it (or a same-package helper it calls) invokes CheckLoginRole / CheckLoginRoleIsSuperAdmin, or its group has a role-enforcing middleware, or it's allowlisted.
    • Fail-closed: a handler that can't be resolved is treated as unchecked.
  • allowlist.txt — the 5 routes that live under /v1/manager by path but are intentionally not admin:
    • POST /v1/manager/login (public, issues the admin token pre-auth).
    • /v1/manager/secrets CRUD ×4 (user-scoped, owner == c.GetLoginUID(), not an admin surface).
  • CI: new manager-authz-lint job (mirrors personal-msgsendreq-lint) — runs the guard's unit tests, then the guard over ./modules.
  • Docs-as-interface: convention documented in the tool's package doc, the failure message, the allowlist header, and CLAUDE.md.

Inventory today: 109 /v1/manager routes, all role-checked or allowlisted.

How to satisfy the guard for a new /v1/manager route

Call a role check at the top of the handler before doing any work:

if err := c.CheckLoginRoleIsSuperAdmin(); err != nil { c.ResponseError(err); return }

(use CheckLoginRole for the admin∪superAdmin tier). If the route is deliberately not an admin route (public, or user-scoped by UID), add it to tools/lint-manager-authz/allowlist.txt with a one-line reason.

Test plan

  • go test ./tools/lint-manager-authz/... — 9 cases incl. the function-local-scoping regression, delegated check, subgroup composition, allowlist suppression, and the unchecked-route fixture that proves CI fails.
  • go run ./tools/lint-manager-authz ./modules → green (109 routes).
  • Live proof: injected an unchecked GET /v1/manager/backup/leak into a real module → guard exits 1; reverted → exits 0.
  • Confirmed the 5 allowlist entries are exactly the routes flagged with an empty allowlist (no stale/over-broad entries).
  • gofmt / go vet clean; ci.yml YAML validated.
  • CTO review as security owner before merge. Consider adding Manager Authz Lint to required status checks.

Scope guard: this is Part 1 only — no central authz refactor (held for the SecurityEngineer).

Closes #366

…anager route

Authz on the admin (/v1/manager) surface is enforced per-handler by an inline
c.CheckLoginRoleIsSuperAdmin() / c.CheckLoginRole() call — there is no central
policy middleware yet. A new /v1/manager route that forgets the check silently
ships a privilege-escalation hole, since AuthMiddleware only authenticates and
does not gate admin role.

Add tools/lint-manager-authz: an AST guard that fails the build if any handler
registered on a /v1/manager route group lacks a role check. Group variables are
resolved function-locally so the admin group does not conflate with unrelated
/v1 or /v1/space groups that reuse names like `auth`. Non-admin routes that live
under /v1/manager by path (public login, user-scoped /v1/manager/secrets CRUD)
are allowlisted with reasons in allowlist.txt.

Wire it into CI as the manager-authz-lint job (runs the guard's unit tests, then
the guard over ./modules). Document the convention in CLAUDE.md.

octo-server#366 Part 1.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
@lml2468 lml2468 requested a review from a team as a code owner June 17, 2026 16:11
@github-actions github-actions Bot added the size/XL PR size: XL label Jun 17, 2026
@lml2468 lml2468 changed the title security(authz): CI source-guard requiring a role check on every /v1/manager route feat(security): CI source-guard requiring a role check on every /v1/manager route Jun 17, 2026
@lml2468

lml2468 commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

CTO security review — APPROVED (Part 1)

(Posting as a comment — GitHub blocks a formal approval since the PR is attributed to this token. Treat this as the security-owner sign-off for the OCT-20 / #366 review gate.)

Independently verified on branch feat/oct20-manager-authz-guard (go1.26.4):

Verification (reproduced locally, not just trusting the PR):

  • go test ./tools/lint-manager-authz/... → pass (9 cases, incl. the unchecked-route acceptance fixture + the function-local-scoping regression).
  • go run ./tools/lint-manager-authz ./modulesOK: all 109 /v1/manager route(s) are role-checked or allowlisted (5 allowlist entries).
  • Fail-closed proof: injected an unchecked GET /v1/manager/backup/leak into modules/backup/api_manager.go → guard exits 1 and pinpoints the route; reverted → exits 0.
  • Allowlist audited (the one place a real hole could hide) — sound:
    • POST /v1/manager/login — registered on the no-middleware pre-auth group (modules/user/api_manager.go:63); role genuinely cannot be checked before login. Correct.
    • /v1/manager/secrets CRUD ×4 (modules/usersecret/api.go) — every handler (create/list/update/delete) scopes by owner := c.GetLoginUID(). User-scoped, not an admin surface. Correct, not over-broad.

Design check: function-local route-group resolution is the right call — the 9 manager modules all reuse auth/v names across Route() methods, so a package-global index would have flooded false positives. Fail-closed on unresolved handlers is the correct security default.

Lenses: blast-radius LOW (additive CI job + tool, no runtime/refactor, two-way door); security-by-default ✅; docs-as-interface ✅ (failure message + allowlist header + package doc + CLAUDE.md).

Scope correctly held to Part 1 — no central authz refactor (Part 2 stays with the SecurityEngineer, tracked on #366).

Merge gate — one non-code blocker remains

All code/security checks are green (Manager Authz Lint, Build, Vet, Lint, Test). The only red check is check-sprint, an org board-hygiene gate. I added Closes #366 to the PR body, which moved it past "no linked issue"; it now reports "Linked issue #366 has no Sprint set." #366 is not on the Octo Board (project #2).

Action needed (Maintainer): add #366 to the Octo Board with Sprint W25 (or add this PR to the board as the gate's PR-fallback). My token lacks read:project, so I can't do it from here.

On merge: once check-sprint is green, merge, and add Manager Authz Lint to required status checks so the guard is enforced on every future PR.

@mochashanyao mochashanyao left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Octo-Q · automated review]

Verdict: Approve — no blocking findings; notes below (data-flow traced).


octo-server PR#409 Review Report — security(authz): CI source-guard requiring a role check on every /v1/manager route

Reviewer: Octo-Q (automated review)
Head SHA: ee7e24b71ae6e950a477190180128c2bc39c674a
Routing: complexity=security_sensitive (automated review leg)


1. Verification Conclusions

Item Status Evidence
CI job mirrors personal-msgsendreq-lint pattern .github/workflows/ci.yml lines 309-338: identical needs, if, runs-on, timeout-minutes, pinned action SHAs
changes job dependency resolves changes job at ci.yml:42-78 exports code output; GO_VERSION at ci.yml:37 in workflow scope
Tool detects all 13 real /v1/manager groups Verified against 12 files in modules/: variable names (auth, manager, user, mgr) and receivers (r, l) all tracked by localGroupVars
Sub-prefix groups handled /v1/manager/workplace, /v1/manager/dashboard, /v1/manager/secrets matched by strings.HasPrefix(pd.basePath, managerPathPrefix) at main.go:243
Delegated role checks (helper wrappers) bodyHasRoleCheck recurses into same-package helpers up to maxHelperDepth=2; covers opanalytics.requireAdmin() pattern (6 call sites)
Allowlist entries valid All 5 entries verified against source: POST /v1/manager/login (user/api_manager.go:64, pre-auth), 4× /v1/manager/secrets (usersecret/api.go:99-102, owner-scoped CRUD)
Allowlist path resolves in CI os.Open("tools/lint-manager-authz/allowlist.txt") CWD-relative; CI runs from repo root after actions/checkout
Test suite adequate 9 tests covering: unchecked flagging, both role-check tiers, function-local scoping regression, subgroup composition, delegated check, allowlist suppression, role middleware, non-manager sibling exclusion
CLAUDE.md addition consistent Line 150 bullet aligns with tool doc comment, CI comment block, and allowlist header

2. Findings

F1 — P2: stringLit uses strings.Trim instead of strconv.Unquote (main.go:370-375)

Diff-scope: new (this PR)

strings.Trim(lit.Value, " + "" + "")strips all leading/trailing characters in the cutset, which is not equivalent to Go string unquoting. A double-quoted group path containing Unicode escapes (e.g.,"\u002fv1\u002fmanager") would resolve to /v1/manager` at runtime but fail to match in the tool.

Practical risk: Low. No current codebase usage of escape sequences in route paths. All routes use plain ASCII string literals.

Recommendation: Replace with strconv.Unquote(lit.Value) for correctness. One-line change.

F2 — P2: Group passed as function parameter is invisible (design limitation)

Diff-scope: new (this PR — design limitation, not a bug in current codebase)

If a Route() method creates a manager group and passes it to a helper function, the helper's parameter is not tracked by localGroupVars (which only resolves local := assignments from .Group() calls). Routes registered in the helper would be invisible.

func (m *Manager) Route(r R) {
    auth := r.Group("/v1/manager", AuthMiddleware())
    m.registerExtra(auth)  // group escapes local scope
}
func (m *Manager) registerExtra(auth *RouteGroup) {
    auth.POST("/dangerous", m.dangerous)  // invisible to the tool
}

Practical risk: Low. No current api_manager*.go file uses this pattern. All 13 manager groups are consumed within their declaring function.

Recommendation: Consider a warning when a local group variable appears as an argument to a function call (group escapes local scope).

F3 — P2: Variable shadowing could confuse first-assignment-wins tracking (design limitation)

Diff-scope: new (this PR — design limitation)

localGroupVars fixed-point loop uses first-assignment-wins (if _, done := groups[pd.name]; done { continue }). If a variable name is shadowed in an inner block with a different group path, the tool tracks the outer assignment.

Practical risk: Low. No current variable shadowing in any Route() method. The regression test (TestFunctionLocalGroupScoping) covers the cross-function case (same name auth in different methods), which is the real-world pattern.

F4 — P2: httpVerbs map lacks Gin's Match() method (main.go:72-74)

Diff-scope: new (this PR — omission)

Gin's IRoutes interface exposes Match(methods []string, path string, handlers ...HandlerFunc). Not in the map. A route registered via auth.Match([]string{"GET","POST"}, "/path", handler) would be invisible.

Practical risk: Very low. No .Match( usage found anywhere in modules/.

Recommendation: Add "Match": true to httpVerbs preemptively.

F5 — P2: Allowlist path is CWD-relative (main.go:56, 555)

Diff-scope: new (this PR introduces the first allowlist-bearing lint tool)

os.Open("tools/lint-manager-authz/allowlist.txt") resolves CWD-relative. Works in CI (runs from repo root). A developer running from a subdirectory silently loads an empty allowlist (os.IsNotExist returns empty map), which would produce false positives rather than false negatives (fail-closed).

Recommendation: Consider runtime.Caller(0) or os.Executable() resolution for robustness.

F6 — P2: collectPackages skips any directory named tools at any depth (main.go:165)

Diff-scope: new (this PR)

filepath.SkipDir on any directory named tools, vendor, or testdata. The tools skip is broader than needed (intended to exclude the project's own tools/ directory). If a module creates a tools/ subdirectory with Go files containing manager routes, those would be silently skipped.

Practical risk: Very low. No tools/ subdirectory exists under modules/.

3. Severity Assessment (R1-R4 Rubric)

R1 (P1 hard definition): No finding makes a currently-working path unavailable, produces user-visible incorrect data, or makes the system worse than before the PR. The tool is a CI-time safety net, not a runtime authorization mechanism. All bypass vectors are theoretical (no current codebase pattern exploits them). The tool correctly covers all 109 existing routes.

R2 (diff-scope): All findings are new (introduced by this PR as design limitations of the new tool). None are amplified or pre-existing.

R3 (distributed side-effects): N/A — this is a CI lint tool, not a runtime system with fan-out/observer/emitter patterns.

R4 (verdict mapping): All findings are P2 (design limitations and robustness improvements). No P0 or P1. → APPROVE.

4. Additional Findings (out of scope but noted)

  • Two existing routes use paths without leading slashes: modules/message/api_manager.go:55 ("message/sendfriends") and modules/user/api_manager.go:79 ("user/online"). Gin auto-prepends /, so these work, but they are inconsistent with every other route. Pre-existing, not introduced by this PR.

5. Data-Flow Trace

allowlist.txt (on disk)
  └─ os.Open(allowlistRelPath)  [main.go:555]
     └─ loadAllowlist() → map[string]bool keyed by "<VERB> <path>"  [main.go:553-579]
        └─ main() → pkg.collectViolations(allow)  [main.go:103]
           └─ for each route: allow[rt.sig()] short-circuits violation  [main.go:267]
              └─ rt.sig() = "<Method> <full path>"  [main.go:137]

Route detection pipeline:
  collectPackages(roots) → walk dirs, parse .go files, index funcsByName
    └─ pkg.managerRoutes() → per FuncDecl:
         localGroupVars(fn.Body) → resolve .Group() assignments to groupInfo
           └─ fixed-point loop resolves subgroups from parent groups
         ast.Inspect(fn.Body) → find verb calls on known group variables
           └─ route{file, line, method, path, handler, enclosingRecvType, groupHasRoleMW}
    └─ pkg.collectViolations(allow) → for each route:
         allow[rt.sig()]? → skip
         rt.groupHasRoleMiddleware? → skip
         handlerHasRoleCheck(handler, recvType)? → skip (else violation)
           └─ FuncLit → bodyHasRoleCheck(body)
           └─ Ident/SelectorExpr → namedFuncHasRoleCheck(name, recvType, depth=2)
              └─ bodyHasRoleCheck → ast.Inspect for roleCheckNames selector calls
                 └─ recurse into helper calls if depth > 0

6. R5 Blindspot Checklist (security_sensitive — mandatory)

  • C1 (dual-path parity): N/A — this PR adds a CI guard, not a symmetric add/remove or create/delete path. No paired operations to verify.
  • C2 (control-flow ordering / nested reuse): N/A — the lint tool is a leaf CLI command. Its internals are not reused by other code paths. No nested composite calls or safety-control bypass risk.
  • C3 (authorization boundary ≠ capability boundary): N/A — the tool enforces a coding convention on other code but does not itself expose a runtime capability or endpoint. No authorization boundary is modified.
  • C4 (authorization lifecycle / container-member cascade): N/A — no runtime authorization path is modified. The tool is purely static analysis at CI time.

7. Cross-Round Blocker Recheck (R6)

N/A — this is the first review of this PR.


[Octo-Q] verdict: APPROVE — All findings are P2 (design limitations and robustness improvements for follow-up). No P0/P1. The tool correctly covers all 109 existing /v1/manager routes, the CI integration is a faithful mirror of the existing personal-msgsendreq-lint job, the allowlist is accurate and minimal, and the test suite covers the key patterns including the function-local scoping regression. The six P2 notes are recommended improvements for Part 2, not landing blockers.

@yujiawei yujiawei left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #409 (octo-server)

Verdict: CHANGES_REQUESTED (one P1 blocker on an idiomatic registration path; see below). Reviewed at head ee7e24b7, merge-base fb4763d0, against main. Diff is 5 files / +884, exactly as described.

What's good

  • The guard does what it claims for the current surface: go run ./tools/lint-manager-authz ./modules is green and counts all 109 /v1/manager routes; go test ./tools/lint-manager-authz/... passes (9 cases). I re-ran both locally.
  • The allowlist is justified by the code. With an empty allowlist exactly 5 routes flag — POST /v1/manager/login (public, pre-auth) and the 4 /v1/manager/secrets CRUD ops. I verified all 4 secret handlers scope every operation by owner = c.GetLoginUID() (modules/usersecret/api.go create/list/update/delete), so allowlisting them is correct, not a hidden IDOR.
  • Function-local group resolution is the right call and the regression test (TestFunctionLocalGroupScoping) locks in the real false-positive trap (reused auth var across Route() methods).
  • Honest scoping: this is a thin, reversible net, no authz refactor. Good.

P1 (blocking) — root-level route registrations are silently invisible to the guard

tools/lint-manager-authz/main.go:333-383 (managerRoutes) only inspects functions that contain a local .Group(...) assignment and only matches verb calls whose receiver resolves to that group variable. A route registered directly on the root router is never seen:

r.GET("/v1/manager/export_all_users", m.dump)   // no role check -> guard stays GREEN

This is not an exotic shape — it is idiomatic in this repo. Root-level r.GET/POST/Any("/v1/...") is used in modules/webhook/api.go, modules/usersecret/api.go:113, modules/app_bot/app_bot.go:139, modules/bot_api/bot_api.go:204, modules/base/app/api.go, modules/botfather/api.go. In the pinned octo-lib (f62e626a34db) WKHttp exposes GET/POST/Any at the root, so an admin route added this way compiles, serves, and passes CI with no role check.

Why this blocks: the PR also adds to CLAUDE.md that the guard "fails the build on any /v1/manager route whose handler lacks a check." That promise is not kept for root-level registration. A future engineer trusts the doc, registers an admin route on r directly (or via r.Any), and ships a privilege-escalation hole behind a green security gate — worse than no guard, because the green check manufactures false confidence. For a security control, a documented guarantee that the code does not honor on an idiomatic path is a correctness defect, not a nit.

Cheap fix: also walk top-level r.VERB("/v1/manager...", handler) calls (string-literal path starting with /v1/manager), not just group-relative ones. Any is already in httpVerbs, so include it in the root scan too.

P1 / P2 — two more real false negatives (fix or document as known gaps)

  • [P1] Group passed to a helper (main.go:341-343). localGroupVars is computed per-function, so auth := r.Group("/v1/manager", …); m.register(auth) where register does auth.GET(...) registers routes the guard never inspects (helper has no local Group). No manager route uses this today, but it is a normal Go pattern and nothing prevents it. At minimum the walk should follow a manager group passed as an argument, or fail-closed when a /v1/manager group escapes into a call.

  • [P2] Presence-not-enforcement (main.go:438-478). A handler passes as soon as the identifier CheckLoginRole/CheckLoginRoleIsSuperAdmin appears anywhere in its AST. An ignored return value or a late check passes:

    c.CheckLoginRoleIsSuperAdmin()   // err discarded
    doAdminWork()                    // runs regardless -> guard GREEN

    All current handlers act on the error (verified), so this is advisory, but it is the difference between "a check is present" and "a check gates execution." Worth a comment in the tool doc noting the guard verifies presence, not control flow.

Refuted finding (so it isn't chased)

A cross-tool pass flagged v1 := r.Group("/v1"); mgr := v1.Group("/manager") composition as a dropped path. Not reachable in this framework: octo-lib RouterGroup exposes only GET/POST/PUT/DELETE and has no Group method, so subgroups cannot be nested off a group — every group is created directly from WKHttp.Group. The guard's parent-composition logic and TestManagerSubgroupInScope exercise a shape the real router type cannot produce. Harmless, but it is dead coverage, not protection.

Coverage notes (manual verification beyond the diff)

  • Confirmed all 14 manager-route files use the single in-body x := r.Group("/v1/manager…") pattern the guard handles — that is why the current 109-route inventory is fully covered. The guard's correctness today rests entirely on that convention holding.
  • httpVerbs lists PATCH/HEAD/OPTIONS/Handle that the pinned router type does not expose. Harmless, but the verb set is not aligned to the actual framework surface.
  • Not re-verified here (flagging for human attention on a security PR): whether CheckLoginRole/CheckLoginRoleIsSuperAdmin in octo-lib are themselves correct and revocation-aware — the guard trusts these primitives by name only.

Path to approval

The blocker is small relative to the guard's value. Either: (a) extend the AST walk to root-level r.VERB("/v1/manager…") registrations (closes P1 #1), or (b) if you want to keep scope truly minimal, soften the CLAUDE.md / tool-doc wording so it states the guard's actual coverage and lists the unsupported patterns (root-level, group-passed-to-helper, ignored-error) as known gaps with a tracking issue. (a) is preferred — it's the idiomatic gap. Fixing the helper-passing case (P1 #2) or fail-closing on it would make the net trustworthy enough to rely on.

@yujiawei

Copy link
Copy Markdown
Contributor

Two small additions to the review above, both in the same "the walk should match what the framework actually allows" spirit as the blocker:

  • stringLit should use strconv.Unquote (main.go:368-375). strings.Trim(lit.Value, """)strips a cutset, not Go quoting. Plain ASCII paths are fine today, so this is a correctness nit, not a live gap — but for a guard whose whole job is exact path matching,strconv.Unquote` is the right primitive.

  • Add Match to httpVerbs (main.go:79-82) if you keep the verb-allowlist approach. The router can register a route via Match([]string{...}, path, h); it's unused today, but it's the same shape of future blind spot as the root-level/helper cases — a real registration verb the walk doesn't recognize.

Neither changes the verdict on its own; folding them in while you address the root-level registration gap keeps the guard's coverage aligned with the actual router surface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

authz: centralized authorization layer + CI guard requiring a role check on every /v1/manager route

3 participants