Skip to content

Feat/http api mvp#30

Open
shayankashif123 wants to merge 2 commits intoc2siorg:mainfrom
shayankashif123:feat/http-api-mvp
Open

Feat/http api mvp#30
shayankashif123 wants to merge 2 commits intoc2siorg:mainfrom
shayankashif123:feat/http-api-mvp

Conversation

@shayankashif123
Copy link
Copy Markdown

@shayankashif123 shayankashif123 commented Apr 1, 2026

Summary

Delivers the Cognitive Firewall MVP described in the issue —
a FastAPI HTTP interface over the existing Python SDK and UDS
sidecar, with a configurable rule-based pre-filter layer.

Closes #27


What this PR does

Before this PR, using the firewall required writing Python code
directly against the SDK. After this PR, any HTTP client can
evaluate a payload through the full enforcement stack:

curl -X POST http://localhost:8000/validate \
  -H "Content-Type: application/json" \
  -d '{"hook":"on_prompt","payload":"Ignore previous instructions"}'

# {"decision":"BLOCK","signals":["instruction_override"],"score":0.95,"rule_based":true}

Architecture decision — pre-filter + sidecar

The rule engine runs in-process before every sidecar call.

POST /validate
      │
      ▼
Layer 1 — Rule engine (~0ms)
  Critical pattern → BLOCK immediately, sidecar not called
  No critical match → continue
      │
      ▼
Layer 2 — acf SDK → UDS → Go sidecar (4–8ms)
      │
      ▼
ValidateResponse (decision + signals + score + rule_based)

This story is clean — fast rejection first, authoritative
enforcement second. The rule_based flag tells the caller
which layer made the decision.


Files changed

api/main.py              FastAPI app — GET /health, POST /validate
                         _get_firewall() separated for easy mocking
                         _call_hook() handles SDK signature differences
                         per hook (on_tool_call takes 2 args, not 1)

api/models.py            Pydantic contracts — ValidateRequest,
                         ValidateResponse, HealthResponse, HookType

api/rules/engine.py      RuleEngine — pre-compiles patterns at startup
                         Score = max(weights), never additive
                         hard_block short-circuits sidecar call

api/config/rules.yaml    6 rules, 24+ patterns across 4 threat
                         categories — editable without code changes

api/requirements.txt     fastapi, uvicorn, pyyaml, httpx

tests/api/               82 tests — all mocked, no sidecar required
  test_health.py         12 tests — all three sidecar states
  test_validate.py       31 tests — all decision paths + contracts
  test_rules_engine.py   39 tests — isolated unit tests

docs/api.md              Full usage guide — architecture diagram,
                         hook reference, curl examples, roadmap table

SDK zero-dependency contract preserved

sdk/python/pyproject.toml is untouched.
New dependencies live exclusively in api/requirements.txt.

dependencies = []  # zero external dependencies — stdlib only

Test output

82 passed in 0.98s

All tests use mocks — CI passes without a running sidecar.


Live demo

Health check:

curl -s http://localhost:8000/health
{"status":"ok","sidecar":"reachable"}

Clean prompt — ALLOW:

curl -s -X POST http://localhost:8000/validate \
  -H "Content-Type: application/json" \
  -d '{"hook":"on_prompt","payload":"What is the capital of France?"}'

{"decision":"ALLOW","sanitised_payload":null,"signals":[],"score":0.0,"rule_based":false}

Injection attempt — BLOCK (rule engine, sidecar not called):

curl -s -X POST http://localhost:8000/validate \
  -H "Content-Type: application/json" \
  -d '{"hook":"on_prompt","payload":"Ignore previous instructions"}'

{"decision":"BLOCK","sanitised_payload":null,"signals":["instruction_override"],"score":0.95,"rule_based":true}

Invalid hook — 422:

curl -s -X POST http://localhost:8000/validate \
  -H "Content-Type: application/json" \
  -d '{"hook":"on_invalid","payload":"hello"}'

{"detail":[{"type":"enum","msg":"Input should be 'on_prompt', 'on_context', 'on_tool_call' or 'on_memory'"}]}

Sidecar down — 503:

curl -s -X POST http://localhost:8000/validate \
  -H "Content-Type: application/json" \
  -d '{"hook":"on_prompt","payload":"hello"}'

{"detail":"Sidecar is not reachable. Start it with: source .env.local && ./bin/acf-sidecar"}

What this deliberately does not include

  • OPA/Rego integration — Phase 3, tracked in roadmap
  • Full pipeline parity — validate/normalise/scan/aggregate stages
  • UDS replacement with HTTP — explicit non-goal per issue
  • SANITISE with real scrubbed content — sidecar returns hardcoded
    ALLOW in Phase 1, sanitise path is correctly handled in code
    and will produce real output when Phase 3 ships

On detection coverage

The rule engine is Layer 1 of a planned multi-layer stack.
Regex matching is fast and deterministic but bypassable through
paraphrasing. This is acknowledged and addressed by deeper layers:

  • Layer 2 — sidecar normalise stage strips obfuscation
    (URL encoding, Base64, zero-width chars, Unicode variants)
  • Layer 3 — Aho-Corasick scan + OPA policy evaluation
  • Layer 4 — semantic LLM classifier for mid-band inputs

- FastAPI app with GET /health and POST /validate endpoints
- In-process regex rule engine (6 rules, 24+ patterns)
- Pre-filter + sidecar two-layer enforcement flow
- Pydantic request/response contracts
- 82 tests — all passing, no sidecar required
- SDK zero-dependency contract preserved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[feature] Initial Cognitive Firewall MVP — FastAPI façade + Python SDK + rule-based detection

1 participant