feat(pipeline): implement scan stage - lexical pattern matching + 21 tests by Pranjal0410 · Pull Request #25 · c2siorg/acf-sdk

Pranjal0410 · 2026-03-27T06:44:47Z

feat(pipeline): implement scan stage - lexical pattern matching + 21 tests

Implements the scan stage (Stage 3) of the sidecar pipeline. The stage was defined in the architecture but had no implementation - just a package declaration and comments.

What this adds

internal/pipeline/scan.go - Lexical pattern matching for the scan stage.

Configurable pattern library with 4 signal categories: instruction_override, role_hijack, data_exfiltration, shell_metacharacter
Hard block short-circuit for high-severity signals (data exfiltration, shell metacharacters)
Tool allowlist: on_tool_call payloads from allowlisted tools skip scanning
Structured payload support: handles both string payloads (on_prompt, on_memory) and map payloads (on_tool_call, on_context)
Preserves pre-computed semantic signals from the Python SDK - lexical and semantic signals coexist without duplication
Case-insensitive matching throughout

internal/pipeline/scan_test.go - 21 tests covering:

All 4 attack categories with expected signal names
Hard block vs non-hard-block behaviour
Benign input, empty payload, nil payload — no false signals
Case insensitivity
Structured tool call payloads (map[string]any)
Allowlisted tools skip scanning (case-insensitive)
Allowlist only applies to on_tool_call, not other hooks
Multiple simultaneous signals from a single input
Pre-existing semantic signals from Python SDK preserved
No duplicate signals when lexical and semantic overlap
Signals written back to RiskContext
RAG context injection detection (on_context + rag provenance)
Memory write poisoning detection (on_memory)
Custom pattern configuration
Empty config produces no signals
Hard negative: developer-speak ("override a method in Python") does not hard block

Architecture alignment

This implements the scan stage described in the v0.2 architecture doc:

Scan: lexical · semantic fallback

The lexical side is implemented here. The semantic fallback runs in the Python SDK (PR #15) and sends pre-computed signals over UDS - this stage preserves those signals and merges them with lexical hits.

All 35 sidecar tests pass

go test ./...
ok github.com/acf-sdk/sidecar/internal/crypto (14 tests)
ok github.com/acf-sdk/sidecar/internal/pipeline (21 tests)
ok github.com/acf-sdk/sidecar/internal/transport (cached)

No existing code modified. No files deleted. No dependencies added.

… of concept

…tests

Pranjal0410 added 2 commits March 23, 2026 17:44

feat(integration): audit logger + scanner-to-sidecar round-trip proof…

d792c29

… of concept

feat(pipeline): implement scan stage - lexical pattern matching + 21 …

bb62479

…tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(pipeline): implement scan stage - lexical pattern matching + 21 tests#25

feat(pipeline): implement scan stage - lexical pattern matching + 21 tests#25
Pranjal0410 wants to merge 2 commits intoc2siorg:mainfrom
Pranjal0410:feat/scan-stage

Pranjal0410 commented Mar 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Pranjal0410 commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!