CLAUDE.md — Wairz Codebase Guide

This file is for AI agents (Claude Code, etc.) working on the Wairz codebase. It describes the architecture, conventions, and patterns you need to follow when making changes.

What is Wairz? An open-source, browser-based firmware reverse engineering and security assessment platform. Users upload firmware, the tool unpacks it, and provides a unified interface for filesystem exploration, binary analysis, emulation, fuzzing, and security assessment — augmented by an AI assistant connected via MCP (Model Context Protocol). See README.md for user-facing documentation.

Architecture Overview

Claude Code / Claude Desktop
        │
        │ MCP (stdio)
        ▼
┌─────────────────┐     ┌──────────────────────────────────┐
│   wairz-mcp     │────▶│         FastAPI Backend           │
│  (MCP server)   │     │                                    │
│  60+ tools      │     │  Services: firmware, file,         │
│                 │     │  analysis, emulation, fuzzing,     │
│  Entry point:   │     │  sbom, uart, finding, export...    │
│  wairz-mcp CLI  │     │                                    │
└─────────────────┘     │  Ghidra headless · QEMU · AFL++    │
                        └──────────┬───────────────────────┘
                                   │
┌──────────────┐    ┌──────────────┼──────────────┐
│   React SPA  │───▶│  PostgreSQL  │  Redis       │
│  (Frontend)  │    │              │              │
└──────────────┘    └──────────────┴──────────────┘

Host machine (optional):
  wairz-uart-bridge.py ←─ TCP:9999 ─→ Docker backend

Frontend: React 19 + Vite + TypeScript, shadcn/ui + Tailwind, Monaco Editor, ReactFlow, xterm.js, Zustand
Backend: Python 3.12 + FastAPI (async), SQLAlchemy 2.0 (async) + Alembic, pydantic-settings
MCP Server: wairz-mcp CLI entry point (app.mcp_server:main), stdio transport, 60+ tools
Database: PostgreSQL 16 (JSONB for analysis cache)
Containers: Docker Compose — backend, postgres, redis, emulation (QEMU), fuzzing (AFL++)

Directory Structure

wairz/
├── backend/
│   ├── pyproject.toml           # Entry point: wairz-mcp
│   ├── alembic/versions/        # Database migrations (auto-run on container start)
│   └── app/
│       ├── main.py              # FastAPI app + router registration
│       ├── config.py            # Settings via pydantic-settings
│       ├── database.py          # Async engine, session factory, get_db dependency
│       ├── mcp_server.py        # MCP server with dynamic project switching
│       ├── models/              # SQLAlchemy ORM models
│       ├── schemas/             # Pydantic request/response schemas
│       ├── routers/             # FastAPI REST endpoint routers
│       ├── services/            # Business logic layer
│       ├── workers/             # Background tasks (firmware unpacking)
│       ├── ai/
│       │   ├── __init__.py      # Tool registry factory — registers all tool categories
│       │   ├── tool_registry.py # ToolContext + ToolRegistry framework
│       │   ├── system_prompt.py # MCP system prompt for Claude
│       │   └── tools/           # Tool handlers by category
│       └── utils/
│           ├── sandbox.py       # Path traversal prevention (CRITICAL)
│           └── truncation.py    # Output truncation (30KB max)
├── frontend/
│   └── src/
│       ├── pages/               # Route pages, registered in App.tsx
│       ├── components/          # UI components organized by feature
│       ├── api/                 # Axios API client functions
│       ├── stores/              # Zustand state management
│       └── types/               # TypeScript type definitions
├── ghidra/
│   ├── Dockerfile
│   └── scripts/                 # Custom Java analysis scripts for headless Ghidra
├── emulation/
│   ├── Dockerfile               # QEMU + kernels (ARM, MIPS, MIPSel, AArch64)
│   └── scripts/                 # start-user-mode.sh, start-system-mode.sh, serial-exec.sh
├── fuzzing/
│   └── Dockerfile               # AFL++ with QEMU mode
├── harness-build/
│   ├── Dockerfile               # Bootlin old-glibc cross toolchains (armhf/armel/aarch64/mips/mipsel)
│   └── build-harness.sh         # Cross-compile a harness linked against a firmware .so
└── scripts/
    └── wairz-uart-bridge.py     # Host-side serial bridge (standalone, pyserial only)

How to Add Things

Adding a New MCP Tool

Create or edit a handler in backend/app/ai/tools/<category>.py:

async def _handle_my_tool(input: dict, context: ToolContext) -> str:
    # Available on context: project_id, firmware_id, extracted_path,
    # storage_path (for RTOS / blob-only firmware), extraction_dir,
    # carved_path, db
    path = context.resolve_path(input.get("path", "/"))  # validates against sandbox
    # ... do work ...
    return "result string (max 30KB, truncated automatically)"

registry.register(name="my_tool", description="...", input_schema={...}, handler=_handle_my_tool)

Tag kind-specific tools via applies_to. Defaults to ALL_KINDS (("linux", "rtos", "unknown")); only set this when the tool requires something one kind has and another doesn't:
```
registry.register(
    name="enumerate_rtos_tasks",
    description="...", input_schema={...}, handler=_handle_enumerate_rtos_tasks,
    applies_to=("rtos",),   # hidden from list_tools when active project is Linux
)
```
The MCP server filters list_tools by the active firmware's kind and rejects mismatched call_tool invocations as defense in depth. switch_project emits notifications/tools/list_changed so clients re-fetch automatically.
If it's a new category file, import and call register_<category>_tools(registry) in backend/app/ai/__init__.py.

Adding a New REST Endpoint

Create router: backend/app/routers/<name>.py

router = APIRouter(prefix="/api/v1/projects/{project_id}/<name>", tags=["<name>"])

Register in backend/app/main.py: app.include_router(<name>.router)
Create Pydantic schemas in backend/app/schemas/<name>.py (use from_attributes=True for ORM compatibility)
Create service in backend/app/services/<name>_service.py

Adding a Database Table

Create model in backend/app/models/<name>.py:
- Use SQLAlchemy Mapped/mapped_column style
- UUID primary key with dual defaults: default=uuid.uuid4 + server_default=func.gen_random_uuid()
- Foreign keys with cascade="all, delete-orphan" on relationships
Create Alembic migration: alembic revision --autogenerate -m "description"
Migrations run automatically on container startup

Adding a Frontend Page

Create page component in frontend/src/pages/<Name>Page.tsx
Register route in frontend/src/App.tsx
Create API client functions in frontend/src/api/<name>.ts
Use Zustand stores (frontend/src/stores/) for shared state
UI components from shadcn/ui + Tailwind

Critical Rules

Security

Path traversal prevention is mandatory. Every file access must be validated via app/utils/sandbox.py (os.path.realpath() + prefix check against the extracted root). The MCP ToolContext.resolve_path() method handles this — always use it.
Never execute firmware binaries on the host. Emulation runs inside an isolated QEMU Docker container. Fuzzing runs inside an isolated AFL++ Docker container. Both have resource limits (memory, CPU).
No API keys stored in the backend. The Anthropic API key is user-provided via their Claude Code/Desktop configuration and never touches Wairz.

Performance

Cache Ghidra decompilations — each run takes 30-120s. Cached by binary hash + function name in the analysis_cache table.
Cache radare2 analysis — aaa can take 10-30s. LRU session caching in the analysis service.
Lazy-load the file tree — firmware can have 10K+ files. Load children on expand, never the full tree at once.
Truncate MCP tool outputs — keep under 30KB (app/utils/truncation.py). Large outputs break MCP clients.
Firmware unpacking is non-blocking — the unpack endpoint returns 202 and runs asyncio.create_task(). The frontend polls every 2s until status changes from "unpacking".

Conventions

Backend: Async everywhere (SQLAlchemy async sessions, asyncio.create_subprocess_exec for subprocesses). Use async_session_factory from database.py for DB access outside request context (e.g., background tasks).
Frontend: Zustand for state, API functions in src/api/, pages poll with useEffect + setInterval for long-running operations (see EmulationPage, FuzzingPage, ProjectDetailPage for the pattern).
Docker: Backend has access to Docker socket for managing emulation/fuzzing containers. Emulation containers run on an internal emulation_net network.

MCP Server

Entry point: wairz-mcp = "app.mcp_server:main" (defined in pyproject.toml)

The server uses a mutable ProjectState dataclass so all project context (project_id, firmware_id, extracted_path, storage_path, firmware_kind, rtos_flavor) can be switched dynamically via the switch_project tool without restarting the MCP process. When the firmware kind changes, the server emits notifications/tools/list_changed so clients re-fetch the visible tool set.

--project-id is optional. When omitted, the server boots with an empty ProjectState (project_id = zero UUID, firmware_kind="unknown"); the firmware-kind filter naturally hides every analysis tool so only list_projects, switch_project, get_project_info, and list_firmware_versions remain callable, and build_system_prompt returns a short directive telling the agent to pick a project before doing anything else. Used for shared/team Wairz instances where multiple users connect to one server and switch_project between projects independently.

Firmware kind discriminator

Every firmware row carries firmware_kind (linux | rtos | unknown) plus an optional rtos_flavor (freertos | zephyr | baremetal-cortexm) and firmware_kind_source (detected | manual). Auto-detection runs in app/services/rtos_detection_service.py at the tail of unpack and only writes when firmware_kind_source != 'manual' — the dropdown override on the project page always wins. Kind plumbs through to the MCP system prompt (kind-aware blocks in app/ai/system_prompt.py), the tool registry filter (registry.for_kind(kind)), and the frontend (Project.firmware_kind from the projects-list endpoint, used by Sidebar to filter analysis tabs).

Path resolution for RTOS firmware

RTOS projects have no extracted_path (no rootfs to mount). FileService recognises this "blob-only" mode and exposes the firmware blob via:

/firmware/<basename> — the canonical virtual path
/<basename> and bare <basename> — also resolve, for forgiving callers

Tools that take a binary_path / path argument and call context.resolve_path() work transparently across Linux and RTOS; context.storage_path is the underlying real path when an RTOS-specific tool needs to bypass the virtual layer (e.g. enumerate_rtos_tasks uses pyelftools directly on it).

Tool Categories (90+)

Category	File	Tools
Project	`tools/filesystem.py`	`get_project_info`, `switch_project`, `list_projects`
Filesystem	`tools/filesystem.py`	`list_directory`, `read_file`, `search_files`, `file_info`, `find_files_by_type`, `get_component_map`, `get_firmware_metadata`, `extract_bootloader_env`
Strings	`tools/strings.py`	`extract_strings`, `search_strings`, `find_crypto_material`, `find_hardcoded_credentials`
Binary	`tools/binary.py`	`list_functions`, `disassemble_function`, `decompile_function`, `list_imports`, `list_exports`, `xrefs_to`, `xrefs_from`, `get_binary_info`, `check_binary_protections`, `check_all_binary_protections`, `find_string_refs`, `resolve_import`, `find_callers`, `search_binary_content`, `get_stack_layout`, `get_global_layout`, `trace_dataflow`, `cross_binary_dataflow`
Security	`tools/security.py`	`check_known_cves`, `analyze_config_security`, `check_setuid_binaries`, `analyze_init_scripts`, `check_filesystem_permissions`, `analyze_certificate`
SBOM	`tools/sbom.py`	`generate_sbom`, `get_sbom_components`, `check_component_cves`, `run_vulnerability_scan`
Emulation	`tools/emulation.py`	`start_emulation`, `run_command_in_emulation`, `stop_emulation`, `check_emulation_status`, `get_emulation_logs`, `enumerate_emulation_services`, `diagnose_emulation_environment`, `troubleshoot_emulation`, `get_crash_dump`, `run_gdb_command`, `save_emulation_preset`, `list_emulation_presets`, `start_emulation_from_preset`
Fuzzing	`tools/fuzzing.py`	`analyze_fuzzing_target`, `generate_fuzzing_dictionary`, `generate_seed_corpus`, `generate_fuzzing_harness`, `build_fuzz_harness`, `patch_function_return`, `start_fuzzing_campaign`, `check_fuzzing_status`, `stop_fuzzing_campaign`, `triage_fuzzing_crash`, `diagnose_fuzzing_campaign`
Comparison	`tools/comparison.py`	`list_firmware_versions`, `diff_firmware`, `diff_binary`, `diff_decompilation`
UART	`tools/uart.py`	`uart_connect`, `uart_send_command`, `uart_read`, `uart_send_break`, `uart_send_raw`, `uart_disconnect`, `uart_status`, `uart_get_transcript`
Reporting	`tools/reporting.py`	`add_finding`, `list_findings`, `get_finding`, `update_finding`, `read_project_instructions`, `list_project_documents`, `read_project_document`
Code	`tools/documents.py`	`save_code_cleanup`
RTOS (applies_to=`("rtos",)`)	`tools/rtos.py`	`detect_rtos_kernel`, `enumerate_rtos_tasks`, `analyze_vector_table`, `recover_base_address`, `analyze_memory_map`

Linux-only tools are tagged applies_to=("linux",) in tools/emulation.py (15 tools), tools/security.py (4 of the 6 — analyze_config_security, check_setuid_binaries, analyze_init_scripts, check_filesystem_permissions), and tools/filesystem.py (get_component_map). All other tools default to ALL_KINDS.

UART Bridge Architecture

The bridge runs on the host (not in Docker) because USB serial adapters can't easily pass through to containers.

How it works:

Host: scripts/wairz-uart-bridge.py is a standalone TCP server (only requires pyserial). It listens on TCP 9999 and proxies serial I/O.
Docker: uart_service.py in the backend container connects to the bridge via host.docker.internal:9999
Protocol: Newline-delimited JSON, request/response matched by id field
Important: The bridge does NOT take a serial device path or baudrate on its command line. Those are specified by the MCP uart_connect tool at connection time.

Starting the bridge:

python3 scripts/wairz-uart-bridge.py --bind 0.0.0.0 --port 9999

The bridge will print "UART bridge listening on ..." when ready. It waits for connection commands from the backend.

Connecting via MCP: Call uart_connect with the device_path (e.g., /dev/ttyUSB0) and baudrate (e.g., 115200). The backend sends these to the bridge, which opens the serial port.

Common setup issues (Bridge unreachable):

UART_BRIDGE_HOST in .env must be host.docker.internal (NOT localhost — localhost inside Docker refers to the container, not the host)
An iptables rule is required to allow Docker bridge traffic to reach the host:
```
sudo iptables -I INPUT -i docker0 -p tcp --dport 9999 -j ACCEPT
```
After changing .env, restart the backend: docker compose restart backend
After restarting the backend, reconnect MCP (e.g., /mcp in Claude Code)

Environment Variables

See .env.example for defaults. Key variables:

Variable	Description
`DATABASE_URL`	PostgreSQL connection string (asyncpg)
`REDIS_URL`	Redis connection string
`STORAGE_ROOT`	Where firmware files are stored on disk
`MAX_UPLOAD_SIZE_MB`	Maximum firmware upload size (default 500)
`MAX_TOOL_OUTPUT_KB`	MCP tool output truncation limit (default 30)
`GHIDRA_PATH` / `GHIDRA_SCRIPTS_PATH`	Ghidra headless installation paths
`GHIDRA_TIMEOUT`	Decompilation timeout in seconds (default 120)
`EMULATION_IMAGE` / `EMULATION_NETWORK`	Docker image and network for QEMU containers
`FUZZING_IMAGE` / `FUZZING_TIMEOUT_MINUTES`	Docker image and timeout for AFL++ containers
`UART_BRIDGE_HOST` / `UART_BRIDGE_PORT`	Host-side UART bridge connection
`NVD_API_KEY`	Optional, for higher NVD rate limits during CVE scanning

Enterprise / Cloud Deployment (`enterprise/`)

The enterprise/ directory is a self-contained AWS deployment target (Terraform) that runs Wairz elastically: SPA on S3/CloudFront, FastAPI on Fargate, Aurora Serverless v2, ElastiCache, EFS-shared firmware storage, and Ghidra decompilation bursting onto scale-to-zero AWS Batch workers. It also adds an optional custom domain + Cognito/OIDC auth (SSO-ready) and a remote Streamable-HTTP MCP transport.

The non-negotiable contract when touching app code: every cloud behavior is config-gated and defaults to the local behavior. The single-host docker compose workflow must keep working unchanged with the default settings, and the existing test suite must stay green with defaults. Concretely:

Cloud features are toggled by settings (e.g. auth_enabled, Batch dispatch, Redis-backed analysis lock, allowed_hosts/allowed_origins, mcp_http_enabled) whose defaults reproduce the original local behavior.
Firmware storage stays a POSIX path (EFS) — do not migrate STORAGE_ROOT to an S3-only abstraction.
Keep the existing async job protocol (analysis_cache / ghidra_analysis_run + poll tools); the enterprise change only moves where Ghidra runs and what backs the cross-process lock.
docker.sock features (fuzzing, emulation, carving) are out of scope for the cloud MVP but must gate off gracefully, not be hard-removed.

Start at enterprise/PLAN.md — its "Codebase Ground Truth" and "Guardrails for agents" sections are required reading before changing anything in this subtree. Operations/cost detail is in enterprise/docs/.

Testing Firmware

Good images for development and testing:

OpenWrt (MIPS, ARM) — well-structured embedded Linux with lots of components
DD-WRT — similar to OpenWrt
DVRF (Damn Vulnerable Router Firmware) — intentionally vulnerable, great for security tool testing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md — Wairz Codebase Guide

Architecture Overview

Directory Structure

How to Add Things

Adding a New MCP Tool

Adding a New REST Endpoint

Adding a Database Table

Adding a Frontend Page

Critical Rules

Security

Performance

Conventions

MCP Server

Firmware kind discriminator

Path resolution for RTOS firmware

Tool Categories (90+)

UART Bridge Architecture

Environment Variables

Enterprise / Cloud Deployment (`enterprise/`)

Testing Firmware

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md — Wairz Codebase Guide

Architecture Overview

Directory Structure

How to Add Things

Adding a New MCP Tool

Adding a New REST Endpoint

Adding a Database Table

Adding a Frontend Page

Critical Rules

Security

Performance

Conventions

MCP Server

Firmware kind discriminator

Path resolution for RTOS firmware

Tool Categories (90+)

UART Bridge Architecture

Environment Variables

Enterprise / Cloud Deployment (enterprise/)

Testing Firmware

Enterprise / Cloud Deployment (`enterprise/`)