Skip to content

Commit 0224ead

Browse files
digitalandrewclaude
andcommitted
Add MkDocs documentation site for wairz.ai
MkDocs Material theme with dark mode, GitHub Actions workflow for GitHub Pages deployment, and docs seeded from README content covering all features, MCP tools reference, configuration, and architecture. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 428623a commit 0224ead

21 files changed

Lines changed: 1415 additions & 0 deletions

.github/workflows/docs.yml

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
name: Deploy Docs
2+
on:
3+
push:
4+
branches: [main]
5+
paths:
6+
- 'docs/**'
7+
- 'mkdocs.yml'
8+
9+
permissions:
10+
contents: read
11+
pages: write
12+
id-token: write
13+
14+
concurrency:
15+
group: "pages"
16+
cancel-in-progress: false
17+
18+
jobs:
19+
build:
20+
runs-on: ubuntu-latest
21+
steps:
22+
- uses: actions/checkout@v4
23+
- uses: actions/setup-python@v5
24+
with:
25+
python-version: '3.12'
26+
- run: pip install mkdocs-material
27+
- run: mkdocs build --strict
28+
- uses: actions/upload-pages-artifact@v3
29+
with:
30+
path: site/
31+
32+
deploy:
33+
needs: build
34+
runs-on: ubuntu-latest
35+
environment:
36+
name: github-pages
37+
url: ${{ steps.deployment.outputs.page_url }}
38+
steps:
39+
- id: deployment
40+
uses: actions/deploy-pages@v4

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,5 +29,8 @@ Thumbs.db
2929
node_modules/
3030
frontend/dist/
3131

32+
# MkDocs
33+
site/
34+
3235
# Docker
3336
docker-compose.override.yml

docs/CNAME

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
wairz.ai

docs/architecture.md

Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
# Architecture
2+
3+
## System Overview
4+
5+
```
6+
Claude Code / Claude Desktop
7+
|
8+
| MCP (stdio)
9+
v
10+
+------------------+ +------------------------------------+
11+
| wairz-mcp |---->| FastAPI Backend |
12+
| (MCP server) | | |
13+
| 60+ tools | | Services: firmware, analysis, |
14+
+------------------+ | emulation, fuzzing, sbom, uart |
15+
| |
16+
| Ghidra headless - QEMU - AFL++ |
17+
+-----------+--------------------------|
18+
|
19+
+--------------+ +----------------+----------------+
20+
| React SPA |--->| PostgreSQL | Redis |
21+
| (Frontend) | | | |
22+
+--------------+ +----------------+----------------+
23+
24+
Optional:
25+
wairz-uart-bridge.py (host) <-- TCP:9999 --> Docker backend
26+
```
27+
28+
## Tech Stack
29+
30+
| Layer | Technology |
31+
|-------|------------|
32+
| Frontend | React 19, Vite, TypeScript, Tailwind CSS, shadcn/ui |
33+
| Code Viewer | Monaco Editor |
34+
| Component Graph | ReactFlow + Dagre |
35+
| Terminal | xterm.js |
36+
| State Management | Zustand |
37+
| Backend | Python 3.12, FastAPI, SQLAlchemy 2.0 (async), Alembic |
38+
| Database | PostgreSQL 16 |
39+
| Cache | Redis 7 |
40+
| Firmware Extraction | binwalk, sasquatch, jefferson, ubi_reader, cramfs-tools |
41+
| Binary Analysis | radare2 (r2pipe), pyelftools |
42+
| Decompilation | Ghidra 11.3.1 (headless) with custom analysis scripts |
43+
| Emulation | QEMU user-mode + system-mode (ARM, MIPS, MIPSel, AArch64) |
44+
| Fuzzing | AFL++ with QEMU mode |
45+
| SBOM | CycloneDX, NVD API (nvdlib) |
46+
| UART | pyserial (host-side bridge) |
47+
| AI Integration | MCP (Model Context Protocol) |
48+
| Containers | Docker + Docker Compose |
49+
50+
## Project Structure
51+
52+
```
53+
wairz/
54+
├── backend/
55+
│ ├── app/
56+
│ │ ├── main.py # FastAPI application
57+
│ │ ├── config.py # Settings (pydantic-settings)
58+
│ │ ├── database.py # Async SQLAlchemy engine/session
59+
│ │ ├── mcp_server.py # MCP server with dynamic project switching
60+
│ │ ├── models/ # SQLAlchemy ORM models
61+
│ │ ├── schemas/ # Pydantic request/response schemas
62+
│ │ ├── routers/ # REST API endpoints
63+
│ │ ├── services/ # Business logic
64+
│ │ ├── ai/ # MCP tool registry + 60+ tool implementations
65+
│ │ │ └── tools/ # Organized by category
66+
│ │ └── utils/ # Path sandboxing, output truncation
67+
│ ├── alembic/ # Database migrations
68+
│ └── pyproject.toml
69+
├── frontend/
70+
│ ├── src/
71+
│ │ ├── pages/ # Route pages
72+
│ │ ├── components/ # UI components
73+
│ │ ├── api/ # API client functions
74+
│ │ ├── stores/ # Zustand state management
75+
│ │ └── types/ # TypeScript types
76+
│ └── package.json
77+
├── ghidra/
78+
│ ├── Dockerfile # Ghidra headless container
79+
│ └── scripts/ # Custom Java analysis scripts
80+
├── emulation/
81+
│ ├── Dockerfile # QEMU container
82+
│ └── scripts/ # Emulation helper scripts
83+
├── fuzzing/
84+
│ └── Dockerfile # AFL++ container with QEMU mode
85+
├── scripts/
86+
│ └── wairz-uart-bridge.py # Host-side UART serial bridge
87+
├── docker-compose.yml
88+
├── launch.sh
89+
├── .env.example
90+
└── CLAUDE.md
91+
```
92+
93+
## Key Design Decisions
94+
95+
### MCP as the AI Interface
96+
97+
Rather than embedding an LLM in the backend, Wairz exposes analysis tools through MCP. This means:
98+
99+
- Users bring their own Claude subscription (no API keys stored server-side)
100+
- The AI assistant runs in the user's Claude Code or Claude Desktop
101+
- Tools are composable — Claude can chain them together for complex analysis workflows
102+
103+
### Isolated Execution Environments
104+
105+
Firmware binaries are never executed on the host. All execution happens in isolated Docker containers:
106+
107+
- **Emulation** — QEMU runs inside a dedicated container with resource limits
108+
- **Fuzzing** — AFL++ runs in a separate container
109+
- Both are on an isolated Docker network
110+
111+
### Async Everything
112+
113+
The backend is fully async:
114+
115+
- SQLAlchemy async sessions with asyncpg
116+
- `asyncio.create_subprocess_exec` for running Ghidra, binwalk, etc.
117+
- Background tasks for long-running operations (firmware unpacking)
118+
- Non-blocking API endpoints
119+
120+
### Caching Strategy
121+
122+
Analysis results are cached aggressively:
123+
124+
- **Ghidra decompilation** — Cached by binary hash + function name in PostgreSQL
125+
- **SBOM data** — Cached after first generation
126+
- **Firmware metadata** — Extracted once during unpacking
127+
128+
### Security Boundaries
129+
130+
- **Path traversal prevention** — All file access validated against the extracted firmware root via `sandbox.py`
131+
- **Output truncation** — MCP tool outputs capped at 30KB to prevent client issues
132+
- **Resource limits** — Emulation and fuzzing containers have memory and CPU limits

docs/assets/wairz_logo.png

121 KB
Loading

docs/configuration.md

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
# Configuration
2+
3+
All Wairz settings are configured via environment variables or a `.env` file in the project root. Copy `.env.example` to `.env` to get started:
4+
5+
```bash
6+
cp .env.example .env
7+
```
8+
9+
## Environment Variables
10+
11+
### Core
12+
13+
| Variable | Default | Description |
14+
|----------|---------|-------------|
15+
| `DATABASE_URL` | `postgresql+asyncpg://wairz:wairz@postgres:5432/wairz` | PostgreSQL connection string (asyncpg driver) |
16+
| `REDIS_URL` | `redis://redis:6379/0` | Redis connection string |
17+
| `STORAGE_ROOT` | `/data/firmware` | Directory where firmware files are stored on disk |
18+
| `LOG_LEVEL` | `INFO` | Logging level (DEBUG, INFO, WARNING, ERROR) |
19+
20+
### Firmware
21+
22+
| Variable | Default | Description |
23+
|----------|---------|-------------|
24+
| `MAX_UPLOAD_SIZE_MB` | `500` | Maximum firmware upload size in megabytes |
25+
| `MAX_TOOL_OUTPUT_KB` | `30` | MCP tool output truncation limit in kilobytes |
26+
27+
### Ghidra
28+
29+
| Variable | Default | Description |
30+
|----------|---------|-------------|
31+
| `GHIDRA_PATH` | `/opt/ghidra` | Ghidra headless installation path |
32+
| `GHIDRA_SCRIPTS_PATH` | `/opt/ghidra-scripts` | Custom Ghidra analysis scripts path |
33+
| `GHIDRA_TIMEOUT` | `120` | Decompilation timeout in seconds |
34+
35+
### Emulation
36+
37+
| Variable | Default | Description |
38+
|----------|---------|-------------|
39+
| `EMULATION_IMAGE` | `wairz-emulation` | Docker image for QEMU containers |
40+
| `EMULATION_NETWORK` | `emulation_net` | Docker network for emulation containers |
41+
42+
### Fuzzing
43+
44+
| Variable | Default | Description |
45+
|----------|---------|-------------|
46+
| `FUZZING_IMAGE` | `wairz-fuzzing` | Docker image for AFL++ containers |
47+
| `FUZZING_TIMEOUT_MINUTES` | `120` | Maximum fuzzing campaign duration in minutes |
48+
| `FUZZING_MAX_CAMPAIGNS` | `1` | Maximum concurrent fuzzing campaigns per project |
49+
50+
### UART Bridge
51+
52+
| Variable | Default | Description |
53+
|----------|---------|-------------|
54+
| `UART_BRIDGE_HOST` | `host.docker.internal` | Hostname of the UART bridge on the host machine |
55+
| `UART_BRIDGE_PORT` | `9999` | TCP port the UART bridge listens on |
56+
57+
### External APIs
58+
59+
| Variable | Default | Description |
60+
|----------|---------|-------------|
61+
| `NVD_API_KEY` | *(empty)* | Optional NVD API key for higher rate limits during CVE scanning |
62+
63+
## Docker Compose
64+
65+
The default `docker-compose.yml` starts all services. Key port mappings:
66+
67+
| Service | Host Port | Container Port |
68+
|---------|-----------|----------------|
69+
| Frontend | 3000 | 3000 |
70+
| Backend API | 8000 | 8000 |
71+
| PostgreSQL | 5432 | 5432 |
72+
| Redis | 6379 | 6379 |
73+
74+
## Local Development
75+
76+
For local development without Docker, set the database and Redis URLs to point to your local instances:
77+
78+
```env
79+
DATABASE_URL=postgresql+asyncpg://wairz:wairz@localhost:5432/wairz
80+
REDIS_URL=redis://localhost:6379/0
81+
STORAGE_ROOT=./data/firmware
82+
```

docs/features/binary-analysis.md

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# Binary Analysis
2+
3+
Wairz provides deep binary analysis using Ghidra headless for decompilation and custom analysis scripts for cross-references, dataflow tracing, and more.
4+
5+
## Functions
6+
7+
List all functions in an ELF binary, sorted by size (largest first). Large custom functions are typically the most interesting for security analysis.
8+
9+
!!! note
10+
The first analysis of a binary triggers Ghidra headless processing, which takes 1-3 minutes. Subsequent calls use cached results.
11+
12+
## Decompilation
13+
14+
View Ghidra pseudo-C decompilation of any function. The decompiled output is much easier to read than assembly and is the primary tool for understanding binary logic.
15+
16+
Claude can also clean up decompiled code — renaming variables, adding comments — and save the result for viewing in the web UI's "AI Cleaned" toggle.
17+
18+
## Disassembly
19+
20+
View raw assembly instructions for any function. Useful for verifying decompilation accuracy or analyzing low-level behavior.
21+
22+
## Cross-References
23+
24+
- **Xrefs To** — Find all locations that call or reference a given function
25+
- **Xrefs From** — Find all functions called by a given function
26+
- **Find Callers** — Find all call sites of a function across the binary, including aliases
27+
28+
## Dataflow Tracing
29+
30+
Trace data from user-controlled sources to dangerous sinks:
31+
32+
- **Sources:** `websGetVar`, `getenv`, `recv`, `read`, `nvram_get`, `fgets`
33+
- **Sinks:** `system`, `popen`, `exec*`, `sprintf`, `strcpy`
34+
35+
This is the highest-impact tool for finding vulnerabilities in embedded web interfaces (e.g., router `httpd` binaries with `goform` handlers).
36+
37+
## Cross-Binary Dataflow
38+
39+
Trace data flows across multiple firmware binaries via IPC mechanisms:
40+
41+
- **nvram**`nvram_get`/`nvram_set` pairs
42+
- **config** — Config get/set operations
43+
- **file** — File I/O between binaries
44+
45+
## String References
46+
47+
Find all functions that reference strings matching a pattern. Useful for tracing interesting strings (URLs, format strings, parameter names) back to the code that uses them.
48+
49+
## Stack & Global Layout
50+
51+
- **Stack Layout** — View local variables, offsets, sizes, and buffer-to-return-address distances for overflow analysis
52+
- **Global Layout** — Map global variables around a target symbol to understand overflow impact
53+
54+
## Binary Protections
55+
56+
Check security protections (equivalent to `checksec`):
57+
58+
| Protection | Description |
59+
|------------|-------------|
60+
| NX | No-execute (DEP) |
61+
| RELRO | Read-only relocations |
62+
| Canary | Stack canaries |
63+
| PIE | Position-independent executable |
64+
| Fortify | Fortify Source |
65+
| Stripped | Symbol table removed |
66+
67+
Use `check_all_binary_protections` to scan all binaries and sort by protection score (least protected first).
68+
69+
## MCP Tools
70+
71+
| Tool | Description |
72+
|------|-------------|
73+
| `list_functions` | List functions sorted by size |
74+
| `decompile_function` | Ghidra pseudo-C decompilation |
75+
| `disassemble_function` | Assembly instructions |
76+
| `list_imports` / `list_exports` | Imported and exported symbols |
77+
| `xrefs_to` / `xrefs_from` | Cross-references |
78+
| `find_callers` | All call sites of a function |
79+
| `find_string_refs` | Functions referencing matching strings |
80+
| `trace_dataflow` | Source-to-sink dataflow analysis |
81+
| `cross_binary_dataflow` | Cross-binary IPC tracing |
82+
| `get_stack_layout` / `get_global_layout` | Memory layout analysis |
83+
| `check_binary_protections` | Security protections check |
84+
| `resolve_import` | Find and decompile imported functions |
85+
| `search_binary_content` | Search for byte/string/disasm patterns |
86+
| `get_binary_info` | ELF metadata and linked libraries |

docs/features/comparison.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# Firmware Comparison
2+
3+
Wairz can compare firmware versions to identify changes between releases — useful for patch analysis, understanding what a vendor fixed or modified.
4+
5+
## Filesystem Diff
6+
7+
Compare two firmware versions' filesystems to see:
8+
9+
- **Added files** — New files in the newer version
10+
- **Removed files** — Files deleted in the newer version
11+
- **Modified files** — Files with changed content (compared by hash)
12+
- **Permission changes** — Files with changed permissions
13+
14+
## Binary Diff
15+
16+
Compare a specific binary between two firmware versions at the function level:
17+
18+
- **Added functions** — New functions in the newer version
19+
- **Removed functions** — Functions deleted in the newer version
20+
- **Modified functions** — Functions with size changes
21+
22+
## Decompilation Diff
23+
24+
Side-by-side decompilation comparison — decompile the same function from two firmware versions and produce a unified diff. Shows exactly what changed in the pseudo-C code.
25+
26+
This is the most detailed comparison level, useful for understanding precisely what a vendor patched.
27+
28+
## Usage
29+
30+
1. Upload multiple firmware versions to the same project
31+
2. Use `list_firmware_versions` to see available versions and their IDs
32+
3. Run comparison tools with the firmware IDs
33+
34+
## MCP Tools
35+
36+
| Tool | Description |
37+
|------|-------------|
38+
| `list_firmware_versions` | List uploaded firmware versions |
39+
| `diff_firmware` | Compare filesystem trees |
40+
| `diff_binary` | Compare binary functions |
41+
| `diff_decompilation` | Side-by-side decompilation diff |

0 commit comments

Comments
 (0)