Skip to content

Commit 4ce5238

Browse files
committed
fix(config): harden init boundaries and docs
1 parent c36c06c commit 4ce5238

15 files changed

Lines changed: 955 additions & 654 deletions

AGENTS.md

Lines changed: 75 additions & 134 deletions
Original file line numberDiff line numberDiff line change
@@ -1,147 +1,88 @@
1-
# AGENTS.md — AI Agent Technical Context
1+
# AGENTS.md
22

3-
## Project Overview
3+
## Project Identity
44

5-
**attnres** is the first Rust implementation of Attention Residuals (MoonshotAI/Kimi paper) using the [burn](https://github.com/tracel-ai/burn) deep learning framework. It provides a drop-in replacement for standard residual connections in Transformers.
5+
`attnres` is a Rust library that implements Attention Residuals for burn-based
6+
Transformer experiments, plus examples, benchmarks, and a web demo.
67

7-
## Tech Stack
8+
## Current State
89

9-
| Component | Technology | Version |
10-
|-------------|-----------------|----------|
11-
| Language | Rust | 2021 edition (1.80+) |
12-
| ML Framework| burn | 0.20 |
13-
| Test Backend| NdArray | (CPU, deterministic) |
14-
| Testing | cargo test + proptest + criterion ||
15-
| Linting | clippy + rustfmt ||
16-
| CI | GitHub Actions | test, clippy, fmt, build-examples |
10+
- Status: alpha as of March 16, 2026.
11+
- Suitable for: research, examples, local experimentation, integration work on
12+
trusted inputs.
13+
- Not yet suitable for: production inference services, validated GPU claims,
14+
PyTorch checkpoint interchange, or a stable 1.0 API promise.
15+
- Important gap: there is no dedicated `spec.md` in this checkout. Use
16+
[ARCHITECTURE.md](ARCHITECTURE.md), README, module docs, and tests as the
17+
current source of truth.
1718

18-
## Project Structure
19+
## Verified Commands
1920

20-
```
21-
src/
22-
├── lib.rs # Public API re-exports + module declarations
23-
├── config.rs # AttnResConfig — validated builder pattern (JSON save/load)
24-
├── attn_res_op.rs # Core AttnRes operation (depth-wise softmax attention)
25-
├── block_state.rs # BlockState — cumulative block representation tracking
26-
├── layer.rs # AttnResLayer — transformer layer with dual AttnRes
27-
├── model.rs # AttnResTransformer — full model with standard + two-phase forward
28-
├── rms_norm.rs # RMSNorm implementation
29-
├── serialization.rs # Model weight save/load (NamedMpk, binary, compact formats)
30-
├── two_phase.rs # Two-phase inference primitives (phase1_batched, online_softmax_merge)
31-
├── attention.rs # Multi-head self-attention
32-
├── feed_forward.rs # Two-layer MLP with GELU activation
33-
└── utils.rs # Causal mask generation helpers
34-
35-
tests/
36-
├── unit_tests.rs # Core algorithm correctness tests
37-
├── differential_tests.rs # PyTorch reference comparison tests
38-
├── property_tests.rs # proptest property-based tests
39-
└── integration_tests.rs # Full model training loop tests
40-
41-
examples/
42-
├── train_tiny.rs # Train a small model on synthetic data
43-
├── compare_residuals.rs # Compare AttnRes vs standard residuals
44-
└── visualize_weights.rs # Visualize depth attention patterns
45-
46-
benches/
47-
└── attn_res_benchmark.rs # Criterion benchmarks
48-
49-
fixtures/ # Reference outputs from PyTorch
50-
├── attn_res_forward.json
51-
└── block_state_tracking.json
52-
53-
web-demo/ # Interactive web demo (WASM + Vite)
54-
├── crate/ # Rust WASM crate (pure-Rust AttnRes reimplementation)
55-
│ ├── Cargo.toml
56-
│ └── src/lib.rs # wasm-bindgen exports: AttnResEngine
57-
├── src/ # TypeScript frontend
58-
│ ├── main.ts # App entry point
59-
│ ├── style.css # Academic-grade styling
60-
│ ├── viz.ts # Canvas 2D heatmaps, charts
61-
│ └── diagrams.ts # Static architectural diagrams
62-
├── index.html # Single-page app
63-
├── package.json # Vite + TypeScript
64-
└── vite.config.ts # Build config
65-
```
66-
67-
## Commands
21+
These commands were run successfully during the latest quality pass:
6822

6923
```bash
70-
cargo build # Build the project
71-
cargo test --all-features # Run all 87 tests
72-
cargo test test_name # Run specific test
73-
cargo clippy -- -D warnings # Lint (warnings = errors)
74-
cargo fmt # Format code
75-
cargo fmt -- --check # Check formatting without modifying
76-
cargo bench # Run Criterion benchmarks
77-
cargo run --example train_tiny # Train example
78-
cargo run --example compare_residuals # Comparison example
79-
cargo run --example visualize_weights # Visualization example
80-
81-
# Web demo
82-
cd web-demo && npm run build:wasm # Build WASM crate
83-
cd web-demo && npm run dev # Start Vite dev server
84-
cd web-demo && npm run build # Production build (WASM + Vite)
24+
cargo fmt -- --check
25+
cargo clippy -- -D warnings
26+
cargo test --all-features
27+
cargo build --examples
28+
cd web-demo && npm run build
8529
```
8630

87-
## Architecture Essentials
88-
89-
### Core Algorithm (AttnRes)
90-
91-
Standard residual: `x_{l+1} = x_l + f_l(x_l)` (fixed unit weights)
92-
93-
AttnRes: `x_{l+1} = Σ α_i · v_i` where α = softmax(w_l · RMSNorm(V)) over depth dimension
31+
Additional useful commands:
9432

95-
Key invariants:
96-
1. **Zero-init pseudo-queries** → starts as uniform averaging (standard residual behavior)
97-
2. **Two AttnRes per transformer layer** — one before self-attention, one before MLP
98-
3. **Softmax over depth** (block/layer dimension), NOT over sequence tokens
99-
4. **RMSNorm on keys** to prevent magnitude domination
100-
5. **Block boundaries** at every `block_size/2` sublayers
101-
102-
### Data Flow
103-
104-
```
105-
Input IDs → Embedding → [AttnResLayer × N] → RMSNorm → LM Head → Logits
106-
107-
AttnResOp(pre-attn) → RMSNorm → MultiHeadAttention
108-
AttnResOp(pre-mlp) → RMSNorm → FeedForward
33+
```bash
34+
cargo bench
35+
cargo doc --open
10936
```
11037

111-
### Configuration
112-
113-
`AttnResConfig::new(d_model, num_layers, num_blocks)` where:
114-
- `d_model`: Hidden dimension
115-
- `num_layers`: Number of **sublayers** (transformer layers × 2)
116-
- `num_blocks`: Number of blocks for Block AttnRes (set = num_layers for Full AttnRes)
117-
118-
## Boundaries
119-
120-
### Read-Only (never modify)
121-
- `spec.md`, `paper.md`, `research_report.md`, `implementation_plan.md`, `LICENSE`
122-
123-
### Gated (requires approval)
124-
- `Cargo.toml` (dependency changes)
125-
- `.github/workflows/` (CI changes)
126-
- `cargo publish`
127-
128-
## Source of Truth
129-
130-
`spec.md` is the authoritative specification. All algorithm implementations must match the pseudocode and equations defined there.
131-
132-
## Web Demo
133-
134-
The `web-demo/` directory contains a fully interactive browser-based demo. The WASM crate (`web-demo/crate/`) is a pure-Rust reimplementation of the core AttnRes algorithm (no burn dependency for WASM portability), faithfully mirroring `src/attn_res_op.rs`. It exposes:
135-
136-
- `AttnResEngine` — model creation, forward pass, training simulation
137-
- `compute_attn_res()` — interactive core operation with custom pseudo-queries
138-
- `train_step()` — simulated training showing depth attention pattern emergence
139-
140-
Frontend: Vite + TypeScript with Canvas 2D visualizations (heatmaps, bar charts, loss curves). Academic design with full algorithm explanation.
141-
142-
## Known Gaps
143-
144-
- No PyTorch checkpoint loading (safetensors format)
145-
- GPU backends (wgpu, CUDA, Metal) untested
146-
- No distributed training support
147-
- Pre-trained weight import/export utilities
38+
## Architecture Map
39+
40+
- `src/config.rs`: `AttnResConfig`, `ConfigError`, validation helpers.
41+
- `src/attn_res_op.rs`: core depth-attention residual operator.
42+
- `src/block_state.rs`: completed blocks + current partial block.
43+
- `src/layer.rs`: one Transformer layer with two AttnRes operations.
44+
- `src/model.rs`: full model, hidden-state forward, two-phase forward.
45+
- `src/two_phase.rs`: batched inter-block pass + online softmax merge.
46+
- `src/attention.rs`: multi-head self-attention.
47+
- `src/feed_forward.rs`: two-layer GELU MLP.
48+
- `src/rms_norm.rs`: RMSNorm for 3D and 4D tensors.
49+
- `src/serialization.rs`: burn-record save/load helpers.
50+
- `tests/`: unit, integration, property, and differential coverage.
51+
- `examples/demo_tui.rs`: terminal demo with live routing visualization.
52+
- `web-demo/`: WASM crate plus Vite frontend.
53+
54+
## Non-Negotiable Invariants
55+
56+
- Pseudo-query vectors start at zero.
57+
- Depth softmax is over block/layer sources, not tokens.
58+
- Each Transformer layer uses two AttnRes operations.
59+
- Block boundaries are defined in sublayer space.
60+
- `BlockState.blocks[0]` is the embedding block.
61+
- Internal invariant failures should panic loudly rather than produce silent
62+
wrong outputs.
63+
64+
## Conventions
65+
66+
- Prefer `try_validate`, `try_init_model`, `try_init_layer`, and `try_init_op`
67+
for untrusted config input. Panic-based constructors remain for trusted,
68+
hard-coded configs.
69+
- Keep tensor shape comments short and accurate when code would otherwise be
70+
hard to parse.
71+
- Add tests for every algorithm or boundary-condition change.
72+
- Keep README and roadmap claims tied to commands or tests that actually ran.
73+
74+
## Constraints
75+
76+
- Do not modify `LICENSE`.
77+
- Do not change dependency versions in `Cargo.toml` without approval.
78+
- Do not change `.github/workflows/` without approval.
79+
- Do not claim backend support, benchmark numbers, or checkpoint compatibility
80+
unless the repository validates them.
81+
82+
## Gotchas
83+
84+
- `num_layers` counts sublayers, not full Transformer blocks.
85+
- Full AttnRes means `num_blocks == num_layers`, so block boundaries can occur
86+
between attention and MLP inside one Transformer layer.
87+
- The web demo is a separate pure-Rust reimplementation for WASM portability;
88+
do not assume it automatically stays in sync with `src/` without verification.

ARCHITECTURE.md

Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
# Architecture
2+
3+
`attnres` is a library-first reference implementation of Attention Residuals
4+
for burn-based Transformer experiments.
5+
6+
## Scope
7+
8+
The repository contains three primary surfaces:
9+
10+
- The core Rust crate under `src/`.
11+
- Rust examples and benchmarks under `examples/` and `benches/`.
12+
- A browser demo under `web-demo/`.
13+
14+
This document is the current architecture reference for the repository. A
15+
dedicated formal `spec.md` is not present in this checkout.
16+
17+
## Forward Path
18+
19+
Standard model forward:
20+
21+
```text
22+
input_ids
23+
-> embedding
24+
-> BlockState { blocks: [embedding], partial_block: None }
25+
-> for each AttnResLayer:
26+
1. depth attention before self-attention
27+
2. self-attention sublayer
28+
3. depth attention before MLP
29+
4. MLP sublayer
30+
5. block-state update
31+
-> final RMSNorm
32+
-> LM head
33+
-> logits
34+
```
35+
36+
Two-phase forward:
37+
38+
```text
39+
completed blocks
40+
-> phase1_batched: inter-block attention statistics for each sublayer
41+
-> sequential intra-block updates
42+
-> online_softmax_merge
43+
-> same final RMSNorm + LM head
44+
```
45+
46+
## Module Map
47+
48+
- `src/config.rs`
49+
- Owns `AttnResConfig`.
50+
- Validates user-supplied config.
51+
- Exposes `ConfigError`, `try_validate`, `try_init_model`, and panic-based
52+
compatibility helpers.
53+
54+
- `src/attn_res_op.rs`
55+
- Implements the core depth-attention residual operator.
56+
- Stacks completed blocks plus the optional partial block.
57+
- Applies RMSNorm to keys before computing depth logits.
58+
59+
- `src/block_state.rs`
60+
- Tracks completed blocks and the currently accumulating block.
61+
- Treats the token embedding as the first completed block.
62+
63+
- `src/layer.rs`
64+
- Implements one Transformer layer with two AttnRes calls.
65+
- Handles block-boundary transitions at sublayer granularity.
66+
67+
- `src/model.rs`
68+
- Builds the full model.
69+
- Provides standard forward, two-phase forward, and hidden-state forward.
70+
71+
- `src/attention.rs`
72+
- Standard multi-head self-attention.
73+
- Assumes additive masks with large negative values for masked positions.
74+
75+
- `src/feed_forward.rs`
76+
- Two-layer GELU MLP.
77+
78+
- `src/rms_norm.rs`
79+
- RMSNorm for both `[B, T, D]` and `[N, B, T, D]` tensors.
80+
81+
- `src/two_phase.rs`
82+
- Batched phase-1 inter-block attention.
83+
- Online softmax merge for intra-block values.
84+
85+
- `src/serialization.rs`
86+
- Save/load helpers for burn recorders.
87+
- Accepts `Path`-like inputs rather than forcing UTF-8 strings.
88+
89+
## Invariants
90+
91+
These are the invariants that most directly affect correctness:
92+
93+
- Pseudo-query vectors are zero-initialized.
94+
- Attention weights are normalized over the depth dimension.
95+
- Each Transformer layer performs two AttnRes operations.
96+
- Block boundaries are defined in sublayer space, not Transformer-layer space.
97+
- `BlockState.blocks[0]` is always the token embedding block.
98+
- `partial_block` is expected to exist after at least one sublayer has run.
99+
100+
If one of these invariants is violated by internal code, the crate prefers a
101+
direct panic with a specific message rather than silently returning a wrong
102+
result.
103+
104+
## Error Model
105+
106+
- Invalid caller-supplied configuration should use `ConfigError`.
107+
- Serialization failures return `SerializationError`.
108+
- Internal invariant breaks still panic because they represent library bugs, not
109+
recoverable runtime conditions.
110+
111+
## Verification
112+
113+
The current repository-wide baseline checked during the March 16, 2026 quality
114+
pass is:
115+
116+
```bash
117+
cargo fmt -- --check
118+
cargo clippy -- -D warnings
119+
cargo test --all-features
120+
cargo build --examples
121+
cd web-demo && npm run build
122+
```

CHANGELOG.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# Changelog
2+
3+
All notable user-visible changes to this project will be documented here.
4+
5+
The repository did not maintain a structured changelog before March 16, 2026.
6+
7+
## [Unreleased]
8+
9+
### Added
10+
11+
- `ConfigError` with typed validation for invalid model configuration.
12+
- Fallible initialization helpers such as `try_validate`, `try_init_model`,
13+
`try_init_layer`, and `try_init_op`.
14+
- `ARCHITECTURE.md` and `CONTRIBUTING.md`.
15+
16+
### Changed
17+
18+
- Serialization APIs now accept `Path`-like inputs instead of only `&str`.
19+
- README, roadmap, and agent context files now reflect the current alpha status
20+
and verified commands instead of stale or aspirational claims.
21+
22+
### Fixed
23+
24+
- Explicit validation for `num_heads = 0`.
25+
- Explicit validation for out-of-range `layer_idx` values.

0 commit comments

Comments
 (0)