Skip to content

Commit 708316c

Browse files
feat: add SWE-bench agents, C/Zig implementations, and benchmark tool
Major additions: - SWE-bench extended agents (Python, TypeScript) with 15 tools - Multi-file atomic edits with rollback - Intelligent code search (grep, find_definition) - Test framework auto-detection - Git integration (status, diff, log) - Think tool for complex reasoning - New language implementations: - C: 200 LOC, 17KB binary (smallest) - Zig: 92 LOC, uses curl for HTTPS - Benchmark tool for API performance testing: - TTFB, total time, tokens/sec metrics - Multi-endpoint comparison (anthropic, z.ai) - JSON output for analysis - Performance report with comprehensive testing data: - 16 tests across 4 languages - Token usage and cost analysis - ASCII art visualizations Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent a4f168a commit 708316c

File tree

16 files changed

+2570
-98
lines changed

16 files changed

+2570
-98
lines changed

implementations/.gitignore

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Binary outputs
2+
c/nano
3+
zig/zig-out/
4+
zig/.zig-cache/
5+
rust/target/
6+
go/nano
7+
8+
# IDE
9+
.vscode/
10+
.idea/
11+
12+
# OS
13+
.DS_Store

implementations/COMPARISON.md

Lines changed: 150 additions & 96 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,20 @@
11
# Multi-Language Implementation Comparison
22

3+
## Quick Links
4+
5+
| Version | Description | LOC |
6+
|---------|-------------|-----|
7+
| **Minimal** | Basic agent (~5 tools) | 72-200 LOC |
8+
| **SWE-bench** | Extended agent (15 tools) | 250-350 LOC |
9+
| **Benchmark** | Performance testing tool | 300 LOC |
10+
311
## Summary
412

513
All implementations share the same core:
6-
1. **5 tools**: read_file, write_file, edit_file, bash, list_dir
14+
1. **Tools**: read_file, write_file, edit_file, bash, list_dir (varies by impl)
715
2. **Agent loop**: Send message → Execute tool calls → Repeat until done
8-
3. **Proxy support**: ANTHROPIC_BASE_URL for custom endpoints
16+
3. **Model**: `claude-sonnet-4-20250514` (configurable via MODEL env)
17+
4. **Proxy support**: ANTHROPIC_BASE_URL for custom endpoints
918

1019
## Line Count Comparison
1120

@@ -14,134 +23,179 @@ All implementations share the same core:
1423
| **nano.py** | 72 | Python | Zero (stdlib) |
1524
| **nano.go** | 85 | Go | Zero (stdlib) |
1625
| **nano-minimal.ts** | 86 | TypeScript | Zero (fetch) |
26+
| **nano.zig** | 92 | Zig | Zero (stdlib + curl) |
1727
| **nano.rs** | 118 | Rust | 3 crates |
18-
| **nano.ts** | 216 | TypeScript | SDK + glob |
28+
| **nano.c** | 200 | C | Zero (stdlib + curl) |
1929

20-
## Feature Comparison
30+
## Binary Size Comparison
2131

22-
| Feature | Python | TS Minimal | TS Full | Rust | Go |
23-
|---------|--------|------------|---------|------|-----|
24-
| Tools | 5 | 5 | 7 | 5 | 5 |
25-
| Interactive REPL ||||||
26-
| Streaming ||||||
27-
| Custom base URL ||||||
28-
| Zero dependencies ||||||
29-
| Single file ||||||
32+
| Language | Binary Size | Notes |
33+
|----------|-------------|-------|
34+
| **C** | **17 KB** | Smallest! Uses curl for HTTPS |
35+
| Rust | 2.0 MB | Static linking with TLS |
36+
| Zig | 2.2 MB | Static linking |
37+
| Go | 7.9 MB | Includes runtime |
3038

31-
## Performance (Estimated)
39+
## Startup Performance
3240

33-
| Metric | Python | TS Minimal | TS Full | Rust | Go |
34-
|--------|--------|------------|---------|------|-----|
35-
| Startup | ~50ms | ~30ms | ~80ms | ~5ms | ~10ms |
36-
| Memory | ~30MB | ~50MB | ~80MB | ~5MB | ~10MB |
37-
| Binary size | N/A | N/A | N/A | ~2MB | ~5MB |
38-
| Compile time | N/A | N/A | N/A | ~30s | ~2s |
41+
Measured as average of 3 runs (time to show usage/error):
42+
43+
| Language | Startup Time | Relative |
44+
|----------|-------------|----------|
45+
| **Rust** | **1 ms** | 1x (baseline) |
46+
| **Zig** | **1 ms** | 1x |
47+
| C | 4 ms | 4x |
48+
| Go | 4 ms | 4x |
49+
| TypeScript | 19 ms | 19x |
50+
| Python | 38 ms | 38x |
51+
52+
## Feature Comparison
53+
54+
| Feature | Python | TS | Go | Rust | Zig | C |
55+
|---------|--------|-----|-----|------|-----|---|
56+
| Tools | 5 | 5 | 5 | 5 | 3 | 4 |
57+
| Custom base URL |||||||
58+
| Zero runtime deps |||||||
59+
| Single file |||||||
60+
| Native TLS |||||||
3961

4062
## Platform Support
4163

42-
| Platform | Python | TS | Rust | Go |
43-
|----------|--------|-----|------|-----|
44-
| Linux x64 |||||
45-
| macOS ARM |||||
46-
| Windows |||||
47-
| Raspberry Pi |||||
48-
| ESP32/MCU ||| ⚠️ ||
49-
| WASM | ⚠️ ||||
50-
| Docker |||||
64+
| Platform | Python | TS | Go | Rust | Zig | C |
65+
|----------|--------|-----|-----|------|-----|---|
66+
| Linux x64 |||||||
67+
| macOS ARM |||||||
68+
| Windows |||||||
69+
| Raspberry Pi |||||||
70+
| ESP32/MCU |||| ⚠️ | ⚠️ | ⚠️ |
71+
| WASM | ⚠️ ||||| ⚠️ |
5172

5273
⚠️ = Possible with modifications
5374

54-
## Code Structure Comparison
75+
## Best Use Cases
5576

56-
### Python (72 LOC)
57-
```python
58-
# Config: 4 lines
59-
# Tools JSON: 5 lines
60-
# Tool execution: 17 lines
61-
# API call: 5 lines
62-
# Agent loop: 15 lines
63-
# Main: 6 lines
64-
```
77+
| Language | Best For |
78+
|----------|----------|
79+
| **Python** | Data science, ML, quick scripts, Raspberry Pi |
80+
| **TypeScript** | Web dev, VS Code extensions, Bun ecosystem |
81+
| **Go** | Cloud services, K8s, single binary deployment |
82+
| **Rust** | Embedded, WASM, high-performance, memory safety |
83+
| **Zig** | Embedded, C interop, freestanding targets |
84+
| **C** | Minimal size, legacy systems, maximum portability |
6585

66-
### Go (85 LOC)
67-
```go
68-
// Config/imports: 16 lines
69-
// Tools JSON: 6 lines
70-
// Types: 4 lines
71-
// Tool execution: 14 lines
72-
// API call: 10 lines
73-
// Agent loop: 12 lines
74-
// Main: 8 lines
75-
```
86+
## SWE-bench Extended Agents
87+
88+
For real-world software engineering tasks (GitHub issues, debugging, refactoring):
89+
90+
| File | Language | LOC | Tools |
91+
|------|----------|-----|-------|
92+
| `python/nano_swe.py` | Python | ~250 | 15 |
93+
| `typescript/nano-swe.ts` | TypeScript | ~300 | 15 |
7694

77-
### TypeScript Minimal (86 LOC)
78-
```typescript
79-
// Config: 5 lines
80-
// Tools array: 6 lines
81-
// Tool execution: 15 lines
82-
// Types: 2 lines
83-
// API call: 10 lines
84-
// Agent loop: 18 lines
85-
// Main: 4 lines
95+
### SWE-bench Tools (15 total)
96+
97+
```
98+
┌─────────────────────────────────────────────────────────────────────────────┐
99+
│ FILE OPERATIONS │ CODE SEARCH │ GIT OPERATIONS │
100+
│ ├─ read_file │ ├─ grep │ ├─ git_status │
101+
│ ├─ write_file │ ├─ find_files │ ├─ git_diff │
102+
│ ├─ edit_file │ └─ find_definition │ └─ git_log │
103+
│ └─ multi_edit │ │ │
104+
├─────────────────────────────────────────────────────────────────────────────┤
105+
│ DIRECTORY │ SHELL & TESTING │ PLANNING │
106+
│ ├─ list_dir │ ├─ bash │ └─ think │
107+
│ └─ tree │ └─ run_tests │ │
108+
└─────────────────────────────────────────────────────────────────────────────┘
86109
```
87110

88-
### Rust (118 LOC)
89-
```rust
90-
// Config/tools: 15 lines
91-
// Types: 8 lines
92-
// Tool execution: 30 lines
93-
// API call: 15 lines
94-
// Agent loop: 20 lines
95-
// Main: 15 lines
111+
### Key Features
112+
113+
- **Atomic multi-file edits** with rollback on failure
114+
- **Intelligent code search** with grep, find, and symbol definition lookup
115+
- **Test framework auto-detection** (pytest, npm test, cargo test, go test)
116+
- **Git integration** for status, diff, and history
117+
- **Think tool** for complex reasoning without action
118+
119+
### Usage
120+
121+
```bash
122+
# Python
123+
python nano_swe.py "fix the bug in issue #123"
124+
python nano_swe.py "add input validation to the login function"
125+
python nano_swe.py "refactor the database module to use connection pooling"
126+
127+
# TypeScript
128+
bun nano-swe.ts "fix the type error in src/utils.ts"
129+
bun nano-swe.ts "add error handling to the API endpoints"
96130
```
97131

98-
## Best Use Cases
132+
## Benchmark Tool
99133

100-
| Language | Best For |
101-
|----------|----------|
102-
| **Python** | Pi, data science, ML, quick scripts |
103-
| **Go** | Cloud, K8s, single binary deploy, servers |
104-
| **TypeScript** | Web dev, VS Code, Bun ecosystem |
105-
| **Rust** | Embedded, WASM, high-performance, no GC |
134+
Test API performance with the included benchmark script:
106135

107-
## Tested ✅
136+
```bash
137+
# Quick benchmark (5 iterations per test)
138+
python benchmark.py
108139

109-
| Implementation | API Test | Tool Test |
110-
|----------------|----------|-----------|
111-
| Python || ✅ read_file |
112-
| TypeScript Minimal || ✅ read_file |
113-
| TypeScript Full || ✅ read_file |
114-
| Go || ✅ read_file |
115-
| Rust | ⚠️ | ⚠️ (toolchain issue) |
140+
# Full benchmark (20 iterations)
141+
python benchmark.py --full
116142

117-
## Usage Examples
143+
# Compare endpoints
144+
python benchmark.py --compare
145+
146+
# Test specific endpoint
147+
python benchmark.py --endpoint z.ai
148+
python benchmark.py --endpoint anthropic
149+
```
150+
151+
### Metrics Measured
152+
153+
- **TTFB**: Time to first byte (network + inference start)
154+
- **Total Time**: Complete response time
155+
- **Tokens/sec**: Output throughput
156+
- **Error Rate**: Request failures
157+
158+
## Build Commands
118159

119160
```bash
120-
# Python
121-
ANTHROPIC_API_KEY=sk-... python nano.py "read config.json"
161+
# Python (no build needed)
162+
python3 python/nano.py "prompt"
122163

123164
# TypeScript (Bun)
124-
ANTHROPIC_API_KEY=sk-... bun nano-minimal.ts "list files"
165+
bun typescript/nano-minimal.ts "prompt"
125166

126167
# Go
127-
ANTHROPIC_API_KEY=sk-... go run nano.go "edit file.txt"
168+
cd go && go build -o nano nano.go
169+
./nano "prompt"
128170

129171
# Rust
130-
ANTHROPIC_API_KEY=sk-... cargo run -- "run tests"
172+
cd rust && cargo build --release
173+
./target/release/nano-opencode "prompt"
174+
175+
# Zig
176+
cd zig && zig build -Doptimize=ReleaseFast
177+
./zig-out/bin/nano "prompt"
131178

132-
# With proxy (all languages)
133-
ANTHROPIC_BASE_URL=https://proxy.example.com/v1 \
134-
ANTHROPIC_API_KEY=your-key \
135-
python nano.py "hello"
179+
# C
180+
cd c && make
181+
./nano "prompt"
136182
```
137183

184+
## Environment Variables
185+
186+
All implementations support:
187+
- `ANTHROPIC_API_KEY` or `ANTHROPIC_AUTH_TOKEN` - API key (required)
188+
- `ANTHROPIC_BASE_URL` - Custom API endpoint (optional)
189+
- `MODEL` - Model name (default: `claude-sonnet-4-20250514`)
190+
138191
## Conclusion
139192

140-
The core agent loop is **remarkably consistent** across languages:
193+
**Key findings:**
141194

142-
1. All implementations are **<120 LOC** (except TS Full with extras)
143-
2. All support **custom proxy URLs**
144-
3. All implement the **same 5 tools**
145-
4. All follow the **same agent loop pattern**
195+
1. **Smallest binary**: C at 17KB (uses external curl)
196+
2. **Fastest startup**: Rust/Zig at 1ms
197+
3. **Fewest lines**: Python at 72 LOC
198+
4. **Most portable**: Go (single static binary with TLS)
199+
5. **Best for embedded**: Zig/Rust (memory safety, no GC)
146200

147-
This proves that AI coding agents don't need to be complex. The essential functionality fits in ~100 lines of any language.
201+
The core agent loop is **remarkably consistent** across all 6 languages - proving that AI coding agents don't need to be complex. The essential functionality fits in ~100 lines of any language.

0 commit comments

Comments
 (0)