Skip to content

Commit 14720c6

Browse files
authored
Phase 1.2: Build Tooling Modernization (crossbario#1788)
* add feature branch audit file * Add tracking document for issue crossbario#1789 and Rust rewrite analysis Document NULL pointer dereference issue reported by University of Athens researchers and provide comprehensive analysis of effort required to rewrite NVX module in Rust. Issue crossbario#1789: Simple fix (< 1 hour) - add NULL checks after malloc Rust rewrite: 2-3 weeks effort - eliminates entire class of bugs Recommendation: Fix crossbario#1789 in Phase 1.2, consider Rust rewrite later. * Phase 1.2: Build tooling modernization for autobahn-python - Add project header banner to justfile default recipe - Add clean-build recipe to remove build artifacts - Add verify-wheels recipe with twine check and auditwheel validation - Add docstring to docs-view recipe Part of GitHub issue crossbario#1789 (Phase 1.2 Build Tooling Modernization). * build: Add missing common justfile recipes for ecosystem consistency Added recipes: - fix-format: renamed from autoformat (autoformat kept as alias) - test-all: run tests on all environments - upgrade: upgrade dependencies in single environment - upgrade-all: upgrade dependencies in all environments These recipes align autobahn-python with the common justfile recipes specification defined in MODERNIZATION.md for consistent developer experience across the WAMP Python ecosystem. * add basic library import test * build: Migrate to src layout and update .ai submodule - Move autobahn/, flatbuffers/, twisted/ packages to src/ directory - Update pyproject.toml for src layout with where = ["src"] - Update docs/conf.py autoapi_dirs to point to src/autobahn - Update .ai submodule to latest (3d10240) with audit docs - Create docs/ai/ subfolder with symlinks to .ai submodule - Update docs/index.rst to reference ai/ subfolder * fix: Update setup.py cffi_modules paths for src layout - Update paths from autobahn/nvx/ to src/autobahn/nvx/ - Add detailed comment explaining why setup.py is still needed (CFFI has no pyproject.toml support for cffi_modules) * fix: Update mypy.ini and justfile paths for src layout - Update mypy.ini python_version from 3.7 to 3.11 - Update justfile paths from autobahn/ to src/autobahn/ - check-typing mypy path - check-coverage-asyncio pytest paths - test-asyncio pytest paths - clean-fbs, build-fbs FlatBuffers paths * fix: Include NVX C source files in package data The _utf8validator.c and _xormasker.c files are needed at runtime for CFFI to compile the native extensions. * fix: Use editable install for test/check recipes With src layout, tests run from src/ directory. Using non-editable install copies files to site-packages, causing pytest to find both locations and fail with "import file mismatch". Changed all test/check recipes to use install-dev (editable mode) instead of install (non-editable): - check-typing, check-coverage-*, check-coverage - test-import, test-twisted, test-asyncio, test-serdes - wstest-testeeclient-*, wstest-testeeserver-* * chore: Bump version to 25.12.1 Note: This work was completed with AI assistance (Claude Code).
1 parent d0eeada commit 14720c6

233 files changed

Lines changed: 434 additions & 54 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
- [ ] I did **not** use any AI-assistance tools to help create this pull request.
2+
- [x] I **did** use AI-assistance tools to *help* create this pull request.
3+
- [x] I have read, understood and followed the project's AI_POLICY.md when creating code, documentation etc. for this pull request.
4+
5+
Submitted by: @oberstet
6+
Date: 2025-11-25
7+
Related issue(s): #1787
8+
Branch: oberstet:modernization-phase-1.2

ISSUES_TO_FIX.md

Lines changed: 228 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,228 @@
1+
# Issues To Fix
2+
3+
## Security/Stability Issues
4+
5+
### Issue #1789: NULL Pointer Dereference in nvx_utf8vld_new
6+
- **Reporter**: University of Athens researchers
7+
- **Severity**: Medium (unlikely in practice, but valid bug)
8+
- **Location**: `autobahn/nvx/_utf8validator.c:110-115`
9+
- **Problem**: `malloc()` return value not checked before dereferencing
10+
- **Similar Issue**: `autobahn/nvx/_xormasker.c` likely has same pattern
11+
- **Fix Priority**: Include in Phase 1.2 or later
12+
- **Estimated Effort**: < 1 hour (add NULL checks, return error to Python)
13+
- **Link**: https://github.com/crossbario/autobahn-python/issues/1789
14+
15+
## Future Modernization Considerations
16+
17+
### Rust Rewrite of NVX Module
18+
- See analysis in this file below
19+
- Not critical, but would eliminate entire class of memory safety issues
20+
- **Estimated Effort**: 1-2 weeks (see detailed analysis)
21+
22+
23+
---
24+
25+
## Detailed Analysis: Rust Rewrite of NVX Module
26+
27+
### Current State
28+
- **2 C files**: `_utf8validator.c` (693 LOC), `_xormasker.c` (199 LOC)
29+
- **Total**: ~892 lines of C code
30+
- **No .h headers**: Functions exposed via CFFI `ffi.cdef()`
31+
- **SIMD optimizations**: SSE2, SSE4.1 (compile-time selection)
32+
- **CFFI-based**: Already using CFFI (not CPyExt) - good for PyPy
33+
34+
### API Surface (Must Remain Unchanged)
35+
36+
**UTF-8 Validator (8 functions)**:
37+
```c
38+
void* nvx_utf8vld_new();
39+
void nvx_utf8vld_reset(void* utf8vld);
40+
int nvx_utf8vld_validate(void* utf8vld, const uint8_t* data, size_t length);
41+
void nvx_utf8vld_free(void* utf8vld);
42+
int nvx_utf8vld_set_impl(void* utf8vld, int impl);
43+
int nvx_utf8vld_get_impl(void* utf8vld);
44+
size_t nvx_utf8vld_get_current_index(void* utf8vld);
45+
size_t nvx_utf8vld_get_total_index(void* utf8vld);
46+
```
47+
48+
**XOR Masker (7 functions)**:
49+
```c
50+
void* nvx_xormask_new(const uint8_t* mask);
51+
void nvx_xormask_reset(void* xormask);
52+
size_t nvx_xormask_pointer(void* xormask);
53+
void nvx_xormask_process(void* xormask, uint8_t* data, size_t length);
54+
void nvx_xormask_free(void* xormask);
55+
int nvx_xormask_set_impl(void* xormask, int impl);
56+
int nvx_xormask_get_impl(void* xormask);
57+
```
58+
59+
### Rust Rewrite Approach
60+
61+
**Strategy**: Keep C API, rewrite implementation in Rust
62+
63+
#### Architecture
64+
```
65+
┌─────────────────────────────────────────┐
66+
│ Python (unchanged) │
67+
│ from _nvx_utf8validator import lib │
68+
└──────────────────┬──────────────────────┘
69+
│ CFFI
70+
┌──────────────────▼──────────────────────┐
71+
│ C FFI Shim (thin, ~50 lines) │
72+
│ #[no_mangle] extern "C" functions │
73+
└──────────────────┬──────────────────────┘
74+
75+
┌──────────────────▼──────────────────────┐
76+
│ Rust Implementation │
77+
│ - Safe memory management (Box) │
78+
│ - SIMD via std::arch intrinsics │
79+
│ - No unsafe malloc/free │
80+
└─────────────────────────────────────────┘
81+
```
82+
83+
#### Key Components
84+
85+
1. **Memory Management** (✅ Major Benefit)
86+
- Replace `malloc/free` → Rust `Box::new() / drop`
87+
- Automatic safety: no NULL deref, no double-free, no use-after-free
88+
- `void*``Box<Validator>` converted to raw pointer for C FFI
89+
90+
2. **SIMD Code** (⚠️ Requires Care)
91+
- C: `__SSE2__`, `__SSE4_1__` preprocessor + intrinsics
92+
- Rust: `#[cfg(target_feature)]` + `std::arch::x86_64`
93+
- Nearly 1:1 mapping of intrinsics (e.g., `_mm_load_si128` → same in Rust)
94+
- Runtime detection via `is_x86_feature_detected!("sse4.1")`
95+
96+
3. **FFI Boundary** (⚠️ Unsafe but Isolated)
97+
```rust
98+
#[no_mangle]
99+
pub extern "C" fn nvx_utf8vld_new() -> *mut c_void {
100+
let validator = Box::new(Utf8Validator::new());
101+
Box::into_raw(validator) as *mut c_void
102+
}
103+
104+
#[no_mangle]
105+
pub extern "C" fn nvx_utf8vld_free(ptr: *mut c_void) {
106+
if !ptr.is_null() {
107+
unsafe {
108+
let _ = Box::from_raw(ptr as *mut Utf8Validator);
109+
}
110+
}
111+
}
112+
```
113+
114+
### Effort Estimation
115+
116+
#### Phase 1: Setup & Basic Structure (2-3 days)
117+
- [ ] Create Rust library crate (`autobahn-nvx` or in-tree)
118+
- [ ] Setup CFFI to compile Rust instead of C
119+
- [ ] Define Rust structs matching C structs
120+
- [ ] Implement FFI shims (15 functions)
121+
- [ ] Wire up build system (Cargo + CFFI integration)
122+
123+
#### Phase 2: Port Core Logic (3-5 days)
124+
- [ ] Port UTF-8 validator scalar implementation
125+
- [ ] Port XOR masker scalar implementation
126+
- [ ] Port UTF-8 DFA state machine
127+
- [ ] Add comprehensive tests (unit tests in Rust)
128+
- [ ] Validate against existing Python tests
129+
130+
#### Phase 3: SIMD Optimizations (3-5 days)
131+
- [ ] Port SSE2 UTF-8 validator
132+
- [ ] Port SSE4.1 UTF-8 validator (if used)
133+
- [ ] Port SSE2 XOR masker
134+
- [ ] Feature detection & runtime selection
135+
- [ ] Benchmark against C version (must match or exceed)
136+
137+
#### Phase 4: Integration & Validation (2-3 days)
138+
- [ ] Integration testing with autobahn-python
139+
- [ ] Performance benchmarking (throughput, latency)
140+
- [ ] Test on CPython 3.11-3.14
141+
- [ ] Test on PyPy 3.11
142+
- [ ] Cross-platform testing (Linux x86-64, ARM64, macOS, Windows)
143+
- [ ] Memory profiling (no leaks, no regressions)
144+
145+
**Total Estimated Effort**: 10-16 days (2-3 weeks)
146+
147+
### Benefits of Rust Rewrite
148+
149+
**Security & Correctness**:
150+
- Eliminates NULL pointer dereferences (issue #1789 class)
151+
- Prevents buffer overflows
152+
- Prevents use-after-free
153+
- Prevents double-free
154+
- Safe memory management by default
155+
156+
**Maintainability**:
157+
- Better type system
158+
- Clearer ownership semantics
159+
- Modern tooling (cargo, clippy, rustfmt)
160+
- Built-in testing framework
161+
162+
**Performance**:
163+
- Likely same or better (Rust's LLVM backend)
164+
- Better optimization opportunities (aliasing rules)
165+
- No unexpected overhead
166+
167+
**PyPy Compatibility**:
168+
- Still uses CFFI (not CPyExt)
169+
- No change to PyPy story
170+
171+
### Risks & Challenges
172+
173+
⚠️ **SIMD Portability**:
174+
- Must support same architectures as C version
175+
- Runtime feature detection complexity
176+
- Need to test on ARM64 NEON (if C version supports it)
177+
178+
⚠️ **Build Complexity**:
179+
- CFFI + Cargo integration
180+
- Cross-compilation for wheels
181+
- CI/CD updates for Rust toolchain
182+
183+
⚠️ **Learning Curve**:
184+
- Team needs Rust familiarity
185+
- SIMD intrinsics knowledge still required
186+
187+
⚠️ **Binary Size**:
188+
- Rust stdlib might increase binary size slightly
189+
- Mitigated by `#![no_std]` if needed
190+
191+
### Recommendation
192+
193+
**Priority**: Medium-Low (Phase 2 or later)
194+
195+
**Rationale**:
196+
1. Current C code works, issue #1789 fixable in < 1 hour
197+
2. Rust rewrite is 2-3 weeks effort vs. immediate value
198+
3. Better to focus on Phase 1 modernization first
199+
4. Consider after Phase 1 complete, if:
200+
- More memory safety issues found
201+
- Team has Rust expertise
202+
- Time available for non-critical improvements
203+
204+
**Alternative Approach**:
205+
- Fix issue #1789 with NULL checks (Phase 1.2)
206+
- Add comprehensive fuzzing tests for C code
207+
- Revisit Rust rewrite in Phase 2 or 3
208+
209+
### If We Do Proceed
210+
211+
**Prerequisites**:
212+
- [ ] Team Rust training (1-2 weeks)
213+
- [ ] Rust SIMD intrinsics study (3-5 days)
214+
- [ ] CFFI + Cargo integration prototype (1-2 days)
215+
- [ ] Performance baseline established (benchmarks)
216+
217+
**Deliverables**:
218+
- [ ] Rust crate with C-compatible API
219+
- [ ] Drop-in replacement for C files
220+
- [ ] No Python code changes required
221+
- [ ] Performance parity or better
222+
- [ ] Comprehensive test suite
223+
- [ ] Documentation
224+
225+
---
226+
227+
**Summary**: Rust rewrite is feasible (2-3 weeks), provides significant safety benefits, but not urgent. Fix immediate issue #1789 first, revisit Rust later if strategic value aligns.
228+

docs/AI_AUDIT_PROCESS.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

docs/AI_AUDIT_PROCESS_REVIEW.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

docs/AI_GUIDELINES.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

docs/AI_POLICY.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

docs/ai/AI_AUDIT_FILE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../../.ai/audit/AI_AUDIT_FILE.md

docs/ai/AI_AUDIT_PROCESS.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../../.ai/audit/AI_AUDIT_PROCESS.md

docs/ai/AI_AUDIT_PROCESS_REVIEW.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../../.ai/audit/AI_AUDIT_PROCESS_REVIEW.md

0 commit comments

Comments
 (0)