Skip to content

Latest commit

 

History

History
178 lines (133 loc) · 7.03 KB

File metadata and controls

178 lines (133 loc) · 7.03 KB

Reconstruction Report — rs-

Generated after the port clears the parity gate and the Acceleration loop terminates. This is the structured "is the port complete?" audit.

1. Identity

Field Value
Rust crate / Python module rs-<pkgname> / rs_<pkgname>
Upstream Python package <UpstreamPyPkg> v<pinned-version>
Upstream source <PyPI / GitHub URL>
Algorithm class <embedding / clustering / ordinal / deterministic-…>
Parity threshold (pre-registered)
Final parity value
Audit class <A / B / C>
Total LOC (Rust, excluding tests)
Wall-clock speedup vs Python reference on <fixture>
Memory / threading gain <yes/no> + details

2. Python API coverage audit

Auto-populated by engine/py_function_audit.py from the upstream's __all__ / public defs/classes (ast). Every public Python function must be in the table. Internal helpers reachable from the public API are also listed.

2.1 Public Python API

Python name Rust equivalent Status Tests Notes
entry_point rs_<pkgname>.entry_point ✅ ported test_exact_match.py
<fn_a> <fn_a> (#[pyfunction]) ✅ ported test_<fn_a>_parity.py
<Class> <Class> (#[pyclass]) ✅ ported test_smoke.py method-chaining preserved
<rare_fn> ⛔ skipped deprecated upstream / never called

2.2 Internal helpers reachable from the public API

Python helper Used by Rust equivalent Status
_kernel entry_point inlined into compute
<other>

2.3 Coverage summary

Category Count Coverage
Public Python API <N_export> <N_ported> / <N_export> = <%>
Internal helpers reachable <N_internal> <N_ported_internal> / <N_internal> = <%>
Total Python LOC <N_py_loc>
Total Rust LOC (src/*.rs) <N_rs_loc> ratio =

A complete port has ≥ 95% of public functions ported AND every internal helper transitively reachable from a ported public function.

2.4 Deliberately skipped

Python function Reason for skipping
<fn> <e.g., "plotting only", "deprecated shim", "only used by removed function X">

2.5 Dependencies reused from the ecosystem

From DISCOVERY.md. Each reused crate/port is code we didn't write twice.

Python dep Reused as Reused how Approx. LOC saved
numpy ndarray + numpy crate hard Cargo dep the n-d array layer
scikit-learn linfa-<x> hard Cargo dep ~XXX
<dep> rs-<dep> pyproject + Cargo dep ~XXX

Total saved by reuse: ~YYYY LOC.

Deps with no crate, kept in Python across the boundary or ported in-crate:

Python dep Handling Reason
statsmodels ported the specific routine into the crate no canonical crate
matplotlib out of scope plotting stays in Python

3. Parity evidence

3.1 Per-output parity (from manifest.yaml::outputs)

Output Class Threshold Final value Pass
embedding embedding 0.95 1.0000
labels clustering 0.95 1.0000

3.2 Per-fixture parity

Fixture Metric Wall-clock (Rust) Wall-clock (Python) Speedup
pbmc3k (2700 × 1838) Procrustes 1.0000 0.08 s 1.9 s 24×
paul15 (2730 × 3451) Procrustes 0.9998 0.21 s 6.0 s 28×
<t / OOM> <Nx / tractability>

3.3 Reference command (reproducible)

conda activate $PYTHON_REF_ENV
python tests/py_reference_driver.py data/fixture.h5ad data/reference_output.json

conda activate $RUST_TEST_ENV
maturin develop --release
python tests/_run_candidate.py data/fixture.h5ad data/candidate_output.json
pytest tests/test_exact_match.py -v

4. Acceleration evidence

4.1 Two-plot evaluation

evolution

  • Plot 1 (top): wall-clock vs iteration (log y). Iteration 0 = baseline Rust translation.
  • Plot 2 (bottom): parity metric vs iteration. Dips annotated with the math reason (usually a reordered reduction).

4.2 Accepted rewrites

Iter Section Admissibility Speedup Accuracy delta
0 (baseline translation)
1 §4.2 LTO + codegen-units=1 E 1.2× 0.0000
2 §2.2 buffer reuse E 1.5× 0.0000
3 §3.4 rayon outer-axis map E 3.8× 0.0000
4 §3.2 parallel reduction B (n·eps·max x )
Final -0.00002

4.3 Rejected rewrites

Iter Section Reason for rejection
5 f32 accumulator reference is f64; no admissible bound at this tier
6 approximate kNN (HNSW) downstream is non-local — inadmissible

5. Code quality audit

All items below are mandatory for release. "Deferred" is not a valid status.

Check Status
maturin build --release produces a wheel; pip install in fresh env
cargo test (Rust-side unit tests) green
pytest -q green (against --release) ✅ / tests pass
examples/compare_Python_vs_Rust.ipynb (6-section schema; outputs committed)
examples/tutorial_<dataset>.ipynb (one subsection per public function; outputs committed)
examples/function_by_function_Python_parity.ipynb (Python⇄Rust param dictionary; outputs committed)
examples/py_per_function_dump.py (Python driver for Notebook 3)
examples/evolution.png rendered from ITERATION_LOG.md
examples/evolution.ipynb (one header per iteration; outputs committed)
README.md has all required sections
MATH.md has perturbation bounds for every (B) rewrite
ITERATION_LOG.md complete and parseable
DISCOVERY.md committed (Phase 0.5 artefact)
AUDIT.md produced by engine.py_function_audit
License compatible with upstream
Version pinned to 0.1.0 (Cargo.toml + pyproject.toml)
GitHub repo created under <org>/
PyPI (maturin) + crates.io release

6. Known limitations

  • Fixture-level equivalence only — not proved over the full input domain.
  • <e.g., parallel-reduction path is bounded-(B), not bit-exact; serial path available via single-thread>
  • <e.g., a stochastic step uses a Rust RNG that mirrors NumPy PCG64; document any residual divergence>
  • <e.g., upstream Python bugs we did NOT replicate, by design — list them>

7. Integration

  • Crate published at: crates.io/crates/rs-<pkgname>
  • Wheel published at: pypi.org/project/rs-<pkgname>
  • Drop-in usage: import rs_<pkgname> as <pkgname>
  • Tutorial: examples/tutorial_<dataset>.ipynb

8. Sign-off

Field Value
Author
Date
Total port duration (active) <hours/days>
Total Acceleration iterations (accepted) / (proposed)