Skip to content

Commit 8282795

Browse files
authored
Merge pull request #10 from 0xC000005/v0.14-binary-serialization
v0.14: Portable .pcb binary serialization format
2 parents ddbf275 + 3436e41 commit 8282795

17 files changed

Lines changed: 1893 additions & 49 deletions

File tree

CHANGELOG.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,25 @@ All notable changes to PyChebyshev will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [0.14.0] - 2026-04-24
9+
10+
### Added
11+
- Portable `.pcb` binary serialization format for `ChebyshevApproximation`
12+
and `ChebyshevSpline`. Closes the MoCaX cross-language serialization gap
13+
in PyChebyshev's specific way.
14+
- New `format=` kwarg on `save()` (default `'pickle'`, opt in to `'binary'`).
15+
- `load()` auto-detects via 4-byte magic header `b"PCB\x00"`. No
16+
behaviour change for existing pickle files.
17+
- C reference reader at `examples/binary_reader/` (not in CI).
18+
- Format spec at `docs/user-guide/binary-format.md`.
19+
- Stdlib `struct` + NumPy only — no new runtime dependencies.
20+
21+
### Restrictions
22+
- `format='binary'` requires flat `n_nodes` for `ChebyshevSpline`. Splines
23+
built with nested per-piece `n_nodes` raise `NotImplementedError` and
24+
fall back to pickle.
25+
- `ChebyshevSlider` and `ChebyshevTT` remain pickle-only in v0.14.
26+
827
## [0.13.1] - 2026-04-24
928

1029
### Changed

CLAUDE.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ PyChebyshev is a pip-installable Python library for multi-dimensional Chebyshev
1212
# Setup
1313
uv sync
1414

15-
# Run tests (~586 tests, ~110s due to 5D Black-Scholes builds)
15+
# Run tests (~662 tests, ~110s due to 5D Black-Scholes builds)
1616
uv run pytest tests/ -v
1717

1818
# Run a single test
@@ -46,6 +46,7 @@ The installable package. Public classes: `ChebyshevApproximation`, `ChebyshevSpl
4646
- **`_algebra.py`** — Shared helpers for Chebyshev arithmetic operators (compatibility validation, operator dispatch).
4747
- **`_extrude_slice.py`** — Shared helpers for extrusion and slicing (parameter validation, tensor manipulation, barycentric contraction).
4848
- **`_calculus.py`** — Shared helpers for Chebyshev calculus (Fejér-1 quadrature weights via DCT-III, sub-interval quadrature weights via Chebyshev antiderivatives, companion-matrix rootfinding, 1-D optimization). References: Waldvogel (2006), Trefethen (2013).
49+
- **`_binary.py`** — Private. `.pcb` portable binary serialization (v0.14). Reading/writing for `ChebyshevApproximation` and `ChebyshevSpline`. Stdlib `struct` + NumPy only.
4950
- **`_jit.py`** — Deprecated Numba JIT kernel with pure NumPy fallback. Used only by deprecated `fast_eval()`.
5051
- **`_version.py`** — Single source of truth for version string.
5152

@@ -79,6 +80,7 @@ Not part of the library. Compare Chebyshev barycentric against alternative metho
7980
- `test_from_values.py` — 65 tests: nodes() and from_values() for ChebyshevApproximation and ChebyshevSpline; bit-identical equivalence with build(); derivatives, calculus, algebra, extrude/slice, save/load; edge cases (NaN/Inf, shape mismatch, 1-node dim, build guard, 4D, boundary eval, negative/wide/tight domains, duplicate knots, algebra chains, domain validation).
8081
- `test_special_points.py` — 37 tests: `ChebyshevApproximation.__new__` dispatch to `ChebyshevSpline` when `special_points` declares any kink (option A, precedent `pathlib.Path`); validation of special_points shape + nested `n_nodes`; 1D/2D correctness (abs kink recovery to machine precision; plateau control); cross-feature (save/load, algebra, integrate, extrude/slice, from_values); edge cases.
8182
- `test_error_threshold.py` — 37 tests: v0.11 auto-N doubling loop, max_n cap, get_optimal_n1, semi-auto mixed-N paths, verbose prints, spline per-piece doubling.
83+
- `test_binary_format.py` — 76 tests: low-level helpers, header parsing, format detection, ChebyshevApproximation round-trip (incl. n=1 dim), ChebyshevSpline round-trip, save/load integration with `format=` kwarg + autodetect, golden vectors, corruption rejection, cross-feature (from_values, algebra, extrude, slice, 5D BS, min n_nodes).
8284

8385
### CI/CD (`.github/workflows/`)
8486

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,8 @@ The convergence plots demonstrate exponential error decay as node count increase
7777
- **Vectorized evaluation** using BLAS matrix-vector products (~0.065ms/query)
7878
- **Error-driven construction** — pass `error_threshold=ε` and PyChebyshev auto-picks node counts per dimension
7979
- **Special points in the core API** — declare kinks directly on `ChebyshevApproximation` via `special_points=[[...]]`; auto-dispatches to a piecewise Chebyshev spline.
80+
- **Portable `.pcb` binary format** for cross-language model sharing (C, Rust,
81+
Julia consumers can read PyChebyshev interpolants without Python).
8082
- **Pure Python** — NumPy + SciPy only, no compiled extensions needed
8183

8284
## Acknowledgments

docs/roadmap.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ Adds guidance on when to choose **Cross vs SVD vs ALS** as the build method.
7878
**Closes MoCaX gaps:** rank-adaptive ALS, completion sweep, TT inner
7979
product, TT orthogonalization.
8080

81-
## v0.14 — Portable Binary Serialization :material-clock-outline:
81+
## v0.14 — Portable Binary Serialization :material-check:
8282

8383
A language-agnostic `.pcb` binary format alongside the existing pickle-based
8484
save/load. Consumers in C, Rust, or Julia can read PyChebyshev interpolants

docs/user-guide/binary-format.md

Lines changed: 210 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,210 @@
1+
# Portable Binary Format (`.pcb`)
2+
3+
PyChebyshev v0.14 introduced a portable binary serialization format alongside
4+
the default pickle format. The goal: let consumers in **C, Rust, Julia, or
5+
any other language** read PyChebyshev interpolants without a Python runtime.
6+
7+
The format is intentionally minimal — a fixed header, length-prefixed
8+
sections, raw little-endian `f64` blobs. A C reference reader at
9+
`examples/binary_reader/` weighs ~240 lines.
10+
11+
## When to use which format
12+
13+
| Format | Use when |
14+
|---|---|
15+
| **Pickle** (default) | Python-only round-trips; need full fidelity (build metadata, error caches) |
16+
| **Binary** (`.pcb`) | Cross-language consumers; sharing models with C/Rust/Julia code; long-term archival without Python pickle compatibility risk |
17+
18+
Pickle stays the default because every existing user keeps working with no
19+
change. Opt into binary explicitly:
20+
21+
```python
22+
cheb.save("model.pcb", format='binary') # portable
23+
cheb.save("model.pkl") # pickle (default)
24+
25+
ChebyshevApproximation.load("model.pcb") # auto-detects
26+
```
27+
28+
`load()` sniffs the first 4 bytes — `b"PCB\x00"` routes to the binary
29+
reader, anything else to the pickle reader.
30+
31+
## Coverage in v0.14
32+
33+
- **`ChebyshevApproximation`** — full support.
34+
- **`ChebyshevSpline`** — full support, with one restriction: the spline
35+
must use **flat** `n_nodes` (a single `int` per dim, shared across pieces).
36+
Splines built with nested per-piece `n_nodes` (the `[[n00, n01], …]` form
37+
introduced for special points) cannot be saved as `.pcb` because the
38+
underlying `ChebyshevSpline.from_values()` factory does not yet support
39+
that shape; use pickle for those.
40+
- **`ChebyshevSlider`**, **`ChebyshevTT`** — pickle only in v0.14.
41+
42+
## Format specification (v1)
43+
44+
All multi-byte fields are **little-endian**. Numeric arrays are raw `f64`
45+
blobs in C-order.
46+
47+
### Header (12 bytes)
48+
49+
```
50+
offset size field
51+
0 4 magic = b"PCB\x00"
52+
4 1 major_version = 1
53+
5 1 minor_version = 0
54+
6 2 class_tag 1 = ChebyshevApproximation, 2 = ChebyshevSpline
55+
8 4 reserved = 0x00000000
56+
```
57+
58+
### `ChebyshevApproximation` body (`class_tag = 1`)
59+
60+
```
61+
uint32 num_dimensions d
62+
f64[d] domain_lo [a_0, ..., a_{d-1}]
63+
f64[d] domain_hi [b_0, ..., b_{d-1}]
64+
uint32[d] n_nodes [n_0, ..., n_{d-1}]
65+
f64[prod(n_nodes)] tensor_values C-order
66+
```
67+
68+
`barycentric_weights` and `diff_matrices` are **not** stored; they are
69+
recomputed from `(domain, n_nodes)` on load (they are pure functions of
70+
those primitives).
71+
72+
### `ChebyshevSpline` body (`class_tag = 2`)
73+
74+
```
75+
uint32 num_dimensions d
76+
f64[d] domain_lo
77+
f64[d] domain_hi
78+
uint32[d] n_nodes shared across pieces
79+
uint32[d] num_knots_per_dim [k_0, ..., k_{d-1}]
80+
f64[k_0 + ... + k_{d-1}] knots_concatenated flat, dim-by-dim
81+
uint32 num_pieces P = prod(k_i + 1)
82+
83+
# P piece blocks, in C-order over the piece grid:
84+
for p in 0..P-1:
85+
f64[prod(n_nodes)] tensor_values_p C-order
86+
```
87+
88+
### Versioning policy
89+
90+
- New required fields → bump **major**. v1 readers reject `major != 1`.
91+
- New optional trailing fields → bump **minor**. v1 readers ignore unknown
92+
trailing data.
93+
- Reserved header bytes MUST be zero in v1.
94+
95+
## Worked example: `f(x,y) = x + y`
96+
97+
Python side:
98+
99+
```python
100+
from pychebyshev import ChebyshevApproximation
101+
102+
cheb = ChebyshevApproximation(
103+
function=lambda pt, _: pt[0] + pt[1],
104+
num_dimensions=2,
105+
domain=[(-1.0, 1.0), (-1.0, 1.0)],
106+
n_nodes=[3, 3],
107+
)
108+
cheb.build()
109+
cheb.save("xy.pcb", format='binary')
110+
```
111+
112+
The resulting file is exactly **128 bytes**:
113+
114+
```
115+
12 header
116+
4 num_dimensions = 2
117+
16 domain_lo = [-1.0, -1.0]
118+
16 domain_hi = [ 1.0, 1.0]
119+
8 n_nodes = [3, 3]
120+
72 tensor_values (3 × 3 f64)
121+
```
122+
123+
C reader:
124+
125+
```bash
126+
cd examples/binary_reader
127+
make
128+
./reader ../../xy.pcb 0.3 0.4
129+
# 0.69999999999999996
130+
```
131+
132+
The same IEEE-754 double Python returns (`repr` truncates trailing digits):
133+
134+
```python
135+
cheb.eval([0.3, 0.4], [0, 0]) # 0.7
136+
```
137+
138+
The two strings render the same `float64` value `0x3fe6666666666666`. The
139+
C reader prints with `%.17g`, Python with `repr` — they agree bit-for-bit.
140+
141+
### Spline worked example: `|x|` on `[-1, 1]`
142+
143+
```python
144+
from pychebyshev import ChebyshevSpline
145+
146+
s = ChebyshevSpline(
147+
function=lambda pt, _: abs(pt[0]),
148+
num_dimensions=1,
149+
domain=[(-1.0, 1.0)],
150+
n_nodes=[3],
151+
knots=[[0.0]],
152+
)
153+
s.build()
154+
s.save("abs.pcb", format='binary')
155+
```
156+
157+
The resulting file is exactly **100 bytes**:
158+
159+
```
160+
12 header
161+
4 num_dimensions = 1
162+
8 domain_lo = [-1.0]
163+
8 domain_hi = [ 1.0]
164+
4 n_nodes = [3]
165+
4 num_knots = [1]
166+
8 knots = [0.0]
167+
4 num_pieces = 2
168+
48 piece tensor values (2 pieces × 3 × f64)
169+
```
170+
171+
Two pieces because one knot at `0.0` splits the domain `[-1, 1]` into `[-1, 0]`
172+
and `[0, 1]`. Each piece carries its own 3-node Chebyshev grid.
173+
174+
## Writing a reader in another language
175+
176+
The format is small enough to implement in an afternoon:
177+
178+
1. Read 4 bytes; verify equal to `b"PCB\x00"`.
179+
2. Read major/minor version; reject unknown major.
180+
3. Read class tag; dispatch.
181+
4. For class 1: read `uint32 d`, then `d × f64` for `lo`, `d × f64` for `hi`,
182+
`d × uint32` for `n_nodes`, then `prod(n_nodes) × f64` for tensor values.
183+
5. To evaluate, generate Chebyshev first-kind nodes per dim, compute
184+
barycentric weights from node positions, evaluate by dim-by-dim collapse.
185+
186+
`examples/binary_reader/reader.c` is the reference. It is intentionally
187+
minimal: ~240 lines, stdlib + `libm` only.
188+
189+
## What the format does not store
190+
191+
These fields are dropped on `format='binary'`:
192+
193+
| Field | Replacement |
194+
|---|---|
195+
| `function` | always dropped (also dropped by pickle) |
196+
| `barycentric_weights`, `diff_matrices` | recomputed on load |
197+
| `_cached_error_estimate` | recomputed lazily |
198+
| `build_time`, `n_evaluations`, `method` | not preserved (use pickle for full fidelity) |
199+
| `max_derivative_order` | resets to default `2` on load — re-set manually after `load()` if you need higher orders |
200+
201+
If you need any of those preserved, use pickle.
202+
203+
## Security
204+
205+
The binary reader does no `pickle.loads`-style code execution. It can be
206+
used to load files from untrusted sources — it will reject malformed
207+
files with a `ValueError`.
208+
209+
Pickle remains the default and **does** execute arbitrary code from
210+
loaded files. Treat pickle files like executable code.

examples/binary_reader/.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
reader
2+
*.o
3+
*.pcb

examples/binary_reader/Makefile

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
CC ?= cc
2+
CFLAGS ?= -O2 -Wall -Wextra -std=c99
3+
4+
reader: reader.c
5+
$(CC) $(CFLAGS) -o reader reader.c -lm
6+
7+
clean:
8+
rm -f reader
9+
10+
.PHONY: clean

examples/binary_reader/README.md

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
# PyChebyshev `.pcb` Binary Reader (C Reference)
2+
3+
A ~240-line C program that reads a v1 `.pcb` file (`ChebyshevApproximation`
4+
class) and evaluates it at a query point. It is the reference proof that
5+
the format is implementable from scratch in another language. It assumes
6+
a little-endian host (matches the on-disk format); a port to big-endian
7+
hardware would need explicit byte-swapping.
8+
9+
`ChebyshevSpline` is **not** supported here — this is a minimal proof,
10+
not a full port.
11+
12+
## Build
13+
14+
```bash
15+
cd examples/binary_reader
16+
make
17+
```
18+
19+
Requires a C99 compiler (gcc, clang) and `libm`. No third-party deps.
20+
21+
## Usage
22+
23+
First, save a `.pcb` file from Python:
24+
25+
```python
26+
from pychebyshev import ChebyshevApproximation
27+
cheb = ChebyshevApproximation(
28+
function=lambda pt, _: pt[0] + pt[1],
29+
num_dimensions=2,
30+
domain=[(-1.0, 1.0), (-1.0, 1.0)],
31+
n_nodes=[3, 3],
32+
)
33+
cheb.build()
34+
cheb.save("test.pcb", format='binary')
35+
print("Python eval:", cheb.eval([0.3, 0.4], [0, 0]))
36+
```
37+
38+
Then read it from C:
39+
40+
```bash
41+
./reader test.pcb 0.3 0.4
42+
```
43+
44+
The two values should agree to at least 1e-12.
45+
46+
## What it covers
47+
48+
- Header parse + validation (magic, major, class tag, reserved bytes)
49+
- ChebyshevApproximation body (num_dim, domain, n_nodes, tensor)
50+
- Chebyshev first-kind node reconstruction
51+
- Barycentric weight computation
52+
- N-D evaluation by dimension-by-dimension collapse
53+
54+
## What it does not cover
55+
56+
- ChebyshevSpline (pieces, knots) — requires an extra outer routing step
57+
- Derivatives — requires the differentiation matrix
58+
- Future format versions
59+
60+
Contributions adding spline support, or porting to Rust/Julia, are welcome.

0 commit comments

Comments
 (0)