Skip to content

Commit 113f63c

Browse files
authored
docs: map existing codebase (#81)
1 parent 2456f01 commit 113f63c

7 files changed

Lines changed: 849 additions & 0 deletions

File tree

.planning/codebase/ARCHITECTURE.md

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
# Architecture
2+
3+
**Analysis Date:** 2026-03-18
4+
5+
## Pattern Overview
6+
7+
**Overall:** Layered Go library for native ONNX Runtime interop, with a low-level FFI core in `ort/` and model-specific convenience layers in `embeddings/`.
8+
9+
**Key Characteristics:**
10+
- No-CGO design: native symbols are loaded dynamically through `purego`
11+
- Global runtime lifecycle managed once per process in `ort/environment.go`
12+
- Explicit resource ownership for tensors, sessions, memory info, and embedders
13+
- Higher-level embedding packages build reusable batching and post-processing on top of the raw ORT session API
14+
15+
## Layers
16+
17+
**FFI Runtime Layer (`ort/`):**
18+
- Purpose: map the ONNX Runtime C API into Go and expose safe-ish wrappers for environment, tensor, memory, and session lifecycle
19+
- Contains: `InitializeEnvironment`, `DestroyEnvironment`, `AdvancedSession`, `Tensor[T]`, bootstrap/download logic, OS-specific library loading
20+
- Depends on: `purego`, `unsafe`, native ONNX Runtime shared libraries, and generated `OrtApi` bindings
21+
- Used by: all example programs and all packages under `embeddings/`
22+
23+
**Model Adapter Layer (`embeddings/*`):**
24+
- Purpose: hide raw tensor/session wiring behind model-specific APIs
25+
- Contains: `minilm.Embedder`, `splade.Embedder`, `openclip.Embedder`, session cache management, tokenizer usage, pooling/post-processing
26+
- Depends on: `ort/`, `pure-tokenizers`, and model-specific local artifacts
27+
- Used by: example programs and downstream applications that want dense, sparse, or CLIP embeddings
28+
29+
**Utility Layer (`embeddings/internal/ortutil`):**
30+
- Purpose: small shared helpers for resource cleanup
31+
- Contains: `DestroyAll`
32+
- Depends on: only local interfaces and the Go standard library
33+
- Used by: embedding packages when tearing down grouped ORT resources
34+
35+
**Executable/Tooling Layer (`examples/`, `tools/`, `.github/workflows/`):**
36+
- Purpose: provide runnable demos, generators, and CI/release automation
37+
- Contains: basic/inference/openclip examples, OpenCLIP export tooling, `gen_ortapi.go`, GitHub Actions workflows
38+
- Depends on: lower library layers plus external services such as GitHub and Hugging Face
39+
- Used by: maintainers, CI, and users validating real-world flows
40+
41+
## Data Flow
42+
43+
**Core ORT Inference Flow:**
44+
45+
1. Caller resolves a runtime library path explicitly or via `EnsureOnnxRuntimeSharedLibrary()` in `ort/bootstrap.go`
46+
2. `ort.InitializeEnvironment()` loads the native library, registers function pointers, and creates a process-global ORT environment
47+
3. Caller creates tensors via `ort.NewTensor` / `ort.NewEmptyTensor`
48+
4. Caller constructs an `ort.AdvancedSession` with model path, input names, and output names
49+
5. `AdvancedSession.Run()` acquires session-local and global runtime locks, converts names/values into native handles, and invokes ORT
50+
6. Callers read tensor-backed output slices and then destroy sessions/tensors/environment in reverse order
51+
52+
**Embedding Flow:**
53+
54+
1. Caller initializes `ort` once for the process
55+
2. Embedder loads tokenizer/model metadata and validates artifact paths
56+
3. Inputs are tokenized or preprocessed into reusable backing buffers
57+
4. Package-specific session caches keyed by batch size reuse tensors and `AdvancedSession` instances
58+
5. Post-processing converts raw outputs into dense vectors, sparse vectors, or CLIP similarity matrices
59+
60+
**Bootstrap/Asset Resolution Flow:**
61+
62+
1. Bootstrap resolves cache directory and target artifact names from platform/runtime config
63+
2. Downloads are guarded by file/process locks
64+
3. Archives/files are validated for size, checksum, and path traversal
65+
4. Resolved local file paths are returned to the caller for normal `ort` or embedding setup
66+
67+
**State Management:**
68+
- Process-global ORT state lives in package globals in `ort/environment.go`
69+
- Session-level mutable state lives on `AdvancedSession` and embedder caches
70+
- Persistent state is file-based only (user cache directories and generated artifacts)
71+
72+
## Key Abstractions
73+
74+
**`AdvancedSession`:**
75+
- Purpose: wrap an ONNX Runtime session plus fixed input/output bindings
76+
- Examples: `ort/session.go`, used directly by `examples/inference/main.go` and all embedding packages
77+
- Pattern: stateful handle wrapper with explicit `Run()` / `Destroy()`
78+
79+
**`Tensor[T]`:**
80+
- Purpose: represent ORT values backed by Go slices pinned for native access
81+
- Examples: `ort/tensor.go`
82+
- Pattern: generic resource wrapper with finalizer safety net and explicit destroy semantics
83+
84+
**`BootstrapOption` / embedding `Option`:**
85+
- Purpose: configure bootstrap and embedder behavior without large constructors
86+
- Examples: `ort/bootstrap.go`, `embeddings/minilm/embedder.go`, `embeddings/splade/embedder.go`, `embeddings/openclip/embedder.go`
87+
- Pattern: functional options
88+
89+
## Entry Points
90+
91+
**Library Entry Points:**
92+
- `ort/environment.go` - global runtime initialization and teardown
93+
- `ort/session.go` - model session creation and inference
94+
- `ort/tensor.go` - tensor allocation and lifecycle
95+
96+
**Example Programs:**
97+
- `examples/basic/main.go` - minimal runtime initialization example
98+
- `examples/inference/main.go` - end-to-end single-model inference flow driven by env vars
99+
- `examples/openclip/main.go` - OpenCLIP text/image embedding demo with manifest-backed fixtures
100+
101+
**Tooling:**
102+
- `tools/gen_ortapi.go` - regenerates `ort/ortapi_generated.go` from ONNX Runtime headers
103+
104+
## Error Handling
105+
106+
**Strategy:** return Go `error` values aggressively, validate inputs early, and use deferred cleanup to keep native handles from leaking.
107+
108+
**Patterns:**
109+
- Wrap lower-level failures with `fmt.Errorf(...: %w)`
110+
- Translate ORT `OrtStatus` handles into Go strings via helper functions in `ort/environment.go`
111+
- Use `errors.Join` when cleanup steps can fail independently
112+
- Treat nil receivers as safe no-ops for most `Destroy()` methods
113+
114+
## Cross-Cutting Concerns
115+
116+
**Concurrency:**
117+
- `ort/environment.go` defines a lock hierarchy spanning global runtime state, session runs, and tensor lifetimes
118+
- Several tests in `ort/session_test.go` and `ort/environment_test.go` exist specifically to protect these invariants
119+
120+
**Memory Management:**
121+
- The code pins Go slice backing arrays while native ORT uses them (`ort/tensor.go`)
122+
- Finalizers are present as leak backstops, but the design still expects explicit `Destroy()` calls
123+
124+
**Security and Integrity:**
125+
- Bootstrap code validates checksums, path traversal, redirect safety, and download size limits in both `ort/bootstrap.go` and `embeddings/openclip/bootstrap.go`
126+
127+
---
128+
129+
*Architecture analysis: 2026-03-18*
130+
*Update when major patterns change*

.planning/codebase/CONCERNS.md

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
# Codebase Concerns
2+
3+
**Analysis Date:** 2026-03-18
4+
5+
## Tech Debt
6+
7+
**Generated ORT API binding pipeline:**
8+
- Issue: `tools/gen_ortapi.go` uses regex parsing of upstream headers instead of a real C parser
9+
- Why: simple generation was fast to bootstrap
10+
- Impact: upstream header format changes could silently break `ort/ortapi_generated.go` generation or field ordering
11+
- Fix approach: replace the parser with a proper C AST approach or add stronger structural validation around generator output
12+
13+
**Stale maintainer guidance in `CLAUDE.md`:**
14+
- Issue: the file still references `main2.go` as the main implementation entry point, but that file does not exist in the current tree
15+
- Why: documentation drift as the repo evolved
16+
- Impact: maintainers or automation may follow outdated guidance
17+
- Fix approach: update `CLAUDE.md` to match the current `examples/`-based layout and active packages
18+
19+
**Partially implemented wrapper types:**
20+
- Issue: `ort/types.go` still contains placeholder `Status.GetErrorCode()` and `Status.GetErrorMessage()` implementations
21+
- Why: the public API surface was sketched before all wrappers were wired to real ORT calls
22+
- Impact: contributors can mistake these types for production-ready abstractions
23+
- Fix approach: either finish the implementation or clearly de-emphasize/remove unused wrapper surfaces
24+
25+
## Known Bugs
26+
27+
**Incorrect usage docs are possible around version support:**
28+
- Symptoms: docs and code can drift between `DefaultOnnxRuntimeVersion`, CI-pinned runtime versions, and README examples
29+
- Trigger: bumping runtime versions in one place but not the others
30+
- Workaround: treat `ort/bootstrap.go`, `.github/workflows/ci.yml`, and `README.md` as a synchronized set during upgrades
31+
- Root cause: multiple version pins live in different files for different purposes
32+
33+
## Security Considerations
34+
35+
**Native binary bootstrap paths must stay hardened:**
36+
- Risk: relaxing checksum, redirect, or path traversal checks in `ort/bootstrap.go` or `embeddings/openclip/bootstrap.go` would expose consumers to malicious binary/model downloads
37+
- Current mitigation: checksum validation, redirect policy enforcement, size limits, path sanitization, and lock-guarded extraction
38+
- Recommendations: preserve these checks during refactors and add tests first when changing bootstrap behavior
39+
40+
**Credential-bearing env vars are handled by runtime/test code:**
41+
- Risk: `GITHUB_TOKEN`, `GH_TOKEN`, `HF_TOKEN`, and release secrets could be leaked through unsafe logging or insecure redirects
42+
- Current mitigation: token use is limited and OpenCLIP bootstrap rejects insecure token-bearing base URLs
43+
- Recommendations: keep logs free of raw env values and treat any new download URL override as security-sensitive
44+
45+
## Performance Bottlenecks
46+
47+
**Session creation remains expensive compared with warm reuse:**
48+
- Problem: embedder packages build LRU caches to avoid repeatedly creating ORT sessions per batch size
49+
- Measurement: the existence of dedicated benchmarks in `ort/session_benchmark_test.go` and `embeddings/splade/benchmark_test.go` indicates session setup cost is material
50+
- Cause: native session creation and tensor wiring are heavier than reuse
51+
- Improvement path: preserve cache hit paths and benchmark any refactor that touches session lifecycle
52+
53+
**Bootstrap and integration flows download large artifacts:**
54+
- Problem: real-model tests and bootstrap code move hundreds of MB of runtime/model data
55+
- Measurement: CI allocates dedicated cache keys and download steps in `.github/workflows/ci.yml`
56+
- Cause: native runtime archives and model bundles are large by nature
57+
- Improvement path: keep cache keys stable, avoid redundant downloads, and be careful with checksum/version churn
58+
59+
## Fragile Areas
60+
61+
**Global ORT lifecycle and lock ordering:**
62+
- Why fragile: `ort/environment.go`, `ort/session.go`, and `ort/tensor.go` coordinate multiple locks and native handles with a strict ordering contract
63+
- Common failures: deadlocks, use-after-destroy bugs, or release calls after environment teardown
64+
- Safe modification: change these paths only with concurrency tests running and keep lock-order comments accurate
65+
- Test coverage: strong relative coverage in `ort/environment_test.go` and `ort/session_test.go`, but still high-risk code
66+
67+
**Embedder tensor/session cache internals:**
68+
- Why fragile: `embeddings/minilm`, `embeddings/splade`, and `embeddings/openclip` reuse backing slices and tensor handles across runs
69+
- Common failures: accidental slice reallocation, stale cache entries, or broken LRU eviction
70+
- Safe modification: preserve buffer ownership assumptions and run integration plus cache-behavior tests after changes
71+
- Test coverage: good package-local tests, but behavior still depends on real model contracts
72+
73+
## Scaling Limits
74+
75+
**Single-process global runtime model:**
76+
- Current capacity: one process-global ORT environment shared across sessions
77+
- Limit: alternative multi-runtime or per-tenant isolation models are not represented in the current architecture
78+
- Symptoms at limit: awkward lifecycle coordination for complex host applications
79+
- Scaling path: introduce a more explicit runtime/handle ownership model only with careful API redesign
80+
81+
## Dependencies at Risk
82+
83+
**Pure FFI dependence on ONNX Runtime ABI stability:**
84+
- Risk: upstream ABI or packaging changes can break symbol registration, archive naming, or runtime expectations
85+
- Impact: bootstrap, initialization, or session creation failures across supported platforms
86+
- Migration plan: keep `internal/c_api/` snapshots, generator output, CI runtime versions, and bootstrap rules aligned when upgrading
87+
88+
## Test Coverage Gaps
89+
90+
**Python tooling is effectively outside the Go test matrix:**
91+
- What's not tested: `tools/openclip_export_onnx.py`, `tools/openclip_generate_golden.py`, and `tools/splade_generate_golden.py`
92+
- Risk: export or dataset-generation regressions may not be caught until maintainers run them manually
93+
- Priority: medium
94+
- Difficulty to test: requires Python environment setup, heavyweight ML dependencies, and artifact generation time
95+
96+
**Release workflow behavior is mostly validated indirectly:**
97+
- What's not tested: end-to-end `.github/workflows/release.yml` behavior against real R2 credentials and signed artifact publishing
98+
- Risk: release-only failures can slip through normal PR CI
99+
- Priority: medium
100+
- Difficulty to test: depends on secrets, tag pushes, and external infrastructure
101+
102+
---
103+
104+
*Concerns audit: 2026-03-18*
105+
*Update as issues are fixed or new ones are discovered*

.planning/codebase/CONVENTIONS.md

Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
# Coding Conventions
2+
3+
**Analysis Date:** 2026-03-18
4+
5+
## Naming Patterns
6+
7+
**Files:**
8+
- lower_snake_case for implementation files: `environment.go`, `bootstrap_lock_unix.go`, `golden_dataset_parity_test.go`
9+
- `*_test.go` for all test files, often with specialized suffixes like `*_integration_test.go` or `*_benchmark_test.go`
10+
- generated code is named explicitly with `_generated`: `ort/ortapi_generated.go`
11+
12+
**Functions:**
13+
- Exported APIs use Go-standard PascalCase: `InitializeEnvironment`, `EnsureDefaultAssets`, `WithSequenceLength`
14+
- Unexported helpers use lowerCamelCase: `resolveRuntimeArtifact`, `validateFilePath`, `setupORTTestEnvironment`
15+
- Functional option constructors consistently start with `With` / `Without`
16+
17+
**Variables:**
18+
- lowerCamelCase for locals and fields
19+
- package-level constants are mostly PascalCase when exported and lowerCamelCase when internal
20+
- all-caps names are reserved for compatibility constants or env-oriented identifiers such as `ORT_API_VERSION`
21+
22+
**Types:**
23+
- PascalCase for structs, interfaces, and enums: `AdvancedSession`, `MemoryInfo`, `PoolingStrategy`
24+
- Package names stay short and lowercase: `ort`, `minilm`, `splade`, `openclip`
25+
26+
## Code Style
27+
28+
**Formatting:**
29+
- `gofmt` and `goimports` are the formatting source of truth (`.golangci.yml`, `Makefile`)
30+
- Standard Go formatting conventions apply: tabs, grouped imports, no manual alignment
31+
- Error strings are generally lowercase and sentence-fragment style
32+
33+
**Linting:**
34+
- `go vet`, `golangci-lint`, and `gosec` are the main static checks
35+
- `precommit` in `Makefile` mirrors CI blockers, with opt-out env vars for local workflows
36+
- `.golangci.yml` keeps the rule set small and targeted instead of enabling everything
37+
38+
## Import Organization
39+
40+
**Order:**
41+
1. Go standard library imports
42+
2. Repository-local imports such as `github.com/amikos-tech/pure-onnx/ort`
43+
3. Third-party imports such as `github.com/ebitengine/purego`
44+
45+
**Grouping:**
46+
- One blank line between import groups
47+
- No path aliases beyond short clarity-driven aliases like `tokenizers`
48+
- Side-effect imports are only used when required, for example image codecs in `examples/openclip/main.go`
49+
50+
**Path Aliases:**
51+
- None; imports use full module paths
52+
53+
## Error Handling
54+
55+
**Patterns:**
56+
- Validate aggressively at function entry and return early on bad inputs
57+
- Wrap underlying failures with `fmt.Errorf(...: %w)`
58+
- Join cleanup failures with `errors.Join` where multiple destroy steps can fail
59+
- Native-handle wrappers usually treat nil receivers as safe no-ops on `Destroy()`
60+
61+
**Error Types:**
62+
- Most code returns plain `error` rather than custom error structs
63+
- A few package-private sentinel or wrapper types exist where retry/permanence matters, such as `permanentBootstrapError` in `ort/bootstrap.go`
64+
- Tests assert on message content frequently, so wording changes can be user-visible
65+
66+
## Logging
67+
68+
**Framework:**
69+
- Standard library `log` package only
70+
71+
**Patterns:**
72+
- Library code logs sparingly, mainly for warnings or bootstrap path visibility
73+
- Examples use `log.Fatal` / `log.Printf` for CLI-style behavior
74+
- There is no structured logging abstraction
75+
76+
## Comments
77+
78+
**When to Comment:**
79+
- Explain FFI safety assumptions, lock ordering, and lifetime rules
80+
- Document why `unsafe` usage is acceptable in specific places
81+
- Keep obvious code uncommented
82+
83+
**Doc Comments:**
84+
- Exported types, constants, and functions are usually documented
85+
- Internal helpers receive comments when concurrency, lifecycle, or security behavior is non-obvious
86+
87+
**Security/Lint Markers:**
88+
- `#nosec` annotations are used narrowly to justify intentional `unsafe` pointer conversions or checksum-like constants
89+
90+
## Function Design
91+
92+
**Size:**
93+
- Public APIs tend to be moderate-sized with early validation and deferred cleanup
94+
- The longest functions are concentrated in bootstrap/download code and embedder setup paths
95+
96+
**Parameters:**
97+
- Constructors prefer explicit required parameters plus functional options
98+
- Helpers avoid large configuration structs at call sites unless state needs to persist
99+
100+
**Return Values:**
101+
- APIs nearly always return `(value, error)` or `error`
102+
- Cleanup methods return `error` rather than panicking
103+
104+
## Module Design
105+
106+
**Exports:**
107+
- Packages expose focused public APIs and keep helpers unexported
108+
- `internal/` is used only where cross-package reuse should remain private
109+
110+
**Generated Code:**
111+
- `ort/ortapi_generated.go` is treated as generated output; changes should come from `tools/gen_ortapi.go` and header inputs, not hand edits
112+
113+
---
114+
115+
*Convention analysis: 2026-03-18*
116+
*Update when patterns change*

0 commit comments

Comments
 (0)