|
| 1 | +# Architecture |
| 2 | + |
| 3 | +A compiler for CLI interface specifications. Parses CLI definitions from various sources, optimizes the intermediate representation, solves for minimal parameter bindings, and generates typed wrappers and schemas for multiple target languages. |
| 4 | + |
| 5 | +## Pipeline |
| 6 | + |
| 7 | +```mermaid |
| 8 | +flowchart LR |
| 9 | + subgraph Frontend |
| 10 | + F1[Boutiques] |
| 11 | + F2[Argparse] |
| 12 | + F3[...] |
| 13 | + end |
| 14 | +
|
| 15 | + subgraph IR |
| 16 | + IR1[Expr Tree] |
| 17 | + IR2[Passes] |
| 18 | + end |
| 19 | +
|
| 20 | + subgraph Solver |
| 21 | + S1[Solve] |
| 22 | + S2[Bindings] |
| 23 | + end |
| 24 | +
|
| 25 | + subgraph Backend |
| 26 | + B1[Python] |
| 27 | + B2[R] |
| 28 | + B3[TypeScript] |
| 29 | + B4[JSON Schema] |
| 30 | + end |
| 31 | +
|
| 32 | + F1 & F2 & F3 --> IR1 |
| 33 | + IR1 --> IR2 --> S1 --> S2 |
| 34 | +
|
| 35 | + S2 --> B1 & B2 & B3 & B4 |
| 36 | + IR2 -.-> B1 & B2 & B3 |
| 37 | +``` |
| 38 | + |
| 39 | +## Core Concepts |
| 40 | + |
| 41 | +| Module | Purpose | |
| 42 | +| ------------- | -------------------------------------------------------------------------------------------------------------------------- | |
| 43 | +| **ir** | Canonical expression tree (Expr = Literal \| Sequence \| Alternative \| Optional \| Repeat \| Int \| Float \| Str \| Path) | |
| 44 | +| **ir/passes** | Optimization passes: flatten, simplify, canonicalize, remove-empty | |
| 45 | +| **bindings** | Solved types (BoundType = scalar \| bool \| count \| literal \| optional \| list \| struct \| union \| nullable) | |
| 46 | +| **solver** | IR → Bindings via pattern matching | |
| 47 | +| **manifest** | Optional metadata: Project > Package > App | |
| 48 | +| **frontend** | Parsers producing IR | |
| 49 | +| **backend** | Code generators consuming IR + Bindings | |
| 50 | + |
| 51 | +## Solver Patterns |
| 52 | + |
| 53 | +| IR Pattern | BoundType | |
| 54 | +| ----------------------- | -------------------- | |
| 55 | +| `optional<literal>` | `bool` | |
| 56 | +| `repeat<literal>` | `count` | |
| 57 | +| `optional<T>` | `optional<solve(T)>` | |
| 58 | +| `repeat<T>` | `list<solve(T)>` | |
| 59 | +| `sequence<...named...>` | `struct<...>` | |
| 60 | +| `alternative<...>` | `union<...>` | |
| 61 | +| terminal | `scalar` | |
| 62 | + |
| 63 | +## Design Philosophy |
| 64 | + |
| 65 | +The key architectural improvement over Styx 1 is a clean separation of concerns for backends: |
| 66 | + |
| 67 | +- **Solved bindings -> parametrization** - the solver (`solver/solver.ts`) walks the IR once and pattern-matches into a `BoundType` tree (bool, count, optional, list, struct, union, etc.). Backends translate these into the typed parameter interface that users interact with. |
| 68 | +- **IR -> argument building logic** - the expr tree describes how to construct the command line (sequences, optionals, alternatives, literals). Backends translate the IR into runtime code that assembles CLI invocations, pulling values from the solved parameters to fill each slot. |
| 69 | + |
| 70 | +The IR is the skeleton of the command line; the bindings define the typed interface; the argument builder walks the IR and pulls from the parametrization to assemble the final invocation. In Styx 1, these concerns were entangled - each backend had to re-derive types from the IR via a complex `LanguageProvider` protocol. In Styx 2, backends receive both pieces pre-computed and just translate them into target language constructs. |
| 71 | + |
| 72 | +## Styx 1 vs Styx 2 |
| 73 | + |
| 74 | +| | Styx 1 (Python) | Styx 2 (TypeScript) | |
| 75 | +|---|---|---| |
| 76 | +| **IR** | Dataclass hierarchy (`Param[T]` with body types) | Algebraic expr tree with `kind` discriminant | |
| 77 | +| **Optimization** | Minimal (string merging) | Pass-based pipeline (flatten, simplify, canonicalize) | |
| 78 | +| **Type resolution** | Direct mapping in frontend; each backend re-derives types via language provider protocol | Solver produces a universal `BoundType` tree; backends just translate it | |
| 79 | +| **Backends** | Python mature, TS/R partial; each implements a complex `LanguageProvider` protocol | All stubs (architecture in place); should be simpler since solver does the heavy lifting | |
| 80 | +| **Output files** | First-class: path templates with param refs, suffix stripping, fallbacks | Not yet modeled to the same degree | |
| 81 | + |
| 82 | +Key Styx 1 features to eventually match: |
| 83 | +- **Output path templates** - `"output-[X].nii.gz"` parsed into literal + `OutputParamReference` tokens with suffix stripping and fallbacks |
| 84 | +- **Conditional groups** - command-line args only emitted when at least one param in the group is set |
| 85 | + |
| 86 | +## Roadmap |
| 87 | + |
| 88 | +Long-term, Boutiques shifts from being the primary frontend to primarily a **backend** (for cross-compatibility and bootstrapping NiWrap onto the new compiler). Planned frontends: |
| 89 | + |
| 90 | +- **Custom TypeScript-types-like language** - the intended primary way to define CLI specs |
| 91 | +- **Serialized Python argparse** - parse argparse definitions |
| 92 | +- **Boutiques** (current) - remains as both a frontend and a backend |
| 93 | + |
| 94 | +## Ecosystem Context |
| 95 | + |
| 96 | +This compiler is part of the **Styx/NiWrap ecosystem** ([niwrap.dev](https://niwrap.dev/)): |
| 97 | + |
| 98 | +- **Styx compiler** (this repo) - generates type-safe bindings from CLI tool descriptions |
| 99 | +- **NiWrap** - Boutiques descriptors for ~2,000 neuroimaging tools (FSL, FreeSurfer, ANTs, AFNI, MRTrix3, etc.) plus a build pipeline that feeds them through Styx to produce language-specific packages |
| 100 | +- **NiWrap packages** - generated Python/TypeScript/R wrappers with IDE autocompletion and type checking |
| 101 | +- **NiWrap Hub** - interactive web platform for exploring tools and generating code |
0 commit comments