You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Refresh documentation for the single-pass multi-granularity converter. README.md files across the monorepo (root, `@lexbuild/cli`, `@lexbuild/usc`, `@lexbuild/ecfr`, `@lexbuild/core`, `apps/astro`) now document the `--granularities <list>` and `--output-<granularity>` flags on `convert-usc`/`convert-ecfr`, the new `granularities` option on `convertTitle`/`convertEcfrTitle`, the builder's `ReadonlySet<LevelType>` emit mode, and the updated `update-usc.sh`/`update-ecfr.sh` single-invocation pattern. Public docs on the Astro site and internal docs under `.claude/internal/docs/` were updated in parallel. No code changes in this release — the bump exists to ship the refreshed package README copy to npm.
Copy file name to clipboardExpand all lines: CLAUDE.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -303,6 +303,7 @@ Note: identifiers use `/us/cfr/` (content type) not `/us/ecfr/` (data source). B
303
303
-**VPS PM2 logs live at `/home/ubuntu/pm2/logs/lexbuild/`**, not `~/.pm2/logs/`. The latter is legacy — only `pm2-logrotate-out.log` still writes there. Check the new path when debugging PM2-managed services.
304
304
-**VPS has 6 GiB swap** at `/swapfile` (persisted in `/etc/fstab`). Added as defense against Meilisearch OOM during bulk upserts on a 7.6 GiB RAM Lightsail box. Don't remove.
305
305
-**Stuck Meilisearch tasks crash-loop across restarts**: document-addition tasks that OOM Meilisearch are persisted in LMDB and re-attempted after every PM2 restart (observed ~60s crash cycle, 160+ restarts in 2.5 hours). Cancel via `curl -XPOST -H "Authorization: Bearer $MEILI_MASTER_KEY" "http://127.0.0.1:7700/tasks/cancel?uids=<list>"` — the cancellation typically executes during a healthy window even if the stuck task itself can't complete.
306
+
-**`_meta.json` / `README.md` carry wall-clock timestamps**: Converter outputs include a `generated_at` field. Byte-parity tests comparing outputs across runs must skip these files (assert existence, not content).
Copy file name to clipboardExpand all lines: README.md
+26Lines changed: 26 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -143,6 +143,8 @@ Update scripts handle change detection, download, convert, and deploy in one com
143
143
./scripts/update-usc.sh --skip-deploy
144
144
```
145
145
146
+
`update-usc.sh` and `update-ecfr.sh` convert every granularity in one parse using the `--granularities` flag (see below), so the convert step no longer scales with the number of output granularities.
147
+
146
148
---
147
149
148
150
## Commands
@@ -173,6 +175,13 @@ lexbuild convert-usc --all # All downloaded ti
Copy file name to clipboardExpand all lines: apps/astro/README.md
+24-4Lines changed: 24 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,16 +22,36 @@ The web application for [LexBuild](https://github.com/chris-c-thomas/LexBuild)
22
22
23
23
-**Node.js** >= 22
24
24
-**pnpm** >= 10
25
-
-**Converted content** — run the CLI to generate Markdown files before starting the app:
25
+
-**Converted content** — run the CLI to generate Markdown files before starting the app. The app browses every granularity (section, chapter, title, and part for eCFR), so emit them all in a single parse using `--granularities`:
Copy file name to clipboardExpand all lines: apps/astro/src/content/docs/architecture/conversion-pipeline.md
+18Lines changed: 18 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -90,6 +90,24 @@ const builder = new ASTBuilder({
90
90
91
91
Levels above the emit level (for example, `title` and `chapter` when emitting at `section`) are tracked as lightweight `AncestorInfo` objects containing just the level type, identifier, number, and heading. Their child subtrees are never accumulated in memory.
92
92
93
+
### Multi-Level Emit
94
+
95
+
`emitAt` also accepts a `ReadonlySet<LevelType>`. Deeper levels fire first (sections before their containing title), and emitted nodes remain attached to their parents so a higher-level emission sees the full subtree. Attach-to-parent is gated by "any enclosing stack frame is itself an emit target" — this live stack check is what keeps the logic correct for USLM's permissive level nesting (for example, an appendix inside a part), where hierarchy index ordering would be misleading.
The converter uses this to collect per-level buckets in a single pass and write every requested granularity from the matching bucket. This is how `--granularities` on the CLI produces multiple output trees without re-parsing the XML.
110
+
93
111
## The Collect-Then-Write Pattern
94
112
95
113
Both USC and eCFR converters collect all emitted nodes synchronously during parsing, then write files after parsing completes. The collect phase pushes `{ node, context }` pairs into an array; the write phase iterates this array to render and write each file.
Copy file name to clipboardExpand all lines: apps/astro/src/content/docs/cli/commands.md
+28-1Lines changed: 28 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -52,6 +52,10 @@ These flags are available on all convert commands:
52
52
|---|---|---|
53
53
|`-o, --output <dir>`|`./output`| Output directory for converted Markdown |
54
54
|`-g, --granularity <level>`|`section`| Output granularity (varies by source) |
55
+
|`--granularities <list>`| -- | Comma-separated granularities for multi-pass output. Mutually exclusive with `-g`. |
56
+
|`--output-chapter <dir>`| -- | Output directory for chapter granularity (when using `--granularities`) |
57
+
|`--output-title <dir>`| -- | Output directory for title granularity (when using `--granularities`) |
58
+
|`--output-part <dir>`| -- | Output directory for part granularity, eCFR only (when using `--granularities`) |
55
59
|`--link-style <style>`|`plaintext`| Cross-reference link style: `plaintext`, `relative`, or `canonical`|
56
60
|`--dry-run`| off | Parse and report statistics without writing files |
57
61
|`-v, --verbose`| off | Print detailed output including file paths |
@@ -63,6 +67,29 @@ These flags are available on all convert commands:
63
67
> [!NOTE]
64
68
> Setting any selective note flag (`--include-editorial-notes`, `--include-statutory-notes`, `--include-amendments`) automatically disables the broad `--include-notes` flag.
65
69
70
+
### Multi-Granularity Single-Pass Mode
71
+
72
+
Use `--granularities` to emit several granularity levels from a single parse of the source XML. Each listed granularity needs a matching output directory — section uses `--output` (or `--output-section`); chapter, title, and part (eCFR only) each take their own `--output-<granularity>` flag.
73
+
74
+
```bash
75
+
# USC: three granularities in one parse
76
+
lexbuild convert-usc --all \
77
+
--granularities section,title,chapter \
78
+
--output ./output \
79
+
--output-title ./output-title \
80
+
--output-chapter ./output-chapter
81
+
82
+
# eCFR: four granularities in one parse
83
+
lexbuild convert-ecfr --all \
84
+
--granularities section,title,chapter,part \
85
+
--output ./output \
86
+
--output-title ./output-title \
87
+
--output-chapter ./output-chapter \
88
+
--output-part ./output-part
89
+
```
90
+
91
+
`--granularities` is mutually exclusive with `-g/--granularity`. The builder parses the XML once and emits at every requested level, so multi-granularity runs are roughly ~40–50% faster than N separate single-granularity invocations.
92
+
66
93
## Full Pipeline Examples
67
94
68
95
### U.S. Code
@@ -110,7 +137,7 @@ For routine updates, wrapper scripts handle the full pipeline (detect changes, d
110
137
./scripts/update-usc.sh # USC, checks release point
111
138
```
112
139
113
-
Each script auto-detects what changed and only processes updates. See [Incremental Updates](/docs/guides/bulk-download#incremental-updates) for details.
140
+
Each script auto-detects what changed and only processes updates. `update-usc.sh` and `update-ecfr.sh` convert all granularities in one call using `--granularities` (see above), so the convert step parses the XML once per title rather than once per granularity. See [Incremental Updates](/docs/guides/bulk-download#incremental-updates) for details.
Copy file name to clipboardExpand all lines: apps/astro/src/content/docs/cli/sources/ecfr.md
+15Lines changed: 15 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -99,6 +99,21 @@ lexbuild convert-ecfr --all -g part
99
99
lexbuild convert-ecfr --titles 17 -g title
100
100
```
101
101
102
+
### Multi-Granularity (Single Pass)
103
+
104
+
Emit all four granularities from one parse:
105
+
106
+
```bash
107
+
lexbuild convert-ecfr --all \
108
+
--granularities section,title,chapter,part \
109
+
--output ./output \
110
+
--output-title ./output-title \
111
+
--output-chapter ./output-chapter \
112
+
--output-part ./output-part
113
+
```
114
+
115
+
`--granularities` is mutually exclusive with `-g`. Because the XML is parsed once and fanned out to each requested directory, this is roughly ~40–50% faster than running `convert-ecfr` four times with different `-g` values.
116
+
102
117
### Notes Filtering
103
118
104
119
Notes filtering works the same as for the U.S. Code:
Emit section, chapter, and title outputs from one parse:
93
+
94
+
```bash
95
+
lexbuild convert-usc --all \
96
+
--granularities section,title,chapter \
97
+
--output ./output \
98
+
--output-title ./output-title \
99
+
--output-chapter ./output-chapter
100
+
```
101
+
102
+
`--granularities` is mutually exclusive with `-g`. Because the XML is parsed once and fanned out to each requested directory, this is roughly ~40–50% faster than running `convert-usc` three times with different `-g` values.
103
+
90
104
### Notes Filtering
91
105
92
106
All notes (editorial, statutory, and amendment history) are included by default. Disable them entirely or filter selectively:
Produces 54 files for USC, 50 for eCFR. Files can be large (1-100 MB). Title-level files include extra frontmatter fields: `chapter_count`, `section_count`, and `total_token_estimate`.
199
199
200
+
### All Granularities in One Pass
201
+
202
+
If you need more than one granularity, `--granularities` emits them from a single parse of the source XML (~40–50% faster than running `convert-*` N times):
203
+
204
+
```bash
205
+
# USC: section + chapter + title from one parse
206
+
lexbuild convert-usc --all \
207
+
--granularities section,title,chapter \
208
+
--output ./output \
209
+
--output-title ./output-title \
210
+
--output-chapter ./output-chapter
211
+
212
+
# eCFR: all four granularities from one parse
213
+
lexbuild convert-ecfr --all \
214
+
--granularities section,title,chapter,part \
215
+
--output ./output \
216
+
--output-title ./output-title \
217
+
--output-chapter ./output-chapter \
218
+
--output-part ./output-part
219
+
```
220
+
221
+
`--granularities` is mutually exclusive with `-g/--granularity`.
222
+
200
223
> [!NOTE]
201
224
> The `-o` flag appends source subdirectories automatically. `convert-usc -o /some/path` writes to `/some/path/usc/`, not `/some/path/` directly.
0 commit comments