
Shared infrastructure for the LexBuild legal-XML-to-Markdown pipeline. Provides streaming XML parsing, AST definitions, Markdown rendering, YAML frontmatter generation, and cross-reference link resolution used by all source packages.
Note: This is a foundational library. Most users should install @lexbuild/cli for the command-line tool, or a source package (@lexbuild/usc, @lexbuild/ecfr, @lexbuild/fr) for programmatic access.
npm install @lexbuild/core
# or
pnpm add @lexbuild/core
import { XMLParser, ASTBuilder, renderDocument, generateFrontmatter, createLinkResolver } from "@lexbuild/core";
import { createReadStream } from "node:fs";
// 1. Parse XML via streaming SAX
const parser = new XMLParser();
const builder = new ASTBuilder({
emitAt: "section",
onEmit: (node, context) => {
// 2. Each completed section is emitted here
const frontmatter = generateFrontmatter(/* ... */);
const resolver = createLinkResolver("relative");
const markdown = renderDocument(node, frontmatter, {
linkStyle: "relative",
resolveLink: resolver.resolve,
});
// 3. Write markdown to file
},
});
parser.on("openElement", (name, attrs) => builder.onOpenElement(name, attrs));
parser.on("closeElement", (name) => builder.onCloseElement(name));
parser.on("text", (text) => builder.onText(text));
await parser.parseStream(createReadStream("usc01.xml"));
emitAt also accepts a ReadonlySet<LevelType>. Deeper levels fire first (sections before their containing title), and emitted nodes remain attached to their parents so a higher-level emission sees the full subtree. Attach-to-parent is gated by "any enclosing stack frame is itself an emit target" — do not reason via LEVEL_TYPES index ordering, which breaks on USLM's permissive nesting (e.g. an appendix inside a part).
const byLevel = new Map<LevelType, LevelNode[]>();
const builder = new ASTBuilder({
emitAt: new Set(["section", "chapter", "title"]),
onEmit: (node, context) => {
const bucket = byLevel.get(node.levelType) ?? [];
bucket.push(node);
byLevel.set(node.levelType, bucket);
},
});
This is how the @lexbuild/usc and @lexbuild/ecfr converters produce multiple output granularities from a single parse.
| Export |
Description |
XMLParser |
Streaming SAX parser wrapping saxes with namespace normalization. Supports USLM (namespaced) and namespace-free XML (eCFR) via the defaultNamespace option. |
| Export |
Description |
ASTBuilder |
Stack-based USLM XML-to-AST builder with configurable emit-at-level streaming. emitAt accepts LevelType (single) or ReadonlySet<LevelType> (multi-level emit). Handles the full USLM 1.0 element vocabulary. Source packages for other formats provide their own builders. |
| Export |
Description |
renderDocument() |
Render a section node with frontmatter to a complete Markdown file |
renderSection() |
Render a section-level node to Markdown body text |
renderNode() |
Render any AST node to Markdown |
generateFrontmatter() |
Generate a YAML frontmatter block from FrontmatterData |
createLinkResolver() |
Create a cross-reference link resolver supporting USC, CFR, and fallback URLs |
| Export |
Description |
ASTNode |
Union type for all AST nodes |
LevelNode |
Hierarchical structural node (title, chapter, section, etc.) |
ContentNode |
Text content block (content, chapeau, continuation, proviso) |
InlineNode |
Inline text formatting (bold, italic, ref, footnoteRef, etc.) |
NoteNode |
Note block (editorial, statutory, amendment, etc.) |
TableNode |
Table with headers and rows |
SourceCreditNode |
Enactment source citation |
FrontmatterData |
Full frontmatter field definitions |
EmitContext |
Context passed with emitted nodes (ancestors, document metadata) |
SourceType |
"usc" | "ecfr" | "fr" |
LegalStatus |
"official_legal_evidence" | "official_prima_facie" | "authoritative_unofficial" |
| Export |
Description |
FORMAT_VERSION |
Output format version ("1.1.0") |
GENERATOR |
Generator string for frontmatter metadata |
LEVEL_TYPES |
Ordered array of level types (title → subsubitem) |
BIG_LEVELS |
Set of structural levels above section |
USLM_NS |
USLM namespace URI |
XHTML_NS |
XHTML namespace URI |
| Export |
Description |
writeFile() |
Write with ENFILE/EMFILE retry and exponential backoff |
writeFileIfChanged() |
Write only if content differs. Returns true if written, false if skipped (mtime preserved). Used by converters for incremental updates. |
mkdir() |
Recursive mkdir with retry |
- Node.js >= 22
- ESM only — no CommonJS build
- TypeScript — ships
.d.ts type declarations
- Zero browser dependencies — Node.js runtime only
This package is part of the LexBuild monorepo, managed with Turborepo and pnpm workspaces. All packages use changesets for lockstep versioning.
packages/
├── core/ ← you are here
├── usc/ # depends on core
├── ecfr/ # depends on core
├── fr/ # depends on core
└── cli/ # depends on core, usc, ecfr, fr
pnpm turbo build --filter=@lexbuild/core # Build
pnpm turbo test --filter=@lexbuild/core # Run tests
pnpm turbo typecheck --filter=@lexbuild/core
pnpm turbo lint --filter=@lexbuild/core
| Package |
Description |
@lexbuild/cli |
CLI tool for downloading and converting legal XML |
@lexbuild/usc |
U.S. Code (USLM XML) converter and downloader |
@lexbuild/ecfr |
eCFR (Code of Federal Regulations) converter and downloader |
@lexbuild/fr |
Federal Register converter and downloader |
MIT