Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -137,11 +137,11 @@ tags
# law2md operational outputs
# -----------------------------------------------------------------------------

# Downloaded USC XML files (large, regenerable from OLRC)
xml/

# Legacy location for downloaded XML (kept for backwards compatibility)
# Downloaded source files (large, regenerable — e.g. downloads/usc/xml/)
downloads/

# Legacy locations for downloaded XML (kept for backwards compatibility)
xml/
fixtures/xml/

# Converted Markdown output (regenerable from source XML)
Expand Down
1 change: 1 addition & 0 deletions .prettierignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ dist
node_modules
pnpm-lock.yaml
*.md
downloads
xml
fixtures/xml
docs/reference
36 changes: 36 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,42 @@ and this project adheres to [Conventional Commits](https://www.conventionalcommi

## [Unreleased]

## [0.5.0]

### Added

#### Multi-Title Selection

- **`--titles <spec>` option** on both `download` and `convert` commands: supports single numbers (`29`), comma-separated lists (`1,3,8,11`), ranges (`1-5`), and mixed formats (`1-5,8,11`). Replaces the single-title `--title <n>` option on download. ([`3a29a8e`](../../commit/3a29a8e))
- **`--input-dir <dir>` option** on `convert` command: specifies the directory containing USC XML files when using `--titles` (default: `./downloads/usc/xml`) ([`3a29a8e`](../../commit/3a29a8e))
- **Multi-title convert output**: per-title summary tables with progress labels (`"Converting Title 1 (1/5)..."`) followed by an aggregate footer (`"Converted 5 titles (2,450 sections) in 3.2s"`) ([`3a29a8e`](../../commit/3a29a8e))
- **`parseTitles()` utility** (`packages/cli/src/parse-titles.ts`): title spec parser with validation (1-54 range, ascending ranges, deduplication, sorting) and 23 unit tests ([`3a29a8e`](../../commit/3a29a8e))

### Changed

- **`convert` command**: `<input>` argument is now optional — use either a file path or `--titles` ([`3a29a8e`](../../commit/3a29a8e))
- **`download` command**: `--title <n>` replaced by `--titles <spec>` ([`3a29a8e`](../../commit/3a29a8e))

---

## [0.4.1]

### Added

#### Terminal UI

- **Polished CLI output** (`packages/cli/src/ui.ts`): `chalk`, `ora`, and `cli-table3` for spinners, formatted summary blocks, and data tables in download and convert commands ([`a182dbe`](../../commit/a182dbe))

### Fixed

- **Default download/output locations**: adjusted default paths for `--output` on download and convert commands ([`9e15faf`](../../commit/9e15faf), [`5cbffd5`](../../commit/5cbffd5))

### Changed

- **Documentation cleanup**: renamed/reorganized docs, removed reference development docs, updated README with OLRC user guide details ([`52afb03`](../../commit/52afb03), [`0fc4a7e`](../../commit/0fc4a7e))

---

## [0.4.0] — Phase 4: Polish & Publish

### Added
Expand Down
8 changes: 6 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,9 @@ law2md/
│ ├── core/ # @law2md/core — XML parsing, AST, Markdown rendering, shared utilities
│ ├── usc/ # @law2md/usc — U.S. Code-specific element handlers and downloader
│ └── cli/ # law2md — CLI binary (the published npm package users install)
├── xml/ # Full USC XML files (usc01.xml ... usc54.xml) — gitignored
├── downloads/
│ └── usc/
│ └── xml/ # Full USC XML files (usc01.xml ... usc54.xml) — gitignored
├── fixtures/
│ ├── fragments/ # Small synthetic XML snippets for unit tests
│ └── expected/ # Expected output snapshots for integration tests
Expand Down Expand Up @@ -66,7 +68,9 @@ pnpm turbo lint
pnpm turbo dev

# Run the CLI locally during development
node packages/cli/dist/index.js convert ./xml/usc01.xml -o ./test-output
node packages/cli/dist/index.js convert ./downloads/usc/xml/usc01.xml -o ./test-output
node packages/cli/dist/index.js convert --titles 1-5 -o ./test-output
node packages/cli/dist/index.js download --titles 1
```

## Code Conventions
Expand Down
5 changes: 3 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@ pnpm turbo test --filter=law2md

```bash
node packages/cli/dist/index.js convert path/to/usc01.xml -o ./output
node packages/cli/dist/index.js download --title 1
node packages/cli/dist/index.js download --titles 1 # saves to ./downloads/usc/xml/
node packages/cli/dist/index.js convert --titles 1-5 # convert multiple titles
```

### Formatting
Expand Down Expand Up @@ -100,7 +101,7 @@ Review the diff in `fixtures/expected/` to confirm only intended changes, then c

- `fixtures/fragments/` — Small synthetic XML snippets for unit tests (committed)
- `fixtures/expected/` — Pinned expected output for snapshot tests (committed)
- `xml/` — Full USC XML files (gitignored, download with `law2md download`)
- `downloads/usc/xml/` — Full USC XML files (gitignored, download with `law2md download`)

## Submitting Changes

Expand Down
58 changes: 39 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,13 +58,13 @@ pnpm turbo build

```bash
# Download Title 1 (smallest title, good for testing)
law2md download --title 1 -o ./xml
law2md download --titles 1

# Convert to Markdown
law2md convert ./xml/usc01.xml -o ./output
law2md convert ./downloads/usc/xml/usc01.xml -o ./output

# Or do both in one shot
law2md download --title 1 -o ./xml && law2md convert ./xml/usc01.xml -o ./output
# Download and convert multiple titles at once
law2md download --titles 1-5 && law2md convert --titles 1-5
```

---
Expand All @@ -77,50 +77,69 @@ Fetch U.S. Code XML files directly from the Office of the Law Revision Counsel:

```bash
# Download a single title
law2md download --title 1 -o ./xml
law2md download --titles 1

# Download multiple titles (range)
law2md download --titles 1-5

# Download specific titles (mixed)
law2md download --titles 1-5,8,11

# Download all 54 titles
law2md download --all -o ./xml
law2md download --all

# Use a specific release point
law2md download --title 26 -o ./xml --release-point 119-73not60
law2md download --titles 26 --release-point 119-73not60
```

Or download manually from the [OLRC download page](https://uscode.house.gov/download/download.shtml).

### Convert

```bash
# Section-level output (default)
law2md convert ./xml/usc01.xml -o ./output
# Convert a single XML file
law2md convert ./downloads/usc/xml/usc01.xml -o ./output

# Convert by title number (uses default input directory)
law2md convert --titles 1

# Convert multiple titles
law2md convert --titles 1-5,8,11

# Convert with a custom input directory
law2md convert --titles 1-5 -i ./my-xml-files

# Chapter-level output
law2md convert ./xml/usc01.xml -o ./output -g chapter
law2md convert ./downloads/usc/xml/usc01.xml -o ./output -g chapter

# Cross-reference links resolved to OLRC URLs
law2md convert ./xml/usc05.xml -o ./output --link-style canonical
law2md convert ./downloads/usc/xml/usc05.xml -o ./output --link-style canonical

# Include only amendment notes
law2md convert ./xml/usc01.xml -o ./output --include-amendments
law2md convert ./downloads/usc/xml/usc01.xml -o ./output --include-amendments

# Exclude all notes
law2md convert ./xml/usc01.xml -o ./output --no-include-notes
law2md convert ./downloads/usc/xml/usc01.xml -o ./output --no-include-notes

# Dry-run: preview stats without writing files
law2md convert ./xml/usc42.xml -o ./output --dry-run
law2md convert ./downloads/usc/xml/usc42.xml -o ./output --dry-run
```

### CLI Reference

```bash
law2md convert <input> [options]
law2md convert [input] [options]
```

```text
Arguments:
input Path to a USC XML file
input Path to a USC XML file (optional if --titles is used)

Options:
--titles <spec> Title(s) to convert: single (1), range (1-5),
or mixed (1-5,8,11)
-i, --input-dir <dir> Input directory for XML files
(default: "./downloads/usc/xml")
-o, --output <dir> Output directory (default: "./output")
-g, --granularity <level> "section" or "chapter" (default: "section")
--link-style <style> "plaintext", "canonical", or "relative"
Expand All @@ -137,9 +156,10 @@ Options:
law2md download [options]

Options:
--title <number> Download a single title (1-54)
--titles <spec> Title(s) to download: single (1), range (1-5),
or mixed (1-5,8,11)
--all Download all 54 titles
-o, --output <dir> Output directory (default: "./xml")
-o, --output <dir> Output directory (default: "./downloads/usc/xml")
--release-point <point> OLRC release point (default: current)
-h, --help Display help
```
Expand Down Expand Up @@ -189,7 +209,7 @@ positive_law: true
currency: "119-73"
last_updated: "2025-12-03"
format_version: "1.0.0"
generator: "law2md@0.4.0"
generator: "law2md@0.5.0"
source_credit: "(Added Pub. L. 104-199, § 3(a), Sept. 21, 1996, ...)"
---
```
Expand Down
4 changes: 2 additions & 2 deletions docs/output-format.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ positive_law: true # Boolean
currency: "119-73" # Release point identifier
last_updated: "2025-12-03" # ISO date from XML generation
format_version: "1.0.0" # Output format version
generator: "law2md@0.4.0" # Generator version
generator: "law2md@0.5.0" # Generator version

# Optional
source_credit: "(July 30, 1947, ...)" # Full source credit text (included by default)
Expand Down Expand Up @@ -212,7 +212,7 @@ Complex tables (with colspan, rowspan, or nested content) render as fenced HTML:
```json
{
"format_version": "1.0.0",
"generator": "law2md@0.4.0",
"generator": "law2md@0.5.0",
"generated_at": "2025-12-03T12:00:00.000Z",
"identifier": "/us/usc/t1",
"title_number": 1,
Expand Down
21 changes: 21 additions & 0 deletions packages/cli/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,26 @@
# law2md

## 0.5.0

### Minor Changes

- 3a29a8e: Add `--titles` multi-select option to download and convert commands. Supports ranges (`1-5`), comma-separated lists

### Patch Changes

- Updated dependencies [3a29a8e]
- @law2md/core@0.5.0
- @law2md/usc@0.5.0

## 0.4.1

### Patch Changes

- Add chalk, ora, and cli-table3 for polished terminal output with spinners and formatted
- Updated dependencies
- @law2md/core@0.4.1
- @law2md/usc@0.4.1

## 0.4.0

### Minor Changes
Expand Down
5 changes: 4 additions & 1 deletion packages/cli/package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "law2md",
"version": "0.4.0",
"version": "0.5.0",
"description": "Convert U.S. legislative XML (USLM) to structured Markdown for AI/RAG ingestion",
"type": "module",
"main": "./dist/index.js",
Expand Down Expand Up @@ -29,7 +29,10 @@
"dependencies": {
"@law2md/core": "workspace:*",
"@law2md/usc": "workspace:*",
"chalk": "^5.6.2",
"cli-table3": "^0.6.5",
"commander": "^13.1.0",
"ora": "^8.2.0",
"pino": "^9.6.0",
"pino-pretty": "^13.0.0"
},
Expand Down
Loading