Enhance downloader functionality and version bump to v0.6.1#8
Enhance downloader functionality and version bump to v0.6.1#8chris-c-thomas merged 2 commits intomainfrom
Conversation
ℹ️ Recent review infoConfiguration used: Organization UI Review profile: CHILL Plan: Free 📒 Files selected for processing (10)
📝 WalkthroughWalkthroughVersion 0.6.1 release patch that optimizes the USC package downloader to fetch all 54 titles from a single bulk XML ZIP when all titles are downloaded, with automatic fallback to per-title downloads if unavailable. Includes version bumps and changelog entries across packages. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Downloader as downloadTitles()
participant Check as isAllTitles()
participant BulkDL as downloadAndExtractAllTitles()
participant HTTP as HTTP Request
participant ZIP as ZIP Extractor
participant FileSystem as File System
participant Fallback as Per-title Downloads
Client->>Downloader: downloadTitles(titles?)
Downloader->>Check: isAllTitles(titles)
alt All 54 titles requested
Check-->>Downloader: true
Downloader->>BulkDL: downloadAndExtractAllTitles()
BulkDL->>HTTP: GET xml_uscAll@{releasePoint}.zip
HTTP-->>BulkDL: bulk ZIP file
BulkDL->>ZIP: extractAllXmlFromZip()
ZIP->>FileSystem: extract all uscNN.xml
FileSystem-->>ZIP: file paths
ZIP-->>BulkDL: DownloadedFile[]
BulkDL-->>Downloader: success
Downloader-->>Client: DownloadedFile[]
else Bulk download fails or partial titles
Check-->>Downloader: false / error
Downloader->>Fallback: downloadTitles() per-title
Fallback->>HTTP: GET per-title XMLs
HTTP-->>Fallback: individual files
Fallback-->>Downloader: DownloadedFile[]
Downloader-->>Client: DownloadedFile[]
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Note 🎁 Summarized by CodeRabbit FreeYour organization is on the Free plan. CodeRabbit will generate a high-level summary and a walkthrough for each pull request. For a comprehensive line-by-line review, please upgrade your subscription to CodeRabbit Pro by visiting https://app.coderabbit.ai/login. Comment |
There was a problem hiding this comment.
Pull request overview
This PR improves @law2md/usc downloader efficiency by adding a bulk-download path when all 54 USC titles are requested, alongside a repo-wide version bump to 0.6.1 and corresponding changelog updates.
Changes:
- Add
isAllTitleshelper and bulkxml_uscAll@{releasePoint}.zipdownload + extraction path with per-title fallback. - Add unit tests for
isAllTitlesand export it from@law2md/usc. - Bump package versions to
0.6.1and update changelogs.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/usc/src/index.ts | Re-export isAllTitles from the downloader module. |
| packages/usc/src/downloader.ts | Implements bulk all-titles zip download/extract path and isAllTitles helper. |
| packages/usc/src/downloader.test.ts | Adds unit tests for isAllTitles. |
| packages/usc/package.json | Bumps @law2md/usc version to 0.6.1. |
| packages/usc/CHANGELOG.md | Documents downloader enhancement and dependency updates. |
| packages/core/package.json | Bumps @law2md/core version to 0.6.1. |
| packages/core/CHANGELOG.md | Adds a 0.6.1 entry (currently describing downloader enhancement). |
| packages/cli/package.json | Bumps CLI version to 0.6.1. |
| packages/cli/CHANGELOG.md | Notes downloader enhancement and dependency updates. |
| CHANGELOG.md | Adds top-level 0.6.1 release notes for bulk download behavior. |
| // Use bulk zip when all 54 titles are requested | ||
| if (options.titles === undefined || isAllTitles(titles)) { | ||
| try { | ||
| const files = await downloadAndExtractAllTitles(releasePoint, options.outputDir); | ||
| return { releasePoint, files, errors: [] }; | ||
| } catch { | ||
| // Fall back to per-title downloads | ||
| } |
There was a problem hiding this comment.
The bulk-download attempt swallows all errors and falls back silently, but it can leave a partially downloaded uscAll.zip and/or partially extracted XML files in outputDir. Consider narrowing the fallback to expected “bulk unavailable” failures (e.g., 404/410), and ensure temp artifacts are cleaned up on bulk failure (e.g., cleanup in a finally inside the bulk download/extract helper, or delete uscAll.zip + partial outputs before continuing).
| // Extract all XML files from zip | ||
| const extracted = await extractAllXmlFromZip(zipPath, outputDir); | ||
|
|
||
| // Clean up zip file | ||
| await unlink(zipPath); | ||
|
|
||
| // Stat each extracted file and build results | ||
| const files: DownloadedFile[] = []; | ||
| for (const { titleNumber, filePath } of extracted) { | ||
| const fileStat = await stat(filePath); |
There was a problem hiding this comment.
downloadAndExtractAllTitles treats any extraction result as success. If the bulk zip layout changes (or the regex fails), extractAllXmlFromZip can return an empty/partial list and downloadTitles will return { errors: [] }, silently skipping titles. Add a completeness check (e.g., ensure the extracted title numbers cover 1–54 and/or that 54 files were extracted) and throw if incomplete so the caller can fall back to per-title downloads.
| zipFile.on("end", () => { | ||
| ended = true; | ||
| maybeResolve(); | ||
| }); | ||
|
|
||
| zipFile.on("error", (zipErr: Error) => { | ||
| reject(new Error(`Zip error: ${zipErr.message}`)); | ||
| }); |
There was a problem hiding this comment.
extractAllXmlFromZip never closes the zipFile on the successful path (and also doesn’t close it in the zipFile.on("error") handler). This can leak file descriptors, especially when downloading all titles repeatedly. Ensure zipFile.close() is called when finishing (after end/pending resolves) and when handling zip-level errors.
| // Use bulk zip when all 54 titles are requested | ||
| if (options.titles === undefined || isAllTitles(titles)) { | ||
| try { | ||
| const files = await downloadAndExtractAllTitles(releasePoint, options.outputDir); | ||
| return { releasePoint, files, errors: [] }; | ||
| } catch { | ||
| // Fall back to per-title downloads | ||
| } | ||
| } |
There was a problem hiding this comment.
New bulk-download behavior (single uscAll zip, extraction of all titles, and fallback when unavailable) isn’t covered by tests here. Consider adding tests that mock fetch to (1) return a small in-memory zip with a few uscNN.xml entries to verify extraction/mapping, and (2) simulate a 404 for the bulk URL to verify the per-title fallback path is used.
|
|
||
| ### Patch Changes | ||
|
|
||
| - Enhance downloader for when all titles are downloaded |
There was a problem hiding this comment.
This changelog entry attributes the bulk USC downloader enhancement to @law2md/core, but the downloader implementation lives in @law2md/usc. To avoid misleading consumers, adjust this entry to reflect actual core changes in 0.6.1 (e.g., dependency bump / no functional changes) and leave the downloader note to the usc/cli changelogs.
| - Enhance downloader for when all titles are downloaded | |
| - Internal maintenance and dependency updates; no functional changes in core |
This pull request introduces an enhancement to the downloader logic for U.S. Code XML files: when all 54 titles are requested, the system now downloads a single bulk zip file (
xml_uscAll@{releasePoint}.zip) instead of making 54 individual HTTP requests, significantly improving efficiency. If the bulk zip is unavailable, it gracefully falls back to individual downloads. The change is reflected across the core, CLI, and USC packages, with updated tests and documentation.Bulk download enhancement
downloadTitlesinpackages/usc/src/downloader.tsto detect requests for all 54 titles and fetch a single bulk zip file, falling back to per-title downloads if the bulk file is unavailable.isAllTitlesto determine if a request covers all 54 titles, handling duplicates and arbitrary ordering.extractAllXmlFromZip,downloadAndExtractAllTitles) to handle bulk zip files and extract individual XML files for each title.Testing and exports
isAllTitlesinpackages/usc/src/downloader.test.tsto ensure correct detection of all-title requests.isAllTitlesinpackages/usc/src/index.tsfor use in other modules.Documentation and versioning
CHANGELOG.md,packages/cli/CHANGELOG.md,packages/core/CHANGELOG.md, andpackages/usc/CHANGELOG.mdto document the bulk download enhancement and dependency updates. [1] [2] [3] [4]0.6.1inpackages/cli/package.json,packages/core/package.json, andpackages/usc/package.json. [1] [2] [3]Summary by CodeRabbit
Release Notes – Version 0.6.1
New Features
Tests