meeting-schedules-parser parses JW Meeting Workbook (mwb_*) and Watchtower Study (w_*) publications from .jwpub and .epub files and returns normalized schedule objects.
Primary public API:
loadPub(...)from:- Browser:
meeting-schedules-parser - Node ESM:
meeting-schedules-parser/dist/node/index.js - Node CJS:
meeting-schedules-parser/dist/node/index.cjs
- Browser:
- Entry points:
src/node/index.tssrc/browser/index.ts
- Shared parser orchestrator:
src/common/parser.ts
- Input/file validation:
src/common/file_validation.ts
- Format-specific parsing:
- EPUB:
src/common/epub_parser.ts - JWPUB (DB + decrypt):
src/common/jwpub_parser.ts
- EPUB:
- HTML extraction + schedule shaping:
src/common/html_utils.tssrc/common/html_validation.ts
- Enhanced language-aware parsing:
src/common/date_parser.tssrc/common/parsing_rules.tssrc/common/source_strategies.tssrc/common/language_rules.ts
- Enhanced-language config:
src/config/enhanced_languages.tssrc/config/language_profile_overrides.ts
- Per-locale configuration (each enhanced language):
- Crowdin translation file:
src/locales/*/text.json - Parser profile with regex patterns and text overrides:
src/locales/*/profile.ts
- Crowdin translation file:
The code relies on a global meeting_schedules_parser object, initialized by:
- Node:
src/node/utils.node.ts - Browser:
src/browser/utils.browser.ts
This global provides:
- locale rules (
languages) - path helpers (
path) - file IO in Node (
readFile) - SQL.js loader (
loadSQL)
If parsing code is used without these initializers, it will fail.
loadPubvalidates input.startParseenforces filename rules and issue minimums:- MWB >=
202401 - W >=
202310
- MWB >=
- Publication bytes are loaded (file path, Blob, or URL).
- Zip safety checks run (
max files,max total size, path traversal). - Branch by extension:
.epub-> XHTML scan and parse..jwpub-> decrypt DB documents and parse HTML payloads.
- Schedules are assembled in
html_utils.ts. - If language is in enhanced list (
src/config/enhanced_languages.ts), date/source parts are normalized:- Language-specific profile (
src/locales/<locale>/profile.ts) defines regex patterns for dates and sources. - Date parser uses these patterns to extract and structure date fields.
- Source parser extracts and normalizes assignment information.
- Text overrides apply locale-specific content fixes.
- Language-specific profile (
Types are in src/types/index.ts:
MWBScheduleWSchedule
Enhanced fields are optional and language-dependent (for example locale date vs normalized date).
- File naming is strict (
mwb_[A-Z]{1,3}_YYYYMM.*/w_[A-Z]{1,3}_YYYYMM.*). - JWPUB is expected to contain exactly 2 top-level files in zip.
- Locale text comes from Crowdin-managed JSON files.
- Locale parser behavior is defined in per-locale
profile.tsand wired throughsrc/config/language_profile_overrides.ts. - Enhanced parsing logic is regex-driven and sensitive to punctuation/spacing changes; keep changes narrowly scoped and fixture-backed.
Common commands:
npm run buildnpm testnpm run parsenpm run locale:checknpm run locale:new -- --code <CODE> --locale <locale> [--enhanced true|false]npm run cypress:opennpm run cypress:run
Notes:
test/unit/contains unit tests for language-specific parsing.test/e2e/01_standardParsing.test.jsandtest/e2e/02_enhancedParsing.test.jsare integration tests that pull data from JW CDN.- Tests compare parser output with fixtures in
test/e2e/fixtures/. - Tests import from
dist/node/index.js, so build first when changing source.
- Do not edit
dist/*manually. - Prefer edits in
src/common/*for parser behavior. - When adding/updating enhanced parsing for a language:
- Ensure language code exists in globals (
utils.node.tsandutils.browser.ts) andsrc/config/enhanced_languages.ts. - Sync/update locale tokens in
src/locales/<locale>/text.json(Crowdin-managed). - Add/update
src/locales/<locale>/profile.tswith:- Date regex patterns (
mwbDatePatterns,wDatePatterns) - Source parsing patterns (
sourcePatternOptions) - Text overrides for one-off content fixes (
textOverridesobject) - Locale-specific settings (direction, normalizers, etc.)
- Date regex patterns (
- Validate with
npm run locale:check,npm run build, andnpm run test.
- Ensure language code exists in globals (
- Language-specific text overrides previously in
src/common/override.ts(removed) are now handled per-locale via thetextOverridesproperty insrc/locales/<locale>/profile.ts. - Prefer per-locale
textOverridesin profile for one-off content fixes; use parser regex updates inparsing_rules.tsfor recurring structural patterns.
src/common/parsing_rules.ts: assignment parsing by regex (including RTL/bidi normalization work).src/common/date_parser.ts: strategy dispatch and date extraction flow.src/common/source_strategies.ts: regex assembly, bidi stripping, and numeral normalization helpers.src/common/html_utils.ts: DOM selector assumptions for multiple publication layouts.src/common/jwpub_parser.ts: decryption/DB extraction path.
- Demo app is under
client/. - Browser setup requires copied wasm (
copy-wasm.js) and bundler external config for Node built-ins.