npm install
npm run build
# Or use directly with tsx:
npx tsx bin/websource.ts <command>Guided conversational setup to create a new data source. Analyzes the URL, proposes extractable fields, and saves a reusable config.
websource init https://books.toscrape.comAnalyze a website without creating a source. Shows page type, repeated blocks, suggested fields, pagination, and detail links.
websource scan https://example.com/products
websource scan https://example.com/jobs --rendered # Force browser renderingList all registered sources.
websource sources list
websource sources list --status activeShow detailed info about a source: config, fields, schedule, recent runs.
websource sources show abc123Dry-run extraction — fetches and extracts but does not save results.
websource preview abc123
websource preview abc123 --limit 5Run extraction and save results. Automatically computes diff against the previous snapshot.
websource extract abc123Show changes between the last two extractions.
websource diff abc123
websource diff abc123 --run-a <runId> --run-b <runId> # Compare specific runsSet a refresh schedule. Supports presets (hourly, daily, weekly) or cron expressions.
websource schedule abc123 daily
websource schedule abc123 "0 */6 * * *" # Every 6 hoursStart the local API server and scheduler.
websource serve
websource serve --port 4000
websource serve --scheduler-only # No HTTP, just run scheduled extractionsExport extracted data to JSON or CSV.
websource export abc123 --format json
websource export abc123 --format csv --output data.csv
websource export abc123 --run <runId> # Export from a specific runRun diagnostics: check database, Playwright, source reachability, selector health.
websource doctor| Variable | Description | Default |
|---|---|---|
WEBSOURCE_DATA_DIR |
Data directory (DB, logs, locks) | ~/.local/share/websource |
WEBSOURCE_CONFIG_DIR |
Config directory | ~/.config/websource |
LOG_LEVEL |
Log level (debug, info, warn, error) | info |
NODE_ENV |
Set to production for file-based logging |
— |