export-obsidian.mjs exports the Supabase database into an Obsidian-friendly
folder structure. It pulls issues, pages, and OCR generation history, then
writes Markdown notes that link to each other.
This script mirrors the .env loading pattern used in other root scripts.
- Issues (from
public.issues) - Pages (from
public.pages) - OCR generations (from
public.ocr_generations)
Each issue is grouped into a year based on publication_date, title, or
volume (see "Year inference" below).
The export creates (or updates) these files under the target folder:
Overview.mdYears/<year>.mdIssues/<issue-slug>.mdGenerations/<issue-slug>-page-###-gen-<id>.mdDashboards/Authors.mdDashboards/Tags.mdDashboards/Recent Issues.md
Where:
<year>is a 4-digit year orunknown<issue-slug>is a lowercase, dash-separated slug derived from the issue title###is zero-padded page number (e.g.,001)
Overview.md
- Links to each year and issue.
Years/<year>.md
- Links to all issues in that year.
Issues/<issue-slug>.md
- Metadata: id, volume, publication_date, authors, tags, created_at, updated_at
- Stats: total pages, pages with OCR text, pages with OCR generations
- One line per page with an online image link and OCR generation links
Generations/<issue-slug>-page-###-gen-<id>.md
- Metadata: generation id, page id, issue id, created_at, model, image_path
- OCR prompt/output from
ocr_generations
Dashboards/Authors.md
- Groups issues by author, listing each author, the number of issues they appear on, and links to the related issue notes.
- Uses the
authorsfield already fetched from issues (no extra API calls or dependencies).
Dashboards/Tags.md
- Groups issues by tag, showing each tag, how many issues use it, and links to those issue notes.
- Uses the
tagsfield already fetched from issues (no extra API calls or dependencies).
Dashboards/Recent Issues.md
- Lists the most recently created issues (sorted by
created_at) with links to their notes and any availableauthorsandtagsmetadata inline. - Uses existing issue metadata only; no new dependencies are required.
Authors and tags are read directly from the issues rows. Arrays are used as-is; comma, semicolon, or newline-delimited strings are split into individual entries before grouping and linking.
Required:
VITE_SUPABASE_URLVITE_SUPABASE_ANON_KEYOptional:VITE_IMAGE_BASE_URL(prefix forpages.image_pathwhen linking images)
Optional:
OBSIDIAN_EXPORT_DIR(used if--outis not provided)
The script looks for .env in:
./.env(project root) if present./kitanocr-web/.envotherwise
You can override with --env <path>.
node export-obsidian.mjs --out /path/to/ObsidianVaultOther CLI options:
--out,--folder, or--vaultto set output folder--env <path>to set the env file-hor--helpto print usage
Tables and columns used:
public.issues
id,title,volume,publication_date,authors,tags,created_at,updated_at- Dashboards assume
authorsandtagsare available (arrays or delimited strings) so they can group and label issues without extra queries.
public.pages
id,issue_id,page_number,image_path,status,ocr_text,created_at,updated_at
public.ocr_generations
id,page_id,model,prompt,output,metadata,created_at
If columns are added/removed, update the select('*') fields or note templates.
Year is determined in this order:
publication_date(parsed as UTC year)title/volumecontaining a 4-digit yeartitle/volumethat looks likeYYYYMMorYYYYMMDD(uses first 4)- Fallback to
unknown
To change this behavior, update inferYear() in export-obsidian.mjs.
Issue slugs:
- Lowercase
- Non-alphanumeric replaced with
- - Multiple dashes collapsed
To change this behavior, update slugify() in export-obsidian.mjs.
- Export pagination is fixed at 1000 rows per request.
- Obsidian notes are overwritten on each run.
- OCR generation history is written to separate generation notes.
If exports become large, consider:
- Adding filters (by year, issue, or page range)
- Splitting OCR generations into separate notes
- Using
select()with only necessary columns to reduce payload size