Skip to content

Add CVFileLoader integration page#3969

Open
Ilansky (ilanoh) wants to merge 1 commit into
langchain-ai:mainfrom
ilanoh:cvfile-loader
Open

Add CVFileLoader integration page#3969
Ilansky (ilanoh) wants to merge 1 commit into
langchain-ai:mainfrom
ilanoh:cvfile-loader

Conversation

@ilanoh
Copy link
Copy Markdown

This PR adds a document loader integration page for langchain-cvfile, which loads .cv files (an open file format for resumes/CVs: PDF/A-3u with embedded Markdown/HTML/JSON payloads carried as PDF Associated Files).

Why

A .cv file is a PDF/A-3u that ships a Markdown copy of the same content (plus optional HTML and JSON Resume) as PDF Associated Files (ISO 32000-2 §14.13). For RAG/ATS use cases, the embedded Markdown is a much cleaner text representation than OCRing the visual PDF layer. This loader pulls those embedded payloads directly and returns one Document per textual payload, with the payload declared as cv:primaryPayload in the file's XMP metadata flagged metadata["primary"] = True.

What this PR contains

A single new MDX page at src/oss/python/integrations/document_loaders/cvfile.mdx, following the existing PyPDFLoader / similar loader template (frontmatter, integration details table, loader features table, setup, initialization, load + lazy_load examples, metadata reference, API reference link).

Package details

Code sample is runnable

The example in the page works against any .cv file produced by the reference SDK. Sample output in the page was generated against an actual fixture; running loader.load() returns three Documents for a typical resume (resume.md primary, resume.html alternate, resume.json alternate).

@github-actions github-actions Bot added the external User is not a member of langchain-ai label May 12, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for opening a docs PR, Ilansky (@ilanoh)! When it's ready for review, please add the relevant reviewers:

  • @mdrxy (Python integrations)

@github-actions github-actions Bot added langchain For docs changes to LangChain oss python For content related to the Python version of LangChain projects labels May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external User is not a member of langchain-ai langchain For docs changes to LangChain oss python For content related to the Python version of LangChain projects

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant