A Docling integration for LangChain.
Simply install langchain-docling from your package manager, e.g. pip:
pip install langchain-doclingTo develop for Docling Core, you need Python >=3.9 <=3.13 and uv. You can then install from your local clone's root dir:
uv syncBasic usage of DoclingLoader looks as follows:
from langchain_docling import DoclingLoader
FILE_PATH = ["https://arxiv.org/pdf/2408.09869"] # Docling Technical Report
loader = DoclingLoader(file_path=FILE_PATH)
docs = loader.load()When initializing a DoclingLoader, you can use the following parameters:
file_path: source as single str (URL or local file) or iterable thereofconverter(optional): any specific Docling converter instance to useconvert_kwargs(optional): any specific kwargs for conversion executionexport_type(optional): export mode to use:ExportType.DOC_CHUNKS(default) orExportType.MARKDOWNmd_export_kwargs(optional): any specific Markdown export kwargs (for Markdown mode)chunker(optional): any specific Docling chunker instance to use (for doc-chunk mode)meta_extractor(optional): any specific metadata extractor to use
The package also provides a helper to ingest PDF files into a Neo4j vector store using Docling's hybrid chunking strategy:
from langchain_openai import OpenAIEmbeddings
from langchain_docling import ingest_pdfs_to_neo4j
vector_store = ingest_pdfs_to_neo4j(
file_paths=["/path/to/report.pdf"],
embedding=OpenAIEmbeddings(),
url="bolt://localhost:7687",
username="neo4j",
password="secret",
)For more details and usage examples, check out this page.