In addition to the general contribution guidelines, there are a few extra things to consider when contributing third-party integrations to LangChain that will be covered here. The goal of this page is to help you draft PRs that take these considerations into account, and can therefore be merged sooner.
Integrations tend to fall into a set number of categories, each of which will have their own section below. Please read the general guidelines, then see the integration-specific guidelines and example PRs section at the end of this page for additional information and examples.
The following guidelines apply broadly to all type of integrations:
You should generally not export your new module from an index.ts file that contains many other exports. Instead, you should add a separate entrypoint for your integration in libs/community/langchain-community/langchain.config.js within the entrypoints field in the config object:
export const config = {
internals: [ ... ],
entrypoints: {
load: "load/index",
...
"vectorstores/chroma": "vectorstores/chroma",
"vectorstores/hnswlib": "vectorstores/hnswlib",
...
},
...
}The entrypoint name should conform to its path in the repo. For example, if you were adding a new vector store for a hypothetical provider "langco", you might create it under vectorstores/langco.ts. You should add it above as:
export const config = {
internals: [ ... ],
entrypoints: {
load: "load/index",
...
"vectorstores/chroma": "vectorstores/chroma",
"vectorstores/hnswlib": "vectorstores/hnswlib",
"vectorstores/langco": "vectorstores/langco",
...
},
...
}You may use third-party dependencies in new integrations, but they should be added as peerDependencies and devDependencies with an entry under peerDependenciesMeta in libs/community/langchain-community/package.json, not under any core dependencies list. This keeps the overall package size small, as only people who are using your integration will need to install, and allows us to support a wider range of runtimes.
We suggest using caret syntax (^) for peer dependencies to support a wider range of people trying to use them as well as to be somewhat tolerant to non-major version updates, which should (theoretically) be the only breaking ones.
Please make sure all introduced dependencies are permissively licensed (MIT is recommended) and well-supported and maintained.
You must also add your new entrypoint under requiresOptionalDependency in the langchain.config.js file to avoid breaking the build:
export const config = {
internals: [ ... ],
entrypoints: {
load: "load/index",
...
"vectorstores/chroma": "vectorstores/chroma",
"vectorstores/hnswlib": "vectorstores/hnswlib",
"vectorstores/langco": "vectorstores/langco",
...
},
requiresOptionalDependency: [
...
"vectorstores/langco",
...
],
...
}If you have conformed to all of the above guidelines, you can just import your dependency as normal in your integration's file in the LangChain repo. Developers who import your entrypoint will then see an error message if they are missing the required peer dependency.
Many integrations initialize instances of third-party clients, which often require vendor-specific configuration and options in addition to LangChain specific configuration. To avoid unnecessary repetition and desyncing, we suggest using imported third-party configuration types whenever available, unless there's a specific reason to only support a subset of these options.
Here's a simplified example:
import {
LangCoClient,
LangCoClientOptions,
} from "langco-client";
import { BaseDocumentLoader, DocumentLoader } from "../base.js";
export class LangCoDatasetLoader
extends BaseDocumentLoader
implements DocumentLoader
{
protected langCoClient: LangCoClient;
protected datasetId: string;
protected verbose: boolean;
constructor(
datasetId: string,
config: {
verbose: boolean;
clientOptions?: LangCoClientOptions;
}
) {
super();
this.langCoClient = new LangCoClient(config.clientOptions ?? {});
this.verbose = config.verbose ?? false;
}
...
}Above, we have a document loader that we're sure will always require a specific datasetId, and then some config properties that could change in the future containing a LangChain specific configuration property, verbose. We have also put a clientOptions parameter within that config that is passed directly into the third party client. With this structure, if the underlying client adds new options, all we need to do is bump the version.
We highly appreciate documentation and integration tests showing how to set up and use your integration. Providing this will make it much easier for reviewers to verify that your integration works and will streamline the review process.
As with all contributions, make sure you run pnpm lint and pnpm format so that everything conforms to our established style.
Below are links to guides with advice and tips for specific types of integrations:
- LLM providers (e.g. OpenAI's GPT-3)
- Chat model providers (TODO) (e.g. Anthropic's Claude, OpenAI's GPT-4)
- Memory (used to give an LLM or chat model context of past conversations, e.g. Motörhead)
- Vector stores (e.g. Pinecone)
- Persistent message stores (used to persistently store and load raw chat histories, e.g. Redis)
- Document loaders (used to load documents for later storage into vector stores, e.g. Apify)
- Embeddings (used to create embeddings of text documents or strings e.g. Cohere)
- Tools (used for agents, e.g. the SERP API tool)
This is a living document, so please make a pull request if we're missing anything useful!