docs/src/oss/python/integrations/providers/databricks.mdx at a68c64c4261bcca8af135e6af96c1420bfc86b28 · langchain-ai/docs

title	Databricks integrations
description	Integrate with Databricks using LangChain Python.

Databricks provides a data and AI platform (often called the Databricks Lakehouse) for analytics, machine learning, and generative AI workloads.

Databricks embraces the LangChain ecosystem in various ways:

🚀 Model Serving - Access state-of-the-art LLMs, such as DBRX, Llama3, Mixtral, or your fine-tuned models on Databricks Model Serving, via a highly available and low-latency inference endpoint. LangChain provides LLM (Databricks), Chat Model (ChatDatabricks), and Embeddings (DatabricksEmbeddings) implementations, streamlining the integration of your models hosted on Databricks Model Serving with your LangChain applications.
📃 Vector Search - [Databricks Vector Search](https://www.databricks.com/product/machine-learning/ vector-search) is a serverless vector database seamlessly integrated within the Databricks Platform. Use the DatabricksVectorSearch class to connect LangChain to vector search indexes in your Databricks account.
📊 MLflow - MLflow is an open-source platform to manage full the ML lifecycle, including experiment management, evaluation, tracing, deployment, and more. MLflow's LangChain Integration streamlines the process of developing and operating modern compound ML systems.
🌐 SQL Database - Databricks SQL is integrated with SQLDatabase in LangChain, allowing you to access the auto-optimizing, exceptionally performant data warehouse.
💡 Open Models - Databricks open sources models, such as [DBRX](https://www.databricks.com/blog/ introducing-dbrx-new-state-art-open-llm), which are available through the [Hugging Face Hub](https:// huggingface.co/databricks/dbrx-instruct). These models can be directly utilized with LangChain, leveraging its integration with the transformers library.

Installation

First-party Databricks integrations are now available in the databricks-langchain partner package.

```bash pip pip install databricks-langchain ```

uv add databricks-langchain

The legacy langchain-databricks partner package is still available but will be soon deprecated.

Chat Model

ChatDatabricks is a Chat Model class to access chat endpoints hosted on Databricks, including state-of-the-art models such as Llama3, Mixtral, and DBRX, as well as your own fine-tuned models.

from databricks_langchain import ChatDatabricks

chat_model = ChatDatabricks(endpoint="databricks-meta-llama-3-70b-instruct")

See the usage example for more guidance on how to use it within your LangChain application.

LLM

Databricks is an LLM class to access completion endpoints hosted on Databricks.

Text completion models have been deprecated and the latest and most popular models are [chat completion models](/oss/langchain/models). Use `ChatDatabricks` chat model instead to use those models and advanced features such as tool calling.

from langchain_community.llm.databricks import Databricks

llm = Databricks(endpoint="your-completion-endpoint")

See the usage example for more guidance on how to use it within your LangChain application.

Embeddings

DatabricksEmbeddings is an Embeddings class to access text-embedding endpoints hosted on Databricks, including state-of-the-art models such as BGE, as well as your own fine-tuned models.

from databricks_langchain import DatabricksEmbeddings

embeddings = DatabricksEmbeddings(endpoint="databricks-bge-large-en")

See the usage example for more guidance on how to use it within your LangChain application.

Vector Search

Databricks Vector Search is a serverless similarity search engine that allows you to store a vector representation of your data, including metadata, in a vector database. With Vector Search, you can create auto-updating vector search indexes from Delta tables managed by Unity Catalog and query them with a simple API to return the most similar vectors.

from databricks_langchain import DatabricksVectorSearch

dvs = DatabricksVectorSearch(
    endpoint="<YOUT_ENDPOINT_NAME>",
    index_name="<YOUR_INDEX_NAME>",
    index,
    text_column="text",
    embedding=embeddings,
    columns=["source"]
)
docs = dvs.similarity_search("What is vector search?)

See the usage example for how to set up vector indices and integrate them with LangChain.

MLflow Integration

In the context of LangChain integration, MLflow provides the following capabilities:

Experiment Tracking: Tracks and stores models, artifacts, and traces from your LangChain experiments.
Dependency Management: Automatically records dependency libraries, ensuring consistency among development, staging, and production environments.
Model Evaluation Offers native capabilities for evaluating LangChain applications.
Tracing: Visually traces data flows through your LangChain application.

See MLflow LangChain Integration to learn about the full capabilities of using MLflow with LangChain through extensive code examples and guides.

SQLDatabase

To connect to Databricks SQL or query structured data, see the Databricks structured retriever tool documentation and to create an agent using the above created SQL UDF see Databricks UC Integration.

Open Models

To directly integrate Databricks's open models hosted on HuggingFace, you can use the HuggingFace Integration of LangChain.

from langchain_huggingface import HuggingFaceEndpoint

llm = HuggingFaceEndpoint(
    repo_id="databricks/dbrx-instruct",
    task="text-generation",
    max_new_tokens=512,
    do_sample=False,
    repetition_penalty=1.03,
)
llm.invoke("What is DBRX model?")

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Installation

Chat Model

LLM

Embeddings

Vector Search

MLflow Integration

SQLDatabase

Open Models

FilesExpand file tree

databricks.mdx

Latest commit

History

databricks.mdx

File metadata and controls

Installation

Chat Model

LLM

Embeddings

Vector Search

MLflow Integration

SQLDatabase

Open Models