Skip to content

Commit ac96278

Browse files
committed
update clickhouse integration docs
1 parent 5f2f984 commit ac96278

2 files changed

Lines changed: 18 additions & 15 deletions

File tree

src/oss/javascript/integrations/vectorstores/clickhouse.mdx

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,12 @@ description: "Integrate with the ClickHouse vector store using LangChain JavaScr
99
Only available on Node.js.
1010
</Tip>
1111

12-
[ClickHouse](https://clickhouse.com/) is a robust and open-source columnar database that is used for handling analytical queries and efficient storage, ClickHouse is designed to provide a powerful combination of vector search and analytics.
12+
[ClickHouse](https://clickhouse.com/) is an open-source columnar database for analytics that also supports vector search. For background on ClickHouse vector search (including approximate indexes), see [Exact and Approximate Vector Search](https://clickhouse.com/docs/engines/table-engines/mergetree-family/annindexes).
1313

1414
## Setup
1515

16-
1. Launch a ClickHouse cluster. Refer to the [ClickHouse Installation Guide](https://clickhouse.com/docs/en/getting-started/install/) for details.
17-
2. After launching a ClickHouse cluster, retrieve the `Connection Details` from the cluster's `Actions` menu. You will need the host, port, username, and password.
16+
1. Launch a ClickHouse cluster. Refer to the [ClickHouse installation guide](https://clickhouse.com/docs/getting-started/install/) for details.
17+
2. After launching a ClickHouse cluster, retrieve the connection details. You will need the host, port, username, and password.
1818
3. Install the required Node.js peer dependency for ClickHouse in your workspace.
1919

2020
You will need to install the following peer dependencies:
@@ -47,7 +47,9 @@ const vectorStore = await ClickHouseStore.fromTexts(
4747
new OpenAIEmbeddings(),
4848
{
4949
host: process.env.CLICKHOUSE_HOST || "localhost",
50-
port: process.env.CLICKHOUSE_PORT || 8443,
50+
port: process.env.CLICKHOUSE_PORT
51+
? Number.parseInt(process.env.CLICKHOUSE_PORT, 10)
52+
: 8443,
5153
username: process.env.CLICKHOUSE_USER || "username",
5254
password: process.env.CLICKHOUSE_PASSWORD || "password",
5355
database: process.env.CLICKHOUSE_DATABASE || "default",
@@ -81,7 +83,9 @@ const vectorStore = await ClickHouseStore.fromExistingIndex(
8183
new OpenAIEmbeddings(),
8284
{
8385
host: process.env.CLICKHOUSE_HOST || "localhost",
84-
port: process.env.CLICKHOUSE_PORT || 8443,
86+
port: process.env.CLICKHOUSE_PORT
87+
? Number.parseInt(process.env.CLICKHOUSE_PORT, 10)
88+
: 8443,
8589
username: process.env.CLICKHOUSE_USER || "username",
8690
password: process.env.CLICKHOUSE_PASSWORD || "password",
8791
database: process.env.CLICKHOUSE_DATABASE || "default",

src/oss/python/integrations/vectorstores/clickhouse.mdx

Lines changed: 9 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -3,16 +3,16 @@ title: "ClickHouse integration"
33
description: "Integrate with the ClickHouse vector store using LangChain Python."
44
---
55

6-
> [ClickHouse](https://clickhouse.com/) is the fastest and most resource efficient open-source database for real-time apps and analytics with full SQL support and a wide range of functions to assist users in writing analytical queries. Lately added data structures and distance search functions (like `L2Distance`) as well as [approximate nearest neighbor search indexes](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/annindexes) enable ClickHouse to be used as a high performance and scalable vector database to store and search vectors with SQL.
6+
> [ClickHouse](https://clickhouse.com/) is an open-source database for real-time apps and analytics with full SQL support. ClickHouse supports exact vector search (for example, using distance functions like `L2Distance`) and approximate vector search using vector similarity indexes (available in ClickHouse 25.8+). For details, see [Exact and Approximate Vector Search](https://clickhouse.com/docs/engines/table-engines/mergetree-family/annindexes).
77
8-
This notebook shows how to use functionality related to the `ClickHouse` vector store.
8+
This page shows how to use functionality related to the `ClickHouse` vector store.
99

1010
## Setup
1111

1212
First set up a local clickhouse server with docker:
1313

1414
```python
15-
! docker run -d -p 8123:8123 -p 9000:9000 --name langchain-clickhouse-server --ulimit nofile=262144:262144 -e CLICKHOUSE_SKIP_USER_SETUP=1 clickhouse/clickhouse-server:25.7
15+
! docker run -d -p 8123:8123 -p 9000:9000 --name langchain-clickhouse-server --ulimit nofile=262144:262144 -e CLICKHOUSE_SKIP_USER_SETUP=1 clickhouse/clickhouse-server:26.2
1616
```
1717

1818
You'll need to install `langchain-community` and `clickhouse-connect` to use this integration
@@ -153,9 +153,8 @@ Performing a simple similarity search can be done as follows:
153153
results = vector_store.similarity_search(
154154
"LangChain provides abstractions to make working with LLMs easy", k=2
155155
)
156-
for res in results:
157-
page_content, metadata = res
158-
print(f"* {page_content} [{metadata}]")
156+
for doc in results:
157+
print(f"* {doc.page_content} [{doc.metadata}]")
159158
```
160159

161160
#### Similarity search with score
@@ -174,7 +173,7 @@ You can have direct access to ClickHouse SQL where statement. You can write `WHE
174173

175174
**NOTE**: Please be aware of SQL injection, this interface must not be directly called by end-user.
176175

177-
If you custimized your `column_map` under your setting, you search with filter like this:
176+
If you customized your `column_map` in your settings, you can search with a filter like this:
178177

179178
```python
180179
meta = vector_store.metadata_column
@@ -195,14 +194,14 @@ There are a variety of other search methods that are not covered in this noteboo
195194

196195
You can also transform the vector store into a retriever for easier usage in your chains.
197196

198-
Here is how to transform your vector store into a retriever and then invoke the retreiever with a simple query and filter.
197+
Here is how to transform your vector store into a retriever and then invoke the retriever with a simple query and filter.
199198

200199
```python
201200
retriever = vector_store.as_retriever(
202201
search_type="similarity_score_threshold",
203-
search_kwargs={"k": 1, "score_threshold": 0.5},
202+
search_kwargs={"k": 1, "score_threshold": 0.5, "where_str": "metadata.source = 'news'"},
204203
)
205-
retriever.invoke("Stealing from the bank is a crime", filter={"source": "news"})
204+
retriever.invoke("Stealing from the bank is a crime")
206205
```
207206

208207
## Usage for retrieval-augmented generation

0 commit comments

Comments
 (0)