Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/docs/examples/examples/academic_papers_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ def paper_metadata_flow(
```

`flow_builder.add_source` will create a table with sub fields (`filename`, `content`).
<DocumentationButton url="https://cocoindex.io/docs/ops/sources" text="Sources" margin="0 0 16px 0" />
<DocumentationButton url="https://cocoindex.io/docs/sources" text="Sources" margin="0 0 16px 0" />

## Extract and collect metadata

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/examples/examples/codebase_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ def code_embedding_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoind
- Exclude files and directories starting `.`, `target` in the root and `node_modules` under any directory.

`flow_builder.add_source` will create a table with sub fields (`filename`, `content`).
<DocumentationButton url="https://cocoindex.io/docs/ops/sources" text="Sources" />
<DocumentationButton url="https://cocoindex.io/docs/sources" text="Sources" />


## Process each file and collect the information
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/examples/examples/custom_targets.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope
)
```
This ingestion creates a table with `filename` and `content` fields.
<DocumentationButton url="https://cocoindex.io/docs/ops/sources" text="Sources" />
<DocumentationButton url="https://cocoindex.io/docs/sources" text="Sources" />

## Process each file and collect

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/examples/examples/docs_to_knowledge_graph.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ def docs_to_kg_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.D
Here `flow_builder.add_source` creates a [KTable](https://cocoindex.io/docs/core/data_types#KTable).
`filename` is the key of the KTable.

<DocumentationButton url="https://cocoindex.io/docs/ops/sources" text="Sources" margin="0 0 16px 0" />
<DocumentationButton url="https://cocoindex.io/docs/sources" text="Sources" margin="0 0 16px 0" />


### Add data collectors
Expand Down
4 changes: 2 additions & 2 deletions docs/docs/examples/examples/document_ai.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ data_scope["documents"] = flow_builder.add_source(
doc_embeddings = data_scope.add_collector()
```

<DocumentationButton url="https://cocoindex.io/docs/ops/sources" text="Source" margin="0 0 16px 0" />
<DocumentationButton url="https://cocoindex.io/docs/sources" text="Source" margin="0 0 16px 0" />

<DocumentationButton url="https://cocoindex.io/docs/ops/collectors" text="Collector" margin="0 0 16px 0" />

Expand Down Expand Up @@ -154,4 +154,4 @@ For a step-by-step walkthrough of each indexing stage and the query path, check

CocoIndex natively supports Google Drive, Amazon S3, Azure Blob Storage, and more with native incremental processing out of box - when new or updated files are detected, the pipeline will capture the changes and only process what's changed.

<DocumentationButton url="https://cocoindex.io/docs/ops/sources" text="Sources" margin="0 0 16px 0" />
<DocumentationButton url="https://cocoindex.io/docs/sources" text="Sources" margin="0 0 16px 0" />
4 changes: 2 additions & 2 deletions docs/docs/examples/examples/image_search.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ def image_object_embedding_flow(flow_builder, data_scope):

The `add_source` function sets up a table with fields like `filename` and `content`. Images are automatically re-scanned every minute.

<DocumentationButton url="https://cocoindex.io/docs/ops/sources/localfile" text="LocalFile" />
<DocumentationButton url="https://cocoindex.io/docs/sources/localfile" text="LocalFile" />


## Process Each Image and Collect the Embedding
Expand Down Expand Up @@ -266,6 +266,6 @@ One of CocoIndex’s core strengths is its ability to connect to your existing d
- Amazon S3 / SQS
- Azure Blob Storage

<DocumentationButton url="https://cocoindex.io/docs/ops/sources" text="Sources" margin="0 0 16px 0" />
<DocumentationButton url="https://cocoindex.io/docs/sources" text="Sources" margin="0 0 16px 0" />

Once connected, CocoIndex continuously watches for changes — new uploads, updates, or deletions — and applies them to your index in real time.
2 changes: 1 addition & 1 deletion docs/docs/examples/examples/manual_extraction.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ def manual_extraction_flow(
- `filename` (key, type: `str`): the filename of the file, e.g. `dir1/file1.md`
- `content` (type: `str` if `binary` is `False`, otherwise `bytes`): the content of the file

<DocumentationButton url="https://cocoindex.io/docs/ops/sources" text="LocalFile" margin="0 0 16px 0" />
<DocumentationButton url="https://cocoindex.io/docs/sources" text="LocalFile" margin="0 0 16px 0" />

## Parse Markdown

Expand Down
4 changes: 2 additions & 2 deletions docs/docs/examples/examples/multi_format_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ data_scope["documents"] = flow_builder.add_source(
cocoindex.sources.LocalFile(path="source_files", binary=True)
)
```
<DocumentationButton url="https://cocoindex.io/docs/ops/sources/localfile" text="LocalFile" margin="0 0 16px 0" />
<DocumentationButton url="https://cocoindex.io/docs/sources/localfile" text="LocalFile" margin="0 0 16px 0" />


## Convert Files to Pages
Expand Down Expand Up @@ -203,4 +203,4 @@ Follow the url `https://cocoindex.io/cocoinsight`. It connects to your local Co
## Connect to other sources
CocoIndex natively supports Google Drive, Amazon S3, Azure Blob Storage, and more.

<DocumentationButton url="https://cocoindex.io/docs/ops/sources" text="Sources" margin="0 0 16px 0" />
<DocumentationButton url="https://cocoindex.io/docs/sources" text="Sources" margin="0 0 16px 0" />
4 changes: 2 additions & 2 deletions docs/docs/examples/examples/patient_form_extraction.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ def patient_intake_extraction_flow(

`flow_builder.add_source` will create a table with a few sub fields.

<DocumentationButton url="https://cocoindex.io/docs/ops/sources" text="Sources" margin="0 0 16px 0" />
<DocumentationButton url="https://cocoindex.io/docs/sources" text="Sources" margin="0 0 16px 0" />


## Parse documents with different formats to Markdown
Expand Down Expand Up @@ -298,4 +298,4 @@ Click on the `markdown` column for `Patient_Intake_Form_Joe.pdf`, you could see
## Connect to other sources
CocoIndex natively supports Google Drive, Amazon S3, Azure Blob Storage, and more.

<DocumentationButton url="https://cocoindex.io/docs/ops/sources" text="Sources" margin="0 0 16px 0" />
<DocumentationButton url="https://cocoindex.io/docs/sources" text="Sources" margin="0 0 16px 0" />
4 changes: 2 additions & 2 deletions docs/docs/examples/examples/photo_search.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,8 +65,8 @@ def face_recognition_flow(flow_builder, data_scope):
This creates a table with `filename` and `content` fields. 📂


You can connect it to your [S3 Buckets](https://cocoindex.io/docs/ops/sources/amazons3) (with SQS integration, [example](https://cocoindex.io/blogs/s3-incremental-etl))
or [Azure Blob store](https://cocoindex.io/docs/ops/sources/azureblob).
You can connect it to your [S3 Buckets](https://cocoindex.io/docs/sources/amazons3) (with SQS integration, [example](https://cocoindex.io/blogs/s3-incremental-etl))
or [Azure Blob store](https://cocoindex.io/docs/sources/azureblob).

## Detect and Extract Faces

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/examples/examples/postgres_source.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ CocoIndex incrementally sync data from Postgres. When new or updated rows are fo
- `notification` enables change capture based on Postgres LISTEN/NOTIFY. Each change triggers an incremental processing on the specific row immediately.
- Regardless if `notification` is provided or not, CocoIndex still needs to scan the full table to detect changes in some scenarios (e.g. between two `update` invocation), and the `ordinal_column` provides a field that CocoIndex can use to quickly detect which row has changed without reading value columns.

Check [Postgres source](https://cocoindex.io/docs/ops/sources/postgres) for more details.
Check [Postgres source](https://cocoindex.io/docs/sources/postgres) for more details.

If you use the Postgres database hosted by Supabase, please click Connect on your project dashboard and find the URL there. Check [DatabaseConnectionSpec](https://cocoindex.io/docs/core/settings#databaseconnectionspec)
for more details.
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/examples/examples/product_recommendation.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Product taxonomy is a way to organize product catalogs in a logical and hierarch

## Prerequisites
* [Install PostgreSQL](https://cocoindex.io/docs/getting_started/installation#-install-postgres). CocoIndex uses PostgreSQL internally for incremental processing.
* [Install Neo4j](https://cocoindex.io/docs/ops/storages#Neo4j), a graph database.
* [Install Neo4j](https://cocoindex.io/docs/targets/neo4j), a graph database.
* - [Configure your OpenAI API key](https://cocoindex.io/docs/ai/llm#openai). Create a `.env` file from `.env.example`, and fill `OPENAI_API_KEY`.

Alternatively, we have native support for Gemini, Ollama, LiteLLM. You can choose your favorite LLM provider and work completely on-premises.
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/examples/examples/simple_vector_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ def text_embedding_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoind
```

`flow_builder.add_source` will create a table with sub fields (`filename`, `content`)
<DocumentationButton url="https://cocoindex.io/docs/ops/sources" text="Source" />
<DocumentationButton url="https://cocoindex.io/docs/sources" text="Source" />


## Process each file and collect the embeddings
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/getting_started/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ doc_embeddings = data_scope.add_collector()

`flow_builder.add_source` will create a table with sub fields (`filename`, `content`)

<DocumentationButton url="https://cocoindex.io/docs/ops/sources" text="Source" />
<DocumentationButton url="https://cocoindex.io/docs/sources" text="Source" />

<DocumentationButton url="https://cocoindex.io/docs/core/flow_def#data-collector" text="Data Collector" />

Expand Down
1 change: 0 additions & 1 deletion docs/docs/targets/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ The way to map data from a data collector to a target depends on data model of t
| [Qdrant](/docs/targets/qdrant) | Vector Database, Keyword Search |
| [LanceDB](/docs/targets/lancedb) | Vector Database, Keyword Search |
| [Neo4j](/docs/targets/neo4j) | [Property graph](#property-graph-targets) |
| [Kuzu](/docs/targets/kuzu) | [Property graph](#property-graph-targets) |

If you are looking for targets beyond here, you can always use [custom targets](/docs/custom_ops/custom_targets) as building blocks.

Expand Down
4 changes: 3 additions & 1 deletion docs/docs/targets/kuzu.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,9 @@ toc_max_heading_level: 4
---
import { ExampleButton } from '../../src/components/GitHubButton';

# Kuzu
# Kuzu (Archived)

Note:[Kuzu](https://github.com/kuzudb/kuzu) - embedded graph database is no longer maintained.

Exports data to a [Kuzu](https://kuzu.com/) graph database.

Expand Down
7 changes: 2 additions & 5 deletions examples/docs_to_knowledge_graph/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,11 @@ Please drop [Cocoindex on Github](https://github.com/cocoindex-io/cocoindex) a s

## Prerequisite
* [Install Postgres](https://cocoindex.io/docs/getting_started/installation#-install-postgres) if you don't have one.
* Install [Neo4j](https://cocoindex.io/docs/ops/targets#neo4j-dev-instance) or [Kuzu](https://cocoindex.io/docs/ops/targets#kuzu-dev-instance) if you don't have one.
* The example uses Neo4j by default for now. If you want to use Kuzu, find out the "SELECT ONE GRAPH DATABASE TO USE" section and switch the active branch.
* Install [Neo4j](https://cocoindex.io/docs/targets/neo4j).
* Install / configure LLM API. In this example we use Ollama, which runs LLM model locally. You need to get it ready following [this guide](https://cocoindex.io/docs/ai/llm#ollama). Alternatively, you can also follow the comments in source code to switch to OpenAI, and [configure OpenAI API key](https://cocoindex.io/docs/ai/llm#openai) before running the example.

## Documentation
You can read the official CocoIndex Documentation for Property Graph Targets [here](https://cocoindex.io/docs/ops/targets#property-graph-targets).
You can read the official CocoIndex Documentation for Property Graph Targets [here](https://cocoindex.io/docs/targets#property-graph-targets).

## Run

Expand Down Expand Up @@ -48,8 +47,6 @@ cocoindex update main
After the knowledge graph is built, you can explore the knowledge graph.

* If you're using Neo4j, you can open the explorer at [http://localhost:7474](http://localhost:7474), with username `neo4j` and password `cocoindex`.
* If you're using Kuzu, you can start a Kuzu explorer locally. See [Kuzu dev instance](https://cocoindex.io/docs/ops/targets#kuzu-dev-instance) for more details.

You can run the following Cypher query to get all relationships:

```cypher
Expand Down
17 changes: 0 additions & 17 deletions examples/docs_to_knowledge_graph/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,29 +14,12 @@
password="cocoindex",
),
)
kuzu_conn_spec = cocoindex.add_auth_entry(
"KuzuConnection",
cocoindex.targets.KuzuConnection(
api_server_url="http://localhost:8123",
),
)

# SELECT ONE GRAPH DATABASE TO USE
# This example can use either Neo4j or Kuzu as the graph database.
# Please make sure only one branch is live and others are commented out.

# Use Neo4j
GraphDbSpec = cocoindex.targets.Neo4j
GraphDbConnection = cocoindex.targets.Neo4jConnection
GraphDbDeclaration = cocoindex.targets.Neo4jDeclaration
conn_spec = neo4j_conn_spec

# Use Kuzu
# GraphDbSpec = cocoindex.targets.Kuzu
# GraphDbConnection = cocoindex.targets.KuzuConnection
# GraphDbDeclaration = cocoindex.targets.KuzuDeclaration
# conn_spec = kuzu_conn_spec


@dataclasses.dataclass
class DocumentSummary:
Expand Down
2 changes: 1 addition & 1 deletion examples/patient_intake_extraction/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ We appreciate a star ⭐ at [CocoIndex Github](https://github.com/cocoindex-io/c


This repo shows how to use LLM to extract structured data from patient intake forms with different formats - like PDF, Docx, etc.
CocoIndex supports multiple [sources](https://cocoindex.io/docs/ops/sources) and [LLM models](https://cocoindex.io/docs/ai/llm) natively.
CocoIndex supports multiple [sources](https://cocoindex.io/docs/sources) and [LLM models](https://cocoindex.io/docs/ai/llm) natively.

![Structured Data From Patient Intake Forms](https://github.com/user-attachments/assets/1f6afb69-d26d-4a08-8774-13982d6aec1e)

Expand Down
8 changes: 3 additions & 5 deletions examples/product_recommendation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,12 @@ Please drop [CocoIndex on Github](https://github.com/cocoindex-io/cocoindex) a s


## Prerequisite
* [Install Postgres](https://cocoindex.io/docs/getting_started/installation#-install-postgres) if you don't have one.
* Install [Neo4j](https://cocoindex.io/docs/ops/targets#neo4j-dev-instance) or [Kuzu](https://cocoindex.io/docs/ops/targets#kuzu-dev-instance) if you don't have one.
* The example uses Neo4j by default for now. If you want to use Kuzu, find out the "SELECT ONE GRAPH DATABASE TO USE" section and switch the active branch.
* [Install Postgres](https://cocoindex.io/docs/getting_started/installation#-install-postgres)
* Install [Neo4j](https://cocoindex.io/docs/targets/neo4j)
* [Configure your OpenAI API key](https://cocoindex.io/docs/ai/llm#openai).

## Documentation
You can read the official CocoIndex Documentation for Property Graph Targets [here](https://cocoindex.io/docs/ops/targets#property-graph-targets).
You can read the official CocoIndex Documentation for Property Graph Targets [here](https://cocoindex.io/docs/targets#property-graph-targets).

## Run

Expand Down Expand Up @@ -43,7 +42,6 @@ cocoindex update main
After the knowledge graph is built, you can explore the knowledge graph.

* If you're using Neo4j, you can open the explorer at [http://localhost:7474](http://localhost:7474), with username `neo4j` and password `cocoindex`.
* If you're using Kuzu, you can start a Kuzu explorer locally. See [Kuzu dev instance](https://cocoindex.io/docs/ops/targets#kuzu-dev-instance) for more details.

You can run the following Cypher query to get all relationships:

Expand Down
17 changes: 0 additions & 17 deletions examples/product_recommendation/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,29 +15,12 @@
password="cocoindex",
),
)
kuzu_conn_spec = cocoindex.add_auth_entry(
"KuzuConnection",
cocoindex.targets.KuzuConnection(
api_server_url="http://localhost:8123",
),
)

# SELECT ONE GRAPH DATABASE TO USE
# This example can use either Neo4j or Kuzu as the graph database.
# Please make sure only one branch is live and others are commented out.

# Use Neo4j
GraphDbSpec = cocoindex.targets.Neo4j
GraphDbConnection = cocoindex.targets.Neo4jConnection
GraphDbDeclaration = cocoindex.targets.Neo4jDeclaration
conn_spec = neo4j_conn_spec

# Use Kuzu
# GraphDbSpec = cocoindex.targets.Kuzu
# GraphDbConnection = cocoindex.targets.KuzuConnection
# GraphDbDeclaration = cocoindex.targets.KuzuDeclaration
# conn_spec = kuzu_conn_spec


# Template for rendering product information as markdown to provide information to LLMs
PRODUCT_TEMPLATE = """
Expand Down
2 changes: 1 addition & 1 deletion examples/text_embedding_qdrant/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

[![GitHub](https://img.shields.io/github/stars/cocoindex-io/cocoindex?color=5B5BD6)](https://github.com/cocoindex-io/cocoindex)

CocoIndex supports Qdrant natively - [documentation](https://cocoindex.io/docs/ops/targets#qdrant). In this example, we will build index flow from text embedding from local markdown files, and query the index. We will use **Qdrant** as the vector database.
CocoIndex supports Qdrant natively - [documentation](https://cocoindex.io/docs/targets/qdrant). In this example, we will build index flow from text embedding from local markdown files, and query the index. We will use **Qdrant** as the vector database.

We appreciate a star ⭐ at [CocoIndex Github](https://github.com/cocoindex-io/cocoindex) if this is helpful.

Expand Down