diff --git a/docs/docs/examples/examples/academic_papers_index.md b/docs/docs/examples/examples/academic_papers_index.md index 0db8f156..278a1e4e 100644 --- a/docs/docs/examples/examples/academic_papers_index.md +++ b/docs/docs/examples/examples/academic_papers_index.md @@ -64,7 +64,7 @@ def paper_metadata_flow( ``` `flow_builder.add_source` will create a table with sub fields (`filename`, `content`). - + ## Extract and collect metadata diff --git a/docs/docs/examples/examples/codebase_index.md b/docs/docs/examples/examples/codebase_index.md index 0e1caa67..9863b1db 100644 --- a/docs/docs/examples/examples/codebase_index.md +++ b/docs/docs/examples/examples/codebase_index.md @@ -70,7 +70,7 @@ def code_embedding_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoind - Exclude files and directories starting `.`, `target` in the root and `node_modules` under any directory. `flow_builder.add_source` will create a table with sub fields (`filename`, `content`). - + ## Process each file and collect the information diff --git a/docs/docs/examples/examples/custom_targets.md b/docs/docs/examples/examples/custom_targets.md index 3094f1a7..fa53b87d 100644 --- a/docs/docs/examples/examples/custom_targets.md +++ b/docs/docs/examples/examples/custom_targets.md @@ -36,7 +36,7 @@ flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope ) ``` This ingestion creates a table with `filename` and `content` fields. - + ## Process each file and collect diff --git a/docs/docs/examples/examples/docs_to_knowledge_graph.md b/docs/docs/examples/examples/docs_to_knowledge_graph.md index 9d90c196..d301ca46 100644 --- a/docs/docs/examples/examples/docs_to_knowledge_graph.md +++ b/docs/docs/examples/examples/docs_to_knowledge_graph.md @@ -66,7 +66,7 @@ def docs_to_kg_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.D Here `flow_builder.add_source` creates a [KTable](https://cocoindex.io/docs/core/data_types#KTable). `filename` is the key of the KTable. - + ### Add data collectors diff --git a/docs/docs/examples/examples/document_ai.md b/docs/docs/examples/examples/document_ai.md index 6ef86de8..d35fa06d 100644 --- a/docs/docs/examples/examples/document_ai.md +++ b/docs/docs/examples/examples/document_ai.md @@ -98,7 +98,7 @@ data_scope["documents"] = flow_builder.add_source( doc_embeddings = data_scope.add_collector() ``` - + @@ -154,4 +154,4 @@ For a step-by-step walkthrough of each indexing stage and the query path, check CocoIndex natively supports Google Drive, Amazon S3, Azure Blob Storage, and more with native incremental processing out of box - when new or updated files are detected, the pipeline will capture the changes and only process what's changed. - + diff --git a/docs/docs/examples/examples/image_search.md b/docs/docs/examples/examples/image_search.md index 3e468784..db5cb153 100644 --- a/docs/docs/examples/examples/image_search.md +++ b/docs/docs/examples/examples/image_search.md @@ -66,7 +66,7 @@ def image_object_embedding_flow(flow_builder, data_scope): The `add_source` function sets up a table with fields like `filename` and `content`. Images are automatically re-scanned every minute. - + ## Process Each Image and Collect the Embedding @@ -266,6 +266,6 @@ One of CocoIndex’s core strengths is its ability to connect to your existing d - Amazon S3 / SQS - Azure Blob Storage - + Once connected, CocoIndex continuously watches for changes — new uploads, updates, or deletions — and applies them to your index in real time. diff --git a/docs/docs/examples/examples/manual_extraction.md b/docs/docs/examples/examples/manual_extraction.md index 21c0367d..0d76bdc0 100644 --- a/docs/docs/examples/examples/manual_extraction.md +++ b/docs/docs/examples/examples/manual_extraction.md @@ -67,7 +67,7 @@ def manual_extraction_flow( - `filename` (key, type: `str`): the filename of the file, e.g. `dir1/file1.md` - `content` (type: `str` if `binary` is `False`, otherwise `bytes`): the content of the file - + ## Parse Markdown diff --git a/docs/docs/examples/examples/multi_format_index.md b/docs/docs/examples/examples/multi_format_index.md index 2b0e9e31..8880602f 100644 --- a/docs/docs/examples/examples/multi_format_index.md +++ b/docs/docs/examples/examples/multi_format_index.md @@ -52,7 +52,7 @@ data_scope["documents"] = flow_builder.add_source( cocoindex.sources.LocalFile(path="source_files", binary=True) ) ``` - + ## Convert Files to Pages @@ -203,4 +203,4 @@ Follow the url `https://cocoindex.io/cocoinsight`. It connects to your local Co ## Connect to other sources CocoIndex natively supports Google Drive, Amazon S3, Azure Blob Storage, and more. - + diff --git a/docs/docs/examples/examples/patient_form_extraction.md b/docs/docs/examples/examples/patient_form_extraction.md index 5068d54e..ab92d776 100644 --- a/docs/docs/examples/examples/patient_form_extraction.md +++ b/docs/docs/examples/examples/patient_form_extraction.md @@ -66,7 +66,7 @@ def patient_intake_extraction_flow( `flow_builder.add_source` will create a table with a few sub fields. - + ## Parse documents with different formats to Markdown @@ -298,4 +298,4 @@ Click on the `markdown` column for `Patient_Intake_Form_Joe.pdf`, you could see ## Connect to other sources CocoIndex natively supports Google Drive, Amazon S3, Azure Blob Storage, and more. - + diff --git a/docs/docs/examples/examples/photo_search.md b/docs/docs/examples/examples/photo_search.md index d17c998c..07a8ca3a 100644 --- a/docs/docs/examples/examples/photo_search.md +++ b/docs/docs/examples/examples/photo_search.md @@ -65,8 +65,8 @@ def face_recognition_flow(flow_builder, data_scope): This creates a table with `filename` and `content` fields. 📂 -You can connect it to your [S3 Buckets](https://cocoindex.io/docs/ops/sources/amazons3) (with SQS integration, [example](https://cocoindex.io/blogs/s3-incremental-etl)) -or [Azure Blob store](https://cocoindex.io/docs/ops/sources/azureblob). +You can connect it to your [S3 Buckets](https://cocoindex.io/docs/sources/amazons3) (with SQS integration, [example](https://cocoindex.io/blogs/s3-incremental-etl)) +or [Azure Blob store](https://cocoindex.io/docs/sources/azureblob). ## Detect and Extract Faces diff --git a/docs/docs/examples/examples/postgres_source.md b/docs/docs/examples/examples/postgres_source.md index 5f3d4914..d0b51292 100644 --- a/docs/docs/examples/examples/postgres_source.md +++ b/docs/docs/examples/examples/postgres_source.md @@ -59,7 +59,7 @@ CocoIndex incrementally sync data from Postgres. When new or updated rows are fo - `notification` enables change capture based on Postgres LISTEN/NOTIFY. Each change triggers an incremental processing on the specific row immediately. - Regardless if `notification` is provided or not, CocoIndex still needs to scan the full table to detect changes in some scenarios (e.g. between two `update` invocation), and the `ordinal_column` provides a field that CocoIndex can use to quickly detect which row has changed without reading value columns. -Check [Postgres source](https://cocoindex.io/docs/ops/sources/postgres) for more details. +Check [Postgres source](https://cocoindex.io/docs/sources/postgres) for more details. If you use the Postgres database hosted by Supabase, please click Connect on your project dashboard and find the URL there. Check [DatabaseConnectionSpec](https://cocoindex.io/docs/core/settings#databaseconnectionspec) for more details. diff --git a/docs/docs/examples/examples/product_recommendation.md b/docs/docs/examples/examples/product_recommendation.md index 110201b2..e912b594 100644 --- a/docs/docs/examples/examples/product_recommendation.md +++ b/docs/docs/examples/examples/product_recommendation.md @@ -30,7 +30,7 @@ Product taxonomy is a way to organize product catalogs in a logical and hierarch ## Prerequisites * [Install PostgreSQL](https://cocoindex.io/docs/getting_started/installation#-install-postgres). CocoIndex uses PostgreSQL internally for incremental processing. -* [Install Neo4j](https://cocoindex.io/docs/ops/storages#Neo4j), a graph database. +* [Install Neo4j](https://cocoindex.io/docs/targets/neo4j), a graph database. * - [Configure your OpenAI API key](https://cocoindex.io/docs/ai/llm#openai). Create a `.env` file from `.env.example`, and fill `OPENAI_API_KEY`. Alternatively, we have native support for Gemini, Ollama, LiteLLM. You can choose your favorite LLM provider and work completely on-premises. diff --git a/docs/docs/examples/examples/simple_vector_index.md b/docs/docs/examples/examples/simple_vector_index.md index 53a54388..017fcda5 100644 --- a/docs/docs/examples/examples/simple_vector_index.md +++ b/docs/docs/examples/examples/simple_vector_index.md @@ -51,7 +51,7 @@ def text_embedding_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoind ``` `flow_builder.add_source` will create a table with sub fields (`filename`, `content`) - + ## Process each file and collect the embeddings diff --git a/docs/docs/getting_started/quickstart.md b/docs/docs/getting_started/quickstart.md index eb565699..72bfbd7b 100644 --- a/docs/docs/getting_started/quickstart.md +++ b/docs/docs/getting_started/quickstart.md @@ -64,7 +64,7 @@ doc_embeddings = data_scope.add_collector() `flow_builder.add_source` will create a table with sub fields (`filename`, `content`) - + diff --git a/docs/docs/targets/index.md b/docs/docs/targets/index.md index 36d117b7..c90d7654 100644 --- a/docs/docs/targets/index.md +++ b/docs/docs/targets/index.md @@ -18,7 +18,6 @@ The way to map data from a data collector to a target depends on data model of t | [Qdrant](/docs/targets/qdrant) | Vector Database, Keyword Search | | [LanceDB](/docs/targets/lancedb) | Vector Database, Keyword Search | | [Neo4j](/docs/targets/neo4j) | [Property graph](#property-graph-targets) | -| [Kuzu](/docs/targets/kuzu) | [Property graph](#property-graph-targets) | If you are looking for targets beyond here, you can always use [custom targets](/docs/custom_ops/custom_targets) as building blocks. diff --git a/docs/docs/targets/kuzu.md b/docs/docs/targets/kuzu.md index ae129ef3..441e9e78 100644 --- a/docs/docs/targets/kuzu.md +++ b/docs/docs/targets/kuzu.md @@ -5,7 +5,9 @@ toc_max_heading_level: 4 --- import { ExampleButton } from '../../src/components/GitHubButton'; -# Kuzu +# Kuzu (Archived) + +Note:[Kuzu](https://github.com/kuzudb/kuzu) - embedded graph database is no longer maintained. Exports data to a [Kuzu](https://kuzu.com/) graph database. diff --git a/examples/docs_to_knowledge_graph/README.md b/examples/docs_to_knowledge_graph/README.md index 714c2046..41b38ac1 100644 --- a/examples/docs_to_knowledge_graph/README.md +++ b/examples/docs_to_knowledge_graph/README.md @@ -14,12 +14,11 @@ Please drop [Cocoindex on Github](https://github.com/cocoindex-io/cocoindex) a s ## Prerequisite * [Install Postgres](https://cocoindex.io/docs/getting_started/installation#-install-postgres) if you don't have one. -* Install [Neo4j](https://cocoindex.io/docs/ops/targets#neo4j-dev-instance) or [Kuzu](https://cocoindex.io/docs/ops/targets#kuzu-dev-instance) if you don't have one. - * The example uses Neo4j by default for now. If you want to use Kuzu, find out the "SELECT ONE GRAPH DATABASE TO USE" section and switch the active branch. +* Install [Neo4j](https://cocoindex.io/docs/targets/neo4j). * Install / configure LLM API. In this example we use Ollama, which runs LLM model locally. You need to get it ready following [this guide](https://cocoindex.io/docs/ai/llm#ollama). Alternatively, you can also follow the comments in source code to switch to OpenAI, and [configure OpenAI API key](https://cocoindex.io/docs/ai/llm#openai) before running the example. ## Documentation -You can read the official CocoIndex Documentation for Property Graph Targets [here](https://cocoindex.io/docs/ops/targets#property-graph-targets). +You can read the official CocoIndex Documentation for Property Graph Targets [here](https://cocoindex.io/docs/targets#property-graph-targets). ## Run @@ -48,8 +47,6 @@ cocoindex update main After the knowledge graph is built, you can explore the knowledge graph. * If you're using Neo4j, you can open the explorer at [http://localhost:7474](http://localhost:7474), with username `neo4j` and password `cocoindex`. -* If you're using Kuzu, you can start a Kuzu explorer locally. See [Kuzu dev instance](https://cocoindex.io/docs/ops/targets#kuzu-dev-instance) for more details. - You can run the following Cypher query to get all relationships: ```cypher diff --git a/examples/docs_to_knowledge_graph/main.py b/examples/docs_to_knowledge_graph/main.py index 438f08b0..4dc9e505 100644 --- a/examples/docs_to_knowledge_graph/main.py +++ b/examples/docs_to_knowledge_graph/main.py @@ -14,29 +14,12 @@ password="cocoindex", ), ) -kuzu_conn_spec = cocoindex.add_auth_entry( - "KuzuConnection", - cocoindex.targets.KuzuConnection( - api_server_url="http://localhost:8123", - ), -) -# SELECT ONE GRAPH DATABASE TO USE -# This example can use either Neo4j or Kuzu as the graph database. -# Please make sure only one branch is live and others are commented out. - -# Use Neo4j GraphDbSpec = cocoindex.targets.Neo4j GraphDbConnection = cocoindex.targets.Neo4jConnection GraphDbDeclaration = cocoindex.targets.Neo4jDeclaration conn_spec = neo4j_conn_spec -# Use Kuzu -# GraphDbSpec = cocoindex.targets.Kuzu -# GraphDbConnection = cocoindex.targets.KuzuConnection -# GraphDbDeclaration = cocoindex.targets.KuzuDeclaration -# conn_spec = kuzu_conn_spec - @dataclasses.dataclass class DocumentSummary: diff --git a/examples/patient_intake_extraction/README.md b/examples/patient_intake_extraction/README.md index b25fe281..0043d55e 100644 --- a/examples/patient_intake_extraction/README.md +++ b/examples/patient_intake_extraction/README.md @@ -4,7 +4,7 @@ We appreciate a star ⭐ at [CocoIndex Github](https://github.com/cocoindex-io/c This repo shows how to use LLM to extract structured data from patient intake forms with different formats - like PDF, Docx, etc. -CocoIndex supports multiple [sources](https://cocoindex.io/docs/ops/sources) and [LLM models](https://cocoindex.io/docs/ai/llm) natively. +CocoIndex supports multiple [sources](https://cocoindex.io/docs/sources) and [LLM models](https://cocoindex.io/docs/ai/llm) natively. ![Structured Data From Patient Intake Forms](https://github.com/user-attachments/assets/1f6afb69-d26d-4a08-8774-13982d6aec1e) diff --git a/examples/product_recommendation/README.md b/examples/product_recommendation/README.md index c2e6e7f9..f3ce29b0 100644 --- a/examples/product_recommendation/README.md +++ b/examples/product_recommendation/README.md @@ -8,13 +8,12 @@ Please drop [CocoIndex on Github](https://github.com/cocoindex-io/cocoindex) a s ## Prerequisite -* [Install Postgres](https://cocoindex.io/docs/getting_started/installation#-install-postgres) if you don't have one. -* Install [Neo4j](https://cocoindex.io/docs/ops/targets#neo4j-dev-instance) or [Kuzu](https://cocoindex.io/docs/ops/targets#kuzu-dev-instance) if you don't have one. - * The example uses Neo4j by default for now. If you want to use Kuzu, find out the "SELECT ONE GRAPH DATABASE TO USE" section and switch the active branch. +* [Install Postgres](https://cocoindex.io/docs/getting_started/installation#-install-postgres) +* Install [Neo4j](https://cocoindex.io/docs/targets/neo4j) * [Configure your OpenAI API key](https://cocoindex.io/docs/ai/llm#openai). ## Documentation -You can read the official CocoIndex Documentation for Property Graph Targets [here](https://cocoindex.io/docs/ops/targets#property-graph-targets). +You can read the official CocoIndex Documentation for Property Graph Targets [here](https://cocoindex.io/docs/targets#property-graph-targets). ## Run @@ -43,7 +42,6 @@ cocoindex update main After the knowledge graph is built, you can explore the knowledge graph. * If you're using Neo4j, you can open the explorer at [http://localhost:7474](http://localhost:7474), with username `neo4j` and password `cocoindex`. -* If you're using Kuzu, you can start a Kuzu explorer locally. See [Kuzu dev instance](https://cocoindex.io/docs/ops/targets#kuzu-dev-instance) for more details. You can run the following Cypher query to get all relationships: diff --git a/examples/product_recommendation/main.py b/examples/product_recommendation/main.py index 4c2b9123..b63cf143 100644 --- a/examples/product_recommendation/main.py +++ b/examples/product_recommendation/main.py @@ -15,29 +15,12 @@ password="cocoindex", ), ) -kuzu_conn_spec = cocoindex.add_auth_entry( - "KuzuConnection", - cocoindex.targets.KuzuConnection( - api_server_url="http://localhost:8123", - ), -) -# SELECT ONE GRAPH DATABASE TO USE -# This example can use either Neo4j or Kuzu as the graph database. -# Please make sure only one branch is live and others are commented out. - -# Use Neo4j GraphDbSpec = cocoindex.targets.Neo4j GraphDbConnection = cocoindex.targets.Neo4jConnection GraphDbDeclaration = cocoindex.targets.Neo4jDeclaration conn_spec = neo4j_conn_spec -# Use Kuzu -# GraphDbSpec = cocoindex.targets.Kuzu -# GraphDbConnection = cocoindex.targets.KuzuConnection -# GraphDbDeclaration = cocoindex.targets.KuzuDeclaration -# conn_spec = kuzu_conn_spec - # Template for rendering product information as markdown to provide information to LLMs PRODUCT_TEMPLATE = """ diff --git a/examples/text_embedding_qdrant/README.md b/examples/text_embedding_qdrant/README.md index be307232..60d1dced 100644 --- a/examples/text_embedding_qdrant/README.md +++ b/examples/text_embedding_qdrant/README.md @@ -2,7 +2,7 @@ [![GitHub](https://img.shields.io/github/stars/cocoindex-io/cocoindex?color=5B5BD6)](https://github.com/cocoindex-io/cocoindex) -CocoIndex supports Qdrant natively - [documentation](https://cocoindex.io/docs/ops/targets#qdrant). In this example, we will build index flow from text embedding from local markdown files, and query the index. We will use **Qdrant** as the vector database. +CocoIndex supports Qdrant natively - [documentation](https://cocoindex.io/docs/targets/qdrant). In this example, we will build index flow from text embedding from local markdown files, and query the index. We will use **Qdrant** as the vector database. We appreciate a star ⭐ at [CocoIndex Github](https://github.com/cocoindex-io/cocoindex) if this is helpful.