cocoindex/examples/amazon_s3_embedding at main · 1fanwang/cocoindex

Name	Name	Last commit message	Last commit date
parent directory ..
.env.example	.env.example
.gitignore	.gitignore
README.md	README.md
main.py	main.py
pyproject.toml	pyproject.toml

Name

Last commit message

Last commit date

.env.example

This example builds an embedding index based on files stored in an Amazon S3 bucket. It continuously updates the index as files are added / updated / deleted in the source bucket: it keeps the index in sync with the Amazon S3 bucket effortlessly.

Prerequisite

Before running the example, you need to:

Install Postgres if you don't have one.
Prepare for Amazon S3. See Setup for AWS S3 for more details.

Create a .env file with your Amazon S3 bucket name and (optionally) prefix. Start from copying the .env.example, and then edit it to fill in your bucket name and prefix.

cp .env.example .env
$EDITOR .env

Example .env file:

# Database Configuration
DATABASE_URL=postgresql://localhost:5432/cocoindex

# Amazon S3 Configuration
AMAZON_S3_BUCKET_NAME=your-bucket-name
AMAZON_S3-SQS_QUEUE_URL=https://sqs.us-west-2.amazonaws.com/123456789/S3ChangeNotifications

Run

Install dependencies:

pip install -e .

Run:

python main.py

During running, it will keep observing changes in the Amazon S3 bucket and update the index automatically. At the same time, it accepts queries from the terminal, and performs search on top of the up-to-date index.

CocoInsight

CocoInsight is in Early Access now (Free) 😊 You found us! A quick 3 minute video tutorial about CocoInsight: Watch on YouTube.

Run CocoInsight to understand your RAG data pipeline:

cocoindex server -ci main

You can also add a -L flag to make the server keep updating the index to reflect source changes at the same time:

cocoindex update -L main

Then open the CocoInsight UI at https://cocoindex.io/cocoinsight.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Prerequisite

Run

CocoInsight

FilesExpand file tree

amazon_s3_embedding

Directory actions

More options

Directory actions

More options

Latest commit

History

amazon_s3_embedding

Folders and files

parent directory

README.md

Prerequisite

Run

CocoInsight