Skip to content

Populate vectors to Vector DB from 3 different sources, S3 bucket, code repository, and list of URLs

Notifications You must be signed in to change notification settings

RHEcosystemAppEng/populate-vectors-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Populate Vectors Pipeline

This repo compiled a pipeline.yaml that populates vectors from 3 different sources by the user choice:

  • S3 Bucket
  • Code Repository
  • List of URLs

Currently, the repository only supports processing PDFs. However, it can be extended to handle other data types as needed.

Upload the pipeline as a job

If you want to upload the complied pipeline using a job this can be done using this repo: ml-pipeline-importer-runner

How to execute

pip install -r requirements.txt python3 ./main.py

About

Populate vectors to Vector DB from 3 different sources, S3 bucket, code repository, and list of URLs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages