Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds simple RAG example to contrib #673

Merged
merged 2 commits into from
Feb 2, 2024
Merged

Adds simple RAG example to contrib #673

merged 2 commits into from
Feb 2, 2024

Conversation

skrawcz
Copy link
Collaborator

@skrawcz skrawcz commented Feb 1, 2024

This is a basic example to show the basic mechanics of a RAG pipeline.

It uses an inmemory vector store with the FAISS for similarity search.

For new dataflows:

Do you have the following?

  • Added a directory mapping to my github user name in the contrib/hamilton/contrib/user directory.
    • If my author names contains hyphens I have replaced them with underscores.
    • If my author name starts with a number, I have prefixed it with an underscore.
    • If your author name is a python reserved keyword. Reach out to the maintainers for help.
    • Added an author.md file under my username directory and is filled out.
    • Added an init.py file under my username directory.
  • Added a new folder for my dataflow under my username directory.
    • Added a README.md file under my dataflow directory that follows the standard headings and is filled out.
    • Added a init.py file under my dataflow directory that contains the Hamilton code.
    • Added a requirements.txt under my dataflow directory that contains the required packages outside of Hamilton.
    • Added tags.json under my dataflow directory to curate my dataflow.
    • Added valid_configs.jsonl under my dataflow directory to specify the valid configurations.
    • Added a dag.png that shows one possible configuration of my dataflow.
  • I hearby acknowledge that to the best of my ability, that the code I have contributed contains correct attribution
    and notices as appropriate.

How I tested this

  • ran this locally

Notes

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Dataflow documentation has been updated if adding/changing functionality.

@skrawcz skrawcz added the contrib Used for code related to contrib package label Feb 1, 2024
@skrawcz skrawcz changed the title Adds simple RAG exmaple to contrib Adds simple RAG example to contrib Feb 1, 2024
So that people know what they can adjust easily.
@skrawcz skrawcz merged commit 33c9e36 into main Feb 2, 2024
24 checks passed
@skrawcz skrawcz deleted the contrib/rag branch February 2, 2024 06:42
@skrawcz
Copy link
Collaborator Author

skrawcz commented Feb 2, 2024

@elijahbenizzy @zilto please still review, will fix things up as they come in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contrib Used for code related to contrib package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant