You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am working on a RAG application using DSPy and ChromaDB for pdf files.
At first I fetched the text from the pdf and add it to the Chromadb as chunks. Also added the embeddings of the chunks.
like this
def store_document_in_chromadb(text):
chunks = chunk_document(text)
ids = [f'chunk_{i}' for i in range(len(chunks))]
embeddings = [get_embedding(chunk).tolist() for chunk in chunks]
collection.add(ids=ids, documents=chunks, embeddings=embeddings)
And I try to retrieve the relevant chunks like this,
retriever_model = ChromadbRM("contracts_collection", 'db/', k=2)
dspy.settings.configure(lm=llama2_model, rm=retriever_model)
class GenerateAnswer(dspy.Signature):
"""Answer the question based on the context given."""
context = dspy.InputField(desc="may contain relevant context")
question = dspy.InputField()
answer = dspy.OutputField(desc="often between 5 to 10 words")
class RAG(dspy.Module):
def __init__(self, num_passages=2):
super().__init__()
self.retrieve = dspy.Retrieve(k=num_passages)
self.generate_answer = dspy.ChainOfThought(GenerateAnswer)
def forward(self, question):
context = self.retrieve(question).passages
prediction = self.generate_answer(context=context, question=question)
return dspy.Prediction(context=context, answer=prediction.answer)
with dspy.context(lm=llama2_model, rm=retriever_model):
module = RAG()
response = module("What is the Total Spend")
print(response)
When I a running this, getting this error
InvalidDimensionException: Embedding dimension 384 does not match collection dimensionality 768
but when I remove the embedding from the ChromaDB, its retrieving the relevant chunks correctly.
Does anyone know why is this not getting while using embeddings?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I am working on a RAG application using DSPy and ChromaDB for pdf files.
At first I fetched the text from the pdf and add it to the Chromadb as chunks. Also added the embeddings of the chunks.
like this
And I try to retrieve the relevant chunks like this,
When I a running this, getting this error
InvalidDimensionException: Embedding dimension 384 does not match collection dimensionality 768but when I remove the embedding from the ChromaDB, its retrieving the relevant chunks correctly.
Does anyone know why is this not getting while using embeddings?
Beta Was this translation helpful? Give feedback.
All reactions