Skip to content

Commit 036c1d9

Browse files
authored
Merge pull request #30 from UBC-MDS/docs_readme
Docs readme
2 parents f8f72cf + 4b68b44 commit 036c1d9

6 files changed

Lines changed: 149 additions & 170 deletions

File tree

README.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,23 @@ flowchart LR
6868
output --> app["App\nHTML"]
6969
```
7070

71+
### Hybrid
72+
73+
We will merge the above two into a hybrid retriever where we can give weights to the outputs of both retrievers, combining semantic similarity (FAISS) with keyword-based relevance (BM25) to produce a more robust and balanced ranking of documents. By changing the weights, we can control the trade-off between contextual understanding and exact term matching, and we landed on equal weights for our project.
74+
75+
```mermaid
76+
flowchart LR
77+
query["query"] --> sem["FAISS retriever"]
78+
query --> bm["BM25 retriever"]
79+
sem --> semtop["Top-k semantic"]
80+
bm --> bmtop["Top-k BM25"]
81+
semtop --->|50% weight| comb["Combined Output docs"]
82+
bmtop --->|50% weight| comb
83+
comb --> output["Output JSON\n(content + metadata)"]
84+
output --> app["App\nHTML"]
85+
metadata --->|metadata like image url| output
86+
```
87+
7188
## Setup
7289

7390
1. Clone the repository using HTTP
@@ -131,6 +148,35 @@ The app will automatically use the full local index if available, otherwise fall
131148

132149
Evaluation and exploration can be [generated here](./notebooks/milestone1_evaluate_retrieval.ipynb) and are [summarised here](./results/milestone1_discussion.md). We can see a few cases where BM25 was doing better, while in some FAISS was. We cannot compare the scores between them as they are on different scales, but we are able to see how they prioritise items. We have not implemented scoring based on other factors such as popularity or rating, and only rank the products based on their retrieval score.
133150

151+
## RAG and LLM Integration
152+
153+
We will query the online hosted LLMs through huggingface api and thus we have the liberty to select somewhat heavier and powerful models, and we ended up selecting [`meta-llama/Meta-Llama-3-8B-Instruct`](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) with a `max_token` limit of 512 tokens. Since our app is small scale, we expect to have enough free-tier API calls available, and the LLM performance was quite good over many iterations we tested.
154+
155+
We tested 2 kinds of retriever for the RAG- a fully [semantic](#faiss) (FAISS) retriever and a [hybrid](#hybrid) (BM25+semantic) retriever with equal weights. We discovered the hybrid to work very well in this case and it is the sole RAG retriever for this implementation. In the future we can implement a slider to control the ratio of weights.
156+
157+
Both semantic and hybrid can be explored in this [notebook](./notebooks/milestone2_rag.ipynb) with different prompts and parameters. The `rag_pipeline` object returns a tuple, where the second item will return the context retrieved from the retriever, so both can be tested simultaneously. The input `verbose=True` can also print the entire context which is being sent to the LLM, after each step, for more clear exploration.
158+
159+
Here is the basic workflow:
160+
161+
```mermaid
162+
flowchart LR
163+
reviews["Retriever\n(Hybrid or Semantic)"] --> docs["Top k\nDocuments"]
164+
docs --> embeddings["Create\nPage Context"]
165+
embeddings --> similar["Prompt"]
166+
sys_pro["SYSTEM Prompt"] --> similar
167+
query(["User"]) --> qembed["Query"]
168+
qembed --> reviews
169+
qembed --> similar
170+
similar --> response["LLM response"] --> output["Output JSON\n(content + metadata)"]
171+
metadata[("Metadata")] --->|metadata like image_url| output["Output JSON\n(llm output + page_content)"]
172+
```
173+
#### LLM Evaluation
174+
175+
Similar to Search function, some metrics and exploration can be [generated here](./notebooks/milestone2_evaluate_rag.ipynb) and are [summarised here](./results/milestone2_discussion.md). We found that while LLM was slightly unpredictable, for most simple searches it did pretty well. We tried to depend as less as possible on the output formatting to avoid breaking of code in edge cases, eg. when the LLM does not return the `parent_asin` numbers.
176+
177+
> **Disclaimer**
178+
> LLM-based pipelines may occasionally produce inaccurate or unexpected results. Since this application handles food and recipe-related queries, any guidance on cooking, storage, or handling should be independently verified before use. Prompting should be done carefully to avoid hallucinations.
179+
134180
## Authors
135181

136182
- Sarisha Das

app/app.py

Lines changed: 10 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
from src.semantic import load_vector_store
1616
from src.rag_pipeline import run_rag
1717
from src.bm25 import load
18-
from src.hybrid import load_hybrid_retriever
18+
from src.hybrid import HybridRetriever
1919

2020
from dotenv import load_dotenv
2121
load_dotenv()
@@ -36,6 +36,8 @@
3636
FEEDBACK_CSV = ROOT / "results" / "feedback.csv"
3737
FEEDBACK_CSV.parent.mkdir(parents=True, exist_ok=True)
3838

39+
TOP_K = 5
40+
3941
HF_TOKEN = os.getenv('HF_TOKEN')
4042

4143
from datasets import load_dataset
@@ -125,16 +127,14 @@ def semantic_search(query: str, top_k: int = 3) -> list[dict]:
125127
results = enrich_search_results(vector_store, query, top_k, HF_DATASET['full'])
126128
return results
127129

128-
@st.cache_resource
129-
def load_hybrid_retriever_cached():
130-
return load_hybrid_retriever(
131-
bm25_index_path=ROOT_FOLDER / "data" / "processed" / "tokenisation" / "bm25_index_mini.pkl",
132-
faiss_store_path=ROOT_FOLDER / "data" / "processed" / "embeddings",
133-
k=5,
130+
hybrid_retriever = HybridRetriever(
131+
bm25_retriever=retriever,
132+
semantic_store=vector_store,
133+
k=TOP_K,
134+
bm25_weight=0.5,
135+
semantic_weight=0.5,
134136
)
135137

136-
hybrid_retriever = load_hybrid_retriever_cached()
137-
138138
def llm_retriever(query: str, top_k: int = 5):
139139
retriever = hybrid_retriever
140140
answer, docs = run_rag(retriever, query=query, hf_dataset=HF_DATASET['full'])
@@ -256,8 +256,6 @@ def render_results(results: list[dict], mode: str, query: str) -> None:
256256
unsafe_allow_html=True,
257257
)
258258

259-
TOP_K = 5
260-
261259
# ─── Search bar ───────────────────────────────────────────────────────────────
262260
query = st.text_input(
263261
"Search for a product or describe what you're looking for",
@@ -321,6 +319,7 @@ def render_results(results: list[dict], mode: str, query: str) -> None:
321319
)
322320
else:
323321
st.markdown(f"#### 🤖 AI Answer — *\"{st.session_state.last_query}\"*")
322+
st.caption("⚠️ AI responses may contain errors - please verify before relying on them.")
324323
html_response = markdown.markdown(
325324
st.session_state.llm_result,
326325
extensions=["tables", "fenced_code", "nl2br"],

0 commit comments

Comments
 (0)