docs : Improve README navigation and contribution guidance by tubajavedd · Pull Request #45 · AOSSIE-Org/Info

tubajavedd · 2026-02-13T18:21:37Z

✨ Improve README navigation and contribution guidance

Summary

This PR improves the README structure and usability for new contributors and community members.

Changes Made

Added a Table of Contents for easier navigation
Converted absolute GitHub links to relative links
Added a "How to Contribute" section
Improved minor naming and formatting consistency

Why This Is Useful

These changes make the documentation cleaner, easier to maintain, and more welcoming for new contributors and GSoC applicants.

Scope

Documentation-only changes
No policy or functional changes

Fixes #40

Summary by CodeRabbit

New Features
- Interactive Q&A CLI enabling intelligent search over markdown notes with semantic matching and source attribution.
Documentation
- Updated project README with contribution guidelines and improved navigation structure.
- Added architecture documentation for local-first markdown storage system.
- New documentation for Q&A search implementations.

coderabbitai · 2026-02-13T18:21:59Z

📝 Walkthrough

Walkthrough

This PR establishes a local-first markdown notes storage system with retrieval-augmented generation (RAG) capabilities. It introduces design documentation, updates the README with improved navigation, and implements a pipeline for chunking, embedding, and semantically searching markdown notes through a CLI interface.

Changes

Cohort / File(s)	Summary
Documentation & Configuration `.gitignore`, `README.md`, `Design/local markdown storage.md`, `smart-notes-design/README.md`, `smart-notes/rag_mvp/README.md`	Added `.gitignore` entry for notes directory. Restructured README with Table of Contents, relative links, and "How to Contribute" section. Created design documents outlining local markdown storage architecture and landing page UI design. Added README for RAG MVP project describing Q&A and embeddings-based implementations.
Embeddings & Indexing Infrastructure `smart-notes/rag_mvp/embeddings/chunker.py`, `embedder.py`, `indexer.py`	Introduced text chunking utility supporting overlapping chunks. Created Embedder class wrapping SentenceTransformer for text-to-vector conversion. Implemented VectorIndexer class backed by FAISS for efficient semantic search over embeddings.
RAG Pipeline & CLI `smart-notes/rag_mvp/pipelines/embedding_pipeline.py`, `smart-notes/rag_mvp/qa_cli.py`	Added EmbeddingPipeline orchestrating end-to-end workflow: chunking, embedding generation, FAISS index creation, and semantic search. Implemented CLI module providing interactive note search with keyword-based fallback and embedding pipeline demonstration.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI as qa_cli
    participant Pipeline as EmbeddingPipeline
    participant Chunker as chunker
    participant Embedder as embedder
    participant Indexer as VectorIndexer

    User->>CLI: Enter query
    CLI->>Pipeline: process_notes(text)
    Pipeline->>Chunker: chunk_text(text)
    Chunker-->>Pipeline: List[str]
    Pipeline->>Embedder: embed(chunks)
    Embedder-->>Pipeline: np.ndarray
    Pipeline->>Indexer: add(embeddings, chunks)
    Indexer-->>Pipeline: ✓
    
    CLI->>Pipeline: semantic_search(query)
    Pipeline->>Embedder: embed(query)
    Embedder-->>Pipeline: np.ndarray
    Pipeline->>Indexer: search(query_embedding, k=3)
    Indexer-->>Pipeline: List[str]
    Pipeline-->>CLI: Results
    CLI-->>User: Display results

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

docs: add role-based guidance and navigation links to README #9: Makes overlapping documentation changes to README.md with role-based guidance and navigation structure improvements.

Suggested reviewers

Zahnentferner

Poem

🐰 A warren of notes, now nestled with care,
Chunked into fragments that float through the air,
Embeddings dance through the vector space wide,
While FAISS indexes help seekers find answers inside! ✨

🚥 Pre-merge checks | ✅ 3 | ❌ 3

❌ Failed checks (3 warnings)

Check name	Status	Explanation	Resolution
Out of Scope Changes check	⚠️ Warning	The PR contains out-of-scope changes beyond issue `#40` requirements: .gitignore modification, new design documentation, and RAG MVP implementation code are unrelated to README improvements.	Separate non-documentation changes (embeddings, indexer, pipeline, CLI code) and architecture design into dedicated PRs to maintain focus on issue `#40` objectives.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Merge Conflict Detection	⚠️ Warning	❌ Merge conflicts detected (3 files): ⚔️ `GSoC-Ideas/2026/MiniChain.md` (content) ⚔️ `GSoC-Ideas/2026/Rein.md` (content) ⚔️ `README.md` (content) These conflicts must be resolved before merging into `main`.	Resolve conflicts locally and push changes to this branch.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: improving README navigation and contribution guidance, which aligns with the PR objectives.
Linked Issues check	✅ Passed	The PR fulfills all coding-related objectives from issue `#40`: adds Table of Contents [`#40`], converts absolute to relative links [`#40`], adds How to Contribute section [`#40`], and improves formatting consistency [`#40`].

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

⚔️ Resolve merge conflicts (beta)

Auto-commit resolved conflicts to branch improve-readme
Post resolved changes as copyable diffs in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 13

🤖 Fix all issues with AI agents

In `@Design/local` markdown storage.md:
- Line 170: The section numbering skips from 10 to 12: rename the header "###
12. Summary" to "### 11. Summary" and adjust any following numbered section
headers so numbering is consecutive (e.g., decrement subsequent headings by
one); search for the literal header "### 12. Summary" and update it and any
later section numbers to maintain proper sequence.
- Line 47: Fix the mismatched and inconsistent quotation marks in the phrase
containing "in-memory index " and "local db(SQLite)"; remove the stray trailing
space inside the first quoted phrase so it reads "in-memory index", add the
missing closing quote after local db(SQLite) so it reads "local db(SQLite)", and
ensure both quoted phrases use the same quote style and spacing.
- Line 15: The phrase "NO Cloud Serives" contains a typo; update the text to
read "NO Cloud Services" (replace the string "NO Cloud Serives" with "NO Cloud
Services") so the word "Services" is spelled correctly.

In `@smart-notes/rag_mvp/embeddings/chunker.py`:
- Around line 9-29: The chunk_text function can loop infinitely when overlap >=
max_length; add an input validation or clamp near the top of chunk_text to
ensure overlap < max_length (e.g., raise a ValueError or set overlap =
min(overlap, max_length - 1)), and then continue using start/end as before;
reference the variables overlap, max_length and the function chunk_text to
locate where to add the guard.

In `@smart-notes/rag_mvp/embeddings/indexer.py`:
- Around line 34-39: FAISS returns -1 for empty result slots so the loop in the
search logic must skip negative indices and avoid indexing self.texts with -1;
update the call to self.index.search to assign distances to _distances (to
address the unused-variable warning) and then filter indices before using them
(e.g., only consider idx >= 0 and idx < len(self.texts)) when building results
from self.texts using the query_embedding search results and indices returned by
self.index.search.

In `@smart-notes/rag_mvp/pipelines/embedding_pipeline.py`:
- Line 10: The code hardcodes a Windows-only cache path when instantiating
SentenceTransformer (self.model = SentenceTransformer(model_name,
cache_folder="D:/models_cache")), which will fail on other OSes; change this by
removing the cache_folder argument to use the library default or compute a
cross-platform path (e.g., using os.path.join(os.path.expanduser("~"), ".cache",
"sentence_transformers") or respect an env var like MODEL_CACHE_DIR) and pass
that variable as cache_folder when creating SentenceTransformer with model_name
so the code works across platforms.
- Around line 14-24: Delete the duplicate chunk_text method from
embedding_pipeline.py and instead import and call the canonical chunk_text
utility from smart-notes/rag_mvp/embeddings/chunker.py; update any local calls
in EmbeddingPipeline (or wherever chunk_text was used) to use the imported
function and pass max_length/overlap explicitly if you need the 300/50 behavior
(the canonical utility includes empty-input/strip safeguards and fixes the
overlap infinite-loop risk). Ensure the import targets the exact function name
chunk_text and remove the local definition to avoid shadowing.
- Around line 41-46: semantic_search currently assumes self.index and
self.chunks are ready; add a guard at the top of semantic_search to raise or
return an empty list if self.index is None (i.e., process_notes / build_index
hasn't been run), and after calling self.index.search(filter out FAISS sentinel
-1 index values from indices before dereferencing self.chunks). Specifically, in
semantic_search, check self.index (and optionally self.chunks) and handle the
case where it's None, ensure the query vector is the expected 2D shape, then
post-process the returned indices array to ignore any -1 entries and only map
valid indices into self.chunks so you don't accidentally read self.chunks[-1].

In `@smart-notes/rag_mvp/qa_cli.py`:
- Around line 63-70: In search_notes, detect when query_words (computed from
query and QUESTION_WORDS) is empty and surface a clear message instead of
silently returning no matches; update the function search_notes to check if not
query_words and then either raise a ValueError or return a sentinel (e.g., an
empty list plus a message or a tuple) or print "No searchable keywords found" so
callers know the query had no searchable terms; ensure the behavior is
documented in the function docstring and that callers of search_notes handle the
chosen sentinel/exception appropriately.
- Line 4: Fix the typo in the inline comment that reads
"#-------------------emedding-pipeline-chunking concept" by changing "emedding"
to "embedding" so the comment becomes
"#-------------------embedding-pipeline-chunking concept"; this is purely a
documentation change in the top-of-file comment in qa_cli.py.
- Around line 74-76: The current loop uses substring checks (if any(word in
sentence_lower for word in query_words)) which yields false positives; change it
to perform whole-word matching by either compiling each query word into a
word-boundary regex (e.g., r'\b{word}\b') and testing against sentence_lower, or
tokenize sentence_lower into a set of words (split/punctuation-normalize) and
check membership against that set; update the loop that iterates over sentences
(variables: sentences, sentence_lower, query_words) to use the chosen whole-word
approach so queries like "AI" don't match substrings like "again".

In `@smart-notes/rag_mvp/README.md`:
- Around line 28-43: The README has an unclosed triple-backtick code fence after
the example starting with the command "python smart-notes/rag_mvp/qa_cli.py",
causing the subsequent example queries and rest of the document to be inside the
code block; fix this by adding a closing ``` after the example output (after the
sample responses such as the "[1] From test.md: Artificial Intelligence..."
lines) to properly terminate the code block.
- Around line 75-84: The README's project structure is out of date: it lists
embed.py, index.py, and utils.py but the code now lives under embeddings/ and
pipelines/; update the README to reflect the actual modules (e.g.,
embeddings/chunker.py, embeddings/embedder.py, embeddings/indexer.py, and
pipelines/embedding_pipeline.py), keep notes/, requirements.txt and top-level
README.md entries, and adjust any references to qa_cli or utils to point to the
new files or document where equivalent functionality lives (for example note
which file provides embedding, chunking, indexing, and the CLI/pipeline
entrypoints).

🧹 Nitpick comments (5)

smart-notes/rag_mvp/pipelines/embedding_pipeline.py (1)

3-4: Hard imports will crash at import time if dependencies are missing.

Unlike embedder.py and indexer.py which gracefully handle missing dependencies with try/except ImportError, this file uses bare imports. Consider aligning with the pattern used in the sibling modules, or reusing Embedder and VectorIndexer directly instead of reimplementing their logic.
smart-notes/rag_mvp/qa_cli.py (4)
16-16: Unused variable embeddings.

The unpacked embeddings is never used, as flagged by Ruff. Prefix with _ to signal intent.
Fix
-    chunks, embeddings = pipeline.process_notes(note_text)
+    chunks, _embeddings = pipeline.process_notes(note_text)
85-89: demo_embeddings_pipeline() runs unconditionally, blocking the keyword-search CLI.

The demo requires sentence-transformers and faiss to be installed and downloads a model on first run. This makes the interactive keyword search unusable if those heavy dependencies aren't available. Consider guarding the demo behind a flag or try/except.
Suggested guard
 if __name__ == "__main__":
-
-    demo_embeddings_pipeline()      # Temporary demo for embeddings pipeline
+    try:
+        demo_embeddings_pipeline()      # Temporary demo for embeddings pipeline
+    except Exception as e:
+        print(f"Skipping embedding demo: {e}")
 
     notes = load_notes()
39-39: Relative NOTES_DIR is fragile — depends on the working directory.

If the script is invoked from a different directory, it will fail to find notes/. Consider resolving relative to the script's own location.
Suggested fix
-NOTES_DIR = "notes"
+NOTES_DIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", "notes")
Adjust the relative traversal ("..") to match your actual project layout.
93-98: No handling for EOFError / KeyboardInterrupt in the input loop.

If stdin is piped or the user presses Ctrl+C / Ctrl+D, the program crashes with an unhandled exception.
Suggested fix
     while True:
-        query = input(">> ").strip()
-
-        if query.lower() == "exit":
-            print("Goodbye 👋")
-            break
+        try:
+            query = input(">> ").strip()
+        except (EOFError, KeyboardInterrupt):
+            print("\nGoodbye 👋")
+            break
+
+        if query.lower() == "exit":
+            print("Goodbye 👋")
+            break

coderabbitai · 2026-02-13T18:24:54Z

Design/local markdown storage.md

+
+Offline-first experience : 
+    NO Internet,
+    NO Cloud Serives,


⚠️ Potential issue | 🟡 Minor

Typo: "Serives" → "Services".

- NO Cloud Serives, + NO Cloud Services,

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

NO Cloud Serives,

NO Cloud Services,

🤖 Prompt for AI Agents

In `@Design/local` markdown storage.md at line 15, The phrase "NO Cloud Serives" contains a typo; update the text to read "NO Cloud Services" (replace the string "NO Cloud Serives" with "NO Cloud Services") so the word "Services" is spelled correctly.

coderabbitai · 2026-02-13T18:24:54Z

Design/local markdown storage.md

+
+The app maintains:
+
+  "in-memory index " and "local db(SQLite) 


⚠️ Potential issue | 🟡 Minor

Mismatched quotation marks.

The line has an opening " before local db(SQLite) but no closing ", and there's a stray space inside the first quoted phrase.

- "in-memory index " and "local db(SQLite) + "in-memory index" and "local db (SQLite)"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"in-memory index " and "local db(SQLite)

"in-memory index" and "local db (SQLite)"

🤖 Prompt for AI Agents

In `@Design/local` markdown storage.md at line 47, Fix the mismatched and inconsistent quotation marks in the phrase containing "in-memory index " and "local db(SQLite)"; remove the stray trailing space inside the first quoted phrase so it reads "in-memory index", add the missing closing quote after local db(SQLite) so it reads "local db(SQLite)", and ensure both quoted phrases use the same quote style and spacing.

coderabbitai · 2026-02-13T18:24:54Z

Design/local markdown storage.md

+
+
+
+### 12. Summary


⚠️ Potential issue | 🟡 Minor

Section numbering skips from 10 to 12.

Section 11 is missing — ### 12. Summary should likely be ### 11. Summary, and subsequent sections renumbered accordingly.

🤖 Prompt for AI Agents

In `@Design/local` markdown storage.md at line 170, The section numbering skips from 10 to 12: rename the header "### 12. Summary" to "### 11. Summary" and adjust any following numbered section headers so numbering is consecutive (e.g., decrement subsequent headings by one); search for the literal header "### 12. Summary" and update it and any later section numbers to maintain proper sequence.

coderabbitai · 2026-02-13T18:24:54Z

smart-notes/rag_mvp/embeddings/chunker.py

+def chunk_text(text: str, max_length: int = 500, overlap: int = 50) -> List[str]:
+    if not text:
+        return []
+
+    chunks = []
+    start = 0
+    text = text.strip()
+
+    while start < len(text):
+        end = start + max_length
+        chunk = text[start:end].strip()
+
+        if chunk:
+            chunks.append(chunk)
+
+        if end >= len(text):
+            break
+
+        start = end - overlap
+        if start < 0:
+            start = 0


⚠️ Potential issue | 🟠 Major

Infinite loop when overlap >= max_length.

If overlap >= max_length, then start = end - overlap never advances past the current start, causing an infinite loop. Add a guard.

🐛 Proposed fix

def chunk_text(text: str, max_length: int = 500, overlap: int = 50) -> List[str]: - if not text: + if not text or max_length <= 0: + return [] + if overlap >= max_length: + raise ValueError("overlap must be less than max_length") + if not text.strip(): return []

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def chunk_text(text: str, max_length: int = 500, overlap: int = 50) -> List[str]:

if not text:

return []

chunks = []

start = 0

text = text.strip()

while start < len(text):

end = start + max_length

chunk = text[start:end].strip()

if chunk:

chunks.append(chunk)

if end >= len(text):

break

start = end - overlap

if start < 0:

start = 0

def chunk_text(text: str, max_length: int = 500, overlap: int = 50) -> List[str]:

if not text or max_length <= 0:

return []

if overlap >= max_length:

raise ValueError("overlap must be less than max_length")

if not text.strip():

return []

chunks = []

start = 0

text = text.strip()

while start < len(text):

end = start + max_length

chunk = text[start:end].strip()

if chunk:

chunks.append(chunk)

if end >= len(text):

break

start = end - overlap

if start < 0:

start = 0

🤖 Prompt for AI Agents

In `@smart-notes/rag_mvp/embeddings/chunker.py` around lines 9 - 29, The chunk_text function can loop infinitely when overlap >= max_length; add an input validation or clamp near the top of chunk_text to ensure overlap < max_length (e.g., raise a ValueError or set overlap = min(overlap, max_length - 1)), and then continue using start/end as before; reference the variables overlap, max_length and the function chunk_text to locate where to add the guard.

coderabbitai · 2026-02-13T18:24:54Z

smart-notes/rag_mvp/embeddings/indexer.py

+        distances, indices = self.index.search(query_embedding.reshape(1, -1), k)
+        results = []
+
+        for idx in indices[0]:
+            if idx < len(self.texts):
+                results.append(self.texts[idx])


⚠️ Potential issue | 🔴 Critical

Bug: FAISS returns -1 for unfilled result slots, which passes the idx < len check.

When fewer than k vectors are in the index, FAISS fills remaining indices with -1. In Python, -1 < len(self.texts) is True, so self.texts[-1] silently returns the last chunk — a wrong result.

🐛 Proposed fix

- for idx in indices[0]: - if idx < len(self.texts): + _distances, indices = self.index.search(query_embedding.reshape(1, -1), k) + results = [] + + for idx in indices[0]: + if 0 <= idx < len(self.texts): results.append(self.texts[idx])

This also addresses the Ruff warning about the unused distances variable (rename to _distances).

🧰 Tools

🪛 Ruff (0.15.0)

[warning] 34-34: Unpacked variable distances is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

🤖 Prompt for AI Agents

In `@smart-notes/rag_mvp/embeddings/indexer.py` around lines 34 - 39, FAISS returns -1 for empty result slots so the loop in the search logic must skip negative indices and avoid indexing self.texts with -1; update the call to self.index.search to assign distances to _distances (to address the unused-variable warning) and then filter indices before using them (e.g., only consider idx >= 0 and idx < len(self.texts)) when building results from self.texts using the query_embedding search results and indices returned by self.index.search.

coderabbitai · 2026-02-13T18:24:54Z

smart-notes/rag_mvp/qa_cli.py

+import os
+import re
+
+#-------------------emedding-pipeline-chunking concept


⚠️ Potential issue | 🟡 Minor

Typo: "emedding" → "embedding".

-#-------------------emedding-pipeline-chunking concept +#-------------------embedding-pipeline-chunking concept

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

#-------------------emedding-pipeline-chunking concept

`#-------------------embedding-pipeline-chunking` concept

🤖 Prompt for AI Agents

In `@smart-notes/rag_mvp/qa_cli.py` at line 4, Fix the typo in the inline comment that reads "#-------------------emedding-pipeline-chunking concept" by changing "emedding" to "embedding" so the comment becomes "#-------------------embedding-pipeline-chunking concept"; this is purely a documentation change in the top-of-file comment in qa_cli.py.

coderabbitai · 2026-02-13T18:24:54Z

smart-notes/rag_mvp/qa_cli.py

+def search_notes(query, notes):
+    results = []
+
+    query_words = [
+        word.lower()
+        for word in query.split()
+        if word.lower() not in QUESTION_WORDS
+    ]


⚠️ Potential issue | 🟡 Minor

Empty query_words silently returns no results — consider informing the user.

If the user's query consists entirely of stop/question words (e.g., "What is the?"), query_words will be empty and any(...) will always be False, returning zero matches with no indication of why. A short message like "No searchable keywords found" would improve UX.

🤖 Prompt for AI Agents

In `@smart-notes/rag_mvp/qa_cli.py` around lines 63 - 70, In search_notes, detect when query_words (computed from query and QUESTION_WORDS) is empty and surface a clear message instead of silently returning no matches; update the function search_notes to check if not query_words and then either raise a ValueError or return a sentinel (e.g., an empty list plus a message or a tuple) or print "No searchable keywords found" so callers know the query had no searchable terms; ensure the behavior is documented in the function docstring and that callers of search_notes handle the chosen sentinel/exception appropriately.

coderabbitai · 2026-02-13T18:24:54Z

smart-notes/rag_mvp/qa_cli.py

+        for sentence in sentences:
+            sentence_lower = sentence.lower()
+            if any(word in sentence_lower for word in query_words):


⚠️ Potential issue | 🟠 Major

Substring matching causes false positives in keyword search.

word in sentence_lower performs a substring check, not a whole-word match. For example, querying "AI" would match any sentence containing "contain", "again", "plain", etc., because "ai" is a substring of those words.

Use a word-boundary regex or tokenize sentences into word sets instead.

Proposed fix using word-boundary regex

- if any(word in sentence_lower for word in query_words): + if any(re.search(r'\b' + re.escape(word) + r'\b', sentence_lower) for word in query_words):

🤖 Prompt for AI Agents

In `@smart-notes/rag_mvp/qa_cli.py` around lines 74 - 76, The current loop uses substring checks (if any(word in sentence_lower for word in query_words)) which yields false positives; change it to perform whole-word matching by either compiling each query word into a word-boundary regex (e.g., r'\b{word}\b') and testing against sentence_lower, or tokenize sentence_lower into a set of words (split/punctuation-normalize) and check membership against that set; update the loop that iterates over sentences (variables: sentences, sentence_lower, query_words) to use the chosen whole-word approach so queries like "AI" don't match substrings like "again".

coderabbitai · 2026-02-13T18:24:54Z

smart-notes/rag_mvp/README.md

+```bash
+python smart-notes/rag_mvp/qa_cli.py
+
+
+
+>> what is AI
+
+[1] From test.md:
+Artificial Intelligence (AI) is the simulation of human intelligence in machines.
+
+
+>>  what is machine learning
+how is machine learning used
+difference between AI and ML
+
+


⚠️ Potential issue | 🟡 Minor

Unclosed code block breaks rendering.

The code block opened at line 28 is never closed with a matching ```. The example queries (lines 33–41) bleed into the code block, and the rest of the document renders incorrectly.

Add a closing ``` after line 29 (or after the example output) to properly terminate the block.

🤖 Prompt for AI Agents

In `@smart-notes/rag_mvp/README.md` around lines 28 - 43, The README has an unclosed triple-backtick code fence after the example starting with the command "python smart-notes/rag_mvp/qa_cli.py", causing the subsequent example queries and rest of the document to be inside the code block; fix this by adding a closing ``` after the example output (after the sample responses such as the "[1] From test.md: Artificial Intelligence..." lines) to properly terminate the code block.

coderabbitai · 2026-02-13T18:24:54Z

smart-notes/rag_mvp/README.md

+```bash
+smart-notes/
+├── rag_mvp/
+│   ├── embed.py        # Embedding logic
+│   ├── index.py        # FAISS index creation
+│   ├── qa_cli.py       # CLI for asking questions
+│   └── utils.py        # Helper functions
+├── notes/              # Put your .txt notes here
+├── requirements.txt
+└── README.md


⚠️ Potential issue | 🟡 Minor

Project structure doesn't match actual files.

The documented structure references embed.py, index.py, and utils.py, but the actual modules in this PR are embeddings/chunker.py, embeddings/embedder.py, embeddings/indexer.py, and pipelines/embedding_pipeline.py. This will confuse contributors trying to navigate the codebase.

🤖 Prompt for AI Agents

In `@smart-notes/rag_mvp/README.md` around lines 75 - 84, The README's project structure is out of date: it lists embed.py, index.py, and utils.py but the code now lives under embeddings/ and pipelines/; update the README to reflect the actual modules (e.g., embeddings/chunker.py, embeddings/embedder.py, embeddings/indexer.py, and pipelines/embedding_pipeline.py), keep notes/, requirements.txt and top-level README.md entries, and adjust any references to qa_cli or utils to point to the new files or document where equivalent functionality lives (for example note which file provides embedding, chunking, indexing, and the CLI/pipeline entrypoints).

Tuba Javed added 7 commits February 9, 2026 00:18

Add local Q&A CLI MVP for markdown notes

564741c

embedding-pipeline-chunking

5238131

Design ui landing page of smart-notes

40aba7e

Local-markdown-storage-design

917ef7c

Local-markdown-storage-design changes

aa2b95c

do changes in architecture(trade-offs) of local markdown storage.md

a007c92

docs: improve README navigation and contribution guidance

f268eb3

coderabbitai bot reviewed Feb 13, 2026

View reviewed changes

	"in-memory index " and "local db(SQLite)
	"in-memory index" and "local db (SQLite)"

	#-------------------emedding-pipeline-chunking concept
	`#-------------------embedding-pipeline-chunking` concept

Uh oh!

Conversation

tubajavedd commented Feb 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✨ Improve README navigation and contribution guidance

Summary

Changes Made

Why This Is Useful

Scope

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tubajavedd commented Feb 13, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 13, 2026 •

edited

Loading