Skip to content

Commit 603a7e1

Browse files
committed
Add documentation for LLM chat flow
1 parent ce1ce03 commit 603a7e1

1 file changed

Lines changed: 86 additions & 0 deletions

File tree

doc/chat-llm-flow.md

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# LLM Chat Flow in Flaskmarks
2+
3+
This app uses a Retrieval-Augmented Generation (RAG) flow:
4+
1. Retrieve relevant bookmarks for the user.
5+
2. Send retrieved context plus question to Groq LLM.
6+
3. Return answer with source bookmarks.
7+
8+
## Request Flow
9+
10+
1. Frontend sends `POST /chat/send` with `query` and CSRF token.
11+
- `flaskmarks/templates/chat/index.html`
12+
2. Backend validates input via `ChatForm` (`3..1000` chars).
13+
- `flaskmarks/forms/chat.py`
14+
3. Route calls:
15+
- `rag_service.chat(query, user_id, chat_history)`
16+
- `flaskmarks/views/chat.py`
17+
18+
## RAG + LLM Flow
19+
20+
Inside `RAGService.chat`:
21+
22+
1. Check feature/config:
23+
- `RAG_ENABLED`
24+
- `GROQ_API_KEY`
25+
2. Embed the user query with sentence-transformers.
26+
3. Run pgvector similarity search on `marks.embedding`, scoped by `owner_id` (user isolation).
27+
4. Build context from top matches:
28+
- title, URL, tags, description, content excerpt
29+
5. Call Groq chat completions with:
30+
- system prompt
31+
- recent chat history
32+
- current question + retrieved context
33+
6. Return:
34+
- `answer`
35+
- `sources`
36+
- `tokens_used`
37+
38+
Relevant code:
39+
- `flaskmarks/core/rag/service.py`
40+
- `flaskmarks/core/rag/embeddings.py`
41+
42+
## Embeddings and Storage
43+
44+
Bookmark model stores vectors in:
45+
- `embedding` (`Vector(384)`)
46+
- `embedding_updated`
47+
48+
Defined in:
49+
- `flaskmarks/models/mark.py`
50+
51+
Database support is enabled by migration:
52+
- `migrations/versions/a1b2c3d4e5f6_add_embedding_column_for_rag.py`
53+
54+
## How Embeddings Are Generated
55+
56+
Embeddings are primarily generated via CLI:
57+
- `flask rag generate-embeddings`
58+
59+
Code:
60+
- `flaskmarks/cli.py`
61+
62+
If no relevant embedded bookmarks exist, chat returns a fallback message.
63+
64+
## Session Chat History
65+
66+
Chat history is stored in session as alternating user/assistant messages.
67+
History size is capped by `CHAT_MAX_HISTORY`.
68+
69+
Code:
70+
- `flaskmarks/views/chat.py`
71+
72+
## Main Config Knobs
73+
74+
From `config.py`:
75+
- `RAG_ENABLED`
76+
- `GROQ_API_KEY`
77+
- `GROQ_MODEL`
78+
- `GROQ_TEMPERATURE`
79+
- `GROQ_MAX_TOKENS`
80+
- `RAG_TOP_K`
81+
- `RAG_SIMILARITY_THRESHOLD`
82+
- `CHAT_MAX_HISTORY`
83+
84+
## Note
85+
86+
`RAG_SIMILARITY_THRESHOLD` is defined in config but currently not applied in retrieval logic; retrieval uses top-K results.

0 commit comments

Comments
 (0)