Version: 2025-09-23
Scope: Current simplified FastAPI + LangChain + Hybrid Retrieval + Minimal Enterprise UI (v7)
This project ingests raw log files, extracts error/warning lines ("issues"), builds a hybrid retrieval index (FAISS vectors + BM25 lexical), and provides on‑demand AI resolutions for each issue. The UI supports upload → inspect → resolve → batch operations → multi‑format export. Indexing may be asynchronous; readiness gating prevents premature resolution calls.
User Uploads Logs --> /upload
├─ Parse + Extract issue lines
├─ Chunk context (surrounding lines & full logs)
├─ (Async) Build / augment FAISS + BM25 stores
└─ Return issues immediately (QA may still be building)
User Selects Issue in UI
├─ UI fetches context sources (/context)
└─ User clicks Generate (/resolve) once QA ready
Resolution Returned
├─ Stored in ERROR_STORE[id]._answer
└─ Marked resolved (auto or manual batch mark)
Exports (/export) JSON / CSV / XLSX (optionally filtered or lite)
| Layer | File(s) | Responsibility |
|---|---|---|
| API | api.py |
Endpoints, global state, indexing orchestration |
| Retrieval | rag/vectorstore.py |
FAISS + BM25 management |
| Embeddings | rag/embeddings.py |
HuggingFace (or OpenAI) embedding backend |
| Split / Chunk | rag/splitter.py |
Context windows for retrieval |
| Loader | rag/loaders.py |
Log file reading & issue extraction (severity rules) |
| Retriever builder | rag/retriever.py |
Ensemble (weighted vector + BM25) and source fetch |
| QA Chain | rag/chain.py |
LangChain RetrievalQA construction + custom prompt |
| Wrapper | rag/wrapper.py |
Reusable ingestion/query abstraction (not yet wired into API) |
| UI | web/index.html, web/app.js, web/styles.css |
Interaction, progress, batch ops, accessibility |
| Utils | utils/logger.py |
Structured logging |
| Tests | tests/*.py |
Validation of taxonomy / end-to-end behavior |
VECTORSTORES: VectorStores # holds FAISS + BM25 retrievers
ENSEMBLE: EnsembleRetriever | None
QA_CHAIN: RetrievalQA | None
ERROR_STORE: dict[id -> issue dict]
BUILDING: bool # indexing in progress
LAST_BUILD_TS: float | NoneIssue record canonical fields:
{
id, severity, task_name, process_name, error_type,
message, file_id, run_id, line_no, snippet,
manual_resolved (bool), answer_ts (ISO), _answer (QA output, internal)
}
/uploadreads uploaded files into temp disk files.load_log_files()→ list[Document] each line (with metadata: source + line_no).extract_issue_docs()filters ERROR/WARN lines.- Builds a snippet (±2 line window, marking the focal line with
>>). split_context()creates larger retrieval fragments for embedding.- If first time:
VECTORSTORES.build(issue_docs, context_docs)else.add()(incremental augmentation). _rebuild_chain()builds ensemble retriever + QA chain.- If
ASYNC_BUILD=1, steps 6–7 run in a background thread; UI polls/statsuntilqa_ready.
| Component | Purpose | Notes |
|---|---|---|
| FAISS (cosine) | Semantic similarity | Dense embeddings (sentence-transformers default) |
| BM25 | Lexical keyword match | Captures exact token relevance |
| EnsembleRetriever | Weighted merge | Default weights (0.6 vector, 0.4 BM25) |
fetch_sources() merges top-k docs for context & builds citation list: [{source, line_no}].
A LangChain RetrievalQA ("stuff" chain) pattern with a custom system prompt composed of sections:
- Diagnosis
- Evidence (supported by retrieved context)
- Remediation Steps
- Impact
- Uncertainty (explicit self-estimation)
Model selection via env var (OPENAI_MODEL or fallback to a local LLM if replaced). The chain returns:
{
'answer': <string>,
'sources': [ { 'source': <file>, 'line_no': <int>, ... }, ... ]
}
| Method | Path | Summary | Key Response Fields |
|---|---|---|---|
| POST | /upload | Stage files & trigger indexing | errors[], index_building, qa_ready |
| GET | /errors | List issues + resolved status | errors[] (resolved, answer) |
| GET | /context?id= | Retrieve citations for an issue | context.context_text, citations[] |
| POST | /resolve | Run QA for a single issue | answer.text, answer.citations |
| POST | /resolve_batch | Batch QA generation | results[] per id |
| POST | /mark_resolved | Manual mark/unmark | updated[] |
| GET | /export | Multi-format export | JSON / file stream |
| GET | /stats | Build + readiness + counts | qa_ready, index_building, resolved_total |
| GET | /ready | Lightweight readiness | qa_ready |
| POST | /reset | Reset indices (optional keep issues) | status |
| GET | /health | Simple health probe | status, retriever_ready |
formatorfmt:json|md|csv|xlsxids: comma separated subsetlite=1: reduced columns (id, severity, message, resolved)
An issue is considered resolved if:
- It has
_answer(AI generated), or - It is manually marked (
manual_resolved=true)
answer_ts is recorded when the first resolution (AI or manual) occurs.
Batch operations:
/resolve_batch– invokes QA for each id (skips missing)./mark_resolved– flips manual flag (no QA call).
| Concept | Implementation |
|---|---|
| Layout | Two panes (issues list / detail) + top actions & global progress |
| State | In-browser arrays (ERRORS), sets (SELECTED), flags (QA_READY) |
| Batch | Checkbox selection + batch bar (resolve/mark/export) |
| Progress | Global bar (indexing), mini resolved % bar, inline resolution bar |
| Skeleton | Placeholder list while server processes upload parsing |
| Sources | Lazy-loaded citations, collapsible panel |
| Accessibility | ARIA roles: progressbar, status regions, focus styling |
| Interaction Lock | Temporary body .locked during indexing (search/filter exempt) |
| Export | Dropdown format + Lite toggle + selected subset export |
User selects files → click Upload → show overlay & skeleton
POST /upload → receive errors[] (possibly qa_ready=false)
Render issues → If BUILDING: poll /stats every 2s → when qa_ready → enable Generate
Select issue → fetch /context (citations) → user clicks Generate → POST /resolve
Show inline progress bar → replace with answer → mark resolved + update counts
2025-09-23 10:00:00 INFO Starting job X
2025-09-23 10:00:02 WARN Retryable network glitch
2025-09-23 10:00:05 ERROR Failed to connect to DB timeout=30s
2025-09-23 10:00:06 INFO Backoff scheduled
- WARN line and ERROR line become issue docs.
- Snippet for ERROR includes preceding WARN + following INFO lines (window ±2).
- Documents chunked (e.g., 1000 char windows).
- Embeddings generated → FAISS index updated → BM25 updated.
- Ensemble retriever constructed.
User selects ERROR → QA prompt contains:
- User Query: raw error line text.
- Context: weighted aggregate of top-k vector + BM25 chunks. Model responds with structured answer. Citations list ties back to source file & lines.
def upload(files):
docs = load_log_files(temp_paths)
issues = extract_issue_docs(docs)
build_snippets(issues, docs)
context_docs = split_context(docs)
if ASYNC_BUILD: background(_index_build, issues, context_docs)
else: VECTORSTORES.build(...); _rebuild_chain()
return issues, qa_ready=not ASYNC_BUILDdef resolve(id):
row = ERROR_STORE[id]
assert QA_CHAIN
res = run_qa(QA_CHAIN, row['message'])
row['_answer']=res; row['answer_ts']=utc_now()
return res['answer'], res['sources']| Variable | Default | Meaning |
|---|---|---|
| CHUNK_SIZE | 1000 | Max characters per context chunk |
| CHUNK_OVERLAP | 150 | Overlap between chunks |
| TOP_K | 8 | Retrieval depth (merged) |
| EMBED_BACKEND | hf | Embedding provider (customizable) |
| OPENAI_MODEL | gpt-3.5-turbo | LLM model name if OpenAI used |
| ASYNC_BUILD | 1 | Enable background indexing |
| DISABLE_PERSIST | (unset) | If implemented, skip persistence (future) |
Purpose: Reuse ingestion + retrieval outside this FastAPI service.
from rag.wrapper import RAGWrapper
rag = RAGWrapper(embedding_backend="hf")
rag.ingest_logs(["/path/app.log"]) # or ingest_texts([...])
rag.build() # explicit if not auto
res = rag.query("Failed to connect to DB timeout=30s")
print(res.answer, res.sources)Integration Steps:
- Instantiate once per tenant/session.
- Feed logs or documents.
- Call
build()(or rely on auto call inside ingest if implemented). - Use
query(); capture sources for UI. - Reset with
reset()when switching datasets.
| Mode | Purpose | Notes |
|---|---|---|
| JSON | Raw list of issues + flags | Good for pipeline chaining |
| Markdown | Table for quick reports | Omits long text trimming logic (basic) |
| CSV | Full or lite | lite=1 reduces size |
| XLSX | Spreadsheet consumption | Requires openpyxl |
| IDs Filter | ?ids=id1,id2 |
Combine with format + lite |
| Aspect | Mitigation / Strategy |
|---|---|
| Large logs | Streaming parse line-by-line (current: reads into docs; can stream) |
| Many issues | Use virtualized list (future) |
| Embedding latency | Batch embeddings; potential async concurrency |
| Repeated queries | Introduce local answer cache keyed by (message hash) |
| Memory | Offload FAISS to disk (persist) |
Potential future: Add reranker (e.g., BGE, Cohere ReRank) after initial recall.
| Code | Scenario |
|---|---|
| 400 | Missing files / invalid params / unsupported export format |
| 404 | Issue ID not found |
| 503 | QA chain not built yet (resolve early) |
| 500 | Internal errors (index build / XLSX save) |
UI automatically retries a 503 resolve once after a short delay.
| Feature | Status |
|---|---|
| ARIA progressbar | YES (global bar) |
| Live regions | YES (status, toasts) |
| Keyboard focus states | Unified via :focus-visible |
| Reduced motion toggle | NOT YET (planned) |
| Semantic buttons/labels | YES (copy, export, filters) |
Improvements: Add skip link, high-contrast toggle.
| Concern | Action |
|---|---|
| Arbitrary file upload | Enforce size caps & sanitize content |
| Prompt injection (logs) | Add guardrails / system prompt sanitation |
| Rate limiting | Add IP or token bucket (FastAPI middleware) |
| Auth | OAuth/JWT layer in front (e.g., API key header) |
| PII leakage | Optional redaction pass pre-index |
| Goal | Approach |
|---|---|
| Add streaming answers | Replace /resolve with SSE or WebSocket yielding partial tokens |
| Swap embedding model | Extend embeddings.py to branch on EMBED_BACKEND |
| Add reranker | After merged docs, apply cross-encoder scoring; reorder before chain input |
| Multi-tenant | Instance map: TENANTS[id] = { VECTORSTORES, ENSEMBLE, ... } + per-tenant API key |
| Persistence | Serialize FAISS + BM25 state per build; load at startup |
| Doc ingestion beyond logs | Extend loader to accept PDFs/HTML → unify into wrapper |
Current tests (examples):
tests/test_end_to_end.py: Validates end-to-end minimal ingestion & answer path.tests/test_taxonomy.py: Severity categorization rules.
Suggested additional tests:
- Export formatting integrity (CSV/XLSX columns)
- Batch resolve idempotency
- Async build gating (resolve returns 503 pre-ready)
# Upload two log files
curl -F "files=@app.log" -F "files=@worker.log" http://localhost:8000/upload
# Poll stats
curl http://localhost:8000/stats
# List errors
curl http://localhost:8000/errors
# Get context for one id
curl "http://localhost:8000/context?id=<ISSUE_ID>"
# Resolve one
curl -X POST -H 'Content-Type: application/json' -d '{"id":"<ISSUE_ID>"}' http://localhost:8000/resolve
# Batch resolve
curl -X POST -H 'Content-Type: application/json' -d '{"ids":["<ID1>","<ID2>"]}' http://localhost:8000/resolve_batch
# Export lite CSV for selected
curl "http://localhost:8000/export?format=csv&ids=<ID1>,<ID2>&lite=1" -o selected_lite.csv| Symptom | Likely Cause | Remedy |
|---|---|---|
| 503 on /resolve repeatedly | Index build failed | Check server logs; look for exception in _index_build |
| CSV empty | No issues extracted | Confirm logs have ERROR/WARN tokens |
| XLSX 400 | Missing dependency | pip install openpyxl |
| Low quality answers | Insufficient context | Increase CHUNK_SIZE or TOP_K |
| Slow build | Large file set | Set ASYNC_BUILD=1 & show progress; consider chunk batching |
- Streaming answer generation
- Reranker integration (quality uplift)
- Theming + light mode + high contrast
- Multi-tenant wrapper adoption in API
- Persistent on-disk index & warm start
- Source excerpt highlighting in answer (inline citations)
- Saved sessions & diff-based incremental indexing
| Term | Definition |
|---|---|
| Issue | Single log line categorized as ERROR or WARN |
| Snippet | Local window lines around issue for human inspection |
| Context Chunk | Larger aggregated text unit used for retrieval embeddings |
| QA Chain | LLM-powered retrieval augmented question answering pipeline |
| Ensemble Retrieval | Weighted combination of vector similarity + BM25 lexical matches |
import requests, time
# Upload
files = [('files', ('app.log', open('app.log','rb'), 'text/plain'))]
upload = requests.post('http://localhost:8000/upload', files=files).json()
issue_id = upload['errors'][0]['id']
# Wait for readiness
while True:
s = requests.get('http://localhost:8000/stats').json()
if s['qa_ready']: break
time.sleep(1)
# Resolve
ans = requests.post('http://localhost:8000/resolve', json={'id': issue_id}).json()
print(ans['answer']['text'])- Lean state: Only minimal metadata kept per issue; derived states (resolved flags) computed.
- Progressive disclosure: Sources hidden until toggled.
- Non-blocking ingestion: Async build path ensures fast UI feedback.
- Extensibility first: Wrapper isolates ingestion + retrieval surfaces.
- Accessibility baseline: ARIA roles, focus outlines, live regions.
| Question | Answer |
|---|---|
| Can I use a different embedding model? | Set EMBED_BACKEND or modify embeddings.py to select model. |
| How to add new severity levels? | Extend extraction logic (loader) + badge styles. |
| Why hybrid retrieval? | Lexical (BM25) catches raw token errors; semantic matches contextual synonyms. |
| Is streaming supported? | Not yet—convert /resolve to an SSE endpoint with incremental token yield. |
When modifying:
| Area | Check Also |
|---|---|
| Retrieval weights | Prompt output quality, speed |
| Chunk sizing | Embedding latency, recall quality |
| Export schema | Frontend batch export naming |
| State shape (ERROR_STORE) | /errors, /export, tests |
This codebase implements a pragmatic, production-lean RAG workflow for log diagnostics: fast ingestion, hybrid retrieval, structured LLM resolutions, and actionable UI ergonomics. The architecture isolates concerns cleanly, enabling incremental upgrade (reranking, streaming, multi-tenancy) without rewrites. Refer to this document as an onboarding and extension blueprint.
End of Document.