Skip to content

Shubhank2604/FinSight-AI

Repository files navigation

FinSight AI

Overview

FinSight AI is a local-first financial decision engine built using a Streamlit interface.

It is not a simple chatbot. Instead, it intelligently routes user queries through:

  • Hybrid document retrieval (RAG)
  • Multimodal reasoning (text + images)
  • Structured LLM reasoning (Gemini)
  • Automatic web grounding (for live data)
  • A verification layer (VeriFi) to ensure reliability

The system is designed for analysis and understanding, not for executing trades or providing licensed financial advice.


Core Idea

Instead of answering directly using an LLM, FinSight AI follows a structured pipeline:

User Query
→ Router (classify intent)
→ Retrieval / Multimodal / Web / Educational path
→ Gemini reasoning
→ VeriFi verification
→ Final Answer

This ensures that answers are:

  • grounded in data
  • logically structured
  • verifiable

Key Capabilities

1. Intelligent Query Routing

Every user query is classified into a specific type before processing:

  • compute_only
  • educational_answer
  • retrieve_then_answer
  • retrieve_then_compute_then_answer
  • multimodal_reasoning
  • web_grounded_answer
  • abstain

This prevents generic LLM responses and improves accuracy.


2. Hybrid RAG (Retrieval-Augmented Generation)

FinSight AI uses two retrieval systems together:

Dense Retrieval

  • Uses embeddings
  • Powered by Qdrant
  • Captures semantic meaning

Sparse Retrieval (BM25)

  • Captures exact keywords
  • Handles finance-specific terminology

Fusion (RRF)

  • Combines both results into a final ranked set

3. Context Packing

Before sending data to Gemini:

  • duplicate chunks are removed
  • relevant chunks are prioritized
  • table data is boosted for numerical queries
  • irrelevant content is filtered

This improves answer precision.


4. Multimodal Reasoning

The system supports:

  • PDFs
  • Tables
  • Images
  • Charts
  • Screenshots

For image-based queries:

Image + Query + Context → Gemini → Explanation

Example:

  • “Explain this portfolio screenshot”
  • “What trend is visible in this chart?”

5. Automatic Web Grounding

For live or current queries, the system automatically uses Gemini search grounding.

Triggered when query contains:

  • current / latest / today / recent
  • stock prices
  • exchange rates
  • market conditions
  • interest rates

Example:

"What is the current USD to INR rate?"

6. Structured Gemini Output

Gemini does not return free-form text.

Instead, it returns structured JSON:

{
  "answer": "...",
  "used_citation_ids": ["chunk-id"],
  "claims": [],
  "assumptions": [],
  "confidence": 0.0,
  "needs_more_data": false
}

This allows downstream validation.


7. VeriFi (Verification Layer)

VeriFi ensures answer quality by checking:

  • claims are supported by retrieved context
  • citations are valid
  • missing data is handled properly
  • confidence level is sufficient

If verification fails:

"Insufficient data to answer reliably."

System Architecture

         User
          ↓
      Streamlit UI
          ↓
        Router
          ↓
----------------------------------------------
| Retrieval | Multimodal | Web | Educational |
----------------------------------------------
          ↓
Gemini Structured Answer
          ↓
        VeriFi
          ↓
      Final Answer

Data Ingestion Pipeline

PDF / Image
→ Extract (text, tables, images)
→ Chunking
→ Embedding
→ Store in Qdrant
→ Build BM25 index

Each chunk includes:

  • content
  • type (text / table / image)
  • page number
  • section metadata

Repository Structure

app.py                  Main Streamlit app
config.py               Configuration
schemas.py              Data schemas

ingestion/
  extractor.py          Document parsing
  chunker.py            Chunk creation

retrieval/
  hybrid.py             Hybrid RAG implementation

router/
  intent_router.py      Query classification

verifier/
  verifi.py             Validation layer

data/
  uploads/originals/    Uploaded documents
  index/qdrant/         Vector DB
  index/chunks.json     Retrieval catalog

Example Queries

Educational

Tell me about SEC filings
How do I calculate retirement corpus?

Document-Based

What risks are discussed in the Apple filing?
Summarize the brokerage statement

Multimodal

Explain this portfolio screenshot
What does this chart show?

Web Grounded

What is the latest Fed interest rate?

Setup

python -m venv .venv
.\.venv\Scripts\activate
pip install -r requirements.txt

Create .env:

GEMINI_API_KEY=your_key_here
GEMINI_TEXT_MODEL=gemini-3-flash-preview
GEMINI_EMBEDDING_MODEL=gemini-embedding-001
GEMINI_WEB_GROUNDING_MODEL=gemini-2.5-flash

Run

streamlit run app.py

App runs at:

http://localhost:8510

Index Documents

python index_uploads.py --folder data/uploads/originals --embedding-provider local_hash

Evaluation

python eval_retrieval.py --top-k 5

Technical Highlights

  • Router-first architecture
  • Hybrid RAG (Qdrant + BM25)
  • RRF fusion strategy
  • Context-aware chunk selection
  • Structured LLM output
  • Multimodal reasoning
  • Automatic web grounding
  • Verification layer (VeriFi)
  • Quota-resilient design

About

AI powered, local-first financial decision engine that combines hybrid RAG, multimodal reasoning, and structured LLM outputs to deliver grounded and explainable insights. It intelligently routes queries, retrieves document context, and verifies responses using a confidence-based layer (VeriFi), ensuring accurate, reliable, and transparent analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages