FIU CoAgents — Co-Investigator AI for AML Compliance

A multi-agent AI system that ingests AML transaction data, detects money laundering typologies, generates FinCEN-compliant SAR narrative drafts, and provides a human-in-the-loop investigator review interface. Inspired by the Co-Investigator AI research paper.

Key Features

Multi-Agent Orchestration — 14 specialized agents coordinated by a LangGraph StateGraph with supervisor routing and parallel execution via Send API
SAR Narrative Generation — DSPy-optimized Chain-of-Thought prompts produce FinCEN 5W1H-compliant narratives with per-section confidence scores
Human-in-the-Loop Review — LangGraph interrupt checkpoints pause the pipeline for investigator approval, with iterative feedback loops (up to 3 revision cycles)
PII Protection — Microsoft Presidio masks all sensitive data before LLM inference; reversible anonymization with PostgreSQL-backed mapping store and full audit trail
False Positive Bypass — Supervisor agent detects low-confidence cases and exits the pipeline early without generating unnecessary SAR narratives
Programmatic Prompt Optimization — DSPy MIPROv2 signatures tune prompt templates against a golden evaluation dataset
Compliance Validation (Agent-as-Judge) — Rule-based checks + LLM judge score narratives on 5W1H completeness, factual grounding, regulatory keywords, and objective tone
Full Audit Trail — Every agent action, LLM call, confidence score, and human decision is logged to PostgreSQL for SR 11-7 model risk management compliance

Architecture

Raw IBM AML CSV → Data Ingestion → PII Masking (Presidio) → Crime Type Detection
→ Planning Agent (Supervisor) → [4 Typology Agents in parallel] → External Intelligence (Mock MCP)
→ Narrative Generation (DSPy + CoT) → Compliance Validation (Agent-as-Judge)
→ Human Review (Streamlit) → Feedback Agent → Final SAR → Audit Log

                    ┌─────────────────────┐
                    │   Data Ingestion    │
                    │   & Structuring     │
                    └────────┬────────────┘
                             │
                    ┌────────▼────────────┐
                    │  AI-Privacy Guard   │
                    │  (Presidio)         │
                    └────────┬────────────┘
                             │
                    ┌────────▼────────────┐
                    │  Crime Type         │
                    │  Detection Agent    │
                    └────────┬────────────┘
                             │
                    ┌────────▼────────────┐
                    │  Planning Agent     │◄──── Dynamic Memory (FAISS)
                    │  (Supervisor)       │       ├── Regulatory Memory
                    └────────┬────────────┘       ├── Historical Narrative Memory
                             │                    └── Typology Pattern Memory
              ┌──────────────┼──────────────┐
              │              │              │
    ┌─────────▼──┐  ┌───────▼────┐  ┌──────▼───────┐
    │ Typology   │  │ Typology   │  │ External     │
    │ Agents x4  │  │ Agents     │  │ Intelligence │
    │ (parallel) │  │ (contd)    │  │ Agent (Mock) │
    └─────────┬──┘  └───────┬────┘  └──────┬───────┘
              │              │              │
              └──────────────┼──────────────┘
                             │
                    ┌────────▼────────────┐
                    │  Narrative          │
                    │  Generation Agent   │
                    │  (DSPy + CoT)       │
                    └────────┬────────────┘
                             │
                    ┌────────▼────────────┐
                    │  Compliance         │
                    │  Validation Agent   │
                    │  (Agent-as-Judge)   │
                    └────────┬────────────┘
                             │
                    ┌────────▼────────────┐      ┌──────────────────┐
                    │  Human Review       │◄────►│  Feedback Agent   │
                    │  (Streamlit UI)     │      │  (Iterative)      │
                    │  [INTERRUPT POINT]  │      └──────────────────┘
                    └────────┬────────────┘
                             │
                    ┌────────▼────────────┐
                    │  Final SAR Output   │
                    │  + Audit Trail      │
                    └─────────────────────┘

Technology Stack

Component	Technology	Purpose
Language	Python 3.11+	Primary language
Agent Orchestration	LangGraph	Multi-agent StateGraph with supervisor routing
Prompt Optimization	DSPy	Programmatic prompt tuning with MIPROv2
LLM (Primary)	Groq — Llama 3.3 70B	Reasoning, narrative generation
LLM (Fast)	Groq — Llama 3.1 8B	DSPy optimization, lightweight tasks
Embeddings	sentence-transformers (all-MiniLM-L6-v2)	Vector embeddings for RAG
Vector Store	FAISS	3-tier in-memory vector search
Database	PostgreSQL 16+	Structured data, audit logs, checkpoints
PII Masking	Microsoft Presidio + spaCy	Anonymize/de-anonymize PII
Experiment Tracking	MLflow	Prompt versioning, evaluation metrics
API Framework	FastAPI	REST API with SSE streaming
Frontend	Streamlit	Multi-page investigator review UI
Data Processing	Pandas + Pydantic	Schema validation, feature engineering
ORM / Migrations	SQLAlchemy + Alembic	Database abstraction, schema versioning

Quick Start

# 1. Clone
git clone https://github.com/yourusername/FIU_CoAgents.git
cd FIU_CoAgents

# 2. Environment
cp .env.example .env
# Edit .env — add your GROQ_API_KEY

# 3. Install
pip install -e ".[dev]"
python -m spacy download en_core_web_lg

# 4. Database
brew install postgresql@16 && brew services start postgresql@16
createdb fiu_coagents
make setup

# 5. Run
make run-api   # FastAPI on :8000
make run-ui    # Streamlit on :8501

Usage

Run an Investigation via API

curl -X POST http://localhost:8000/api/investigations/CASE-001/run \
  -H "Content-Type: application/json" \
  -d '{
    "case_id": "CASE-001",
    "enriched_transactions": [...],
    "case_summary": {"total_amount": 95000, "risk_level": "HIGH"},
    "account_profiles": {}
  }'

Stream Agent Progress (SSE)

curl -N http://localhost:8000/api/investigations/CASE-001/run/stream \
  -H "Content-Type: application/json" \
  -d '{"case_id": "CASE-001", "enriched_transactions": [...]}'

Streamlit UI

streamlit run ui/app.py --server.port 8501

Sample SAR Narrative Output

SUSPICIOUS ACTIVITY REPORT — NARRATIVE

FILING INSTITUTION: First National Bank
REPORT PERIOD: 2022-09-01 to 2022-09-15
CASE REFERENCE: CASE-001

SUMMARY
Multiple structured cash deposits totaling $94,500 detected across 12
transactions, systematically kept below the $10,000 CTR threshold.

SUBJECT IDENTIFICATION (WHO)
Account holder [REDACTED-1], operating accounts 8000ECA410 and
8000ED0210 at banks 11 and 15...

NATURE OF SUSPICIOUS ACTIVITY (WHAT)
Structuring / smurfing pattern detected with 92% confidence...

Project Structure

FIU_CoAgents/
├── src/
│   ├── config.py                      # pydantic-settings configuration
│   ├── agents/                        # LangGraph agent nodes
│   │   ├── state.py                   # SARInvestigationState TypedDict
│   │   ├── graph.py                   # StateGraph definition + build_graph()
│   │   ├── supervisor.py              # Planning Agent (supervisor routing)
│   │   ├── crime_detection/           # Crime type detection (rule + LLM)
│   │   ├── typology/                  # 4 specialized typology agents
│   │   ├── intelligence/              # External intelligence (mock MCP)
│   │   ├── narrative/                 # DSPy narrative generation
│   │   ├── validation/                # Compliance validation (Agent-as-Judge)
│   │   └── feedback/                  # Iterative feedback agent
│   ├── data/                          # Data ingestion, schemas, case builder
│   ├── privacy/                       # Presidio guard, PII mapping, audit
│   ├── memory/                        # FAISS 3-tier RAG memory
│   ├── api/                           # FastAPI routes + middleware
│   └── db/                            # SQLAlchemy models, Alembic migrations
├── evaluation/                        # Golden dataset + 5 custom scorers
├── ui/                                # Streamlit multi-page app
├── tests/                             # 300 unit + integration tests
├── data/                              # Raw, processed, golden, regulatory
├── scripts/                           # Setup, load, build, optimize
└── docs/                              # Architecture, API, evaluation docs

Agent Pipeline

The supervisor routes through 8 phases, with a false positive bypass:

Phase	Agent	Output
1	Crime Type Detection	`detected_crime_types`, confidence scores
2	4 Typology Agents (parallel)	Structuring, Layering, Round-Tripping, Sanctions
2a	False Positive Bypass	If max confidence < 25%, pipeline exits early
3	External Intelligence	Mock sanctions, PEP, adverse media checks
4	Narrative Generation (DSPy)	FinCEN 5W1H SAR narrative draft
5	Compliance Validation	Rule-based + LLM judge scoring
6	Human Review (interrupt)	Investigator approval/edit/reject
7	Feedback Agent	Structured revision instructions
8	Final Output	Approved SAR + audit trail

Evaluation

9 scoring dimensions evaluated against a golden dataset of 50 labeled cases:

Scorer	Type	Threshold
Typology Detection Accuracy	Rule + ML	>= 0.85
Confidence Calibration (ECE)	Statistical	<= 0.10
Narrative Completeness (5W1H)	Rule-based	>= 0.90
Factual Grounding	Rule + LLM	>= 0.95
Regulatory Compliance	Rule-based	>= 0.90
Narrative Quality (LLM Judge)	LLM	>= 3.5 / 5
PII Leakage Rate	Rule-based	0.00
End-to-End Latency	Timer	<= 120s
False Positive Rate	Statistical	<= 0.15

make eval   # Run full evaluation suite, logs to MLflow

Documentation

Architecture — System design, data flow, key decisions
Agent Specifications — Per-agent contracts and state fields
API Reference — Endpoint catalog with request/response models
Evaluation Methodology — Golden dataset, scoring, MLflow
Regulatory Alignment — FinCEN, BSA, SR 11-7 mapping
Deployment Guide — Setup, environment, running, testing

Development

make test          # pytest with coverage
make lint          # ruff check + format + mypy
make run-api       # FastAPI dev server
make run-ui        # Streamlit dev server
make eval          # Run evaluation suite
make clean         # Remove caches

Author

Yash Patel — GitHub

License

MIT License. See LICENSE for details.

Acknowledgments

Co-Investigator AI — Research paper inspiring the multi-agent architecture
IBM AML Dataset — Synthetic transaction data for development and evaluation
FinCEN SAR Guidelines — Regulatory framework for narrative structure

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
data		data
docs		docs
evaluation		evaluation
notebooks		notebooks
scripts		scripts
src		src
tests		tests
ui		ui
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Makefile		Makefile
README.md		README.md
SETUP_GUIDE.md		SETUP_GUIDE.md
alembic.ini		alembic.ini
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FIU CoAgents — Co-Investigator AI for AML Compliance

Key Features

Architecture

Technology Stack

Quick Start

Usage

Run an Investigation via API

Stream Agent Progress (SSE)

Streamlit UI

Sample SAR Narrative Output

Project Structure

Agent Pipeline

Evaluation

Documentation

Development

Author

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FIU CoAgents — Co-Investigator AI for AML Compliance

Key Features

Architecture

Technology Stack

Quick Start

Usage

Run an Investigation via API

Stream Agent Progress (SSE)

Streamlit UI

Sample SAR Narrative Output

Project Structure

Agent Pipeline

Evaluation

Documentation

Development

Author

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages