SF GPT - San Francisco Government AI Assistant

AI-powered assistant for querying San Francisco government data, built for NVIDIA Spark Hack 2026.

Overview

SF GPT provides natural language access to San Francisco's 160,000+ government datasets from data.sfgov.org. Using NVIDIA AI models running locally on Dell GB10 (DGX Spark), it enables:

Natural Language Queries - Ask questions about any SF government data
Semantic Search - Find relevant information across all datasets
Voice Input - Speak your questions using ASR
SQL Generation - Generate SQL queries from natural language
RAG Pipeline - Retrieval-augmented generation for accurate answers

Models Used (January 2026)

Model	HuggingFace ID	VRAM	Purpose
Omni-Embed-Nemotron-3B	`nvidia/omni-embed-nemotron-3b`	~10 GB	Universal multimodal embeddings
Nemotron 3 Nano	`nvidia/nemotron-3-nano`	~8 GB	Reasoning & generation
Nemotron Speech ASR	`nvidia/nemotron-speech-asr`	~2 GB	Voice transcription
PersonaPlex-7B (optional)	`nvidia/personaplex-7b-v1`	~14 GB	Conversational voice

Total VRAM: ~20 GB required (GB10 has 128GB unified memory)

Quick Start

Prerequisites

uv - Fast Python package manager
bun - Fast JavaScript runtime (for frontend)
Python 3.12+
CUDA-capable GPU (or run on CPU with reduced performance)

1. Install Dependencies

# Install uv (macOS/Linux)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone and setup
cd ~/Documents/311ML
uv sync

# Activate virtual environment
source .venv/bin/activate

2. Download Models

# Download required models (embeddings, llm, asr)
uv run python scripts/download_models.py

# Or download specific models
uv run python scripts/download_models.py --models embeddings llm

# Download all including optional voice model
uv run python scripts/download_models.py --models all

3. Start the API

# Start FastAPI server
uv run uvicorn backend.main:app --host 0.0.0.0 --port 8000 --reload

# Or using python directly
uv run python -m backend.main

4. Test the API

# Health check
curl http://localhost:8000/api/health

# Query SF data (requires index to be built)
curl -X POST http://localhost:8000/api/query \
  -H "Content-Type: application/json" \
  -d '{"question": "What is the population of San Francisco?"}'

# Semantic search
curl -X POST http://localhost:8000/api/search \
  -H "Content-Type: application/json" \
  -d '{"query": "crime statistics Mission District"}'

API Endpoints

Endpoint	Method	Description
`/api/health`	GET	System health and model status
`/api/health/gpu`	GET	Detailed GPU information
`/api/query`	POST	Natural language query with RAG
`/api/query/sql`	POST	Generate SQL from natural language
`/api/query/examples`	GET	Example queries
`/api/search`	POST	Semantic search across datasets
`/api/search/status`	GET	Search index status
`/api/voice/transcribe`	POST	Audio to text transcription
`/api/voice/query`	POST	Voice-based query (ASR + RAG)
`/api/datasets`	GET	List available datasets
`/api/datasets/{id}`	GET	Get dataset details

Project Structure

311ML/
├── pyproject.toml              # Python dependencies (uv)
├── .python-version             # Python version (3.12)
├── backend/
│   ├── __init__.py
│   ├── main.py                 # FastAPI app entry point
│   ├── config.py               # Model & app configuration
│   ├── api/
│   │   ├── schemas.py          # Pydantic models
│   │   └── routes/
│   │       ├── health.py       # Health endpoints
│   │       ├── query.py        # RAG query endpoints
│   │       ├── search.py       # Semantic search
│   │       ├── voice.py        # ASR endpoints
│   │       └── datasets.py     # Dataset management
│   ├── ml/
│   │   ├── embeddings.py       # Omni-Embed + FAISS
│   │   ├── llm.py              # Nemotron LLM
│   │   ├── asr.py              # Speech recognition
│   │   └── rag.py              # RAG pipeline orchestrator
│   └── ingestion/
│       ├── sf_data_client.py   # SF Open Data API client
│       └── document_processor.py # Document chunking
├── scripts/
│   ├── download_models.py      # Model downloader
│   ├── process_data.py         # Data processing
│   └── setup_gb10.sh           # GB10 setup
├── frontend-resident/          # Public-facing app (bun)
├── frontend-dashboard/         # Internal dashboard (bun)
└── data/
    ├── raw/                    # Downloaded datasets
    ├── processed/              # Processed data
    └── embeddings/             # FAISS index

Data Categories

SF GPT can query data across all San Francisco government categories:

City Infrastructure - Streets, utilities, facilities
Public Safety - Police, fire, emergency services
Health and Social Services - Healthcare, social programs
Transportation - MUNI, parking, traffic
Housing and Buildings - Permits, inspections, housing
Economy and Community - Businesses, jobs, events
Energy and Environment - Sustainability, parks
Geographic Locations - Boundaries, districts, maps

Development

Backend Commands

# Install dependencies
uv sync

# Run tests
uv run pytest

# Format code
uv run ruff format .

# Lint code
uv run ruff check .

Building the Search Index

# Ingest SF Open Data (downloads and processes datasets)
uv run python scripts/ingest_sfdata.py

# Build FAISS embeddings index
uv run python scripts/build_index.py

GB10 Deployment

# SSH into GB10
ssh <username>@<gb10-ip>

# Clone repository
git clone <repo-url>
cd 311ML

# Run setup script
bash scripts/setup_gb10.sh

# Start API
uv run uvicorn backend.main:app --host 0.0.0.0 --port 8000

Hackathon Bounties

NVIDIA Nemotron Track

Omni-Embed-Nemotron-3B for universal embeddings
Nemotron 3 Nano for reasoning and generation
All NVIDIA models running locally on GB10

Arm Architecture Innovation

Dell GB10 uses Grace Hopper (Arm Neoverse V2)
128GB unified memory = zero CPU/GPU transfer overhead
Optimized for edge deployment

Human Impact

Access to 160,000+ government datasets via natural language
Democratizes access to city data
Supports government transparency

Team

Role	Focus
Backend/ML Lead	NVIDIA models, RAG pipeline, API
Frontend - Resident	Public query interface
Frontend - Dashboard	Admin & analytics dashboard
Data Engineering	Ingestion, embeddings, infrastructure

License

MIT License - Built for NVIDIA Spark Hack 2026

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
backend		backend
docs		docs
frontend		frontend
scripts		scripts
.gitignore		.gitignore
.python-version		.python-version
ARCHITECTURE_DECISION.md		ARCHITECTURE_DECISION.md
README.md		README.md
SF_INTELLIGENCE_PLATFORM_HANDOFF.md		SF_INTELLIGENCE_PLATFORM_HANDOFF.md
chat_ui.py		chat_ui.py
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SF GPT - San Francisco Government AI Assistant

Overview

Models Used (January 2026)

Quick Start

Prerequisites

1. Install Dependencies

2. Download Models

3. Start the API

4. Test the API

API Endpoints

Project Structure

Data Categories

Development

Backend Commands

Building the Search Index

GB10 Deployment

Hackathon Bounties

NVIDIA Nemotron Track

Arm Architecture Innovation

Human Impact

Team

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

bledden/sf-intelligence-backend

Folders and files

Latest commit

History

Repository files navigation

SF GPT - San Francisco Government AI Assistant

Overview

Models Used (January 2026)

Quick Start

Prerequisites

1. Install Dependencies

2. Download Models

3. Start the API

4. Test the API

API Endpoints

Project Structure

Data Categories

Development

Backend Commands

Building the Search Index

GB10 Deployment

Hackathon Bounties

NVIDIA Nemotron Track

Arm Architecture Innovation

Human Impact

Team

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages