A multi-scale conversation analysis platform for Google Meet transcripts with AI-powered insights, cognitive bias detection, and advanced visualization.
Live Conversational Threads transforms conversation transcripts into interactive, multi-scale graph visualizations that reveal both temporal flow and thematic relationships. The application supports Google Meet transcripts with speaker diarization, allowing users to explore conversations at five discrete zoom levelsβfrom individual sentences to narrative arcsβwhile simultaneously viewing both timeline and contextual network views.
Built with FastAPI (Python backend) and React + Vite (frontend), the platform leverages LLM-powered analysis to detect Simulacra levels, identify cognitive biases, extract implicit frames, and generate comprehensive speaker analytics.
- Key Features
- Demo
- Architecture Overview
- Project Structure
- Prerequisites
- Local Setup (Recommended)
- Running the Application
- Environment Variables
- Database Setup
- API Documentation
- Documentation
- Development Roadmap
- Troubleshooting
- Contributing
- License
π― Google Meet Transcript Import
- Parse PDF/TXT transcripts with speaker diarization
- Automatic speaker detection and turn segmentation
- Timestamp extraction and duration calculation
π Dual-View Visualization
- Timeline View (15%): Linear temporal progression of conversation
- Contextual Network View (85%): Thematic clustering and idea relationships
- Synchronized navigation and selection across views
- Resizable split with user-customizable proportions
π 5-Level Zoom System
- Level 1 (Sentence): Individual utterances and speaker turns
- Level 2 (Turn): Aggregated speaker contributions
- Level 3 (Topic): Semantic topic segments
- Level 4 (Theme): Major thematic clusters
- Level 5 (Arc): Narrative arcs and conversation structure
π Advanced AI Analysis
- Simulacra Level Detection: Classify utterances by communication intent (Levels 1-4)
- Cognitive Bias Detection: Identify 25+ types of biases and logical fallacies
- Implicit Frame Analysis: Uncover hidden worldviews and normative assumptions
- Speaker Analytics: Role detection, time distribution, topic dominance
βοΈ Customizable AI Prompts
- Externalized prompts in JSON configuration
- User-editable via Settings UI
- A/B testing support for prompt variations
- Version history and rollback capability
- Performance metrics per prompt (cost, latency, accuracy)
π Cost Tracking & Instrumentation
- Real-time LLM API cost tracking
- Latency monitoring (p50, p95, p99)
- Token usage analytics by feature
- Cost per conversation dashboards
- Automated alerts for threshold breaches
βοΈ Edit Mode & Training Data Export
- Manual correction of AI-generated nodes/edges
- All edits logged for future model training
- Export formats: JSONL (fine-tuning), CSV (analysis), Markdown (review)
- Feedback annotation for continuous improvement
Note: Video reflects earlier version of the application. Current version includes dual-view architecture, zoom levels, and advanced analysis features.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β React Frontend (Vite) β
β ββββββββββββββββββ βββββββββββββββββββββββββββββββββββ β
β β Timeline View β β Contextual Network View β β
β β (15% height) β β (85% height) β β
β β β β β β
β β βββββββββββββ β β ββββ ββββ β β
β β β β β ββββββββ β β β
β ββββββββββββββββββ β ββββ ββββ β β
β β β β β β
β β ββββ β β
β β β β β β
β β ββββ β β
β βββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββ
β REST API
ββββββββββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββ
β FastAPI Backend β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββ β
β β Parsers β β AI Services β β Instrumentation β β
β β - Google Meetβ β - Clustering β β - Cost Tracking β β
β β β β - Bias Det. β β - Metrics β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββΌβββββββββββββββββββββββ
β β β
ββββββΌβββββ ββββββββΌβββββββ βββββββββΌβββββββ
βPostgreSQLβ β OpenAI API β β GCS Storage β
β Database β β Anthropic β β (Transcripts)β
ββββββββββββ βββββββββββββββ ββββββββββββββββ
- Import: User uploads Google Meet transcript (PDF/TXT)
- Parsing: Backend extracts speakers, utterances, timestamps
- AI Analysis: LLM generates nodes, edges, clusters (via prompts.json)
- Storage: Conversation data saved to PostgreSQL, files to GCS
- Visualization: Frontend fetches graph data, renders dual-view
- Interaction: User explores zoom levels, selects nodes, views analytics
- Editing: User corrections logged to
edits_logtable - Export: Training data exported in JSONL format for fine-tuning
live_conversational_threads/
βββ lct_python_backend/ # FastAPI backend
β βββ backend.py # App shell + router mounting
β βββ *_api.py # Router modules (import, stt, llm, graph, etc.)
β βββ services/ # Processing, LLM/STT, persistence helpers
β βββ alembic/ # Database migrations
β βββ tests/ # Unit + integration coverage
β βββ prompts.json # Prompt configuration
βββ lct_app/ # React frontend (JSX)
β βββ src/pages/ # Route-level screens
β βββ src/components/ # Graph/audio/settings UI
β βββ src/services/ # API clients
β βββ package.json
β βββ vite.config.js
βββ docs/ # ADRs, plans, runbooks
βββ setup-once.command # First-time setup
βββ start.command # Daily startup
βββ AGENTS.md # Operating protocol
βββ README.md
- Python 3.9+ (with
venvor Conda) - Node.js 18+ and npm 9+
- PostgreSQL 15+ (or Docker via
docker compose up -d) - Optional API keys (depend on provider mode):
- Local mode: none required
- Online LLMs:
GEMINI_KEY/GEMINI_API_KEY/GOOGLEAI_API_KEY,OPENAI_API_KEY,ANTHROPIC_API_KEY,OPENROUTER_API_KEY,PERPLEXITY_API_KEY - Cloud persistence (optional):
GCS_BUCKET_NAME,GOOGLE_APPLICATION_CREDENTIALS
Use the streamlined scripts from repo root:
./setup-once.commandThis installs dependencies, initializes local PostgreSQL (.postgres_data on 5433), prepares lct_python_backend/.env, and runs migrations.
./start.commandThis performs a clean start, runs migrations, then starts backend + frontend with prefixed terminal logs.
start.command now attempts shared Parakeet STT autostart by default (STT_AUTOSTART=1) and uses backend-owned STT routing.
It also checks local LLM reachability at ${LOCAL_LLM_BASE_URL:-http://100.81.65.74:1234}/v1/models during startup.
To disable STT autostart for a run:
STT_AUTOSTART=0 ./start.commandBy default this reuses the sibling Parakeet repo/container and shared Docker volume parakeet-models.
See docs/LOCAL_SETUP.md for detailed setup behavior and troubleshooting.
./start.command- Backend API docs: http://localhost:8000/docs
- Backend health: http://localhost:8000/api/import/health
- Frontend: http://localhost:5173
- Navigate to http://localhost:5173
- Click "Import Transcript" button
- Upload a Google Meet transcript (PDF or TXT format)
- Wait for AI-powered graph generation (~30-60 seconds)
- Explore the conversation using dual-view interface!
| Variable | Description | Example |
|---|---|---|
DATABASE_URL |
PostgreSQL connection string | postgresql://lct_user:lct_password@localhost:5433/lct_dev |
DEFAULT_LLM_MODE |
Local/online default mode | local |
LOCAL_LLM_BASE_URL |
Local LLM endpoint | http://100.81.65.74:1234 |
| Variable | Description | Default |
|---|---|---|
OPENAI_API_KEY |
OpenAI key (online mode) | unset |
ANTHROPIC_API_KEY |
Anthropic key (online mode) | unset |
GEMINI_KEY / GEMINI_API_KEY / GOOGLEAI_API_KEY |
Gemini key aliases (online mode) | unset |
OPENROUTER_API_KEY |
OpenRouter key (online mode) | unset |
PERPLEXITY_API_KEY |
Perplexity key (fact-checking) | unset |
GCS_BUCKET_NAME |
Cloud bucket for conversation JSON | unset |
GCS_FOLDER |
Cloud folder path | unset |
GOOGLE_APPLICATION_CREDENTIALS |
ADC/service account path | unset |
LOG_LEVEL |
Logging level | INFO |
TRACE_API_CALLS |
Backend outbound call tracing | true |
API_LOG_PREVIEW_CHARS |
Trace preview truncation length | 280 |
| Variable | Description | Default |
|---|---|---|
VITE_BACKEND_API_URL |
Backend API base URL (service clients) | http://localhost:8000 |
VITE_AUTH_TOKEN |
Optional bearer token for protected backends | unset |
VITE_API_TRACE |
Frontend request/response console tracing | dev-mode on |
Use ./setup-once.command for first-time setup and ./start.command for daily runs.
These scripts initialize local Postgres (default localhost:5433) and run Alembic migrations automatically.
From lct_python_backend/:
alembic upgrade headDefault local connection in this repo is:
postgresql://lct_user:lct_password@localhost:5433/lct_dev
Schema is migration-driven (alembic upgrade head) and evolves over time.
Current core entities include:
conversations,utterances,nodes,relationshipstranscript_events,app_settingsbookmarks,fact_checks,api_calls_log
For field-level details, use:
- ORM models:
lct_python_backend/models.py - Migrations:
lct_python_backend/alembic/versions/
Once the backend server is running:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
GET /api/import/health
POST /api/import/google-meet
POST /api/import/from-text
GET /conversations/{conversation_id}
POST /save_json/
GET /api/settings/stt
PUT /api/settings/stt
GET /api/settings/llm
PUT /api/settings/llm
GET /api/settings/llm/models
GET /api/graph/health
POST /api/graph/generate
WS /ws/transcripts
| Document | Description |
|---|---|
| ROADMAP.md | 14-week implementation plan with instrumentation, metrics, storage, and testing strategies |
| TIER_1_DECISIONS.md | Foundational architectural decisions (Google Meet format, zoom levels, dual-view, prompts) |
| TIER_2_FEATURES.md | Detailed specifications for 6 major features (Node Detail Panel, Speaker Analytics, Prompts Config, etc.) |
| FEATURE_SIMULACRA_LEVELS.md | Simulacra level detection, cognitive bias analysis, implicit frames, rhetorical profiling |
| DATA_MODEL_V2.md | Complete database schema with all tables, indexes, and relationships |
| PRODUCT_VISION.md | High-level product strategy and user personas |
| FEATURE_ROADMAP.md | ROI analysis and feature prioritization |
| ADR | Title | Status |
|---|---|---|
| ADR-001 | Google Meet Transcript Support | Proposed |
| ADR-002 | Hierarchical Coarse-Graining for Multi-Scale Visualization | Proposed |
| ADR-003 | Observability, Metrics, and Storage Baseline | Proposed |
| ADR-004 | Dual-View Architecture (Timeline + Contextual Network) | Approved |
| ADR-005 | Externalized Prompts Configuration System | Approved |
| ADR-006 | Testing Strategy & Quality Assurance | Proposed |
| ADR-007 | System Invariants & Data Integrity | Proposed |
| ADR-008 | Local STT & Append-Only Transcript Events | Approved |
| ADR-009 | Local-First LLM Defaults | Proposed |
| ADR-010 | Minimal Conversation Schema for Pause/Resume and Thread Legibility | Proposed |
| ADR-011 | Minimal Live Conversation UI Redesign | Draft |
| ADR-012 | Real-Time Speaker Diarization Sidecar for Local Speech-to-Graph | Proposed |
See docs/adr/INDEX.md for the complete ADR index.
- β Database schema migration (DATA_MODEL_V2)
- β Instrumentation & cost tracking
- π§ Google Meet transcript parser
- π§ Initial graph generation with prompt engineering
- π Dual-view architecture (Timeline + Contextual)
- π 5-level zoom system
- π Node detail panel with editing
- π Speaker analytics view
- π Prompts configuration UI
- π Edit history & training data export
- π Simulacra level detection
- π Cognitive bias detection (25 types)
- π Implicit frame analysis
- π Final integration & polish
Legend:
- β Completed
- π§ In Progress
- π Planned
See docs/ROADMAP.md for detailed sprint-by-sprint breakdown.
Database connection errors:
# Check PostgreSQL is running
pg_ctl status
# Test connection
psql -U your_user -d lct_dbLLM API errors:
# Verify API keys are set
echo $OPENAI_API_KEY
echo $ANTHROPIC_API_KEY
# Check API key validity
curl https://api.openai.com/v1/models \
-H "Authorization: Bearer $OPENAI_API_KEY"Import errors:
# Reinstall dependencies
pip install --force-reinstall -r requirements.txt
# Check Python version (must be 3.9+)
python --versionPort conflicts:
# Kill process on port 5173
lsof -ti:5173 | xargs kill -9
# Or use different port
npm run dev -- --port 3000CORS errors:
- Backend is configured to allow
http://localhost:5173 - If using different port, update CORS settings in
backend.py
Build errors:
# Clear cache and reinstall
rm -rf node_modules package-lock.json
npm installSlow graph generation:
- Check
api_calls_logtable for high latency - Consider using GPT-3.5-turbo for cheaper/faster clustering
- Reduce max_tokens in
prompts.json
High LLM costs:
- Check
/api/cost-tracking/statsendpoint - Review
prompts.jsonfor token-heavy templates - Enable prompt caching (coming in Week 9)
We welcome contributions! Please follow these guidelines:
-
Create a feature branch from
main:git checkout -b feature/your-feature-name
-
Follow commit message format (see
.claude/CLAUDE.md):[TYPE]: Brief summary (50 chars max) MOTIVATION: - Why this change was needed APPROACH: - How the solution works CHANGES: - file1.py: Specific changes made IMPACT: - What functionality is added/changed TESTING: - How to verify the changes work -
Write tests:
- Unit tests:
pytest tests/unit/test_your_feature.py - Integration tests:
pytest tests/integration/ - Maintain 85%+ coverage
- Unit tests:
-
Run linters:
# Python black . flake8 . mypy . # Frontend npm run lint
-
Create Pull Request to
main:- Fill out PR template
- Link related issues
- Request review from maintainers
- No direct commits to main β all changes via PR
- Test coverage: 85%+ for new code
- Documentation: Update relevant docs/ files
- ADRs: Create ADR for significant architectural decisions
- Prompts: Externalize new LLM prompts to
prompts.json
Python:
- Black formatter (line length 100)
- Type hints for all functions
- Docstrings (Google style)
TypeScript:
- Prettier formatter
- ESLint rules enforced
- Prefer functional components with hooks
This project is licensed under the GNU General Public License v3.0 (GPLv3).
You are free to use, modify, and distribute this software under the terms of the GPLv3, which ensures that derivative works remain open source.
Key Points:
- β Use freely for personal, academic, or open-source projects
- β Modify and distribute under GPLv3 terms
- β Cannot use in proprietary/closed-source software without commercial license
If you would like to use this software in a closed-source or commercial product, or if you need a commercial license without the GPL's copyleft requirements, please contact:
Email: adityaadiga6@gmail.com GitHub: https://github.com/aditya-adiga
Maintainer: Aditya Adiga Email: adityaadiga6@gmail.com GitHub: @aditya-adiga
Issues: GitHub Issues Discussions: GitHub Discussions
- Zvi Mowshowitz β Simulacra Levels framework
- LessWrong Community β Cognitive bias taxonomies
- OpenAI & Anthropic β LLM APIs powering analysis
- React Flow β Graph visualization library
- FastAPI β Python web framework
Last Updated: 2026-02-13 Version: 2.1.0 (Local STT, local-first LLM, security hardening)