Releases: jvogan/biovoice
Releases · jvogan/biovoice
v0.1.0 — First public release
BioVoice — Talk to your protein structures. Voice control for PyMOL, ChimeraX, AlphaFold, and Rosetta built on the OpenAI Realtime API with structured tool calling. The backend and your molecular files run on your machine; the voice layer streams to OpenAI.
Highlights
- 7 Realtime function tools wired to a WebRTC voice session:
run_pymol_actions,run_chimerax_actions,run_scientific_workflow,get_target_state,run_recipe_step,export_artifact,capture_view - 9 task-level AlphaFold and Rosetta workflows exposed behind a single
run_scientific_workflowtool and compiled into target-specific adapter calls - Production-grade JSON Schema selectors with chain-aware residue ranges, ligand / cofactor handles, proximity selections, and semantic reference handles like
predictedModel,scaffoldChainA,binderChainA - Two-adapter pattern: PyMOL over XML-RPC and ChimeraX over REST, driven by the same typed action schema
- Offline rehearsal mode — explore the full tool surface without an OpenAI key or a live mic
- Conservative Realtime guardrails: idle disconnect, session-duration cap, response/transcription caps, billable-token cap with pre-disconnect warning, concurrent-session cap, and a default raw-command gate
- Comprehensive docs: Getting Started, First Live Session, How Tool Calling Works, AlphaFold / Rosetta / ligand-pocket / cryo-EM tutorials, architecture, and FAQ
Notes
- Live voice support today uses OpenAI Realtime only. Anthropic live voice, Gemini live voice, and an on-device speech stack are not implemented yet
- macOS is the best-supported autolaunch path; Linux and Windows can run the backend and UI but require starting PyMOL / ChimeraX manually
- Molecular files stay on your machine; live voice sends audio, transcripts, and tool-call text to OpenAI
Full changelog: CHANGELOG.md.