Skip to content

Structured Outputs + Playground + Better Transcription

Latest

Choose a tag to compare

@ymrohit ymrohit released this 05 Jan 04:59

Changelog:

Added: structured output dataclasses with JSON schema export, plus analysis_result.schema.json and CLI --schema.
Added: new openscenesense-ollama CLI with caching, structured output, retry/backoff, and prompt overrides.
Added: Playground demo script for deep tuning and debugging; tests + CI coverage expanded.
Changed: default models now ministral-3:latest for vision/summaries and openai/whisper-small for transcription.
Changed: transcription now uses ffmpeg extraction with configurable segment length/beam/temperature and optional repetition collapse.
Fixed: frame selection edge cases, more accurate duration calculation, and retry handling for Ollama timeouts; context prompting now avoids meta responses.