Skip to content

Split text-and-speech into separate translation and speech stages#243

Open
elasticsounds wants to merge 3 commits intomainfrom
elasticsounds/split-translate-tts
Open

Split text-and-speech into separate translation and speech stages#243
elasticsounds wants to merge 3 commits intomainfrom
elasticsounds/split-translate-tts

Conversation

@elasticsounds
Copy link
Copy Markdown
Contributor

Summary

  • Splits the monolithic text-and-speech pipeline stage into two independent DAG stages: translation (text-catalog + catalog-translation) and speech (tts), allowing each to be re-run independently
  • Adds a DELETE /books/:label/stages/:stageName API endpoint for stage-level reset, enabling the UI to clear and re-run individual stages
  • Creates new SpeechSettings, SpeechView, and TranslationSpeechView components, while slimming down TranslationsSettings and TranslationsView to translation-only concerns
  • Expands supported languages from ~20 to 80+ and improves LanguagePicker UX (skips country selection for languages without regional variants, shows book language badge)
  • Updates all pipeline references, effects, tests, i18n catalogs (en/es/pt-BR), and architecture docs

Test plan

  • Verify translation stage runs text-catalog and catalog-translation steps independently
  • Verify speech stage runs TTS and correctly reads text catalog from storage
  • Test stage-level reset via the new DELETE endpoint
  • Confirm the combined "Text & Speech" overview card shows correct state
  • Test language picker with new languages and regional variant flow
  • Run pnpm test and pnpm typecheck

Separate the monolithic text-and-speech pipeline stage into two independent
DAG stages: translation (text-catalog + catalog-translation) and speech (tts).
This lets users re-run audio generation without re-running translations and
vice versa.

- Add speech stage to PIPELINE with dependsOn: ["translation"]
- Split runTextAndSpeechStep into runTranslationStep and runSpeechStep
- Add DELETE /books/:label/stages/:stageName endpoint for stage-level reset
- Create SpeechSettings and SpeechView components for the new stage
- Add TranslationSpeechView for combined translation+audio browsing
- Expand supported languages list (~20 → 80+) and improve LanguagePicker UX
- Update all pipeline references, tests, i18n catalogs, and docs
…view

Replace the combined TranslationSpeechView with a tabbed container
(TranslationStageView) that shows Translation and Speech as separate tabs.

The Speech tab includes:
- Expanded waveform display (60 bars, eager loading, seek support)
- Both base language and translated text shown per entry
- Per-item regeneration menu (three-dot dropdown)
- Word-level timecodes placeholder for future highlighting support
- Catalog type filters (text, captions, activities, answers, glossary, quizzes)
- Speech control panel with provider info and regenerate/clear actions

Also fixes missing Nepali country data in the language list.
Add back the styled Translation and Speech info cards (with language
summary, provider/voice details, and action buttons) that show when
speech hasn't been generated yet. Restore the side-by-side translation
(pink) and speech (rose) control panels above the entry list.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant