This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
FUTO Voice Input is an Android application that provides speech-to-text functionality through third-party keyboards and generic speech-to-text APIs. It integrates with various speech recognition providers including local Whisper models and remote services like Soniox.
# Build release APK for standalone distribution
./gradlew assembleStandaloneRelease
# Build specific flavor variants
./gradlew assembleDevRelease # Development build with all features
./gradlew assemblePlayStoreRelease # Play Store build without auto-update
./gradlew assembleFDroidRelease # F-Droid build without Google services
./gradlew assembleDevSameIdRelease # Dev build with same app ID as release
# Clean and rebuild
./gradlew clean assembleStandaloneReleaseThe project uses Android product flavors for different distribution channels:
- dev/devSameId: Development builds with all payment methods and update checking
- playStore: Play Store builds with only Play Store billing, no auto-update
- standalone: Standalone builds with PayPal billing and auto-update
- fDroid: F-Droid builds with PayPal billing, no auto-update, no Google services
# Run unit tests
./gradlew test
# Run instrumentation tests
./gradlew connectedAndroidTestAudioRecognizer (Abstract Base Class)
- Handles audio recording via AudioRecord API
- Implements Voice Activity Detection (VAD) using WebRTC GMM model
- Manages audio focus and permissions
- Provides template for different recognition providers
- Location:
app/src/main/java/org/futo/voiceinput/AudioRecognizer.kt
VoiceInputMethodService
- Android InputMethodService implementation for keyboard integration
- Manages Compose UI lifecycle and input method lifecycle
- Handles text insertion via InputConnection API
- Location:
app/src/main/java/org/futo/voiceinput/VoiceInputMethodService.kt
RecognizerView (Abstract)
- Base class for recognition UI components
- Manages recognition state machine and UI updates
- Handles result processing and error states
- Drives provider selection: Whisper local, Soniox async, Soniox realtime
Local Whisper Models
- Uses whisper.cpp via JNI for on-device recognition
- GGML quantized models stored in
app/src/main/ml/ - C++ implementation in
app/src/main/cpp/with CMake build system - Supports multiple languages and model sizes
Soniox Provider
- Remote speech recognition service integration
- Both async and real-time recognition modes
- Located in
app/src/main/java/org/futo/voiceinput/providers/soniox/ - Classes:
SonioxAsyncRecognizer,SonioxRealtimeRecognizer,RealtimeSttClient - Realtime uses
wss://stt-rt.soniox.com/transcribe-websocketvia OkHttp; partial tokens stream into IME composing text,onRealtimeFinalResultconsolidates the final transcript
DataStore-based Settings
- Uses Android DataStore for preferences persistence
- Centralized settings management in
app/src/main/java/org/futo/voiceinput/settings/Settings.kt - Coroutine-based async settings operations with blocking fallbacks
- Type-safe settings keys with defaults
- Relevant keys:
STT_PROVIDER,SONIOX_MODE("async"|"realtime"),SONIOX_API_KEY,LANGUAGE_TOGGLES,PERSONAL_DICTIONARY,ENABLE_SOUND,VERBOSE_PROGRESS
Theme System
- Jetpack Compose Material 3 theming
- Dynamic color support (Android 12+)
- Multiple theme presets in
app/src/main/java/org/futo/voiceinput/theme/presets/ - Theme selection UI with live preview
whisper.cpp Integration
- GGML-based Whisper implementation
- JNI wrapper in
voiceinput.cppandjni_common.cpp - Optimized for mobile ARM processors with NEON instructions
- CMake build system with Android NDK
Audio Processing Libraries
- WebRTC VAD for voice activity detection (prebuilt AAR in libs/)
- PocketFFT for audio feature extraction (prebuilt AAR in libs/)
Multi-platform Payment Support
- Play Store billing for Google Play distribution
- PayPal integration for direct sales via FutoPay module
- Conditional compilation based on build flavor
- Billing logic in
app/src/main/java/org/futo/voiceinput/payments/
- Heavy use of Kotlin coroutines throughout the app
withContext(Dispatchers.Default)for background processingwithContext(Dispatchers.Main)for UI updates- Proper lifecycle-aware coroutine scoping
- Single-activity architecture with Compose navigation
- Lifecycle-aware ViewModels where appropriate
- Custom Compose components for recognition UI
- Material 3 design system implementation
- ACRA crash reporting (configurable via build config)
- Graceful degradation for permission errors
- Out-of-memory handling for model loading
- Network error handling for remote providers
- Lazy loading of machine learning models
- Model migration system for updates
- Download manager for obtaining models
- Memory management with proper model cleanup
Source Sets by Flavor:
src/main/- Common code for all flavorssrc/dev/- Development-specific codesrc/playStoreBilling/- Google Play billing implementationsrc/payPalBilling/- PayPal billing implementationsrc/withUpdateChecking/- Auto-update functionalitysrc/withoutUpdateChecking/- Builds without auto-update
Critical Configuration Files:
app/build.gradle- Complex multi-flavor build configurationapp/src/main/cpp/CMakeLists.txt- Native code build setuplibs/- Prebuilt AAR libraries for audio processingapp/src/main/AndroidManifest.xml- IME service, Recognize activity, accessibility insertion service
- Changes should maintain compatibility across all build flavors
- Test both local Whisper and remote provider functionality
- Consider memory implications when modifying model loading
- Ensure proper lifecycle management in UI components
- Test keyboard integration via IME APIs
- Validate permissions handling especially for microphone access
Key Libraries:
- Jetpack Compose (UI framework)
- Kotlin Coroutines (async operations)
- DataStore (settings persistence)
- OkHttp (network operations for remote providers)
- ACRA (crash reporting)
- WebRTC VAD (voice activity detection)
- Material 3 (design system)
Build Tools:
- Android Gradle Plugin (from project)
- Kotlin 2.1.0 with Compose compiler plugin
- CMake 3.22.1 for native builds
- Android NDK for C++ compilation
# Core builds
./gradlew assembleStandaloneRelease
./gradlew assemblePlayStoreRelease
./gradlew assembleFDroidRelease
./gradlew assembleDevRelease
# Tests & lint
./gradlew test
./gradlew testDevDebugUnitTest
./gradlew connectedAndroidTest
./gradlew lint