A comprehensive skill for integrating RunAnywhere on-device AI into your applications across all supported platforms.
This skill enables Claude Code to assist with integrating RunAnywhere's privacy-first, on-device AI capabilities into Swift (iOS/macOS), Kotlin (Android), Web (WebAssembly), React Native, and Flutter applications.
RunAnywhere lets you add AI features to your app that run entirely on-device:
- LLM Text Generation - Run LFM2, Llama, Mistral, Qwen, SmolLM locally via llama.cpp
- Vision Language Models (VLM) - On-device visual understanding with camera/image input (iOS/Web)
- Speech-to-Text - Whisper-based transcription
- Text-to-Speech - Neural voice synthesis via Piper
- Voice Agent Pipeline - Complete VAD → STT → LLM → TTS orchestration
- Tool Calling & Structured Output - Function calling and JSON schema-guided generation
All processing happens locally—no cloud, no latency, no data leaves the device.
- Swift (iOS/macOS) - Complete integration guide with Swift Package Manager setup
- Kotlin (Android) - Full Gradle configuration and usage examples
- Web (Browser) - 3-package WebAssembly SDK with VLM, streaming, and Web Worker support
- React Native - Cross-platform mobile integration
- Flutter - Dart/Flutter implementation guide
- Installation & Setup - Platform-specific dependency management
- Model Selection - Device-appropriate model recommendations with quantization guidance
- Complete API Reference - All SDK methods with working code examples
- Error Handling - Common issues and solutions for each platform
- Performance Optimization - Memory management, streaming patterns, best practices
- Vision Language Models - Camera-based VLM with Web Worker architecture (iOS/Web)
- Voice Agent Pipelines - Complete STT → LLM → TTS workflows
The skill uses a three-tier information architecture:
- SKILL.md - Core workflow and platform selection
- Platform Guides - Detailed references loaded on-demand (swift.md, kotlin.md, web.md, etc.)
- Model Guide - Model selection with LFM2, VLM, and quantization guidance
This minimizes context usage while providing complete information when needed.
- Download the packaged skill:
runanywhere-ai.skill - Install via Claude Code CLI:
claude-code install runanywhere-ai.skill
- The skill will automatically trigger when working with:
- On-device AI features
- Local LLM inference
- Vision Language Models (VLM) / on-device vision
- RunAnywhere SDK integration
- GGUF models or llama.cpp
- Offline AI processing
- On-device speech processing (STT/TTS)
# Clone into your Claude Code skills directory
git clone https://github.com/RunanywhereAI/runanywhere-skill.git ~/.claude/skills/runanywhere-airunanywhere-ai/
├── SKILL.md # Main skill file with core workflow
├── references/ # Platform-specific detailed guides
│ ├── swift.md # iOS/macOS integration (781 lines)
│ ├── kotlin.md # Android integration (573 lines)
│ ├── web.md # Browser/WebAssembly guide (795 lines)
│ ├── react-native.md # React Native guide (743 lines)
│ ├── flutter.md # Flutter guide (800 lines)
│ └── models.md # Model selection guide (450 lines)
└── README.md # This file
User: "Help me integrate RunAnywhere's LLM into my Swift iOS app"
Claude:
1. Reads swift.md reference
2. Provides Swift Package Manager setup
3. Shows SDK initialization with LlamaCPP registration
4. Demonstrates model download and loading
5. Provides text generation examples
6. Suggests streaming for better UX
User: "Set up speech-to-text in my React Native app"
Claude:
1. Reads react-native.md reference
2. Shows npm installation for @runanywhere/core and @runanywhere/onnx
3. Demonstrates ONNX module registration
4. Shows Whisper model setup
5. Provides transcription examples with progress tracking
User: "Build a voice assistant pipeline on Android"
Claude:
1. Reads kotlin.md reference
2. Shows Gradle dependencies for LlamaCPP and ONNX
3. Demonstrates voice agent configuration
4. Shows voice session handling with events
5. Provides complete implementation example
User: "Add on-device vision to my React web app"
Claude:
1. Reads web.md reference
2. Shows 3-package npm install (@runanywhere/web, web-llamacpp, web-onnx)
3. Sets up Vite config with copyWasmPlugin and COOP/COEP headers
4. Creates VLM Web Worker with startVLMWorkerRuntime()
5. Wires VLMWorkerBridge to RunAnywhere.setVLMLoader()
6. Registers LFM2-VL 450M model (model + mmproj files)
7. Shows VideoCapture for camera frames and VLMWorkerBridge.shared.process()
User: "Add an on-device chatbot to my web app"
Claude:
1. Reads web.md reference
2. Installs 3-package Web SDK
3. Configures bundler and cross-origin headers
4. Registers LlamaCPP/ONNX backends and model catalog
5. Downloads and loads LFM2 350M via ModelManager
6. Uses TextGeneration.generateStream() with { stream, result, cancel }
7. Renders tokens in real-time as they generate
Models are compressed using quantization:
- Q4_0: Smallest size, fastest, lower quality (~3.5 bits/weight)
- Q5_K_M: Balanced size and quality (~5.5 bits/weight)
- Q8_0: Largest size, best quality, slower (~8 bits/weight)
- LLM: GGUF format (via llama.cpp)
- VLM: GGUF format (model + mmproj files)
- STT: ONNX format (Whisper models)
- TTS: ONNX format (Piper voices)
- VAD: ONNX format (Silero VAD v5)
Rule of thumb: Device RAM should be 2× model size
- LFM2 350M (~250MB) → Need 500MB+ RAM
- LFM2-VL 450M (~500MB) → Need 1GB+ RAM
- SmolLM2 360M (~400MB) → Need 800MB+ RAM
- LFM2 1.2B Tool (~800MB) → Need 1.5GB+ RAM
- Llama 3.2 1B (~1GB) → Need 2GB+ RAM
- Mistral 7B (~4GB) → Need 8GB+ RAM
This skill was created following the Claude Code Skill Creator guidelines:
- Understanding Phase - Analyzed RunAnywhere SDKs across all platforms
- Planning Phase - Identified reusable resources (platform guides, model selection)
- Implementation Phase - Created comprehensive reference documentation
- Validation Phase - Audited all APIs against official RunAnywhere documentation
- Iteration Phase - Fixed discrepancies and removed misaligned components
All API methods and code examples were verified against:
- Official RunAnywhere Repository
- Platform-specific README files
- CLAUDE.md repository guidelines
Corrections Made:
- Fixed Swift API method names (transcribe, synthesize, voice agent)
- Removed misaligned download script (SDKs have built-in download)
- Verified all platform APIs match official documentation
All information is derived from official sources:
- RunAnywhere GitHub Repository
- Swift SDK Documentation
- Kotlin SDK Documentation
- React Native Documentation
- Flutter Documentation
As RunAnywhere SDKs evolve, this skill should be updated:
- Monitor RunAnywhere Releases
- Check for API changes in platform-specific READMEs
- Update reference files as needed
- Re-validate against official documentation
Contributions are welcome! To improve this skill:
- Report Issues - Found incorrect API usage? Open an issue
- Update Documentation - SDK changed? Submit a pull request
- Add Examples - Have useful patterns? Share them
Before submitting updates:
- Verify APIs against official RunAnywhere docs
- Test code examples when possible
- Keep platform guides under 1000 lines (progressive disclosure)
- Maintain consistent formatting and structure
- Update README.md if structure changes
This skill is licensed under Apache 2.0, matching the RunAnywhere SDK license.
- Website: runanywhere.ai
- Documentation: docs.runanywhere.ai
- GitHub: github.com/RunanywhereAI/runanywhere-sdks
- Discord: discord.gg/N359FBbDVd
- iOS: examples/ios/RunAnywhereAI
- Android: examples/android/RunAnywhereAI
- Web: web-starter-app (Chat, Vision, Voice demo)
- React Native: examples/react-native/RunAnywhereAI
- Flutter: examples/flutter/RunAnywhereAI
- Skill Creator Guide: claude-code/skills/skill-creator
- Claude Code Documentation: claude.ai/code
For issues specific to this skill:
- Open an issue on GitHub Issues
For RunAnywhere SDK questions:
- Join RunAnywhere Discord
- Open an issue on RunAnywhere GitHub
Built with ❤️ using Claude Code