- C++ Standard: Use C++20 features where appropriate
- Formatting: Follow the existing code style in the project
- ARM NEON: When writing SIMD code, ensure proper alignment and use appropriate intrinsics
- Comments: Add comments for complex algorithms, especially in kernel implementations
Cactus is a high-performance inference library optimized for ARM processors. When contributing:
- Benchmark Your Changes: Test performance impact, especially for kernel functions
- Memory Efficiency: Minimize memory allocations in hot paths
- SIMD Optimization: Use ARM NEON intrinsics where beneficial
- Cache Awareness: Consider cache line sizes and memory access patterns
Before submitting a PR:
- Ensure all existing tests pass
- Add tests for new functionality
- Test on ARM hardware if possible (Apple Silicon, Raspberry Pi, etc.)
- Verify quantized operations maintain acceptable accuracy
- Fork the repository and create your branch from
main - Make your changes following the guidelines above
- Sign-off all commits using DCO
- Update documentation if you change APIs
- Open a Pull Request with a clear title and description
## Description
Brief description of what this PR does
## Type of Change
- [ ] Bug fix
- [ ] New feature
- [ ] Performance improvement
- [ ] Documentation update
## Testing
- [ ] Tests pass locally
- [ ] Tested on ARM hardware
- [ ] Benchmarked performance impact
## Checklist
- [ ] All commits are signed-off (DCO)
- [ ] Code follows project style
- [ ] Comments added where necessary
- [ ] Documentation updated if neededWhen reporting issues, please include:
- System information (OS, CPU architecture, ARM variant)
- Cactus version or commit hash
- Minimal code to reproduce the issue
- Expected vs actual behavior
- Any relevant logs or error messages
We especially welcome contributions in these areas:
- Kernel Optimizations: SIMD implementations for ARM architectures
- Quantization: Improved quantization techniques (INT8, INT4)
- Model Support: Support for additional model architectures
- NPU Integration: Apple Neural Engine and other NPU backends
- Documentation: Tutorials, examples, and API documentation
- Testing: Test coverage and benchmarking infrastructure
# Clone the repository
git clone https://github.com/yourusername/cactus.git
cd cactus
# Setup the environment (installs dependencies and activates venv)
./setupYou can run these codes directly on M-series Macbooks since they are ARM-based. Vanilla M3 CPU-only can run Qwen3-600m-INT8 at 60-70 toks/sec.
cactus test # Run unit tests and benchmarks
cactus test --model <hf-name> # Use a specific model for tests
cactus test --ios # Run tests on connected iPhone
cactus test --android # Run tests on connected Android devicecactus build # Build for ARM chips (libcactus.a)
cactus build --apple # Build for Apple platforms (iOS/macOS)
cactus build --android # Build for Androidcactus download <hf-name> # Download and convert model weights
cactus run <hf-name> # Download, build, and run playgroundCactus provides Python bindings for quick scripting and research. After setup:
cactus build
cactus download LiquidAI/LFM2-VL-450M
cactus download openai/whisper-small
cd python && python example.pyAvailable functions:
from cactus import (
cactus_init, # Load a model
cactus_complete, # Text/VLM completion
cactus_transcribe, # Audio transcription (Whisper)
cactus_embed, # Text embeddings
cactus_image_embed, # Image embeddings
cactus_audio_embed, # Audio embeddings
cactus_reset, # Reset model state
cactus_destroy # Free model memory
)Quick example:
import json
from cactus import cactus_init, cactus_complete, cactus_destroy
# Load model
model = cactus_init("../weights/lfm2-vl-450m")
# Text completion
messages = json.dumps([{"role": "user", "content": "What is 2+2?"}])
response = cactus_complete(model, messages)
print(json.loads(response)["response"])
# VLM - describe image
messages = json.dumps([{"role": "user", "content": "Describe this image", "images": ["path/to/image.png"]}])
response = cactus_complete(model, messages)
cactus_destroy(model)Whisper transcription:
from cactus import cactus_init, cactus_transcribe, cactus_destroy
whisper = cactus_init("../weights/whisper-small")
prompt = "<|startoftranscript|><|en|><|transcribe|><|notimestamps|>"
response = cactus_transcribe(whisper, "audio.wav", prompt=prompt)
cactus_destroy(whisper)See python/example.py for a complete working example.
If you have questions about contributing, feel free to:
- Open an issue for discussion
- Check existing issues and PRs
- Review the codebase documentation
By contributing to Cactus, you agree that your contributions will be licensed under the same license as the project (check LICENSE file).
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project maintainers. All complaints will be reviewed and investigated promptly and fairly.
Thank you for contributing to Cactus!