🚀 What's New in v0.2.0
This release adds comprehensive edge device support and automatic cluster health monitoring.
✨ New Features
Edge Device Support
- TensorRT-LLM Runner - Native support for NVIDIA Jetson devices (Orin Nano, NX, AGX)
- Memory/Compute Validation - Validates compositions against device constraints before deployment
- Quantization Support - INT4, INT8, FP16 quantization for memory-constrained devices
- Model Selection Guide - Documentation for choosing optimal models per device
Cluster Health Monitoring
- Active Health Probing - Automatic HTTP health checks every 5 seconds for all deployed replicas
- Professional Status Table - Beautiful CLI output showing NODE, PIPELINE, STATUS, UPTIME, LATENCY, and ERRORS
- Health State Tracking - Consecutive failure/success counting with configurable thresholds
- API Enhancement -
/v1/statusendpoint now returns detailed health summary
📦 Supported Platforms
| Platform | Binary |
|---|---|
| macOS ARM64 (Apple Silicon) | llmnet-darwin-aarch64 |
| macOS x64 (Intel) | llmnet-darwin-x86_64 |
| Linux x64 (glibc) | llmnet-linux-x86_64-gnu |
| Linux x64 (musl/static) | llmnet-linux-x86_64-musl |
| Linux ARM64 (glibc) | llmnet-linux-aarch64-gnu |
| Linux ARM64 (musl/static) | llmnet-linux-aarch64-musl |
| Raspberry Pi (glibc) | llmnet-linux-armv7-gnueabihf |
| Raspberry Pi (musl/static) | llmnet-linux-armv7-musleabihf |
| FreeBSD x64 | llmnet-freebsd-x86_64 |
📚 Documentation
🔧 Technical Details
Files Added:
src/runtime/tensorrt_llm.rs- TensorRT-LLM runner implementationsrc/config/validation.rs- Device constraint validationsrc/cluster/health_checker.rs- Active health probing module
Full Changelog: v0.0.1alpha...v-0.2.0