Skip to content

v-0.2.0: Edge Device Support & Cluster Health Monitoring

Latest

Choose a tag to compare

@github-actions github-actions released this 31 Dec 21:10
· 1 commit to master since this release
450ea7d

🚀 What's New in v0.2.0

This release adds comprehensive edge device support and automatic cluster health monitoring.

✨ New Features

Edge Device Support

  • TensorRT-LLM Runner - Native support for NVIDIA Jetson devices (Orin Nano, NX, AGX)
  • Memory/Compute Validation - Validates compositions against device constraints before deployment
  • Quantization Support - INT4, INT8, FP16 quantization for memory-constrained devices
  • Model Selection Guide - Documentation for choosing optimal models per device

Cluster Health Monitoring

  • Active Health Probing - Automatic HTTP health checks every 5 seconds for all deployed replicas
  • Professional Status Table - Beautiful CLI output showing NODE, PIPELINE, STATUS, UPTIME, LATENCY, and ERRORS
  • Health State Tracking - Consecutive failure/success counting with configurable thresholds
  • API Enhancement - /v1/status endpoint now returns detailed health summary

📦 Supported Platforms

Platform Binary
macOS ARM64 (Apple Silicon) llmnet-darwin-aarch64
macOS x64 (Intel) llmnet-darwin-x86_64
Linux x64 (glibc) llmnet-linux-x86_64-gnu
Linux x64 (musl/static) llmnet-linux-x86_64-musl
Linux ARM64 (glibc) llmnet-linux-aarch64-gnu
Linux ARM64 (musl/static) llmnet-linux-aarch64-musl
Raspberry Pi (glibc) llmnet-linux-armv7-gnueabihf
Raspberry Pi (musl/static) llmnet-linux-armv7-musleabihf
FreeBSD x64 llmnet-freebsd-x86_64

📚 Documentation

🔧 Technical Details

Files Added:

  • src/runtime/tensorrt_llm.rs - TensorRT-LLM runner implementation
  • src/config/validation.rs - Device constraint validation
  • src/cluster/health_checker.rs - Active health probing module

Full Changelog: v0.0.1alpha...v-0.2.0