v1.0.0rc1
inference 1.0.0rc1 — Release Candidate
Today marks an important milestone for Inference.
Over the past years, Inference has grown from a lightweight prediction server into a widely adopted runtime used across local deployments, Docker, edge devices, and production systems. Hundreds of releases later, the project has matured significantly — and so has the need for a faster, more modular, and future-proof.
inference 1.0.0rc1 is a preview of 1.0.0 release which will close one chapter and open another - this release introduces a new prediction engine that will become the foundation for all future development.
🚀 New prediction engine - inference-models
We are introducing inference-models, a redesigned execution engine focused on:
- faster model loading and inference
- improved resource utilization
- better modularity and extensibility
- cleaner separation between serving and model runtime
- stronger foundations for future major versions
The engine is already available today in:
inference-modelspackage → 0.18.6rc8 (RC)inferencepackage and Docker → enabled with env variable
USE_INFERENCE_MODELS=True
inference-models wrapped within old inference is a drop-down replacement. This allows testing the new runtime without changing existing integrations.
Important
Predictions from your models may change - but generally for better! inference-models is completely new engine for running models, we have fixed a lot of bugs and make it multi-backend - capable to run onnx, torch and even trt models! It automatically negotiate with Roboflow model registry to choose best package to run in your environment. We have already migrated almost all Roboflow models to new registry - working hard to achieve full coverage soon!
📅 What happens next
-
Next week
- Stable
Inference1.0.0 - Stable
inference-modelsrelease - Roboflow platform updated to use
inference-modelsas the default engine
- Stable
-
In the coming weeks
inference-modelsbecomes the default engine for public builds (USE_INFERENCE_MODELSbecomes opt-out, not opt-in)- continued performance improvements and runtime optimizations
🔭 Looking forward - the road to 2.0
- This engine refresh is only the first step.
- We are starting work toward Inference 2.0, a larger modernization effort similar in spirit to the changes introduced with
inference-models.
Stay tuned for future updates!