Release Notes — @aws/ml-container-creator v0.2.3
⚠️ Breaking Changes
- Package renamed:
@aws/generator-ml-container-creator→@aws/ml-container-creator - Install command changed:
npm install -g yo @aws/generator-ml-container-creator→npm install -g @aws/ml-container-creator - Run command changed:
yo @aws/ml-container-creator→ml-container-creator - npx support:
npx @aws/ml-container-creator --help(zero-install)
Standalone CLI (Yeoman Removal)
Removes the Yeoman framework entirely. The tool is now a standalone Node.js CLI using Commander.js for option parsing and @inquirer/prompts for interactive prompts. Template rendering uses ejs + tinyglobby + fs directly.
- Eliminates
yopeer dependency andgenerator-naming convention - Removes all CI root-user workarounds (
YEOMAN_ALLOW_ROOT,runuser,useradd) - Moves source from
generators/app/→src/+templates/ - Replaces
yeoman-test/yeoman-assertwith custom test helpers
Deployment Targets
| Target | Description |
|---|---|
| SageMaker Managed Inference | Real-time endpoints with inference components |
| SageMaker Async Inference | Asynchronous inference for long-running predictions |
| SageMaker Batch Transform | Batch processing of large datasets |
| SageMaker HyperPod EKS | Kubernetes-based deployment on HyperPod clusters |
Serving Architectures
| Architecture | Model Servers |
|---|---|
| HTTP (traditional ML) | Flask, FastAPI |
| Transformers (LLMs) | vLLM, SGLang, TensorRT-LLM, DJL/LMI |
| Triton (multi-framework) | FIL, ONNX Runtime, TensorFlow, PyTorch, Python, vLLM, TensorRT-LLM |
| Diffusors (image generation) | vLLM Omni |
MCP Server Ecosystem
Five bundled MCP servers provide intelligent configuration assistance:
- model-picker — Resolves models from HuggingFace, SageMaker JumpStart, Bedrock Marketplace, TensorFlow Hub, ONNX Model Zoo
- base-image-picker — Selects appropriate container base images per model server and CUDA version
- instance-recommender — Recommends SageMaker instance types based on model requirements
- region-picker — Filters AWS regions by service availability with Bedrock-powered reasoning
- hyperpod-cluster-picker — Discovers available HyperPod EKS clusters
All catalog data externalized to JSON files in servers/*/catalogs/.
do-framework Integration
Generated projects include a complete do/ script suite:
| Script | Purpose |
|---|---|
do/build |
Build Docker image (linux/amd64) |
do/push |
Push to Amazon ECR |
do/deploy |
Deploy to SageMaker (idempotent, credential-aware) |
do/submit |
Submit build to CodeBuild |
do/test |
Test deployed endpoint |
do/clean |
Tear down deployed resources |
do/export |
Export config as CLI flags or JSON |
do/register |
Register to local deployment registry (+ optional CI) |
do/ci |
CI report, status, trigger, dashboard |
do/manifest |
Track deployed AWS resource ARNs |
do/logs |
Tail CloudWatch logs |
do/run |
Run container locally |
Bootstrap Infrastructure
ml-container-creator bootstrap provisions shared AWS infrastructure:
- IAM execution role with SageMaker permissions (CloudFormation-managed)
- ECR repository for container images
- S3 buckets for async/batch workloads
- Named profiles persisted to
~/.ml-container-creator/config.json bootstrap status --verifyfor drift detectionbootstrap scanfor discovering pre-existing tagged resources
CI Integration Harness
Serverless CI system for automated end-to-end validation:
- CDK stack: DynamoDB, Lambda scanner, SQS, EventBridge Pipe, Step Functions, CodeBuild
- Hourly scan for pending test configurations
- Two build strategies:
codebuild-submitanddocker-in-docker do/register --cipublishes configs to DynamoDBdo/ci reportshows test matrix coverage
CLI Configuration Parameters
Granular control over deployment infrastructure:
--endpoint-*flags (instance count, data capture, volume size)--ic-*flags (CPU, memory, GPU count, copy count, model weight)--model-env KEY=VALUE(model-level environment variables)--server-env KEY=VALUE(engine-prefixed server environment variables)- Schema-driven validation via bundled parameter schema
- Full propagation through
do/config,do/export,do/register, and Dockerfile
Model Loading Adapter
Generated serve scripts resolve provider-prefixed model URIs at runtime:
s3://bucket/path→ Downloads from S3 to/opt/ml/modeljumpstart://model-id→ Resolves via JumpStart manifestregistry://arn:...→ Resolves viaDescribeModelPackagehuggingface-id→ Passed directly to serving engine
Deployment Registry
Captures successful deployments as structured records:
do/registerlogs deployment config, model, instance, statusml-container-creator registry list/search/export/import/replay- Per-profile asset manifests track deployed AWS resource ARNs
bootstrap statusshows deployed resources with drift detection