Release Notes — @aws/ml-container-creator v0.2.3

⚠️ Breaking Changes

Package renamed: @aws/generator-ml-container-creator → @aws/ml-container-creator
Install command changed: npm install -g yo @aws/generator-ml-container-creator → npm install -g @aws/ml-container-creator
Run command changed: yo @aws/ml-container-creator → ml-container-creator
npx support: npx @aws/ml-container-creator --help (zero-install)

Standalone CLI (Yeoman Removal)

Removes the Yeoman framework entirely. The tool is now a standalone Node.js CLI using Commander.js for option parsing and @inquirer/prompts for interactive prompts. Template rendering uses ejs + tinyglobby + fs directly.

Eliminates yo peer dependency and generator- naming convention
Removes all CI root-user workarounds (YEOMAN_ALLOW_ROOT, runuser, useradd)
Moves source from generators/app/ → src/ + templates/
Replaces yeoman-test/yeoman-assert with custom test helpers

Deployment Targets

Target	Description
SageMaker Managed Inference	Real-time endpoints with inference components
SageMaker Async Inference	Asynchronous inference for long-running predictions
SageMaker Batch Transform	Batch processing of large datasets
SageMaker HyperPod EKS	Kubernetes-based deployment on HyperPod clusters

Serving Architectures

Architecture	Model Servers
HTTP (traditional ML)	Flask, FastAPI
Transformers (LLMs)	vLLM, SGLang, TensorRT-LLM, DJL/LMI
Triton (multi-framework)	FIL, ONNX Runtime, TensorFlow, PyTorch, Python, vLLM, TensorRT-LLM
Diffusors (image generation)	vLLM Omni

MCP Server Ecosystem

Five bundled MCP servers provide intelligent configuration assistance:

model-picker — Resolves models from HuggingFace, SageMaker JumpStart, Bedrock Marketplace, TensorFlow Hub, ONNX Model Zoo
base-image-picker — Selects appropriate container base images per model server and CUDA version
instance-recommender — Recommends SageMaker instance types based on model requirements
region-picker — Filters AWS regions by service availability with Bedrock-powered reasoning
hyperpod-cluster-picker — Discovers available HyperPod EKS clusters

All catalog data externalized to JSON files in servers/*/catalogs/.

do-framework Integration

Generated projects include a complete do/ script suite:

Script	Purpose
`do/build`	Build Docker image (linux/amd64)
`do/push`	Push to Amazon ECR
`do/deploy`	Deploy to SageMaker (idempotent, credential-aware)
`do/submit`	Submit build to CodeBuild
`do/test`	Test deployed endpoint
`do/clean`	Tear down deployed resources
`do/export`	Export config as CLI flags or JSON
`do/register`	Register to local deployment registry (+ optional CI)
`do/ci`	CI report, status, trigger, dashboard
`do/manifest`	Track deployed AWS resource ARNs
`do/logs`	Tail CloudWatch logs
`do/run`	Run container locally

Bootstrap Infrastructure

ml-container-creator bootstrap provisions shared AWS infrastructure:

IAM execution role with SageMaker permissions (CloudFormation-managed)
ECR repository for container images
S3 buckets for async/batch workloads
Named profiles persisted to ~/.ml-container-creator/config.json
bootstrap status --verify for drift detection
bootstrap scan for discovering pre-existing tagged resources

CI Integration Harness

Serverless CI system for automated end-to-end validation:

CDK stack: DynamoDB, Lambda scanner, SQS, EventBridge Pipe, Step Functions, CodeBuild
Hourly scan for pending test configurations
Two build strategies: codebuild-submit and docker-in-docker
do/register --ci publishes configs to DynamoDB
do/ci report shows test matrix coverage

CLI Configuration Parameters

Granular control over deployment infrastructure:

--endpoint-* flags (instance count, data capture, volume size)
--ic-* flags (CPU, memory, GPU count, copy count, model weight)
--model-env KEY=VALUE (model-level environment variables)
--server-env KEY=VALUE (engine-prefixed server environment variables)
Schema-driven validation via bundled parameter schema
Full propagation through do/config, do/export, do/register, and Dockerfile

Model Loading Adapter

Generated serve scripts resolve provider-prefixed model URIs at runtime:

s3://bucket/path → Downloads from S3 to /opt/ml/model
jumpstart://model-id → Resolves via JumpStart manifest
registry://arn:... → Resolves via DescribeModelPackage
huggingface-id → Passed directly to serving engine

Deployment Registry

Captures successful deployments as structured records:

do/register logs deployment config, model, instance, status
ml-container-creator registry list/search/export/import/replay
Per-profile asset manifests track deployed AWS resource ARNs
bootstrap status shows deployed resources with drift detection

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Goodbye-Yeoman

Choose a tag to compare

Sorry, something went wrong.