Skip to content

Goodbye-Yeoman

Latest

Choose a tag to compare

@dferguson992 dferguson992 released this 07 May 20:02
· 49 commits to main since this release
746f9c0

Release Notes — @aws/ml-container-creator v0.2.3

⚠️ Breaking Changes

  • Package renamed: @aws/generator-ml-container-creator@aws/ml-container-creator
  • Install command changed: npm install -g yo @aws/generator-ml-container-creatornpm install -g @aws/ml-container-creator
  • Run command changed: yo @aws/ml-container-creatorml-container-creator
  • npx support: npx @aws/ml-container-creator --help (zero-install)

Standalone CLI (Yeoman Removal)

Removes the Yeoman framework entirely. The tool is now a standalone Node.js CLI using Commander.js for option parsing and @inquirer/prompts for interactive prompts. Template rendering uses ejs + tinyglobby + fs directly.

  • Eliminates yo peer dependency and generator- naming convention
  • Removes all CI root-user workarounds (YEOMAN_ALLOW_ROOT, runuser, useradd)
  • Moves source from generators/app/src/ + templates/
  • Replaces yeoman-test/yeoman-assert with custom test helpers

Deployment Targets

Target Description
SageMaker Managed Inference Real-time endpoints with inference components
SageMaker Async Inference Asynchronous inference for long-running predictions
SageMaker Batch Transform Batch processing of large datasets
SageMaker HyperPod EKS Kubernetes-based deployment on HyperPod clusters

Serving Architectures

Architecture Model Servers
HTTP (traditional ML) Flask, FastAPI
Transformers (LLMs) vLLM, SGLang, TensorRT-LLM, DJL/LMI
Triton (multi-framework) FIL, ONNX Runtime, TensorFlow, PyTorch, Python, vLLM, TensorRT-LLM
Diffusors (image generation) vLLM Omni

MCP Server Ecosystem

Five bundled MCP servers provide intelligent configuration assistance:

  • model-picker — Resolves models from HuggingFace, SageMaker JumpStart, Bedrock Marketplace, TensorFlow Hub, ONNX Model Zoo
  • base-image-picker — Selects appropriate container base images per model server and CUDA version
  • instance-recommender — Recommends SageMaker instance types based on model requirements
  • region-picker — Filters AWS regions by service availability with Bedrock-powered reasoning
  • hyperpod-cluster-picker — Discovers available HyperPod EKS clusters

All catalog data externalized to JSON files in servers/*/catalogs/.


do-framework Integration

Generated projects include a complete do/ script suite:

Script Purpose
do/build Build Docker image (linux/amd64)
do/push Push to Amazon ECR
do/deploy Deploy to SageMaker (idempotent, credential-aware)
do/submit Submit build to CodeBuild
do/test Test deployed endpoint
do/clean Tear down deployed resources
do/export Export config as CLI flags or JSON
do/register Register to local deployment registry (+ optional CI)
do/ci CI report, status, trigger, dashboard
do/manifest Track deployed AWS resource ARNs
do/logs Tail CloudWatch logs
do/run Run container locally

Bootstrap Infrastructure

ml-container-creator bootstrap provisions shared AWS infrastructure:

  • IAM execution role with SageMaker permissions (CloudFormation-managed)
  • ECR repository for container images
  • S3 buckets for async/batch workloads
  • Named profiles persisted to ~/.ml-container-creator/config.json
  • bootstrap status --verify for drift detection
  • bootstrap scan for discovering pre-existing tagged resources

CI Integration Harness

Serverless CI system for automated end-to-end validation:

  • CDK stack: DynamoDB, Lambda scanner, SQS, EventBridge Pipe, Step Functions, CodeBuild
  • Hourly scan for pending test configurations
  • Two build strategies: codebuild-submit and docker-in-docker
  • do/register --ci publishes configs to DynamoDB
  • do/ci report shows test matrix coverage

CLI Configuration Parameters

Granular control over deployment infrastructure:

  • --endpoint-* flags (instance count, data capture, volume size)
  • --ic-* flags (CPU, memory, GPU count, copy count, model weight)
  • --model-env KEY=VALUE (model-level environment variables)
  • --server-env KEY=VALUE (engine-prefixed server environment variables)
  • Schema-driven validation via bundled parameter schema
  • Full propagation through do/config, do/export, do/register, and Dockerfile

Model Loading Adapter

Generated serve scripts resolve provider-prefixed model URIs at runtime:

  • s3://bucket/path → Downloads from S3 to /opt/ml/model
  • jumpstart://model-id → Resolves via JumpStart manifest
  • registry://arn:... → Resolves via DescribeModelPackage
  • huggingface-id → Passed directly to serving engine

Deployment Registry

Captures successful deployments as structured records:

  • do/register logs deployment config, model, instance, status
  • ml-container-creator registry list/search/export/import/replay
  • Per-profile asset manifests track deployed AWS resource ARNs
  • bootstrap status shows deployed resources with drift detection