awslabs
diff --git a/‎README.md‎
Lines changed: 106 additions & 0 deletions b/‎README.md‎
Lines changed: 106 additions & 0 deletions
diff --git a/‎TROUBLESHOOTING.md‎
Lines changed: 222 additions & 0 deletions b/‎TROUBLESHOOTING.md‎
Lines changed: 222 additions & 0 deletions
diff --git a/‎docs/EXAMPLES.md‎
Lines changed: 42 additions & 0 deletions b/‎docs/EXAMPLES.md‎
Lines changed: 42 additions & 0 deletions
@@ -194,6 +194,7 @@ yo ml-container-creator --config=production.json --skip-prompts
 | `--region=<region>` | AWS region | `us-east-1`, etc. |
 | `--role-arn=<arn>` | AWS IAM role ARN | `arn:aws:iam::123456789012:role/SageMakerRole` |
 | `--project-dir=<dir>` | Output directory path | `./my-project` |
+| `--hf-token=<token>` | HuggingFace authentication token | `hf_abc123...` or `$HF_TOKEN` |
 
 ### Environment Variables Reference
 
@@ -270,6 +271,111 @@ yo ml-container-creator --config=production.json --skip-prompts
    - **SageMaker**: Run `./deploy/deploy.sh your-sagemaker-role-arn`
    - **CodeBuild**: Run `./deploy/submit_build.sh` then `./deploy/deploy.sh your-sagemaker-role-arn`
 
+## 🔐 HuggingFace Authentication
+
+### When is Authentication Needed?
+
+HuggingFace authentication is required for:
+- **Private models**: Models in private repositories
+- **Gated models**: Models requiring user agreement (e.g., Llama 2, Llama 3)
+- **Rate-limited access**: Avoiding rate limits on public models
+
+Public models like `openai/gpt-oss-20b` do not require authentication.
+
+### Providing Your HF_TOKEN
+
+When you manually enter a transformer model ID (not selecting from examples), you'll be prompted for authentication:
+
+#### Option 1: Interactive Prompt (Recommended for Local Development)
+
+```
+🔐 HuggingFace Authentication
+⚠️  Security Note: The token will be baked into the Docker image.
+   For CI/CD, consider using "$HF_TOKEN" to reference an environment variable.
+
+? HuggingFace token (enter token, "$HF_TOKEN" for env var, or leave empty):
+```
+
+You can:
+- **Enter your token directly**: `hf_abc123...`
+- **Reference an environment variable**: `$HF_TOKEN`
+- **Leave empty for public models**: (press Enter)
+
+#### Option 2: CLI Option
+
+```bash
+# Direct token
+yo ml-container-creator my-llm-project \
+  --framework=transformers \
+  --model-name=meta-llama/Llama-2-7b-hf \
+  --model-server=vllm \
+  --hf-token=hf_abc123... \
+  --skip-prompts
+
+# Environment variable reference
+yo ml-container-creator my-llm-project \
+  --framework=transformers \
+  --model-name=meta-llama/Llama-2-7b-hf \
+  --model-server=vllm \
+  --hf-token='$HF_TOKEN' \
+  --skip-prompts
+```
+
+#### Option 3: Configuration File
+
+```json
+{
+  "framework": "transformers",
+  "modelName": "meta-llama/Llama-2-7b-hf",
+  "modelServer": "vllm",
+  "hfToken": "$HF_TOKEN"
+}
+```
+
+### Security Best Practices
+
+⚠️ **Important Security Considerations:**
+
+1. **Tokens are baked into the image**: Anyone with access to your Docker image can extract the token using `docker inspect`.
+
+2. **Use environment variable references for CI/CD**:
+   ```bash
+   export HF_TOKEN=hf_your_token_here
+   yo ml-container-creator --framework=transformers --hf-token='$HF_TOKEN' --skip-prompts
+   ```
+
+3. **Never commit tokens to version control**: Use `$HF_TOKEN` in config files, not actual tokens.
+
+4. **Rotate tokens regularly**: Generate new tokens periodically from your HuggingFace account.
+
+5. **Use read-only tokens**: Create tokens with minimal permissions (read-only access to specific models).
+
+### Getting Your HF_TOKEN
+
+1. Go to https://huggingface.co/settings/tokens
+2. Click "New token"
+3. Give it a descriptive name (e.g., "sagemaker-deployment")
+4. Select "Read" access
+5. Copy the token (starts with `hf_`)
+
+### Troubleshooting Authentication
+
+**Error: "Repository not found" or "Access denied"**
+- Verify your token is valid and not expired
+- Ensure you've accepted the model's license agreement on HuggingFace
+- Check that your token has access to the model's organization
+
+**Error: "HF_TOKEN environment variable not set"**
+- You specified `$HF_TOKEN` but the environment variable is not set
+- Set it: `export HF_TOKEN=hf_your_token_here`
+- Or provide the token directly instead of using `$HF_TOKEN`
+
+**Container builds but fails at runtime**
+- The model requires authentication but no token was provided
+- Rebuild with `--hf-token` option
+
+For more authentication troubleshooting, see the [Troubleshooting Guide](./TROUBLESHOOTING.md#huggingface-authentication-issues).
+
 ## 🛠️ Requirements
 
 ### For Users
 
@@ -347,6 +347,228 @@ docker build -t my-model .
 docker run my-model pip list | grep xgboost
 ```
 
+## HuggingFace Authentication Issues
+
+### Token Not Working
+
+**Problem:**
+Container fails to download model with authentication error:
+```bash
+Error: Repository not found or access denied
+```
+
+**Solutions:**
+
+1. **Verify token is valid**:
+   ```bash
+   # Test token with HuggingFace CLI
+   pip install huggingface-hub
+   huggingface-cli login
+   # Enter your token when prompted
+   
+   # Try downloading the model
+   huggingface-cli download meta-llama/Llama-2-7b-hf
+   ```
+
+2. **Check token permissions**:
+   - Go to https://huggingface.co/settings/tokens
+   - Verify token has "Read" access
+   - Ensure token is not expired
+
+3. **Accept model license**:
+   - Visit the model page on HuggingFace (e.g., https://huggingface.co/meta-llama/Llama-2-7b-hf)
+   - Click "Agree and access repository"
+   - Accept the license terms
+
+4. **Rebuild image**:
+   Token changes require rebuilding the Docker image:
+   ```bash
+   docker build -t my-model .
+   ```
+
+### Environment Variable Not Resolved
+
+**Problem:**
+Warning during generation:
+```bash
+⚠️  Warning: $HF_TOKEN specified but HF_TOKEN environment variable is not set
+```
+
+**Solutions:**
+
+1. **Set the environment variable**:
+   ```bash
+   export HF_TOKEN=hf_your_token_here
+   
+   # Verify it's set
+   echo $HF_TOKEN
+   ```
+
+2. **Check variable name** (case-sensitive):
+   ```bash
+   # Must be exactly HF_TOKEN, not hf_token or Hf_Token
+   export HF_TOKEN=hf_abc123...
+   ```
+
+3. **Verify in current shell**:
+   ```bash
+   # Check all environment variables
+   env | grep HF_TOKEN
+   
+   # If not found, set it again
+   export HF_TOKEN=hf_your_token_here
+   ```
+
+4. **Make it persistent** (optional):
+   ```bash
+   # Add to your shell profile (~/.bashrc, ~/.zshrc, etc.)
+   echo 'export HF_TOKEN=hf_your_token_here' >> ~/.bashrc
+   source ~/.bashrc
+   ```
+
+### Token Format Warning
+
+**Problem:**
+Warning during generation:
+```bash
+⚠️  Warning: HuggingFace tokens typically start with "hf_"
+```
+
+**Solutions:**
+
+1. **Verify token format**:
+   - HuggingFace tokens should start with `hf_`
+   - Example: `hf_AbCdEfGhIjKlMnOpQrStUvWxYz1234567890`
+
+2. **Get a new token**:
+   - Go to https://huggingface.co/settings/tokens
+   - Click "New token"
+   - Copy the token (it will start with `hf_`)
+
+3. **If using a valid token**:
+   - The warning is non-blocking
+   - If your token works, you can ignore the warning
+   - The generator will continue with your provided token
+
+### Container Builds But Fails at Runtime
+
+**Problem:**
+Container builds successfully but fails when trying to load the model:
+```bash
+Error: Failed to download model from HuggingFace Hub
+```
+
+**Solutions:**
+
+1. **Token not provided**:
+   The model requires authentication but no token was provided during generation.
+   
+   Rebuild with token:
+   ```bash
+   yo ml-container-creator my-llm-project \
+     --framework=transformers \
+     --model-name=meta-llama/Llama-2-7b-hf \
+     --model-server=vllm \
+     --hf-token=hf_your_token_here \
+     --skip-prompts
+   ```
+
+2. **Token expired**:
+   Generate a new token and rebuild:
+   ```bash
+   # Get new token from https://huggingface.co/settings/tokens
+   docker build -t my-model . --build-arg HF_TOKEN=hf_new_token
+   ```
+
+3. **Model requires agreement**:
+   - Visit the model page on HuggingFace
+   - Accept the license agreement
+   - Wait a few minutes for access to be granted
+   - Rebuild the container
+
+### Token Visible in Docker Image
+
+**Problem:**
+Concerned about token security in Docker image
+
+**Solutions:**
+
+1. **Use environment variable reference** (recommended for CI/CD):
+   ```bash
+   # During generation, use $HF_TOKEN reference
+   yo ml-container-creator --hf-token='$HF_TOKEN' --skip-prompts
+   
+   # Set environment variable before building
+   export HF_TOKEN=hf_your_token_here
+   ```
+
+2. **Restrict image access**:
+   ```bash
+   # Use private ECR repository
+   aws ecr create-repository --repository-name my-private-models
+   
+   # Set repository policy to restrict access
+   aws ecr set-repository-policy \
+     --repository-name my-private-models \
+     --policy-text file://policy.json
+   ```
+
+3. **Rotate tokens regularly**:
+   - Generate new tokens periodically
+   - Revoke old tokens
+   - Rebuild images with new tokens
+
+4. **Use read-only tokens**:
+   - Create tokens with minimal permissions
+   - Only grant "Read" access
+   - Limit scope to specific models/organizations
+
+### Model Access Denied
+
+**Problem:**
+```bash
+Error: You don't have permission to access this model
+```
+
+**Solutions:**
+
+1. **Check organization membership**:
+   - Some models require organization membership
+   - Contact the model owner for access
+
+2. **Verify token scope**:
+   - Token may not have access to the specific model
+   - Create a new token with appropriate permissions
+
+3. **Check model visibility**:
+   - Model may be private or restricted
+   - Verify you have access on HuggingFace website
+
+### Rate Limiting Issues
+
+**Problem:**
+```bash
+Error: Rate limit exceeded
+```
+
+**Solutions:**
+
+1. **Use authentication**:
+   Authenticated requests have higher rate limits:
+   ```bash
+   yo ml-container-creator --hf-token=hf_your_token_here --skip-prompts
+   ```
+
+2. **Wait and retry**:
+   - Rate limits reset after a period
+   - Wait a few minutes and try again
+
+3. **Use cached models**:
+   - Download model once and cache it
+   - Upload to S3 and reference in SageMaker
+
+For more information on HuggingFace authentication, see the [README](./README.md#-huggingface-authentication).
+
 ## Performance Issues
 
 ### Slow Predictions
 
@@ -230,6 +230,48 @@ You want to deploy a Llama 2 7B model using vLLM for efficient inference.
 - Test types: `hosted-model-endpoint` (only option)
 - Instance type: `gpu-enabled` (required)
 
+### Example Model IDs
+
+When generating a transformer project, you'll be prompted to select a model. The generator provides several example models that **do not require HuggingFace authentication**:
+
+**Available Example Models:**
+- `openai/gpt-oss-20b` - Open-source GPT model (no authentication required)
+- `meta-llama/Llama-3.2-3B-Instruct` - Llama 3.2 3B instruction-tuned model
+- `meta-llama/Llama-3.2-1B-Instruct` - Llama 3.2 1B instruction-tuned model
+- `Custom (enter manually)` - Enter any model ID manually
+
+**Important Notes:**
+
+1. **Example models skip authentication prompts**: If you select one of the pre-configured example models, you will NOT be prompted for a HuggingFace token. These models are publicly accessible.
+
+2. **Custom models may require authentication**: If you select "Custom (enter manually)" and enter a model ID for a private or gated model (like `meta-llama/Llama-2-7b-hf`), you will be prompted for your HuggingFace token.
+
+3. **Case-insensitive matching**: Model ID matching is case-insensitive. For example, `OPENAI/GPT-OSS-20B` will be recognized as an example model.
+
+4. **When to use custom models**:
+   - Private models in your HuggingFace account
+   - Gated models requiring license agreement (e.g., Llama 2)
+   - Models not in the example list
+   - Fine-tuned models
+
+**Example: Using a Custom Model with Authentication**
+
+```bash
+yo ml-container-creator
+
+# When prompted:
+? Which model do you want to use? Custom (enter manually)
+? Enter the model name: meta-llama/Llama-2-7b-hf
+
+🔐 HuggingFace Authentication
+⚠️  Security Note: The token will be baked into the Docker image.
+   For CI/CD, consider using "$HF_TOKEN" to reference an environment variable.
+
+? HuggingFace token (enter token, "$HF_TOKEN" for env var, or leave empty): hf_abc123...
+```
+
+For more information on HuggingFace authentication, see the [HuggingFace Authentication section](configuration.md#huggingface-authentication) in the Configuration Guide.
+
 ### Step 2: Prepare Model Files
 
 Option A: Download from Hugging Face Hub