Skip to content

Commit 53ce9a5

Browse files
authored
Merge pull request #45 from dferguson992/main
Added built-in support for HuggingFace tokens
2 parents 3c96bc0 + 7d44307 commit 53ce9a5

23 files changed

Lines changed: 6008 additions & 35 deletions

README.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -194,6 +194,7 @@ yo ml-container-creator --config=production.json --skip-prompts
194194
| `--region=<region>` | AWS region | `us-east-1`, etc. |
195195
| `--role-arn=<arn>` | AWS IAM role ARN | `arn:aws:iam::123456789012:role/SageMakerRole` |
196196
| `--project-dir=<dir>` | Output directory path | `./my-project` |
197+
| `--hf-token=<token>` | HuggingFace authentication token | `hf_abc123...` or `$HF_TOKEN` |
197198

198199
### Environment Variables Reference
199200

@@ -270,6 +271,111 @@ yo ml-container-creator --config=production.json --skip-prompts
270271
- **SageMaker**: Run `./deploy/deploy.sh your-sagemaker-role-arn`
271272
- **CodeBuild**: Run `./deploy/submit_build.sh` then `./deploy/deploy.sh your-sagemaker-role-arn`
272273

274+
## 🔐 HuggingFace Authentication
275+
276+
### When is Authentication Needed?
277+
278+
HuggingFace authentication is required for:
279+
- **Private models**: Models in private repositories
280+
- **Gated models**: Models requiring user agreement (e.g., Llama 2, Llama 3)
281+
- **Rate-limited access**: Avoiding rate limits on public models
282+
283+
Public models like `openai/gpt-oss-20b` do not require authentication.
284+
285+
### Providing Your HF_TOKEN
286+
287+
When you manually enter a transformer model ID (not selecting from examples), you'll be prompted for authentication:
288+
289+
#### Option 1: Interactive Prompt (Recommended for Local Development)
290+
291+
```
292+
🔐 HuggingFace Authentication
293+
⚠️ Security Note: The token will be baked into the Docker image.
294+
For CI/CD, consider using "$HF_TOKEN" to reference an environment variable.
295+
296+
? HuggingFace token (enter token, "$HF_TOKEN" for env var, or leave empty):
297+
```
298+
299+
You can:
300+
- **Enter your token directly**: `hf_abc123...`
301+
- **Reference an environment variable**: `$HF_TOKEN`
302+
- **Leave empty for public models**: (press Enter)
303+
304+
#### Option 2: CLI Option
305+
306+
```bash
307+
# Direct token
308+
yo ml-container-creator my-llm-project \
309+
--framework=transformers \
310+
--model-name=meta-llama/Llama-2-7b-hf \
311+
--model-server=vllm \
312+
--hf-token=hf_abc123... \
313+
--skip-prompts
314+
315+
# Environment variable reference
316+
yo ml-container-creator my-llm-project \
317+
--framework=transformers \
318+
--model-name=meta-llama/Llama-2-7b-hf \
319+
--model-server=vllm \
320+
--hf-token='$HF_TOKEN' \
321+
--skip-prompts
322+
```
323+
324+
#### Option 3: Configuration File
325+
326+
```json
327+
{
328+
"framework": "transformers",
329+
"modelName": "meta-llama/Llama-2-7b-hf",
330+
"modelServer": "vllm",
331+
"hfToken": "$HF_TOKEN"
332+
}
333+
```
334+
335+
### Security Best Practices
336+
337+
⚠️ **Important Security Considerations:**
338+
339+
1. **Tokens are baked into the image**: Anyone with access to your Docker image can extract the token using `docker inspect`.
340+
341+
2. **Use environment variable references for CI/CD**:
342+
```bash
343+
export HF_TOKEN=hf_your_token_here
344+
yo ml-container-creator --framework=transformers --hf-token='$HF_TOKEN' --skip-prompts
345+
```
346+
347+
3. **Never commit tokens to version control**: Use `$HF_TOKEN` in config files, not actual tokens.
348+
349+
4. **Rotate tokens regularly**: Generate new tokens periodically from your HuggingFace account.
350+
351+
5. **Use read-only tokens**: Create tokens with minimal permissions (read-only access to specific models).
352+
353+
### Getting Your HF_TOKEN
354+
355+
1. Go to https://huggingface.co/settings/tokens
356+
2. Click "New token"
357+
3. Give it a descriptive name (e.g., "sagemaker-deployment")
358+
4. Select "Read" access
359+
5. Copy the token (starts with `hf_`)
360+
361+
### Troubleshooting Authentication
362+
363+
**Error: "Repository not found" or "Access denied"**
364+
- Verify your token is valid and not expired
365+
- Ensure you've accepted the model's license agreement on HuggingFace
366+
- Check that your token has access to the model's organization
367+
368+
**Error: "HF_TOKEN environment variable not set"**
369+
- You specified `$HF_TOKEN` but the environment variable is not set
370+
- Set it: `export HF_TOKEN=hf_your_token_here`
371+
- Or provide the token directly instead of using `$HF_TOKEN`
372+
373+
**Container builds but fails at runtime**
374+
- The model requires authentication but no token was provided
375+
- Rebuild with `--hf-token` option
376+
377+
For more authentication troubleshooting, see the [Troubleshooting Guide](./TROUBLESHOOTING.md#huggingface-authentication-issues).
378+
273379
## 🛠️ Requirements
274380

275381
### For Users

TROUBLESHOOTING.md

Lines changed: 222 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -347,6 +347,228 @@ docker build -t my-model .
347347
docker run my-model pip list | grep xgboost
348348
```
349349

350+
## HuggingFace Authentication Issues
351+
352+
### Token Not Working
353+
354+
**Problem:**
355+
Container fails to download model with authentication error:
356+
```bash
357+
Error: Repository not found or access denied
358+
```
359+
360+
**Solutions:**
361+
362+
1. **Verify token is valid**:
363+
```bash
364+
# Test token with HuggingFace CLI
365+
pip install huggingface-hub
366+
huggingface-cli login
367+
# Enter your token when prompted
368+
369+
# Try downloading the model
370+
huggingface-cli download meta-llama/Llama-2-7b-hf
371+
```
372+
373+
2. **Check token permissions**:
374+
- Go to https://huggingface.co/settings/tokens
375+
- Verify token has "Read" access
376+
- Ensure token is not expired
377+
378+
3. **Accept model license**:
379+
- Visit the model page on HuggingFace (e.g., https://huggingface.co/meta-llama/Llama-2-7b-hf)
380+
- Click "Agree and access repository"
381+
- Accept the license terms
382+
383+
4. **Rebuild image**:
384+
Token changes require rebuilding the Docker image:
385+
```bash
386+
docker build -t my-model .
387+
```
388+
389+
### Environment Variable Not Resolved
390+
391+
**Problem:**
392+
Warning during generation:
393+
```bash
394+
⚠️ Warning: $HF_TOKEN specified but HF_TOKEN environment variable is not set
395+
```
396+
397+
**Solutions:**
398+
399+
1. **Set the environment variable**:
400+
```bash
401+
export HF_TOKEN=hf_your_token_here
402+
403+
# Verify it's set
404+
echo $HF_TOKEN
405+
```
406+
407+
2. **Check variable name** (case-sensitive):
408+
```bash
409+
# Must be exactly HF_TOKEN, not hf_token or Hf_Token
410+
export HF_TOKEN=hf_abc123...
411+
```
412+
413+
3. **Verify in current shell**:
414+
```bash
415+
# Check all environment variables
416+
env | grep HF_TOKEN
417+
418+
# If not found, set it again
419+
export HF_TOKEN=hf_your_token_here
420+
```
421+
422+
4. **Make it persistent** (optional):
423+
```bash
424+
# Add to your shell profile (~/.bashrc, ~/.zshrc, etc.)
425+
echo 'export HF_TOKEN=hf_your_token_here' >> ~/.bashrc
426+
source ~/.bashrc
427+
```
428+
429+
### Token Format Warning
430+
431+
**Problem:**
432+
Warning during generation:
433+
```bash
434+
⚠️ Warning: HuggingFace tokens typically start with "hf_"
435+
```
436+
437+
**Solutions:**
438+
439+
1. **Verify token format**:
440+
- HuggingFace tokens should start with `hf_`
441+
- Example: `hf_AbCdEfGhIjKlMnOpQrStUvWxYz1234567890`
442+
443+
2. **Get a new token**:
444+
- Go to https://huggingface.co/settings/tokens
445+
- Click "New token"
446+
- Copy the token (it will start with `hf_`)
447+
448+
3. **If using a valid token**:
449+
- The warning is non-blocking
450+
- If your token works, you can ignore the warning
451+
- The generator will continue with your provided token
452+
453+
### Container Builds But Fails at Runtime
454+
455+
**Problem:**
456+
Container builds successfully but fails when trying to load the model:
457+
```bash
458+
Error: Failed to download model from HuggingFace Hub
459+
```
460+
461+
**Solutions:**
462+
463+
1. **Token not provided**:
464+
The model requires authentication but no token was provided during generation.
465+
466+
Rebuild with token:
467+
```bash
468+
yo ml-container-creator my-llm-project \
469+
--framework=transformers \
470+
--model-name=meta-llama/Llama-2-7b-hf \
471+
--model-server=vllm \
472+
--hf-token=hf_your_token_here \
473+
--skip-prompts
474+
```
475+
476+
2. **Token expired**:
477+
Generate a new token and rebuild:
478+
```bash
479+
# Get new token from https://huggingface.co/settings/tokens
480+
docker build -t my-model . --build-arg HF_TOKEN=hf_new_token
481+
```
482+
483+
3. **Model requires agreement**:
484+
- Visit the model page on HuggingFace
485+
- Accept the license agreement
486+
- Wait a few minutes for access to be granted
487+
- Rebuild the container
488+
489+
### Token Visible in Docker Image
490+
491+
**Problem:**
492+
Concerned about token security in Docker image
493+
494+
**Solutions:**
495+
496+
1. **Use environment variable reference** (recommended for CI/CD):
497+
```bash
498+
# During generation, use $HF_TOKEN reference
499+
yo ml-container-creator --hf-token='$HF_TOKEN' --skip-prompts
500+
501+
# Set environment variable before building
502+
export HF_TOKEN=hf_your_token_here
503+
```
504+
505+
2. **Restrict image access**:
506+
```bash
507+
# Use private ECR repository
508+
aws ecr create-repository --repository-name my-private-models
509+
510+
# Set repository policy to restrict access
511+
aws ecr set-repository-policy \
512+
--repository-name my-private-models \
513+
--policy-text file://policy.json
514+
```
515+
516+
3. **Rotate tokens regularly**:
517+
- Generate new tokens periodically
518+
- Revoke old tokens
519+
- Rebuild images with new tokens
520+
521+
4. **Use read-only tokens**:
522+
- Create tokens with minimal permissions
523+
- Only grant "Read" access
524+
- Limit scope to specific models/organizations
525+
526+
### Model Access Denied
527+
528+
**Problem:**
529+
```bash
530+
Error: You don't have permission to access this model
531+
```
532+
533+
**Solutions:**
534+
535+
1. **Check organization membership**:
536+
- Some models require organization membership
537+
- Contact the model owner for access
538+
539+
2. **Verify token scope**:
540+
- Token may not have access to the specific model
541+
- Create a new token with appropriate permissions
542+
543+
3. **Check model visibility**:
544+
- Model may be private or restricted
545+
- Verify you have access on HuggingFace website
546+
547+
### Rate Limiting Issues
548+
549+
**Problem:**
550+
```bash
551+
Error: Rate limit exceeded
552+
```
553+
554+
**Solutions:**
555+
556+
1. **Use authentication**:
557+
Authenticated requests have higher rate limits:
558+
```bash
559+
yo ml-container-creator --hf-token=hf_your_token_here --skip-prompts
560+
```
561+
562+
2. **Wait and retry**:
563+
- Rate limits reset after a period
564+
- Wait a few minutes and try again
565+
566+
3. **Use cached models**:
567+
- Download model once and cache it
568+
- Upload to S3 and reference in SageMaker
569+
570+
For more information on HuggingFace authentication, see the [README](./README.md#-huggingface-authentication).
571+
350572
## Performance Issues
351573
352574
### Slow Predictions

docs/EXAMPLES.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -230,6 +230,48 @@ You want to deploy a Llama 2 7B model using vLLM for efficient inference.
230230
- Test types: `hosted-model-endpoint` (only option)
231231
- Instance type: `gpu-enabled` (required)
232232

233+
### Example Model IDs
234+
235+
When generating a transformer project, you'll be prompted to select a model. The generator provides several example models that **do not require HuggingFace authentication**:
236+
237+
**Available Example Models:**
238+
- `openai/gpt-oss-20b` - Open-source GPT model (no authentication required)
239+
- `meta-llama/Llama-3.2-3B-Instruct` - Llama 3.2 3B instruction-tuned model
240+
- `meta-llama/Llama-3.2-1B-Instruct` - Llama 3.2 1B instruction-tuned model
241+
- `Custom (enter manually)` - Enter any model ID manually
242+
243+
**Important Notes:**
244+
245+
1. **Example models skip authentication prompts**: If you select one of the pre-configured example models, you will NOT be prompted for a HuggingFace token. These models are publicly accessible.
246+
247+
2. **Custom models may require authentication**: If you select "Custom (enter manually)" and enter a model ID for a private or gated model (like `meta-llama/Llama-2-7b-hf`), you will be prompted for your HuggingFace token.
248+
249+
3. **Case-insensitive matching**: Model ID matching is case-insensitive. For example, `OPENAI/GPT-OSS-20B` will be recognized as an example model.
250+
251+
4. **When to use custom models**:
252+
- Private models in your HuggingFace account
253+
- Gated models requiring license agreement (e.g., Llama 2)
254+
- Models not in the example list
255+
- Fine-tuned models
256+
257+
**Example: Using a Custom Model with Authentication**
258+
259+
```bash
260+
yo ml-container-creator
261+
262+
# When prompted:
263+
? Which model do you want to use? Custom (enter manually)
264+
? Enter the model name: meta-llama/Llama-2-7b-hf
265+
266+
🔐 HuggingFace Authentication
267+
⚠️ Security Note: The token will be baked into the Docker image.
268+
For CI/CD, consider using "$HF_TOKEN" to reference an environment variable.
269+
270+
? HuggingFace token (enter token, "$HF_TOKEN" for env var, or leave empty): hf_abc123...
271+
```
272+
273+
For more information on HuggingFace authentication, see the [HuggingFace Authentication section](configuration.md#huggingface-authentication) in the Configuration Guide.
274+
233275
### Step 2: Prepare Model Files
234276

235277
Option A: Download from Hugging Face Hub

0 commit comments

Comments
 (0)