|
464 | 464 | "cell_type": "markdown", |
465 | 465 | "metadata": {}, |
466 | 466 | "source": [ |
467 | | - "## Step 4: API Key Configuration (NVIDIA & Brev)", |
468 | | - "", |
469 | | - "The Warehouse Operational Assistant uses NVIDIA NIMs (NVIDIA Inference Microservices) for AI capabilities. You have **two deployment options** for NIMs:", |
470 | | - "", |
471 | | - "### \ud83d\ude80 NIM Deployment Options", |
472 | | - "", |
473 | | - "**Option 1: Cloud Endpoints** (Easiest - Default)", |
474 | | - "- Use NVIDIA's cloud-hosted NIM services", |
475 | | - "- **No installation required** - just configure API keys", |
476 | | - "- Quick setup, perfect for development and testing", |
477 | | - "- Endpoints: `api.brev.dev` or `integrate.api.nvidia.com`", |
478 | | - "", |
479 | | - "**Option 2: Self-Hosted NIMs** (Recommended for Production)", |
480 | | - "- **Install NIMs on your own infrastructure** using Docker", |
481 | | - "- **Create custom endpoints** on your servers", |
482 | | - "- Benefits:", |
483 | | - " - \ud83d\udd12 **Data Privacy**: Keep sensitive data on-premises", |
484 | | - " - \ud83d\udcb0 **Cost Control**: Avoid per-request cloud costs", |
485 | | - " - \u2699\ufe0f **Custom Requirements**: Full control over infrastructure", |
486 | | - " - \u26a1 **Low Latency**: Reduced network latency", |
487 | | - "", |
488 | | - "**Self-Hosting Example:**", |
489 | | - "```bash", |
490 | | - "# Deploy LLM NIM on your server", |
491 | | - "docker run --gpus all -p 8000:8000 \\", |
492 | | - " nvcr.io/nvidia/nim/llama-3.3-nemotron-super-49b:latest", |
493 | | - "", |
494 | | - "# Then set in .env:", |
495 | | - "LLM_NIM_URL=http://your-server:8000/v1", |
496 | | - "```", |
497 | | - "", |
498 | | - "**\ud83d\udcdd Note**: This step configures API keys for cloud endpoints. If you're self-hosting NIMs, you can skip API keys (unless your NIMs require authentication) and just configure the endpoint URLs in Step 5.", |
499 | | - "", |
500 | | - "---", |
501 | | - "", |
502 | | - "### \u26a0\ufe0f Important: Two Types of API Keys (for Cloud Endpoints)", |
503 | | - "", |
504 | | - "**1. NVIDIA API Key** (starts with `nvapi-`)", |
505 | | - "- **Format**: `nvapi-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx`", |
506 | | - "- **Get from**: https://build.nvidia.com/", |
507 | | - "- **Works with**: Both `api.brev.dev` and `integrate.api.nvidia.com` endpoints", |
508 | | - "- **Required for**: Embedding service (always requires NVIDIA API key)", |
509 | | - "", |
510 | | - "**2. Brev API Key** (starts with `brev_api_`)", |
511 | | - "- **Format**: `brev_api_xxxxxxxxxxxxxxxxxxxxxxxxxxxxx`", |
512 | | - "- **Get from**: Your Brev account", |
513 | | - "- **Works with**: `api.brev.dev` endpoint only", |
514 | | - "- **Optional**: Can use NVIDIA API key instead", |
515 | | - "", |
516 | | - "### Configuration Options (Cloud Endpoints)", |
517 | | - "", |
518 | | - "**Option A: Use NVIDIA API Key for Everything** (Recommended)", |
519 | | - "- Set `NVIDIA_API_KEY` with your NVIDIA API key", |
520 | | - "- Leave `EMBEDDING_API_KEY` unset (will use `NVIDIA_API_KEY`)", |
521 | | - "- Works with both endpoints", |
522 | | - "", |
523 | | - "**Option B: Use Brev API Key for LLM + NVIDIA API Key for Embedding**", |
524 | | - "- Set `NVIDIA_API_KEY` with your Brev API key", |
525 | | - "- **MUST** set `EMBEDDING_API_KEY` with your NVIDIA API key (required!)", |
526 | | - "- Embedding service always requires NVIDIA API key", |
527 | | - "", |
528 | | - "### Getting Your API Keys (for Cloud Endpoints)", |
529 | | - "", |
530 | | - "**NVIDIA API Key:**", |
531 | | - "1. **Visit**: https://build.nvidia.com/", |
532 | | - "2. **Sign up** or log in to your NVIDIA account", |
533 | | - "3. **Navigate** to the \"API Keys\" section", |
534 | | - "4. **Create** a new API key", |
535 | | - "5. **Copy** the API key (starts with `nvapi-`)", |
536 | | - "", |
537 | | - "**Brev API Key (Optional):**", |
538 | | - "1. **Visit**: Your Brev account dashboard", |
539 | | - "2. **Navigate** to API Keys section", |
540 | | - "3. **Create** or copy your Brev API key (starts with `brev_api_`)", |
541 | | - "", |
542 | | - "### What You'll Get Access To", |
543 | | - "", |
544 | | - "- **LLM Service** (Llama 3.3 Nemotron Super 49B) - for chat and reasoning", |
545 | | - "- **Embedding Service** (llama-3_2-nv-embedqa-1b-v2) - for semantic search", |
546 | | - "- **Document Processing** - OCR and structured data extraction", |
547 | | - "- **Content Safety** - NeMo Guardrails for content moderation", |
548 | | - "", |
| 467 | + "## Step 4: API Key Configuration (NVIDIA & Brev)\n", |
| 468 | + "\n", |
| 469 | + "The Warehouse Operational Assistant uses NVIDIA NIMs (NVIDIA Inference Microservices) for AI capabilities. You have **two deployment options** for NIMs:\n", |
| 470 | + "\n", |
| 471 | + "### \ud83d\ude80 NIM Deployment Options\n", |
| 472 | + "\n", |
| 473 | + "**Option 1: Cloud Endpoints** (Easiest - Default)\n", |
| 474 | + "- Use NVIDIA's cloud-hosted NIM services\n", |
| 475 | + "- **No installation required** - just configure API keys\n", |
| 476 | + "- Quick setup, perfect for development and testing\n", |
| 477 | + "- Endpoints: `api.brev.dev` or `integrate.api.nvidia.com`\n", |
| 478 | + "\n", |
| 479 | + "**Option 2: Self-Hosted NIMs** (Recommended for Production)\n", |
| 480 | + "- **Install NIMs on your own infrastructure** using Docker\n", |
| 481 | + "- **Create custom endpoints** on your servers\n", |
| 482 | + "- Benefits:\n", |
| 483 | + " - \ud83d\udd12 **Data Privacy**: Keep sensitive data on-premises\n", |
| 484 | + " - \ud83d\udcb0 **Cost Control**: Avoid per-request cloud costs\n", |
| 485 | + " - \u2699\ufe0f **Custom Requirements**: Full control over infrastructure\n", |
| 486 | + " - \u26a1 **Low Latency**: Reduced network latency\n", |
| 487 | + "\n", |
| 488 | + "**Self-Hosting Example:**\n", |
| 489 | + "```bash\n", |
| 490 | + "# Deploy LLM NIM on your server\n", |
| 491 | + "docker run --gpus all -p 8000:8000 \\\n", |
| 492 | + " nvcr.io/nvidia/nim/llama-3.3-nemotron-super-49b:latest\n", |
| 493 | + "\n", |
| 494 | + "# Then set in .env:\n", |
| 495 | + "LLM_NIM_URL=http://your-server:8000/v1\n", |
| 496 | + "```\n", |
| 497 | + "\n", |
| 498 | + "**\ud83d\udcdd Note**: This step configures API keys for cloud endpoints. If you're self-hosting NIMs, you can skip API keys (unless your NIMs require authentication) and just configure the endpoint URLs in Step 5.\n", |
| 499 | + "\n", |
| 500 | + "---\n", |
| 501 | + "\n", |
| 502 | + "### \u26a0\ufe0f Important: Two Types of API Keys (for Cloud Endpoints)\n", |
| 503 | + "\n", |
| 504 | + "**1. NVIDIA API Key** (starts with `nvapi-`)\n", |
| 505 | + "- **Format**: `nvapi-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx`\n", |
| 506 | + "- **Get from**: https://build.nvidia.com/\n", |
| 507 | + "- **Works with**: Both `api.brev.dev` and `integrate.api.nvidia.com` endpoints\n", |
| 508 | + "- **Required for**: Embedding service (always requires NVIDIA API key)\n", |
| 509 | + "\n", |
| 510 | + "**2. Brev API Key** (starts with `brev_api_`)\n", |
| 511 | + "- **Format**: `brev_api_xxxxxxxxxxxxxxxxxxxxxxxxxxxxx`\n", |
| 512 | + "- **Get from**: Your Brev account\n", |
| 513 | + "- **Works with**: `api.brev.dev` endpoint only\n", |
| 514 | + "- **Optional**: Can use NVIDIA API key instead\n", |
| 515 | + "\n", |
| 516 | + "### Configuration Options (Cloud Endpoints)\n", |
| 517 | + "\n", |
| 518 | + "**Option A: Use NVIDIA API Key for Everything** (Recommended)\n", |
| 519 | + "- Set `NVIDIA_API_KEY` with your NVIDIA API key\n", |
| 520 | + "- Leave `EMBEDDING_API_KEY` unset (will use `NVIDIA_API_KEY`)\n", |
| 521 | + "- Works with both endpoints\n", |
| 522 | + "\n", |
| 523 | + "**Option B: Use Brev API Key for LLM + NVIDIA API Key for Embedding**\n", |
| 524 | + "- Set `NVIDIA_API_KEY` with your Brev API key\n", |
| 525 | + "- **MUST** set `EMBEDDING_API_KEY` with your NVIDIA API key (required!)\n", |
| 526 | + "- Embedding service always requires NVIDIA API key\n", |
| 527 | + "\n", |
| 528 | + "### Getting Your API Keys (for Cloud Endpoints)\n", |
| 529 | + "\n", |
| 530 | + "**NVIDIA API Key:**\n", |
| 531 | + "1. **Visit**: https://build.nvidia.com/\n", |
| 532 | + "2. **Sign up** or log in to your NVIDIA account\n", |
| 533 | + "3. **Navigate** to the \"API Keys\" section\n", |
| 534 | + "4. **Create** a new API key\n", |
| 535 | + "5. **Copy** the API key (starts with `nvapi-`)\n", |
| 536 | + "\n", |
| 537 | + "**Brev API Key (Optional):**\n", |
| 538 | + "1. **Visit**: Your Brev account dashboard\n", |
| 539 | + "2. **Navigate** to API Keys section\n", |
| 540 | + "3. **Create** or copy your Brev API key (starts with `brev_api_`)\n", |
| 541 | + "\n", |
| 542 | + "### What You'll Get Access To\n", |
| 543 | + "\n", |
| 544 | + "- **LLM Service** (Llama 3.3 Nemotron Super 49B) - for chat and reasoning\n", |
| 545 | + "- **Embedding Service** (llama-3_2-nv-embedqa-1b-v2) - for semantic search\n", |
| 546 | + "- **Document Processing** - OCR and structured data extraction\n", |
| 547 | + "- **Content Safety** - NeMo Guardrails for content moderation\n", |
| 548 | + "\n", |
549 | 549 | "**\ud83d\udca1 For Self-Hosted NIMs**: See `DEPLOYMENT.md` section \"NVIDIA NIMs Deployment & Configuration\" for detailed self-hosting instructions." |
550 | 550 | ] |
551 | 551 | }, |
| 552 | + { |
| 553 | + "cell_type": "markdown", |
| 554 | + "metadata": {}, |
| 555 | + "source": [] |
| 556 | + }, |
| 557 | + { |
| 558 | + "cell_type": "markdown", |
| 559 | + "metadata": {}, |
| 560 | + "source": [] |
| 561 | + }, |
552 | 562 | { |
553 | 563 | "cell_type": "code", |
554 | 564 | "execution_count": null, |
|
0 commit comments