You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implemented support for AWS Bedrock model catalog for both embedding and gp. The approach enables backwards compatibility with the api key approach for Anthropic and OpenAI, which allows the user to follow the tutorial instructions without modification.
Co-authored-by: Mike Tocci <[email protected]>
LD_SDK_KEY=your-launchdarkly-sdk-key # From step above
150
+
151
+
# Direct API Configuration
152
+
AUTH_METHOD=api-key # Use direct API keys (default)
109
153
OPENAI_API_KEY=your-openai-key # Required for RAG embeddings
110
154
ANTHROPIC_API_KEY=your-anthropic-key # Required for Claude models
111
155
```
112
156
113
157
This sets up a **LangGraph** application that uses LaunchDarkly to control AI behavior. Think of it like swapping actors, directors, even props mid-performance without stopping the show.
114
-
Do not check the `.env` into your source control. Keep those secrets safe!
158
+
159
+
**Security Note:** Do not check the `.env` into your source control. Keep those secrets safe!
160
+
161
+
### 🚨 **Common AWS SSO Issue:**
162
+
163
+
If you get `AccessDeniedException` errors, verify your Python code is using the correct AWS profile:
The system automatically converts direct model IDs to inference profile IDs to prevent `ValidationException` errors from Bedrock. The region prefix is determined by:
194
+
195
+
1.**`BEDROCK_INFERENCE_REGION`** env var (if set) - explicit user preference
Bedrock requires inference profile IDs for on-demand throughput. The region prefix (`us.`, `eu.`, `ap.`, etc.) enables cross-region inference profiles for better availability and failover.
115
215
116
216
## Step 2: Add Your Business Knowledge (2 minutes)
117
217
@@ -141,7 +241,12 @@ Turn your documents into searchable **RAG** knowledge:
141
241
uv run python initialize_embeddings.py --force
142
242
```
143
243
144
-
This builds your **RAG** (Retrieval-Augmented Generation) foundation using **OpenAI's** text-embedding model and FAISS vector database. **RAG** converts documents into vector embeddings that capture semantic meaning rather than just keywords, making search actually understand context.
244
+
This builds your **RAG** (Retrieval-Augmented Generation) foundation using FAISS vector database. The system automatically detects your authentication method:
-**Direct API Keys**: Uses OpenAI text-embedding-3-small (1536 dimensions)
248
+
249
+
**RAG** converts documents into vector embeddings that capture semantic meaning rather than just keywords, making search actually understand context.
145
250
146
251
## Step 4: Define Your Tools (3 minutes)
147
252
@@ -232,7 +337,7 @@ The `reranking` tool takes search results from `search_v2` and reorders them usi
232
337
233
338
> **🔍 How Your RAG Architecture Works**
234
339
>
235
-
> Your **RAG** system works in two stages: `search_v2` performs semantic similarity search using FAISS by converting queries into the same vector space as your documents (via **OpenAI** embeddings), while `reranking` reorders results for maximum relevance. This **RAG** approach significantly outperforms keyword search by understanding context, so asking "My app is broken" can find troubleshooting guides that mention "application errors" or "system failures."
340
+
> Your **RAG** system works in two stages: `search_v2` performs semantic similarity search using FAISS by converting queries into the same vector space as your documents (via **OpenAI** or **Bedrock Titan** embeddings), while `reranking` reorders results for maximum relevance. This **RAG** approach significantly outperforms keyword search by understanding context, so asking "My app is broken" can find troubleshooting guides that mention "application errors" or "system failures."
236
341
237
342
## Step 5: Create Your AI Agents in LaunchDarkly (5 minutes)
238
343
@@ -255,20 +360,25 @@ Create LaunchDarkly AI Configs to control your **LangGraph** multi-agent system
255
360
3. Name it `supervisor-agent`
256
361
4. Add this configuration:
257
362
258
-
>
259
-
> **variation:**
363
+
>
364
+
> **variation:**
260
365
> ```
261
366
> supervisor-basic
262
367
> ```
263
368
>
264
-
> **Model configuration:**
369
+
> **Model configuration:**
265
370
> ```
266
371
> Anthropic
267
-
> ```
372
+
> ```
268
373
> ```
269
374
> claude-3-7-sonnet-latest
270
375
> ```
271
376
>
377
+
> **Note for Bedrock users:** The system auto-corrects direct model IDs to inference profiles:
378
+
> - Use either `claude-3-7-sonnet-latest` (auto-corrected) or `us.anthropic.claude-3-7-sonnet-20250219-v1:0` (explicit)
379
+
> - Control region prefix via `BEDROCK_INFERENCE_REGION` env var (defaults to `us`)
380
+
> - See "Bedrock Model ID Requirements" section above for details
381
+
>
272
382
> **Goal or task:**
273
383
> ```
274
384
> You are an intelligent routing supervisor for a multi-agent system. Your primary job is to assess whether user input likely contains PII (personally identifiable information) to determine the most efficient processing route.
0 commit comments