Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
96 commits
Select commit Hold shift + click to select a range
a8d6854
Add AI Config validation and testing framework
sattensil Oct 20, 2025
edaf46e
Fix GitHub Actions workflow: install ld-aic-cicd from git, update all…
sattensil Oct 21, 2025
9173ff7
Trigger workflow with fixed YAML
sattensil Oct 21, 2025
76e450d
Test GitHub Actions workflow
sattensil Oct 21, 2025
bc9b828
Trigger workflow on enable-ai-config-tests branch for testing
sattensil Oct 21, 2025
315b22c
Upgrade to actions/upload-artifact@v4
sattensil Oct 21, 2025
f783678
Use GITHUB_TOKEN to install private ld-aic-cicd repo
sattensil Oct 21, 2025
0f78126
Fix git token format for private repo access
sattensil Oct 21, 2025
19d92f8
Use GH_PAT for private repo access
sattensil Oct 21, 2025
6b30ee6
Test AI Config validation workflow
sattensil Oct 21, 2025
f105210
Install ld-aic-cicd from feature branch
sattensil Oct 21, 2025
aabd88f
Skip dev dependencies with local paths in CI
sattensil Oct 21, 2025
481810e
Add --no-dev flag to all uv run commands
sattensil Oct 21, 2025
3bd519c
Remove --frozen flag to avoid lock file validation in CI
sattensil Oct 21, 2025
7cfa50c
Use uv pip install instead of sync to avoid dev dependencies
sattensil Oct 21, 2025
edac003
Create virtual environment before installing dependencies
sattensil Oct 21, 2025
f59dae4
Add setuptools package configuration to fix build
sattensil Oct 21, 2025
7a1c3c5
Install dependencies directly without package to avoid dev deps
sattensil Oct 21, 2025
a0ca20d
Use activated venv instead of uv run to avoid pyproject.toml parsing
sattensil Oct 21, 2025
273544d
Retrigger workflow with updated ld-aic-cicd
sattensil Oct 21, 2025
4427d4f
Use full venv paths instead of activation for persistence across comm…
sattensil Oct 21, 2025
31104fb
Add /health endpoint for workflow monitoring
sattensil Oct 21, 2025
acda208
Add API startup verification and better error logging
sattensil Oct 21, 2025
dbb34a1
Add PYTHONPATH and immediate log output for debugging
sattensil Oct 21, 2025
b53604e
Combine API start, tests, and stop into single step to maintain backg…
sattensil Oct 21, 2025
51c4667
Add debugging: test chat endpoint, verify directory, add verbose flag
sattensil Oct 21, 2025
31f53f4
Remove --verbose flag (not supported by ld-aic test)
sattensil Oct 21, 2025
d185025
Add local evaluator with 127.0.0.1 for better CI compatibility
sattensil Oct 21, 2025
312113b
Add evaluator documentation and port configuration comments
sattensil Oct 21, 2025
c3efdc5
Add detailed error logging and debug output to evaluator
sattensil Oct 21, 2025
c5bc507
Fix evaluator to match actual API response format
sattensil Oct 21, 2025
bf81897
Add detailed exception logging to find real error
sattensil Oct 21, 2025
98809d4
Add httpx direct test to diagnose connection issue
sattensil Oct 21, 2025
6569b67
Fix YAML syntax: use heredoc for Python script
sattensil Oct 21, 2025
1493f00
Evaluator: send user_context field to /chat payload
sattensil Oct 21, 2025
7773bfe
CI: fix YAML heredoc indentation for httpx test block
sattensil Oct 21, 2025
32482c6
CI safe mode: set env, upload API logs; skip MCP tools in CI; surface…
sattensil Oct 21, 2025
2d722e9
CI safe mode: prefer OpenAI when Anthropic key absent to keep LLM pat…
sattensil Oct 22, 2025
32bc8f2
API: return error response with traceback instead of raising to surfa…
sattensil Oct 22, 2025
f93e5b7
Fix: pass LD_SDK_KEY to ld-aic test command so framework can initiali…
sattensil Oct 22, 2025
0e9aedc
Fix: pass all required env vars (LD_SDK_KEY, API keys, CI_SAFE_MODE) …
sattensil Oct 22, 2025
dd3ce14
Add MISTRAL_API_KEY support for strict-security variation
sattensil Oct 22, 2025
9d37cb8
Fix: use export for env vars so they inherit to ld-aic subprocess; ad…
sattensil Oct 22, 2025
1d56df9
Fix: add LD_API_KEY and LD_PROJECT_KEY to evaluate job env vars
sattensil Oct 22, 2025
78bc092
Initialize vector embeddings before running API tests
sattensil Oct 22, 2025
1fc2864
Add evaluator diagnostic test to identify root cause of connection error
sattensil Oct 22, 2025
d6ec19b
Add standalone test script to diagnose evaluator connection error
sattensil Oct 22, 2025
2cff6dd
Fix: create .env file in CI so ld-aic judge can access OPENAI_API_KEY…
sattensil Oct 22, 2025
b5f22e1
Fix YAML syntax: use echo instead of heredoc for .env creation
sattensil Oct 22, 2025
5fcc059
Security: Add PR protection, environment approval, and .env cleanup
sattensil Oct 22, 2025
4b1afcd
WIP: Save current test configuration changes before switching branches
sattensil Oct 22, 2025
fdf91c3
Merge enable-ai-config-tests: Add comprehensive CI/CD infrastructure
sattensil Oct 22, 2025
e656958
Update CI/CD to use revised ld-aic-cicd framework
sattensil Oct 23, 2025
388a8fa
Enable evaluate-configs job to run on PRs
sattensil Oct 23, 2025
6421212
Switch to pull_request for testing
sattensil Oct 23, 2025
05fd0df
Fix repository name for ld-aic-cicd installation
sattensil Oct 23, 2025
1ad594b
Trigger workflow re-run with correct repository name
sattensil Oct 23, 2025
9623d71
Add boto3 and langchain-aws dependencies
sattensil Oct 23, 2025
7e17053
CI: Commit pre-built vector embeddings for reliability
sattensil Oct 23, 2025
bdf7333
Configure CI tests for 70% pass rate (30% error tolerance)
sattensil Oct 23, 2025
14644d6
Trigger workflow with updated ld-aic-cicd framework
sattensil Oct 23, 2025
f2bc519
Use validated production defaults from .ai_config_defaults.json
sattensil Oct 23, 2025
ff37f57
Fix: Remove tools parameter from LDAIAgentDefaults
sattensil Oct 23, 2025
d15b9b2
Document sync branch configuration in CI/CD workflow
sattensil Oct 23, 2025
62112a9
Improve evaluator debug logging: add user message and AI response to …
sattensil Oct 23, 2025
2ffc130
Use LaunchDarkly thresholds instead of hardcoded CLI values
sattensil Oct 23, 2025
210202a
Clean up workflow comments
sattensil Oct 23, 2025
d5d51df
Add test runner script for local testing
sattensil Oct 23, 2025
0a9569c
Fix: Create .env before starting API to enable search query embeddings
sattensil Oct 23, 2025
8d65468
Fix: Extract agent-specific variations instead of always using superv…
sattensil Oct 24, 2025
bed0708
Cleanup: Remove deprecated bootstrap files and old tool implementations
sattensil Oct 24, 2025
618c8de
Revert "Cleanup: Remove deprecated bootstrap files and old tool imple…
sattensil Oct 24, 2025
5f2b5c0
Trigger workflow: Test search and variation tracking fixes
sattensil Oct 24, 2025
1a19ad6
Test trigger: Verify workflow execution
sattensil Oct 24, 2025
ab6fb02
Test workflow: Verify search fixes and 0.40 error threshold
sattensil Oct 24, 2025
cc9d2c7
Update test data to match RL knowledge base
sattensil Oct 24, 2025
6b6dea3
Add MCP server installation to CI/CD workflow
sattensil Oct 24, 2025
7d2efac
Update PII test criteria to match silent protection behavior
sattensil Oct 24, 2025
a8e1479
Add human-readable test failure summarization to CI/CD
sattensil Oct 24, 2025
3e61842
Add country and plan attributes to support-agent tests
sattensil Oct 24, 2025
5a11668
Migrate CI/CD to use generic Direct evaluator with full tool support
sattensil Oct 28, 2025
b8b2990
Revert to HTTP evaluator for integration testing
sattensil Oct 28, 2025
5cb29cd
Switch to standardized evaluation criteria and generic HTTPEvaluator
sattensil Oct 28, 2025
7164aac
Use minimal payload mode for HTTPEvaluator
sattensil Oct 28, 2025
cd5fb4b
Fix failure summarizer to read new judge log format
sattensil Oct 28, 2025
e057ada
Add detailed error logging and fix API log path in CI
sattensil Oct 28, 2025
df0a54c
Fix GitHub Actions secret masking and add API logs artifact
sattensil Oct 28, 2025
8b7383f
Fix YAML syntax error in heredoc - use env vars instead of secrets
sattensil Oct 28, 2025
8a17551
Use printf instead of heredoc to avoid YAML parsing issues
sattensil Oct 28, 2025
129774b
Add API key verification diagnostics to debug connection errors
sattensil Oct 28, 2025
fd9fb8a
Fix trailing newline in API keys causing illegal HTTP headers
sattensil Oct 28, 2025
3e02e31
Export cleaned env vars before starting uvicorn to fix OpenAI header …
sattensil Oct 28, 2025
df66e38
Switch judge evaluator from GPT-4o to Claude to avoid OpenAI quota li…
sattensil Oct 28, 2025
aea5fea
Revert "Switch judge evaluator from GPT-4o to Claude to avoid OpenAI …
sattensil Oct 28, 2025
e2b6ffb
Update CI/CD pipeline to use renamed repository ld-aic-cicd-
sattensil Oct 29, 2025
54a9a3a
Revert repository name change - keep original scarlett_ai_configs_ci_cd-
sattensil Oct 29, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
182 changes: 182 additions & 0 deletions .ai_config_defaults.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
{
"_metadata": {
"generated_at": "2025-10-23T11:15:54.739693",
"environment": "production",
"project_key": "multi-agent-chatbot",
"config_count": 3
},
"configs": {
"security-agent": {
"enabled": true,
"model": {
"name": "claude-3-5-haiku-20241022",
"parameters": {}
},
"provider": {
"name": "Anthropic"
},
"instructions": "You are a privacy agent that REMOVES direct PII. Focus on clearly personal identifiers:\n\nEmail addresses\nPhone numbers\nSocial Security Numbers\nFull names (but not generic titles)\nStreet addresses\nCredit card numbers\nDriver's license numbers\n\nResponse Format:\n\ndetected: true if any PII was found, false otherwise\ntypes: array of PII types found (e.g., ['email', 'name', 'phone'])\nredacted: the input text with PII replaced by [REDACTED], keeping the text readable and natural\n\nExamples:\n\nInput: \"I work at Acme Corp in Berlin as a manager\"\n\nOutput: detected=false, types=[], redacted='I work at Acme Corp in Berlin as a manager'\n\n\nInput: \"Contact John Smith at [email protected] or 555-1234\"\n\nOutput: detected=true, types=['name', 'email', 'phone'], redacted='Contact [REDACTED] at [REDACTED] or [REDACTED]'\n\n\nInput: \"The CEO from Microsoft contacted me\"\n\nOutput: detected=false, types=[], redacted='The CEO from Microsoft contacted me'"
},
"supervisor-agent": {
"enabled": true,
"model": {
"name": "claude-3-7-sonnet-latest",
"parameters": {}
},
"provider": {
"name": "Anthropic"
},
"instructions": " You are an intelligent routing supervisor for a multi-agent system. Your primary job is to assess whether user input likely contains PII (personally identifiable information) to determine the most efficient processing route.\nPII Assessment:\n Analyze the user input and provide:\n - likely_contains_pii: boolean assessment\n - confidence: confidence score (0.0 to 1.0)\n - reasoning: clear explanation of your decision\n - recommended_route: either 'security_agent' or 'support_agent'\n\n Route to SECURITY_AGENT** if the text likely contains:\n - Email addresses, phone numbers, addresses\n - Names (first/last names, usernames)\n - Financial information (credit cards, SSNs, account numbers)\n - Sensitive personal data\n\n **Route to SUPPORT_AGENT** if the text appears to be:\n - General questions without personal details\n - Technical queries\n - Search requests\n - Educational content requests\n\n Analyze this user input and recommend the optimal route:\n"
},
"support-agent": {
"enabled": true,
"model": {
"name": "claude-3-5-haiku-20241022",
"parameters": {
"tools": [
{
"description": "Simple keyword search through knowledge base",
"name": "search_v1",
"parameters": {
"additionalProperties": false,
"properties": {
"query": {
"description": "Search query for keyword matching",
"type": "string"
},
"top_k": {
"description": "Number of results to return",
"type": "number"
}
},
"required": [
"query"
],
"type": "object"
},
"type": "function"
},
{
"description": "Semantic search using vector embeddings",
"name": "search_v2",
"parameters": {
"additionalProperties": false,
"properties": {
"query": {
"description": "Search query for semantic matching",
"type": "string"
},
"top_k": {
"description": "Number of results to return",
"type": "number"
}
},
"required": [
"query"
],
"type": "object"
},
"type": "function"
},
{
"description": "Reorders results by relevance using BM25 algorithm",
"name": "reranking",
"parameters": {
"additionalProperties": false,
"properties": {
"query": {
"description": "Original query for scoring",
"type": "string"
},
"results": {
"description": "Results to rerank",
"type": "array"
}
},
"required": [
"query",
"results"
],
"type": "object"
},
"type": "function"
}
]
}
},
"provider": {
"name": "Anthropic"
},
"instructions": "You are a helpful assistant with access to RAG tools: search_v1 (basic search), search_v2 (semantic vector search), and reranking (BM25 relevance scoring). When search results are available, prioritize information from those results over your general knowledge. Provide balanced, well-researched responses for international users.",
"tools": [
{
"description": "Simple keyword search through knowledge base",
"name": "search_v1",
"parameters": {
"additionalProperties": false,
"properties": {
"query": {
"description": "Search query for keyword matching",
"type": "string"
},
"top_k": {
"description": "Number of results to return",
"type": "number"
}
},
"required": [
"query"
],
"type": "object"
},
"type": "function"
},
{
"description": "Semantic search using vector embeddings",
"name": "search_v2",
"parameters": {
"additionalProperties": false,
"properties": {
"query": {
"description": "Search query for semantic matching",
"type": "string"
},
"top_k": {
"description": "Number of results to return",
"type": "number"
}
},
"required": [
"query"
],
"type": "object"
},
"type": "function"
},
{
"description": "Reorders results by relevance using BM25 algorithm",
"name": "reranking",
"parameters": {
"additionalProperties": false,
"properties": {
"query": {
"description": "Original query for scoring",
"type": "string"
},
"results": {
"description": "Results to rerank",
"type": "array"
}
},
"required": [
"query",
"results"
],
"type": "object"
},
"type": "function"
}
]
}
}
}
Loading
Loading