Skip to content

Commit 4d23ebf

Browse files
Zie619claude
andcommitted
Expand cloud scanner to 52 Terraform + 25 CloudFormation resource types
- Add 38 new Terraform resources: Bedrock (guardrails, flows, prompts), SageMaker (pipelines, notebooks, domains), Comprehend, Kendra, Lex, Rekognition, Azure OpenAI/AI Foundry/ML, Google Vertex AI (reasoning engine, datasets, feature stores), Dialogflow CX, Discovery Engine - Add 20 new CloudFormation resources matching Terraform coverage - Handle workflow ComponentType → orchestration UsageType - Extract kind, display_name, description from Terraform metadata - Add CloudFormation fallback names: AgentName, FlowName, GuardrailName, PipelineName - Add Scan Levels documentation section to README - Update demo data with Azure OpenAI, Vertex AI, and Bedrock guardrail - Add CloudFormation test fixture and 11 new test methods (135 total) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 9f90fc2 commit 4d23ebf

File tree

7 files changed

+385
-16
lines changed

7 files changed

+385
-16
lines changed

README.md

Lines changed: 32 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,15 +9,16 @@
99
<a href="#demo">Demo</a> &nbsp;|&nbsp;
1010
<a href="#output-formats">Output Formats</a> &nbsp;|&nbsp;
1111
<a href="#n8n-workflow-scanning-first-of-its-kind">n8n Scanning</a> &nbsp;|&nbsp;
12-
<a href="#risk-scoring">Risk Scoring</a>
12+
<a href="#risk-scoring">Risk Scoring</a> &nbsp;|&nbsp;
13+
<a href="#scan-levels">Scan Levels</a>
1314
</p>
1415

1516
<!-- badges -->
1617
<p>
1718
<img src="https://img.shields.io/badge/license-Apache%202.0-blue.svg" alt="License" />
1819
<img src="https://img.shields.io/badge/python-3.10%2B-blue.svg" alt="Python" />
1920
<img src="https://img.shields.io/badge/CycloneDX-1.6-green.svg" alt="CycloneDX" />
20-
<img src="https://img.shields.io/badge/tests-124%20passing-brightgreen.svg" alt="Tests" />
21+
<img src="https://img.shields.io/badge/tests-135%20passing-brightgreen.svg" alt="Tests" />
2122
<img src="https://img.shields.io/badge/PRs-welcome-orange.svg" alt="PRs Welcome" />
2223
</p>
2324
</div>
@@ -68,7 +69,7 @@ ai-bom scan . --format cyclonedx --output ai-bom.json
6869
| Model References | gpt-4o, claude-3-5-sonnet, gemini-1.5-pro, llama-3 | Code |
6970
| API Keys | OpenAI (sk-\*), Anthropic (sk-ant-\*), HuggingFace (hf\_\*) | Code, Network |
7071
| AI Containers | Ollama, vLLM, HuggingFace, NVIDIA, ChromaDB | Docker |
71-
| Cloud AI | AWS Bedrock, SageMaker, Vertex AI, Azure Cognitive | Cloud |
72+
| Cloud AI | AWS Bedrock, SageMaker, Comprehend, Kendra, Lex \| Azure OpenAI, AI Foundry, ML \| Google Vertex AI, Dialogflow CX | Cloud |
7273
| AI Endpoints | api.openai.com, api.anthropic.com, localhost:11434 | Network |
7374
| n8n AI Nodes | AI Agents, LLM Chat, MCP Client, Tools, Embeddings | n8n |
7475
| MCP Servers | Model Context Protocol connections | Code, n8n |
@@ -199,6 +200,33 @@ Every component receives a risk score (0–100):
199200
| Deprecated model | +10 | Using deprecated AI model |
200201
| Unpinned model | +5 | Model version not pinned |
201202

203+
## Scan Levels
204+
205+
ai-bom's detection depth depends on the permissions available at scan time. Each level progressively reveals more shadow AI:
206+
207+
| Level | Access Required | What It Finds | Scanner |
208+
|-------|----------------|---------------|---------|
209+
| **Level 1 — File System** | Read-only file access | Source code imports, dependency files, config files, IaC definitions, n8n workflow JSON | Code, Cloud, n8n |
210+
| **Level 2 — Docker** | + Docker socket access | Running AI containers, GPU allocations, AI model images | Docker |
211+
| **Level 3 — Network** | + Network/env file access | API endpoints, hardcoded API keys, .env configurations | Network |
212+
| **Level 4 — Cloud IAM** | + Cloud provider credentials | Managed AI services (Bedrock, SageMaker, Vertex AI, Azure OpenAI) provisioned at infrastructure level | Cloud |
213+
214+
### What each level requires
215+
216+
**Level 1 (default)** — Works out of the box. Just point ai-bom at a directory or Git URL:
217+
```bash
218+
ai-bom scan .
219+
ai-bom scan https://github.com/org/repo.git
220+
```
221+
222+
**Level 2** — Requires access to Docker socket or compose files in the scan path. No additional configuration needed if Dockerfiles/compose files are in the repo.
223+
224+
**Level 3** — Scans `.env`, `.env.local`, `.env.production`, and config files (`.yaml`, `.json`, `.toml`, `.ini`). Detects both endpoint URLs and hardcoded API keys. For maximum coverage, ensure environment files are accessible (they're often gitignored).
225+
226+
**Level 4** — Scans Terraform (`.tf`) and CloudFormation (`.yaml`, `.json`) files for cloud-provisioned AI services. Covers 60+ AWS, Azure, and GCP resource types. For live cloud inventory (not yet available), would require IAM read permissions.
227+
228+
> **Tip:** For CI/CD pipelines, Level 1-3 are automatic. Level 4 requires IaC files in the repo (Terraform/CloudFormation). A future release will add live cloud API scanning with IAM credentials.
229+
202230
## Comparison
203231

204232
How does ai-bom compare to existing supply chain tools?
@@ -258,7 +286,7 @@ git clone https://github.com/trusera/ai-bom.git
258286
cd ai-bom
259287
pip install -e ".[dev]"
260288

261-
# Run tests (124 passing)
289+
# Run tests (135 passing)
262290
pytest tests/ -v
263291

264292
# Run demo

examples/demo-project/infra/main.tf

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,3 +32,48 @@ resource "aws_sagemaker_model" "llm_model" {
3232
model_data_url = "s3://my-bucket/models/fine-tuned-model.tar.gz"
3333
}
3434
}
35+
36+
# Azure OpenAI Deployment
37+
resource "azurerm_cognitive_deployment" "gpt4o" {
38+
name = "gpt-4o-global"
39+
cognitive_account_id = azurerm_cognitive_account.openai.id
40+
model_name = "gpt-4o"
41+
42+
model {
43+
format = "OpenAI"
44+
name = "gpt-4o"
45+
version = "2024-05-13"
46+
}
47+
48+
sku {
49+
name = "GlobalStandard"
50+
capacity = 50
51+
}
52+
}
53+
54+
# Google Vertex AI Reasoning Engine
55+
resource "google_vertex_ai_reasoning_engine" "support_agent" {
56+
display_name = "ai-support-agent"
57+
description = "Customer support reasoning engine powered by Gemini"
58+
project = "my-gcp-project"
59+
location = "us-central1"
60+
}
61+
62+
# AWS Bedrock Guardrail
63+
resource "aws_bedrock_guardrail" "content_safety" {
64+
name = "production-content-filter"
65+
description = "Block harmful and sensitive content in AI responses"
66+
67+
content_policy_config {
68+
filters_config {
69+
type = "HATE"
70+
input_strength = "HIGH"
71+
output_strength = "HIGH"
72+
}
73+
filters_config {
74+
type = "VIOLENCE"
75+
input_strength = "HIGH"
76+
output_strength = "HIGH"
77+
}
78+
}
79+
}

src/ai_bom/demo_data/infra/main.tf

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,3 +32,48 @@ resource "aws_sagemaker_model" "llm_model" {
3232
model_data_url = "s3://my-bucket/models/fine-tuned-model.tar.gz"
3333
}
3434
}
35+
36+
# Azure OpenAI Deployment
37+
resource "azurerm_cognitive_deployment" "gpt4o" {
38+
name = "gpt-4o-global"
39+
cognitive_account_id = azurerm_cognitive_account.openai.id
40+
model_name = "gpt-4o"
41+
42+
model {
43+
format = "OpenAI"
44+
name = "gpt-4o"
45+
version = "2024-05-13"
46+
}
47+
48+
sku {
49+
name = "GlobalStandard"
50+
capacity = 50
51+
}
52+
}
53+
54+
# Google Vertex AI Reasoning Engine
55+
resource "google_vertex_ai_reasoning_engine" "support_agent" {
56+
display_name = "ai-support-agent"
57+
description = "Customer support reasoning engine powered by Gemini"
58+
project = "my-gcp-project"
59+
location = "us-central1"
60+
}
61+
62+
# AWS Bedrock Guardrail
63+
resource "aws_bedrock_guardrail" "content_safety" {
64+
name = "production-content-filter"
65+
description = "Block harmful and sensitive content in AI responses"
66+
67+
content_policy_config {
68+
filters_config {
69+
type = "HATE"
70+
input_strength = "HIGH"
71+
output_strength = "HIGH"
72+
}
73+
filters_config {
74+
type = "VIOLENCE"
75+
input_strength = "HIGH"
76+
output_strength = "HIGH"
77+
}
78+
}
79+
}

src/ai_bom/scanners/cloud_scanner.py

Lines changed: 105 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -28,38 +28,108 @@ class CloudScanner(BaseScanner):
2828

2929
# Terraform resource type to (provider, component_type) mapping
3030
TERRAFORM_AI_RESOURCES = {
31+
# --- AWS Bedrock ---
3132
"aws_bedrockagent_agent": ("AWS Bedrock", ComponentType.agent_framework),
3233
"aws_bedrockagent_knowledge_base": ("AWS Bedrock", ComponentType.tool),
34+
"aws_bedrock_custom_model": ("AWS Bedrock", ComponentType.model),
35+
"aws_bedrock_provisioned_model_throughput": ("AWS Bedrock", ComponentType.endpoint),
36+
"aws_bedrock_guardrail": ("AWS Bedrock", ComponentType.tool),
37+
"aws_bedrock_model_invocation_logging_configuration": ("AWS Bedrock", ComponentType.tool),
38+
"aws_bedrockagent_agent_action_group": ("AWS Bedrock", ComponentType.tool),
39+
"aws_bedrockagent_agent_alias": ("AWS Bedrock", ComponentType.agent_framework),
40+
"aws_bedrockagent_data_source": ("AWS Bedrock", ComponentType.tool),
41+
"aws_bedrockagent_flow": ("AWS Bedrock", ComponentType.workflow),
42+
"aws_bedrockagent_prompt": ("AWS Bedrock", ComponentType.tool),
43+
# --- AWS SageMaker ---
3344
"aws_sagemaker_endpoint": ("AWS SageMaker", ComponentType.endpoint),
3445
"aws_sagemaker_model": ("AWS SageMaker", ComponentType.model),
35-
"aws_sagemaker_endpoint_configuration": (
36-
"AWS SageMaker",
37-
ComponentType.endpoint,
38-
),
46+
"aws_sagemaker_endpoint_configuration": ("AWS SageMaker", ComponentType.endpoint),
47+
"aws_sagemaker_notebook_instance": ("AWS SageMaker", ComponentType.tool),
48+
"aws_sagemaker_domain": ("AWS SageMaker", ComponentType.container),
49+
"aws_sagemaker_pipeline": ("AWS SageMaker", ComponentType.workflow),
50+
"aws_sagemaker_feature_group": ("AWS SageMaker", ComponentType.tool),
51+
"aws_sagemaker_space": ("AWS SageMaker", ComponentType.container),
52+
"aws_sagemaker_app": ("AWS SageMaker", ComponentType.tool),
53+
"aws_sagemaker_model_package_group": ("AWS SageMaker", ComponentType.model),
54+
# --- AWS Comprehend ---
55+
"aws_comprehend_document_classifier": ("AWS Comprehend", ComponentType.model),
56+
"aws_comprehend_entity_recognizer": ("AWS Comprehend", ComponentType.model),
57+
# --- AWS Kendra ---
58+
"aws_kendra_index": ("AWS Kendra", ComponentType.tool),
59+
# --- AWS Lex ---
60+
"aws_lexv2models_bot": ("AWS Lex", ComponentType.agent_framework),
61+
# --- AWS Rekognition ---
62+
"aws_rekognition_project": ("AWS Rekognition", ComponentType.model),
63+
# --- Google Vertex AI ---
3964
"google_vertex_ai_endpoint": ("Google Vertex AI", ComponentType.endpoint),
4065
"google_vertex_ai_featurestore": ("Google Vertex AI", ComponentType.tool),
4166
"google_vertex_ai_index": ("Google Vertex AI", ComponentType.tool),
4267
"google_vertex_ai_tensorboard": ("Google Vertex AI", ComponentType.tool),
68+
"google_vertex_ai_dataset": ("Google Vertex AI", ComponentType.tool),
69+
"google_vertex_ai_metadata_store": ("Google Vertex AI", ComponentType.tool),
70+
"google_vertex_ai_deployment_resource_pool": ("Google Vertex AI", ComponentType.container),
71+
"google_vertex_ai_index_endpoint": ("Google Vertex AI", ComponentType.endpoint),
72+
"google_vertex_ai_feature_online_store": ("Google Vertex AI", ComponentType.tool),
73+
"google_vertex_ai_reasoning_engine": ("Google Vertex AI", ComponentType.agent_framework),
74+
"google_notebooks_instance": ("Google Vertex AI", ComponentType.tool),
75+
"google_workbench_instance": ("Google Vertex AI", ComponentType.tool),
76+
# --- Google ML Engine ---
4377
"google_ml_engine_model": ("Google ML Engine", ComponentType.model),
78+
# --- Google Dialogflow CX ---
79+
"google_dialogflow_cx_agent": ("Google Dialogflow CX", ComponentType.agent_framework),
80+
# --- Google Discovery Engine ---
81+
"google_discovery_engine_search_engine": (
82+
"Google Discovery Engine",
83+
ComponentType.endpoint,
84+
),
85+
# --- Azure AI ---
4486
"azurerm_cognitive_account": ("Azure AI", ComponentType.llm_provider),
87+
"azurerm_cognitive_deployment": ("Azure OpenAI", ComponentType.endpoint),
88+
"azurerm_ai_services": ("Azure AI", ComponentType.llm_provider),
89+
"azurerm_ai_foundry": ("Azure AI Foundry", ComponentType.tool),
90+
"azurerm_ai_foundry_project": ("Azure AI Foundry", ComponentType.tool),
91+
# --- Azure ML ---
4592
"azurerm_machine_learning_workspace": ("Azure ML", ComponentType.tool),
46-
"azurerm_machine_learning_compute_cluster": (
47-
"Azure ML",
48-
ComponentType.container,
49-
),
50-
"azurerm_machine_learning_compute_instance": (
51-
"Azure ML",
52-
ComponentType.container,
53-
),
93+
"azurerm_machine_learning_compute_cluster": ("Azure ML", ComponentType.container),
94+
"azurerm_machine_learning_compute_instance": ("Azure ML", ComponentType.container),
95+
"azurerm_machine_learning_inference_cluster": ("Azure ML", ComponentType.endpoint),
96+
"azurerm_machine_learning_synapse_spark": ("Azure ML", ComponentType.container),
97+
"azurerm_machine_learning_datastore_blobstorage": ("Azure ML", ComponentType.tool),
5498
}
5599

56100
# CloudFormation resource types to (provider, component_type) mapping
57101
CLOUDFORMATION_AI_RESOURCES = {
102+
# --- Bedrock ---
58103
"AWS::Bedrock::Agent": ("AWS Bedrock", ComponentType.agent_framework),
59104
"AWS::Bedrock::KnowledgeBase": ("AWS Bedrock", ComponentType.tool),
105+
"AWS::Bedrock::AgentAlias": ("AWS Bedrock", ComponentType.agent_framework),
106+
"AWS::Bedrock::DataSource": ("AWS Bedrock", ComponentType.tool),
107+
"AWS::Bedrock::Flow": ("AWS Bedrock", ComponentType.workflow),
108+
"AWS::Bedrock::FlowAlias": ("AWS Bedrock", ComponentType.workflow),
109+
"AWS::Bedrock::Guardrail": ("AWS Bedrock", ComponentType.tool),
110+
"AWS::Bedrock::Prompt": ("AWS Bedrock", ComponentType.tool),
111+
"AWS::Bedrock::ApplicationInferenceProfile": ("AWS Bedrock", ComponentType.endpoint),
112+
# --- SageMaker ---
60113
"AWS::SageMaker::Endpoint": ("AWS SageMaker", ComponentType.endpoint),
61114
"AWS::SageMaker::Model": ("AWS SageMaker", ComponentType.model),
62115
"AWS::SageMaker::EndpointConfig": ("AWS SageMaker", ComponentType.endpoint),
116+
"AWS::SageMaker::NotebookInstance": ("AWS SageMaker", ComponentType.tool),
117+
"AWS::SageMaker::Domain": ("AWS SageMaker", ComponentType.container),
118+
"AWS::SageMaker::Pipeline": ("AWS SageMaker", ComponentType.workflow),
119+
"AWS::SageMaker::FeatureGroup": ("AWS SageMaker", ComponentType.tool),
120+
"AWS::SageMaker::ModelPackage": ("AWS SageMaker", ComponentType.model),
121+
"AWS::SageMaker::ModelPackageGroup": ("AWS SageMaker", ComponentType.model),
122+
"AWS::SageMaker::InferenceComponent": ("AWS SageMaker", ComponentType.endpoint),
123+
"AWS::SageMaker::Space": ("AWS SageMaker", ComponentType.container),
124+
# --- Comprehend ---
125+
"AWS::Comprehend::DocumentClassifier": ("AWS Comprehend", ComponentType.model),
126+
"AWS::Comprehend::Flywheel": ("AWS Comprehend", ComponentType.workflow),
127+
# --- Kendra ---
128+
"AWS::Kendra::Index": ("AWS Kendra", ComponentType.tool),
129+
# --- Lex ---
130+
"AWS::Lex::Bot": ("AWS Lex", ComponentType.agent_framework),
131+
# --- Rekognition ---
132+
"AWS::Rekognition::Project": ("AWS Rekognition", ComponentType.model),
63133
}
64134

65135
# Patterns for GPU instance types
@@ -292,6 +362,21 @@ def _extract_terraform_metadata(
292362
if endpoint_name_match:
293363
metadata["endpoint_name"] = endpoint_name_match.group(1)
294364

365+
# kind = "..." (common in GCP resources)
366+
kind_match = re.search(r'kind\s*=\s*"([^"]+)"', block_text)
367+
if kind_match:
368+
metadata["kind"] = kind_match.group(1)
369+
370+
# display_name = "..." (common in Azure/GCP resources)
371+
display_name_match = re.search(r'display_name\s*=\s*"([^"]+)"', block_text)
372+
if display_name_match:
373+
metadata["display_name"] = display_name_match.group(1)
374+
375+
# description = "..." (common across providers)
376+
description_match = re.search(r'description\s*=\s*"([^"]+)"', block_text)
377+
if description_match:
378+
metadata["description"] = description_match.group(1)
379+
295380
return metadata
296381

297382
def _scan_cloudformation(self, file_path: Path) -> list[AIComponent]:
@@ -345,6 +430,10 @@ def _scan_cloudformation(self, file_path: Path) -> list[AIComponent]:
345430
properties.get("ModelId", "")
346431
or properties.get("ModelName", "")
347432
or properties.get("FoundationModel", "")
433+
or properties.get("AgentName", "")
434+
or properties.get("FlowName", "")
435+
or properties.get("GuardrailName", "")
436+
or properties.get("PipelineName", "")
348437
)
349438

350439
# Create metadata
@@ -495,6 +584,10 @@ def _infer_usage_type(
495584
# Default to completion for LLM endpoints
496585
return UsageType.completion
497586

587+
# Workflows are used for orchestration
588+
if component_type == ComponentType.workflow:
589+
return UsageType.orchestration
590+
498591
# Tools are used for tool_use
499592
if component_type == ComponentType.tool:
500593
return UsageType.tool_use
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
AWSTemplateFormatVersion: "2010-09-09"
2+
Description: Sample CloudFormation template with AI resources for testing
3+
4+
Resources:
5+
OrderProcessingFlow:
6+
Type: AWS::Bedrock::Flow
7+
Properties:
8+
Name: order-processing-flow
9+
Description: AI flow for processing customer orders
10+
ExecutionRoleArn: !Sub "arn:aws:iam::${AWS::AccountId}:role/BedrockFlowRole"
11+
12+
ContentGuardrail:
13+
Type: AWS::Bedrock::Guardrail
14+
Properties:
15+
Name: content-safety
16+
Description: Content filtering guardrail
17+
BlockedInputMessaging: "Input blocked by guardrail"
18+
BlockedOutputsMessaging: "Output blocked by guardrail"
19+
20+
TrainingPipeline:
21+
Type: AWS::SageMaker::Pipeline
22+
Properties:
23+
PipelineName: model-training-pipeline
24+
PipelineDescription: Automated model training pipeline
25+
RoleArn: !Sub "arn:aws:iam::${AWS::AccountId}:role/SageMakerRole"
26+
27+
SearchIndex:
28+
Type: AWS::Kendra::Index
29+
Properties:
30+
Name: knowledge-base-index
31+
Description: Enterprise knowledge base search index
32+
Edition: ENTERPRISE_EDITION
33+
RoleArn: !Sub "arn:aws:iam::${AWS::AccountId}:role/KendraRole"

0 commit comments

Comments
 (0)