Skip to content

Commit 436c674

Browse files
Update DDE and storage modules
- Add support for AWS-managed encryption (Bedrock Data Automation requires AWS-managed encryption) - Add support for AWS-managed Bedrock Data Automation blueprints
1 parent d919bd1 commit 436c674

File tree

11 files changed

+110
-86
lines changed

11 files changed

+110
-86
lines changed

infra/modules/document-data-extraction/resources/README.md

Lines changed: 49 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,20 @@
11
# Bedrock Data Automation Terraform Module
22

3-
This module provisions AWS Bedrock Data Automation resources, including the data automation project, blueprints, and associated IAM role for accessing S3 buckets.
3+
This module provisions AWS Bedrock Data Automation resources, including the data automation project and blueprints.
4+
45

56
## Overview
67

78
The module creates:
89
- **Bedrock Data Automation Project** - Main project resource for data automation workflows
910
- **Bedrock Blueprints** - Custom extraction blueprints configured via a map
10-
- **IAM Role** - Role for Bedrock service to assume with access to input/output S3 buckets
11+
12+
## Important Notes
13+
14+
- **BDA uses its own internal service role** - This module does not create a custom IAM role for BDA. Bedrock Data Automation uses an AWS-managed internal service role for S3 access.
15+
- **S3 bucket encryption** - S3 buckets used with BDA should use AWS-managed encryption (AES256), not customer-managed KMS keys.
16+
- **Lambda permissions** - Any Lambda function invoking BDA must have S3 permissions for both input and output buckets directly attached to its execution role.
17+
- **No bucket policies needed** - BDA does not require bucket policies allowing the `bedrock.amazonaws.com` service principal.
1118

1219
## Features
1320
- Creates resources required for Bedrock Data Automation workflows
@@ -17,14 +24,49 @@ The module creates:
1724
- Complies with Checkov recommendations for security and compliance
1825
- Designed for cross-layer usage (see project module conventions)
1926

27+
## Usage
28+
29+
```hcl
30+
module "bedrock_data_automation" {
31+
source = "../../modules/document-data-extraction/resources"
32+
33+
name = "my-app-prod"
34+
35+
blueprints_map = {
36+
invoice = {
37+
schema = file("${path.module}/schemas/invoice.json")
38+
type = "DOCUMENT"
39+
tags = {
40+
Environment = "production"
41+
ManagedBy = "terraform"
42+
}
43+
}
44+
}
45+
46+
standard_output_configuration = {
47+
document = {
48+
extraction = {
49+
granularity = {
50+
types = ["PAGE", "ELEMENT"]
51+
}
52+
}
53+
}
54+
}
55+
56+
tags = {
57+
Environment = "production"
58+
ManagedBy = "terraform"
59+
}
60+
}
61+
```
62+
2063
## Inputs
2164

2265
### Required Variables
2366

2467
| Name | Description | Type | Required |
2568
|-------|-------------|------|----------|
2669
| `name` | Prefix to use for resource names (e.g., "my-app-prod") | `string` | yes |
27-
| `data_access_policy_arns` | Map of policy ARNs for input and output locations to attach to the BDA role | `map(string)` | yes |
2870
| `blueprints_map` | Map of unique blueprints with keys as blueprint identifiers and values as blueprint objects | `map(object)` | yes |
2971

3072
#### `blueprints_map` Object Structure
@@ -59,17 +101,16 @@ See `variables.tf` for complete structure details.
59101
| Name | Description |
60102
|------|-------------|
61103
| `bda_project_arn` | The ARN of the Bedrock Data Automation project |
62-
| `bda_role_name` | The name of the IAM role used by Bedrock Data Automation |
63-
| `bda_role_arn` | The ARN of the IAM role used by Bedrock Data Automation |
64104
| `access_policy_arn` | The ARN of the IAM policy for accessing the Bedrock Data Automation project |
65-
105+
| `bda_profile_arn` | The profile ARN for cross-region inference |
106+
| `bda_blueprint_arns` | List of created blueprint ARNs |
107+
| `bda_blueprint_names` | List of created blueprint names |
108+
| `bda_blueprint_arn_to_name` | Map of blueprint ARNs to names |
66109

67110
## Resources Created
68111

69112
- `awscc_bedrock_data_automation_project.bda_project` - Main BDA project
70113
- `awscc_bedrock_blueprint.bda_blueprint` - One or more blueprints (created from `blueprints_map`)
71-
- `aws_iam_role.bda_role` - IAM role for Bedrock service
72-
- `aws_iam_role_policy_attachment.role_policy_attachments` - Policy attachments for S3 access
73114

74115
## Project Conventions
75116

@@ -95,11 +136,6 @@ module "bedrock_data_automation" {
95136
96137
name = "my-app"
97138
98-
data_access_policy_arns = {
99-
input = aws_iam_policy.input.arn
100-
output = aws_iam_policy.output.arn
101-
}
102-
103139
blueprints_map = {} # No custom blueprints
104140
}
105141
```
@@ -110,7 +146,6 @@ module "bedrock_data_automation" {
110146
source = "../../modules/document-data-extraction/resources"
111147
112148
name = "my-app"
113-
data_access_policy_arns = { /* ... */ }
114149
blueprints_map = { /* ... */ }
115150
116151
standard_output_configuration = {
@@ -149,8 +184,6 @@ module "bedrock_data_automation" {
149184
- AWS provider configured
150185
- AWS Cloud Control provider (awscc) configured
151186
- Appropriate AWS permissions to create Bedrock and IAM resources
152-
- KMS keys
153-
- S3 bucket policies defined for input/output buckets
154187

155188
## References
156189

infra/modules/document-data-extraction/resources/access_control.tf

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,14 +9,16 @@ data "aws_iam_policy_document" "bedrock_access" {
99
"bedrock:InvokeModel",
1010
"bedrock:InvokeModelWithResponseStream",
1111
"bedrock:GetDataAutomationProject",
12+
"bedrock:GetBlueprint",
1213
"bedrock:StartDataAutomationJob",
1314
"bedrock:GetDataAutomationJob",
1415
"bedrock:ListDataAutomationJobs"
1516
]
1617
effect = "Allow"
1718
resources = [
1819
awscc_bedrock_data_automation_project.bda_project.project_arn,
19-
"${awscc_bedrock_data_automation_project.bda_project.project_arn}/*"
20+
"${awscc_bedrock_data_automation_project.bda_project.project_arn}/*",
21+
"arn:aws:bedrock:*:*:blueprint/*"
2022
]
2123
}
2224
}

infra/modules/document-data-extraction/resources/encryption.tf

Lines changed: 0 additions & 5 deletions
This file was deleted.

infra/modules/document-data-extraction/resources/main.tf

Lines changed: 21 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -7,58 +7,38 @@ locals {
77
}
88
]
99

10-
kms_encryption_context = {
11-
Environment = lookup(var.tags, "environment", "unknown")
12-
}
10+
all_blueprints = concat(
11+
# custom blueprints created from json schemas
12+
[for k, v in awscc_bedrock_blueprint.bda_blueprint : {
13+
blueprint_arn = v.blueprint_arn
14+
blueprint_stage = v.blueprint_stage
15+
}],
16+
# aws managed blueprints referenced by arn
17+
var.aws_managed_blueprints != null ? [
18+
for arn in var.aws_managed_blueprints : {
19+
blueprint_arn = arn
20+
blueprint_stage = "LIVE"
21+
}
22+
] : []
23+
)
1324
}
1425

1526
resource "awscc_bedrock_data_automation_project" "bda_project" {
1627
project_name = "${var.name}-project"
1728
project_description = "Project for ${var.name}"
18-
kms_encryption_context = local.kms_encryption_context
19-
kms_key_id = aws_kms_key.bedrock_data_automation.arn
2029
tags = local.bda_tags
2130
standard_output_configuration = var.standard_output_configuration
22-
custom_output_configuration = {
23-
blueprints = [for k, v in awscc_bedrock_blueprint.bda_blueprint : {
24-
blueprint_arn = v.blueprint_arn
25-
blueprint_stage = v.blueprint_stage
26-
}]
27-
}
31+
custom_output_configuration = length(local.all_blueprints) > 0 ? {
32+
blueprints = local.all_blueprints
33+
} : null
2834
override_configuration = var.override_configuration
2935
}
3036

3137
resource "awscc_bedrock_blueprint" "bda_blueprint" {
3238
for_each = var.blueprints_map
3339

34-
blueprint_name = "${var.name}-${each.key}"
35-
schema = each.value.schema
36-
type = each.value.type
37-
kms_encryption_context = local.kms_encryption_context
38-
kms_key_id = aws_kms_key.bedrock_data_automation.arn
39-
tags = local.bda_tags
40+
blueprint_name = "${var.name}-${each.key}"
41+
schema = each.value.schema
42+
type = each.value.type
43+
tags = local.bda_tags
4044
}
41-
42-
resource "aws_iam_role" "bda_role" {
43-
name = "${var.name}-bda_role"
44-
45-
assume_role_policy = jsonencode({
46-
Version = "2012-10-17"
47-
Statement = [
48-
{
49-
Effect = "Allow"
50-
Principal = {
51-
Service = "bedrock.amazonaws.com"
52-
}
53-
Action = "sts:AssumeRole"
54-
}
55-
]
56-
})
57-
}
58-
59-
resource "aws_iam_role_policy_attachment" "role_policy_attachments" {
60-
for_each = var.data_access_policy_arns
61-
62-
role = aws_iam_role.bda_role.name
63-
policy_arn = each.value
64-
}

infra/modules/document-data-extraction/resources/variables.tf

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,10 @@ variable "name" {
33
type = string
44
}
55

6-
variable "data_access_policy_arns" {
7-
description = "The set of policy ARNs for the input and output locations to attach to the BDA role."
8-
type = map(string)
6+
variable "aws_managed_blueprints" {
7+
description = "List of AWS managed blueprint ARNs (stage defaults to LIVE)"
8+
type = list(string)
9+
default = null
910
}
1011

1112
variable "custom_output_config" {

infra/modules/storage/access_control.tf

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -57,9 +57,12 @@ data "aws_iam_policy_document" "storage_access" {
5757
"arn:aws:s3:::${var.name}/*"
5858
]
5959
}
60-
statement {
61-
actions = ["kms:GenerateDataKey", "kms:Decrypt"]
62-
effect = "Allow"
63-
resources = [aws_kms_key.storage.arn]
60+
dynamic "statement" {
61+
for_each = var.use_aws_managed_encryption ? [] : [1]
62+
content {
63+
actions = ["kms:GenerateDataKey", "kms:Decrypt"]
64+
effect = "Allow"
65+
resources = [aws_kms_key.storage[0].arn]
66+
}
6467
}
6568
}

infra/modules/storage/encryption.tf

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
11
resource "aws_kms_key" "storage" {
2+
count = var.use_aws_managed_encryption ? 0 : 1
3+
24
description = "KMS key for bucket ${var.name}"
35
# The waiting period, specified in number of days. After the waiting period ends, AWS KMS deletes the KMS key.
46
deletion_window_in_days = "10"
@@ -10,8 +12,8 @@ resource "aws_s3_bucket_server_side_encryption_configuration" "storage" {
1012
bucket = aws_s3_bucket.storage.id
1113
rule {
1214
apply_server_side_encryption_by_default {
13-
kms_master_key_id = aws_kms_key.storage.arn
14-
sse_algorithm = "aws:kms"
15+
kms_master_key_id = var.use_aws_managed_encryption ? null : aws_kms_key.storage[0].arn
16+
sse_algorithm = var.use_aws_managed_encryption ? "AES256" : "aws:kms"
1517
}
1618
bucket_key_enabled = true
1719
}

infra/modules/storage/main.tf

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,5 @@ resource "aws_s3_bucket" "storage" {
99
# checkov:skip=CKV_AWS_144:Cross region replication not required by default
1010
# checkov:skip=CKV2_AWS_62:S3 bucket does not need notifications enabled
1111
# checkov:skip=CKV_AWS_21:Bucket versioning is not needed
12+
# checkov:skip=CKV_AWS_145:AWS-managed encryption (AES256) used when use_aws_managed_encryption=true
1213
}

infra/modules/storage/variables.tf

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,9 @@ variable "name" {
88
type = string
99
description = "Name of the AWS S3 bucket. Needs to be globally unique across all regions."
1010
}
11+
12+
variable "use_aws_managed_encryption" {
13+
description = "Use AWS-managed encryption (AES256) instead of customer-managed KMS keys"
14+
type = bool
15+
default = false
16+
}

infra/{{app_name}}/app-config/env-config/document_data_extraction.tf

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,10 @@ locals {
77
# Blueprints path is relative to infra/{{app_name}}/service/ directory
88
# Contains JSON schema files for custom Bedrock Data Automation blueprints
99
# (e.g., not AWS-managed blueprints)
10-
blueprints_path = "./document-data-extraction-blueprints/"
10+
blueprints_path = "./document-data-extraction-blueprints/"
11+
12+
# Optional: List of AWS-managed blueprint ARNs to use for document extraction
13+
aws_managed_blueprints = null
1114

1215
# BDA can only be deployed to us-east-1, us-west-2, and us-gov-west-1
1316
# TODO(https://github.com/navapbc/template-infra/issues/993) Add GovCloud Support

0 commit comments

Comments
 (0)