adds terraform scripts for deploying Bifrost#1636
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a new multi-cloud Terraform module for Bifrost with provider-specific submodules (AWS, GCP, Azure, Kubernetes), extensive variables/outputs, example stacks per platform, README/docs, .gitignore updates, and example Terraform lockfiles. Modules provision networking, secrets, IAM, storage, compute (ECS/EKS/GKE/AKS/Cloud Run/ACI), ingress, and autoscaling. Changes
Sequence Diagram(s)sequenceDiagram
actor Admin as Administrator
participant TF as Terraform CLI
participant Orch as Bifrost Orchestrator
participant Prov as Provider Module
participant Svc as Service Module
participant Cloud as Cloud/API
Admin->>TF: terraform init && terraform apply
TF->>Orch: instantiate module (cloud_provider + vars)
Orch->>Orch: read config_json_file? merge overrides (vars)
Orch->>Prov: select provider module (aws/gcp/azure/kubernetes) with config_json + inputs
Prov->>Cloud: provision shared infra (VPC/Networking, Secrets, IAM, Logs)
Prov->>Svc: instantiate service (ECS/EKS/GKE/AKS/CloudRun/ACI/K8s)
Svc->>Cloud: create compute resources (cluster, task/service, LB, disk, domain)
Cloud-->>Svc: return runtime values (endpoints, names)
Svc-->>Prov: expose module outputs
Prov-->>Orch: forward provider outputs
Orch-->>TF: aggregate outputs (service_url, health_check_url, config_json)
TF-->>Admin: display outputs
Estimated Code Review Effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
🧪 Test Suite AvailableThis PR can be tested by a repository admin. |
There was a problem hiding this comment.
Actionable comments posted: 16
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🤖 Fix all issues with AI agents
In `@terraform/examples/gcp-gke/main.tf`:
- Around line 8-11: The kubernetes provider block is unconfigured and must be
wired to the GKE cluster credentials; update the provider "kubernetes"
declaration to use the GKE cluster endpoint and CA from
google_container_cluster.bifrost (use master_auth[0].cluster_ca_certificate and
base64decode it) and use a data source like
data.google_client_config.default.access_token for the token, then move that
provider block and the new data.google_client_config data source into the GKE
service module where the google_container_cluster.bifrost resource is created so
the provider references the local cluster outputs instead of relying on ambient
gcloud credentials.
In `@terraform/modules/bifrost/.terraform/modules/modules.json`:
- Line 1: Remove the generated modules.json from source control and prevent
future commits: delete the committed modules.json file from the repo index
(unstage/remove it from Git), add a rule ignoring the .terraform directory
(e.g., add a single line .terraform/ to .gitignore), and commit the changes so
the generated artifact is no longer tracked; verify by running git status to
ensure modules.json is gone from tracked files. Ensure you reference the
generated modules.json file (the JSON with the "Modules" key) when
removing/tracking changes.
In
`@terraform/modules/bifrost/.terraform/providers/registry.terraform.io/hashicorp/kubernetes/3.0.1/darwin_arm64/LICENSE.txt`:
- Around line 1-375: The repo has committed Terraform provider binaries under
terraform/modules/bifrost/.terraform which must be removed and excluded; run git
rm -r --cached on the .terraform directory (e.g.,
terraform/modules/bifrost/.terraform), add the patterns **/.terraform/ and
**/.terraform/* to the root .gitignore, commit the removal and updated
.gitignore, and ensure you do NOT remove or ignore .terraform.lock.hcl (keep
.terraform.lock.hcl tracked).
In `@terraform/modules/bifrost/aws/main.tf`:
- Around line 141-149: The aws_secretsmanager_secret resource
aws_secretsmanager_secret.bifrost_config currently sets recovery_window_in_days
= 0 which causes immediate permanent deletion; add a new variable (e.g.,
secret_recovery_window_days) with a safe default like 7 and replace the
hardcoded 0 with that variable, update any documentation/variables.tf to declare
variable "secret_recovery_window_days" (type number, default 7) so the recovery
window is configurable for production.
In `@terraform/modules/bifrost/aws/services/ecs/main.tf`:
- Around line 125-137: The ALB currently only supports HTTP; add optional HTTPS
support and harden the ALB: introduce variables (e.g., enable_https bool default
false and certificate_arn string) and create an aws_lb_listener "https" resource
that is created only when var.create_load_balancer && var.enable_https (and
certificate present), with port 443, protocol "HTTPS", certificate_arn set to
the variable, ssl_policy set to a TLS1.2+ policy (e.g.,
ELBSecurityPolicy-TLS13-1-2-2021-06) and default_action forwarding to
aws_lb_target_group.bifrost; also enable header hardening on aws_lb.bifrost by
setting drop_invalid_header_fields = true (and any other header-dropping
options) so listeners and load balancer enforce TLS and drop invalid headers.
- Around line 22-25: The locals block builds cluster_arn with a wildcard account
ID when var.create_cluster is false, which produces an invalid ARN; update the
logic in locals.cluster_arn (and related locals.cluster_name) to either (A)
resolve the actual account ID via a data source (e.g., data.aws_caller_identity)
and construct a valid ARN when not creating the cluster, or (B) add a new input
variable (e.g., var.existing_cluster_arn) and use that value whenever
var.create_cluster is false; ensure aws_ecs_cluster.this is still used when
create_cluster is true and fall back to the provided/derived ARN when false.
In `@terraform/modules/bifrost/aws/services/eks/main.tf`:
- Around line 125-145: The EKS cluster resource aws_eks_cluster.this needs
Kubernetes secrets envelope encryption enabled by adding an encryption_config
block referencing a KMS key ARN; add a new variable (e.g., var.kms_key_arn) or
create an aws_kms_key resource and supply its arn to
aws_eks_cluster.this.encryption_config[0].provider.kms.key_arn, update module
inputs/variables accordingly, and add a depends_on for the KMS key/resource so
the key is created before the cluster.
In `@terraform/modules/bifrost/azure/main.tf`:
- Around line 67-78: The NSG security_rule currently uses source_address_prefix
= "*" (in the security_rule block) which permits unrestricted ingress; change
this to a configurable CIDR by adding a variable (e.g., allowed_ingress_cidr)
and use it in place of "*" (together with existing destination_port_range =
tostring(var.container_port)); update any variable defaults and documentation to
use a safe default (such as the VNet CIDR or an empty list requiring explicit
set) and ensure validation or a conditional fallback exists so production
deployments must explicitly set allowed_ingress_cidr rather than allowing open
access.
- Around line 90-124: azurerm_key_vault.this is created with no network
restrictions and allows public access; update the azurerm_key_vault.this
resource to disable public networking and add a network_acls block that
restricts access to your VNet/subnets (set public_network_access_enabled = false
and add network_acls { default_action = "Deny" trusted_service_access_enabled =
false (or as appropriate) and virtual_network_subnet_ids = [aws/azure subnet
ids] or ip_rules as needed) so only the managed identity and approved networks
can reach the vault; ensure the existing access_policy blocks remain intact and
test that the azurerm_user_assigned_identity.this principal_id still has secret
Get/List access after the ACL changes.
In `@terraform/modules/bifrost/azure/services/aci/main.tf`:
- Around line 5-10: The cpu conversion in locals (cpu_cores) is inconsistent
with its comment and ACI allowed values: decide whether var.cpu is millicores or
cores and then fix the math and threshold (e.g., if millicores use var.cpu /
1024 to match the comment, or keep /1000 but update the comment), and add
validation or mapping to allowed ACI values (0.25,0.5,1.0,1.5,2.0,4.0) so
cpu_cores emits a supported value; keep memory_gb as var.memory / 1024 if
var.memory is MB or adjust if units differ. Also replace unsafe JSON emission
that uses echo with a safe printf-based serialization (use printf '%s'
"$BIFROST_CONFIG") where the config is written to shell to avoid interpretation
issues.
- Around line 47-52: The startup command in the commands array currently uses
echo to write BIFROST_CONFIG to /app/data/config.json which can mangle JSON
(backslashes, special chars) and adds a trailing newline; change the write to
use printf '%s' with the BIFROST_CONFIG variable (i.e. replace the echo
"$BIFROST_CONFIG" > /app/data/config.json portion) so the JSON is preserved
exactly, keeping the rest of the conditional and the exec of
/app/docker-entrypoint.sh /app/main intact; apply the same change in the AWS ECS
module where the same pattern appears.
In `@terraform/modules/bifrost/azure/services/aks/main.tf`:
- Around line 20-45: Update the azurerm_kubernetes_cluster resource
(azurerm_kubernetes_cluster.this) and its default_node_pool and network_profile
blocks to expose and set security flags: add cluster-level attributes
api_server_authorized_ip_ranges, role_based_access_control_enabled = true (or
driven by new variable), and only_critical_addons_enabled = true; set
network_profile.network_policy = var.network_policy (default "azure"); in
default_node_pool add enable_host_encryption = var.enable_host_encryption
(default true). Add matching module variables (api_server_authorized_ip_ranges,
role_based_access_control_enabled, only_critical_addons_enabled, network_policy,
enable_host_encryption) with sensible defaults and use them in the resource so
consumers can configure these hardening controls.
In `@terraform/modules/bifrost/gcp/main.tf`:
- Around line 62-78: The firewall resource
google_compute_firewall.bifrost_allow_ingress currently uses source_ranges =
["0.0.0.0/0"], allowing unrestricted ingress; change this to use a configurable
variable (e.g., var.allowed_source_ranges or var.allowed_cidrs) and default it
to a safe, limited set (or require it as input) so production deployments don't
open the port globally; update the resource to reference that variable (instead
of the literal "0.0.0.0/0") and document the new variable so callers provide the
allowed CIDR ranges for var.name_prefix-tagged targets and var.container_port.
In `@terraform/modules/bifrost/gcp/services/gke/main.tf`:
- Around line 56-68: The node_config currently exposes instance metadata because
it lacks workload metadata protection; inside the existing node_config block add
a workload_metadata_config block that enables the GKE Metadata Server (e.g.
workload_metadata_config { mode = "GKE_METADATA_SERVER" }) to prevent pods from
directly accessing node/service-account metadata; update the node_config where
it appears (the node_config block alongside machine_type, service_account,
labels, tags, disk_size_gb) so the cluster/node pool uses the metadata server
mode.
- Around line 24-44: Add network-hardening to the
google_container_cluster.bifrost resource by configuring
master_authorized_networks_config to restrict API server access (using
var.master_authorized_networks or similar) and enabling private cluster settings
via private_cluster_config (e.g., enable_private_nodes,
enable_private_endpoint=false/true per your design, and master_ipv4_cidr_block
where required); update any related variables (e.g.,
var.master_authorized_networks, var.enable_private_cluster) and ensure node pool
access (e.g., master_global_access_config) and necessary IAM/peering settings
are adjusted so the cluster remains reachable by authorized networks and private
nodes can access control plane as intended.
🟡 Minor comments (6)
terraform/examples/aws-ecs/terraform.tfvars.example-35-41 (1)
35-41:⚠️ Potential issue | 🟡 MinorAvoid encouraging plaintext API keys in tfvars examples.
Even with placeholders, this pattern can lead to real keys being committed. Consider adding a note and placeholder values that direct users to Secret Manager or TF_VAR-based injection.
🔐 Suggested tweak
-providers_config = { +# NOTE: Prefer injecting API keys via Secret Manager or TF_VAR_* env vars. +providers_config = { openai = { - api_key = "sk-..." + api_key = "<set via secret manager>" } anthropic = { - api_key = "sk-ant-..." + api_key = "<set via secret manager>" } }terraform/modules/bifrost/azure/services/aks/variables.tf-63-110 (1)
63-110:⚠️ Potential issue | 🟡 MinorAdd validation to prevent empty
subnet_idswhen creating a cluster.
subnet_ids[0]is accessed in theazurerm_kubernetes_clusterresource (main.tf line 32) inside acreate_clusterconditional. An empty list will fail at plan/apply time whencreate_clusteris true. Add a validation guard.Suggested validation
variable "subnet_ids" { description = "Subnet IDs for the AKS cluster." type = list(string) + validation { + condition = var.create_cluster == false || length(var.subnet_ids) > 0 + error_message = "subnet_ids must contain at least one subnet when create_cluster is true." + } }terraform/README.md-48-49 (1)
48-49:⚠️ Potential issue | 🟡 MinorProperty count mismatch.
Line 48 states "All 18 top-level config properties" but line 49 lists only 16 properties. Either update the count to 16 or add the 2 missing properties.
terraform/modules/bifrost/azure/outputs.tf-1-15 (1)
1-15:⚠️ Potential issue | 🟡 Minor
coalesce()will error if both modules are absent.If neither AKS nor ACI is deployed (both module counts are 0),
coalesce()receives onlynullvalues and throws an error. Consider wrapping withtry()or usingconcat+one()pattern to safely returnnull.Proposed fix
output "service_url" { description = "URL to access the Bifrost service." - value = coalesce( + value = try(coalesce( try(module.aks[0].service_url, null), try(module.aci[0].service_url, null), - ) + ), null) } output "health_check_url" { description = "URL to the Bifrost health check endpoint." - value = coalesce( + value = try(coalesce( try(module.aks[0].health_check_url, null), try(module.aci[0].health_check_url, null), - ) + ), null) }terraform/modules/bifrost/outputs.tf-1-17 (1)
1-17:⚠️ Potential issue | 🟡 Minor
coalesce()will error if all values are null.If no cloud provider module is instantiated (misconfiguration),
coalesce()with all null arguments will fail at plan/apply time with a cryptic error. Consider wrapping intry()or adding a fallback.🛡️ Proposed fix
output "service_url" { description = "URL to access the Bifrost service." - value = coalesce( + value = try(coalesce( try(module.aws[0].service_url, null), try(module.gcp[0].service_url, null), try(module.azure[0].service_url, null), - ) + ), null) } output "health_check_url" { description = "URL to the Bifrost health check endpoint." - value = coalesce( + value = try(coalesce( try(module.aws[0].health_check_url, null), try(module.gcp[0].health_check_url, null), try(module.azure[0].health_check_url, null), - ) + ), null) }terraform/modules/bifrost/azure/main.tf-126-130 (1)
126-130:⚠️ Potential issue | 🟡 MinorSecurity: Key Vault secret has no expiration date.
Secrets should have an expiration date to enforce rotation policies. Consider adding an
expiration_dateattribute or documenting that secret rotation is handled externally.🔐 Add expiration date
resource "azurerm_key_vault_secret" "config" { name = "${var.name_prefix}-config" value = var.config_json key_vault_id = azurerm_key_vault.this.id + expiration_date = timeadd(timestamp(), "8760h") # 1 year }Note: Using
timestamp()will cause the resource to update on every apply. Consider using a variable for the expiration date instead.
🧹 Nitpick comments (19)
terraform/examples/aws-ecs/terraform.tfvars.example (1)
6-6: Pinimage_tagfor reproducible applies.Using
latestmakes deployments non-deterministic; a fixed version tag or digest is safer for IaC examples.🔧 Suggested tweak
-image_tag = "latest" +image_tag = "vX.Y.Z"terraform/modules/bifrost/azure/services/aks/main.tf (1)
180-184: Pin the init container image instead ofbusybox:latest.Using
latestis non‑deterministic and can introduce supply‑chain risk. Make it a pinned digest or configurable variable.🧷 Example (variable-driven) change
- image = "busybox:latest" + image = var.init_container_imageterraform/README.md (1)
53-69: Add language specifier to fenced code block.The directory structure code block lacks a language specifier. Use
textorplaintextto satisfy linters and improve rendering consistency.Proposed fix
-``` +```text terraform/ modules/bifrost/ # Top-level module (the only thing you call)terraform/examples/azure-aks/terraform.tfvars.example (1)
36-44: Consider using environment variables or secret references for API keys.While this is an example file with placeholder values, users might copy real keys here. Consider adding a comment recommending environment variables (e.g.,
TF_VAR_openai_api_key) or referencing Azure Key Vault for production use.Suggested comment addition
+# WARNING: Do not commit real API keys to version control. +# Use environment variables (TF_VAR_*) or Azure Key Vault references instead. providers_config = { openai = { api_key = "sk-..." }terraform/examples/gcp-gke/terraform.tfvars.example (1)
36-44: Sensitive credentials in example file — add a warning comment.While this is an example file, the
api_key = "sk-..."pattern may encourage users to commit actual secrets. Consider adding a comment emphasizing that API keys should be managed via GCP Secret Manager or environment variables, not stored in tfvars files.💡 Suggested improvement
providers_config = { openai = { - api_key = "sk-..." + # WARNING: Never commit real API keys. Use Secret Manager references or env vars. + # api_key = "sk-..." # Provide via TF_VAR_providers_config or secure injection } vertex = { project_id = "my-gcp-project" region = "us-central1" } }terraform/modules/bifrost/azure/services/aks/outputs.tf (1)
26-42: Nested conditionals reduce readability — consider using locals.The triple-nested ternary expressions for
service_urlandhealth_check_urlare functionally correct but hard to follow. Extracting intermediate values tolocalswould improve maintainability.♻️ Suggested refactor using locals
locals { ingress_url = var.create_load_balancer ? try( "http://${kubernetes_ingress_v1.bifrost[0].status[0].load_balancer[0].ingress[0].ip}", null ) : null external_base_url = var.domain_name != null ? "http://${var.domain_name}" : local.ingress_url internal_base_url = "http://${kubernetes_service.bifrost.metadata[0].name}.${kubernetes_namespace.this.metadata[0].name}.svc.cluster.local" base_url = var.create_load_balancer ? local.external_base_url : local.internal_base_url } output "service_url" { description = "URL to access the Bifrost service." value = local.base_url } output "health_check_url" { description = "URL to the Bifrost health check endpoint." value = local.base_url != null ? "${local.base_url}${var.health_check_path}" : null }terraform/modules/bifrost/aws/services/ecs/main.tf (1)
95-99: Hardcodedassign_public_ip = truemay not suit all deployments.For private subnets with NAT gateways or VPC endpoints, public IP assignment is unnecessary and may pose a security risk. Consider making this configurable.
♻️ Suggested change
network_configuration { subnets = var.subnet_ids security_groups = var.security_group_ids - assign_public_ip = true + assign_public_ip = var.assign_public_ip }Add to variables.tf:
variable "assign_public_ip" { description = "Whether to assign a public IP to tasks. Set to false for private subnets with NAT." type = bool default = true }terraform/examples/aws-ecs/variables.tf (1)
7-11: Consider documenting thatlatesttag should be avoided in production.The default
"latest"is fine for examples, but users should be advised to pin to a specific version in production deployments for reproducibility and rollback safety. Consider adding a note in the description.terraform/modules/bifrost/aws/services/eks/main.tf (2)
320-323: Pin the init container image to a specific digest.Using
busybox:latestis a mutable tag that can change unexpectedly and poses a supply chain risk. Pin to a specific version or digest.♻️ Proposed fix
init_container { name = "fix-permissions" - image = "busybox:latest" + image = "busybox:1.36" command = ["sh", "-c", "chown -R 1000:1000 /app/data && chmod -R 755 /app/data"]
233-260: EBS volume is pinned to a single availability zone.The
availability_zoneis hardcoded to${var.region}a(line 32), which limits pod scheduling flexibility. If no nodes are available in that AZ, the pod cannot be scheduled.Consider either:
- Using an EBS CSI driver with dynamic provisioning
- Making the AZ configurable via a variable
- Documenting this limitation
terraform/modules/bifrost/gcp/services/gke/outputs.tf (1)
26-42: Consider using HTTPS for external domain URLs.When
domain_nameis set, the URLs usehttp://protocol. If TLS is configured on the ingress (which is typical for production), the external URLs should usehttps://for accuracy.♻️ Proposed adjustment
output "service_url" { description = "URL to access the Bifrost service." value = ( var.domain_name != null - ? "http://${var.domain_name}" + ? "https://${var.domain_name}" : "http://${kubernetes_service.bifrost.metadata[0].name}.${kubernetes_namespace.bifrost.metadata[0].name}.svc.cluster.local" ) } output "health_check_url" { description = "URL to the Bifrost health check endpoint." value = ( var.domain_name != null - ? "http://${var.domain_name}${var.health_check_path}" + ? "https://${var.domain_name}${var.health_check_path}" : "http://${kubernetes_service.bifrost.metadata[0].name}.${kubernetes_namespace.bifrost.metadata[0].name}.svc.cluster.local${var.health_check_path}" ) }Alternatively, add a variable to control the protocol.
terraform/modules/bifrost/aws/main.tf (1)
126-135: Unrestricted egress is intentional but worth documenting.The static analysis flags unrestricted outbound traffic (
0.0.0.0/0). This is typically necessary for containers to pull images and call external APIs, but consider adding a comment explaining this design decision for future maintainers.📝 Add explanatory comment
resource "aws_vpc_security_group_egress_rule" "all_outbound" { count = local.create_security_groups ? 1 : 0 security_group_id = aws_security_group.this[0].id - description = "Allow all outbound traffic" + description = "Allow all outbound traffic (required for ECR pulls, external API calls, etc.)" cidr_ipv4 = "0.0.0.0/0" ip_protocol = "-1"terraform/examples/azure-aks/variables.tf (1)
35-51: Consider using stricter type definitions instead ofany.Using
type = anyforconfig_store,logs_store, andproviders_configbypasses Terraform's type validation, which can lead to runtime errors from malformed input. Consider defining object types with expected keys, or at minimum usemap(any)orobject({...})to provide some structure validation.terraform/modules/bifrost/main.tf (1)
32-33: Shallow merge may cause unintended config overwrites.The
merge()function performs a shallow merge, so ifbase_configcontains nested objects (e.g.,providerswith multiple keys) andoverridesprovides only a partialprovidersvalue, the entireproviderskey frombase_configwill be replaced rather than merged. Consider documenting this behavior or using a deep merge approach if partial overrides are expected.terraform/modules/bifrost/gcp/services/gke/main.tf (1)
206-209: Consider using a pinned image tag for the init container.Using
busybox:latest(Line 208) may cause unpredictable behavior if the image changes. Consider pinning to a specific version for reproducibility.📌 Pin the busybox version
init_container { name = "fix-permissions" - image = "busybox:latest" + image = "busybox:1.36" command = ["sh", "-c", "chown -R 1000:1000 /app/data && chmod -R 755 /app/data"]terraform/modules/bifrost/gcp/variables.tf (1)
82-111: Consider adding default values for optional feature flags.Variables like
create_load_balancer,enable_autoscaling, and related thresholds lack defaults. If these are truly optional features, providing sensible defaults (e.g.,falsefor feature flags) would improve usability and prevent Terraform from requiring explicit values for every deployment.💡 Example defaults
variable "create_load_balancer" { description = "Create a load balancer (GKE Ingress / Cloud Run domain mapping)." type = bool + default = false } variable "enable_autoscaling" { description = "Enable autoscaling for the service." type = bool + default = false }terraform/modules/bifrost/aws/variables.tf (3)
2-5: Constrainserviceto supported values.Without validation, invalid targets can slip through and fail deeper in the module. Consider an explicit allowlist.
Proposed diff
variable "service" { description = "AWS service to deploy on (ecs or eks)." type = string + validation { + condition = contains(["ecs", "eks"], var.service) + error_message = "service must be \"ecs\" or \"eks\"." + } }
8-12: Validate thatconfig_jsonis valid JSON.This catches misconfigurations early and produces a clearer error than downstream decode failures.
Proposed diff
variable "config_json" { description = "Complete Bifrost config.json as a string." type = string sensitive = true + validation { + condition = can(jsondecode(var.config_json)) + error_message = "config_json must be valid JSON." + } }
88-144: Make autoscaling and EKS-only inputs conditional.Right now these variables are required even when
enable_autoscaling = falseorservice = "ecs", which forces callers to pass unused values and can break ECS-only plans. Consider defaulting tonull/falseand adding conditional validation (apply the same pattern to the related vars:max_capacity,autoscaling_*,node_count,volume_size_gb, etc.).Example pattern
variable "enable_autoscaling" { description = "Enable autoscaling for the service." type = bool + default = false } variable "min_capacity" { description = "Minimum number of replicas when autoscaling is enabled." type = number + default = null + validation { + condition = var.enable_autoscaling ? var.min_capacity != null : true + error_message = "min_capacity is required when enable_autoscaling = true." + } } variable "kubernetes_namespace" { description = "Kubernetes namespace to deploy into." type = string + default = null + validation { + condition = var.service == "eks" ? var.kubernetes_namespace != null : true + error_message = "kubernetes_namespace is required when service = \"eks\"." + } }
...erraform/providers/registry.terraform.io/hashicorp/kubernetes/3.0.1/darwin_arm64/LICENSE.txt
Outdated
Show resolved
Hide resolved
3fcfc12 to
bbe8b13
Compare
There was a problem hiding this comment.
Actionable comments posted: 9
🤖 Fix all issues with AI agents
In `@terraform/examples/gcp-gke/terraform.tfvars.example`:
- Around line 36-43: Add a short caution above the providers_config block
warning users not to commit secrets in tfvars and to keep tfvars out of VCS
(e.g., add to .gitignore) and to source sensitive values (like openai.api_key,
vertex.project_id/region) from a secure secret manager or environment variables;
keep the note concise and doc-only so it doesn’t change any runtime behavior.
In `@terraform/modules/bifrost/aws/main.tf`:
- Around line 113-124: Replace the hardcoded open CIDR with a configurable list
and create one ingress rule per CIDR: add a variable like allowed_ingress_cidrs
(type = list(string), default = ["0.0.0.0/0"]) and update the
aws_vpc_security_group_ingress_rule.container_port resource to use count =
local.create_security_groups ? length(var.allowed_ingress_cidrs) : 0 and set
cidr_ipv4 = var.allowed_ingress_cidrs[count.index] (preserve other attributes
like from_port/to_port/ip_protocol/tags) so you can override allowed CIDR blocks
for production.
In `@terraform/modules/bifrost/aws/services/ecs/main.tf`:
- Around line 43-46: The command block using echo to write $BIFROST_CONFIG to
/app/data/config.json is vulnerable to shell variable expansion (e.g., $schema)
and can corrupt JSON; update the command in the entryPoint/command definition to
use a safe literal writer like printf '%s' to preserve the exact JSON payload
when BIFROST_CONFIG is set (and keep the existing error branch and exec
/app/docker-entrypoint.sh /app/main behavior); apply the same change for the
analogous command in the Azure ACI module to ensure both use printf '%s' with
the same redirection target (/app/data/config.json) and preserve the exit
semantics.
In `@terraform/modules/bifrost/aws/services/eks/main.tf`:
- Around line 507-509: The kubernetes_ingress_v1.bifrost resource is being
created whenever var.domain_name is non-null even if var.create_load_balancer is
false; update the resource's count expression to require both conditions
(domain_name != null AND create_load_balancer == true) so the Ingress is only
created when a domain is provided and load balancer creation is enabled; locate
kubernetes_ingress_v1.bifrost and change its count accordingly to reference
var.domain_name and var.create_load_balancer.
- Around line 29-33: The local "availability_zone" is hardcoded to
"${var.region}a" which causes EBS volumes to be created in a fixed AZ and can
fail when pods land on nodes in other AZs; change the AZ derivation to pick the
AZ of a chosen subnet (e.g., use the aws_subnet or data.aws_subnet resource
referenced by your node group/subnet selection instead of var.region) and
replace the local "availability_zone" with that subnet.az (or a lookup of the
subnet's availability_zone). Then update the PersistentVolume declaration (the
PV resource that currently uses availability_zone) to include
spec.nodeAffinity.required.nodeSelectorTerms that constrain the PV to nodes with
the matching topology.kubernetes.io/zone value (matchExpressions key
"topology.kubernetes.io/zone", operator "In", values [computed subnet AZ]).
Ensure any references to locals.availability_zone are updated to the new
computed value.
- Around line 516-522: The ALB ingress is configured to force HTTPS via the
annotations map (see "alb.ingress.kubernetes.io/listen-ports") but doesn't
supply a certificate ARN; add a new variable certificate_arn in variables.tf
(string, default empty) and update the annotations block in the resource that
contains annotations (the map using var.health_check_path) to conditionally
include "alb.ingress.kubernetes.io/certificate-arn" = var.certificate_arn when
var.certificate_arn != "" and otherwise either remove/replace the HTTPS
listen-ports with an HTTP port entry (e.g., "[{\"HTTP\":80}]" ) so ALB
provisioning won't fail; ensure the variable name certificate_arn and the
annotations map are the referenced symbols in your change.
In `@terraform/modules/bifrost/kubernetes/main.tf`:
- Around line 285-301: The ingress resource kubernetes_ingress_v1.bifrost can
get host = null when var.create_load_balancer is true and var.domain_name is
unset; add a guard so domain_name is required when creating the LB: either add
validation to the variable "domain_name" (ensure var.create_load_balancer ==
false || var.domain_name != null with an appropriate error_message) or change
the ingress creation logic (e.g., include var.create_load_balancer &&
var.domain_name != null in the resource count or wrap host assignment) so the
ingress rule never receives a null host.
In `@terraform/modules/bifrost/variables.tf`:
- Around line 42-46: The variable "auth_config" holds credentials
(admin_password) but is not marked sensitive; update the Terraform variable
block for auth_config to include sensitive = true so its values are treated as
secrets (keeps default/null and type any intact) and will be redacted from
plan/output and handled securely in state; ensure the change is applied to the
variable declaration named auth_config.
In `@terraform/README.md`:
- Around line 54-72: Update the fenced block showing the directory structure in
the README so the opening triple-backtick includes a language identifier (use
"text")—i.e., change the code fence that begins with the "terraform/" directory
listing to ```text to satisfy markdownlint MD040; no other content changes
needed.
🧹 Nitpick comments (14)
terraform/examples/kubernetes/terraform.tfvars.example (1)
40-47: Add a warning about real API keys in tfvars/state.Terraform stores tfvars values in state; a short warning reduces accidental secret leakage.
💡 Suggested inline warning
providers_config = { openai = { - api_key = "sk-..." + # NOTE: Avoid committing real keys here; values are stored in Terraform state. + api_key = "sk-REPLACE_ME" } anthropic = { - api_key = "sk-ant-..." + # NOTE: Avoid committing real keys here; values are stored in Terraform state. + api_key = "sk-ant-REPLACE_ME" } }terraform/modules/bifrost/outputs.tf (1)
1-25: Verify unified outputs against the rest of the stack.Since this PR is stacked, please confirm upstream/downstream PRs don’t introduce conflicting output names or a different coalescing order for these unified URLs.
As per coding guidelines:
**: always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)terraform/modules/bifrost/gcp/services/gke/main.tf (1)
206-209: Pin the busybox image tag for reproducibility and security.Using
busybox:latestintroduces supply chain risk and non-deterministic builds. Pin to a specific digest or version tag.📌 Suggested fix
init_container { name = "fix-permissions" - image = "busybox:latest" + image = "busybox:1.36" command = ["sh", "-c", "chown -R 1000:1000 /app/data && chmod -R 755 /app/data"]terraform/modules/bifrost/kubernetes/main.tf (1)
102-105: Pin the busybox image tag for reproducibility and security.Using
busybox:latestintroduces supply chain risk. Pin to a specific version.📌 Suggested fix
init_container { name = "fix-permissions" - image = "busybox:latest" + image = "busybox:1.36" command = ["sh", "-c", "chown -R 1000:1000 /app/data && chmod -R 755 /app/data"]terraform/modules/bifrost/azure/services/aks/main.tf (2)
181-184: Pin the busybox image tag for reproducibility and security.Using
busybox:latestintroduces supply chain risk.📌 Suggested fix
init_container { name = "fix-permissions" - image = "busybox:latest" + image = "busybox:1.36" command = ["sh", "-c", "chown -R 1000:1000 /app/data && chmod -R 755 /app/data"]
368-370: Usespec.ingressClassNameinstead of deprecated annotation.The
kubernetes.io/ingress.classannotation is deprecated. Use theingress_class_namefield in the spec instead.♻️ Suggested fix
metadata { name = "${local.service_name}-ingress" namespace = kubernetes_namespace.this.metadata[0].name - annotations = { - "kubernetes.io/ingress.class" = "nginx" - } } spec { + ingress_class_name = "nginx" + rule {terraform/modules/bifrost/azure/main.tf (1)
126-130: Set an expiration date on the Key Vault secret.Azure best practice recommends setting expiration dates on secrets for rotation and compliance purposes.
🔒 Suggested fix
resource "azurerm_key_vault_secret" "config" { name = "${var.name_prefix}-config" value = var.config_json key_vault_id = azurerm_key_vault.this.id + expiration_date = timeadd(timestamp(), "8760h") # 1 year from now + + lifecycle { + ignore_changes = [expiration_date] # Prevent drift on each apply + } }Alternatively, add a variable for the expiration period.
terraform/examples/kubernetes/variables.tf (1)
39-55: Consider adding more specific types for configuration variables.Using
type = anyforconfig_store,logs_store, andproviders_configreduces validation. If the expected structure is known, defining object types would catch configuration errors earlier.💡 Example type definition
variable "config_store" { description = "Config store configuration (type: sqlite/postgres)." type = object({ type = string path = optional(string) dsn = optional(string) }) default = null }terraform/modules/bifrost/gcp/variables.tf (1)
2-5: Add validation forservicevariable.The
servicevariable accepts "gke or cloud-run" per the description, but lacks a validation block to enforce this constraint. This could lead to runtime errors if an invalid value is passed.Proposed validation
variable "service" { description = "GCP service to deploy on (gke or cloud-run)." type = string + validation { + condition = contains(["gke", "cloud-run"], var.service) + error_message = "service must be one of: gke, cloud-run" + } }terraform/examples/azure-aks/variables.tf (1)
7-11: Consider documenting production best practice for image tags.Using
"latest"as the default is acceptable for examples, but production deployments should pin to specific versions for reproducibility. Consider adding a comment in the description noting this.Optional enhancement
variable "image_tag" { - description = "Bifrost Docker image tag." + description = "Bifrost Docker image tag. Use a specific version (e.g. v1.2.3) in production." type = string default = "latest" }terraform/modules/bifrost/main.tf (2)
32-33: Document shallow merge behavior for configuration overrides.The
merge()function performs a shallow merge where overrides completely replace top-level keys frombase_config. If a user providesproviders_configvia variable while also having aproviderssection in their config file, the entireprovidersobject is replaced rather than deep-merged.This may be intentional, but users expecting deep merging (e.g., adding one provider while keeping others from the file) could be surprised. Consider documenting this behavior or using a deep merge approach if nested merging is desired.
35-37: Hardcodedcontainer_portandhealth_check_pathmay limit flexibility.The container port (8080) and health check path ("/health") are hardcoded in locals. While these are reasonable defaults for Bifrost, consider exposing them as variables with these defaults if users need to customize them for reverse proxy configurations or different health endpoints.
terraform/modules/bifrost/azure/variables.tf (1)
2-5: Add validation forservicevariable.Similar to the GCP module, the
servicevariable accepts "aks or aci" per the description but lacks a validation block. Adding validation would catch invalid inputs early duringterraform plan.Proposed validation
variable "service" { description = "Azure service to deploy on (aks or aci)." type = string + validation { + condition = contains(["aks", "aci"], var.service) + error_message = "service must be one of: aks, aci" + } }terraform/modules/bifrost/variables.tf (1)
1-18: Consider cross-validatingcloud_providerandservicecombinations.The
cloud_providerandservicevariables are validated independently, but certain combinations are invalid (e.g.,service = "ecs"withcloud_provider = "gcp"). Users won't receive a clear error until the conditional module block fails or behaves unexpectedly.Optional: Add cross-validation
variable "service" { description = "Cloud service to deploy on. AWS: ecs, eks. GCP: gke, cloud-run. Azure: aks, aci." type = string validation { condition = ( (var.cloud_provider == "aws" && contains(["ecs", "eks"], var.service)) || (var.cloud_provider == "gcp" && contains(["gke", "cloud-run"], var.service)) || (var.cloud_provider == "azure" && contains(["aks", "aci"], var.service)) || (var.cloud_provider == "kubernetes" && var.service == "deployment") ) error_message = "Invalid cloud_provider/service combination. AWS: ecs, eks. GCP: gke, cloud-run. Azure: aks, aci. Kubernetes: deployment." } }Note: This requires Terraform 1.9+ for variable cross-references in validation blocks, or can be implemented via a
localscheck withprecondition.
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Fix all issues with AI agents
In @.gitignore:
- Line 110: Remove the `.terraform.lock.hcl` entry from .gitignore so the
lockfile is committed (ensure `terraform init` / CI will recreate it and then
commit the file), and update .gitignore to explicitly ignore
`terraform.tfstate.backup` and `*.tfvars` while allowing example var files by
adding an exception for `!*.tfvars.example`; target the top-level .gitignore
entry that currently lists `.terraform.lock.hcl` and replace it accordingly so
the repo tracks the lockfile but still ignores state backups and real var files.
In `@terraform/examples/kubernetes/terraform.tfvars.example`:
- Around line 24-37: Update the SQLite file paths in the example tfvars so they
match the Kubernetes module's PVC mount at /app/data: change the
config_store.config.path and logs_store.config.path values from /data/bifrost.db
and /data/bifrost-logs.db to use /app/data (e.g., /app/data/bifrost.db and
/app/data/bifrost-logs.db) so SQLite writes land inside the mounted volume.
In `@terraform/modules/bifrost/azure/services/aks/main.tf`:
- Around line 362-376: The kubernetes_ingress_v1.bifrost resource currently uses
count based only on var.create_load_balancer while its spec references
var.domain_name, risking failure when domain_name is null; change the resource's
count expression to use the existing local.ingress_enabled (which is defined as
var.create_load_balancer && var.domain_name != null) so the ingress is only
created when both conditions hold. Locate the kubernetes_ingress_v1.bifrost
block and replace the count assignment to reference local.ingress_enabled
instead of var.create_load_balancer, leaving the rest of the resource unchanged.
In `@terraform/modules/bifrost/gcp/main.tf`:
- Around line 16-24: Add validation in the variable declaration for
existing_vpc_id/existing_subnet_ids to enforce they are provided together:
require either both are null or existing_vpc_id != null and existing_subnet_ids
is non-empty; this prevents local.subnet_id from resolving to null or a
mismatched subnet when computing subnet_id from existing_subnet_ids. Update the
variables.tf entry for variable "existing_vpc_id" (and/or "existing_subnet_ids")
to include a validation block that checks (var.existing_vpc_id == null &&
var.existing_subnet_ids == null) || (var.existing_vpc_id != null &&
var.existing_subnet_ids != null && length(var.existing_subnet_ids) > 0) and
provide a clear error_message like "Provide both existing_vpc_id and
existing_subnet_ids (non-empty), or leave both null." to stop invalid
single-value combinations.
In `@terraform/modules/bifrost/gcp/services/gke/main.tf`:
- Around line 105-165: The PV is created from a zonal GCE disk
(google_compute_disk.bifrost pinned to "${var.region}-a") but the cluster is
regional, so add node affinity on kubernetes_persistent_volume.bifrost to
restrict scheduling to the same zone (e.g., topology.kubernetes.io/zone =
"${var.region}-a") so pods bound by kubernetes_persistent_volume_claim.bifrost
will only land on nodes in zone-a; alternatively, convert the disk to a regional
disk (use a regional disk resource) and/or enable dynamic provisioning, but if
you keep google_compute_disk.bifrost as zonal then add the PV
spec.node_affinity.required.node_selector_terms matching the zone to fix
attachment failures.
In `@terraform/modules/bifrost/gcp/variables.tf`:
- Around line 69-80: The variables existing_vpc_id and existing_subnet_ids must
be provided together or both left null to avoid mismatched wiring; add a
validation block on existing_subnet_ids that enforces (var.existing_vpc_id ==
null && var.existing_subnet_ids == null) || (var.existing_vpc_id != null &&
var.existing_subnet_ids != null && length(var.existing_subnet_ids) > 0) and
supply a clear error_message like "Provide both existing_vpc_id and
existing_subnet_ids (non-empty), or leave both null."; update the variable
existing_subnet_ids declaration accordingly so the module rejects mismatched
inputs.
In `@terraform/modules/bifrost/variables.tf`:
- Around line 60-64: The variable declaration variable "providers_config"
currently exposes LLM provider secrets; update the Terraform variable block for
providers_config to mark it sensitive by adding sensitive = true so API keys
won't be shown in plan/output or stored in cleartext, keeping type = any and
default = null unchanged; locate the variable "providers_config" block and add
the sensitive = true attribute.
In `@terraform/README.md`:
- Around line 49-50: The README claim "All 18 top-level config properties" is
inconsistent with the listed 16 properties; either change the count to "16" or
add the two missing top-level properties to the list (currently: encryption_key,
auth_config, client, framework, providers_config, governance, mcp, vector_store,
config_store, logs_store, cluster_config, saml_config, load_balancer_config,
guardrails_config, plugins, audit_logs) so the number matches; update the
sentence or append the two missing property names accordingly.
🧹 Nitpick comments (5)
terraform/modules/bifrost/azure/services/aci/outputs.tf (1)
16-24: Consider documenting the HTTP protocol choice.Both
service_urlandhealth_check_urluse thehttp://scheme. This is often acceptable when TLS termination is handled by an upstream load balancer or when these are internal endpoints. Consider either:
- Adding a note in the descriptions that these are HTTP endpoints, or
- Optionally exposing an HTTPS variant if the container handles TLS directly.
If HTTP is intentional (e.g., for internal health checks behind a reverse proxy), the current implementation is fine.
📝 Optional: Add clarifying note to descriptions
output "service_url" { - description = "URL to access the Bifrost service." + description = "HTTP URL to access the Bifrost service (TLS termination expected at load balancer)." value = "http://${azurerm_container_group.bifrost.fqdn}:${var.container_port}" } output "health_check_url" { - description = "URL to the Bifrost health check endpoint." + description = "HTTP URL to the Bifrost health check endpoint." value = "http://${azurerm_container_group.bifrost.fqdn}:${var.container_port}${var.health_check_path}" }terraform/modules/bifrost/kubernetes/main.tf (1)
104-104: Pin the busybox image to a specific version for reproducibility.Using
busybox:latestin the init container can lead to inconsistent behavior across deployments if the upstream image changes. Consider pinning to a specific version.🔧 Suggested fix
- image = "busybox:latest" + image = "busybox:1.36"terraform/examples/aws-ecs/terraform.tfvars.example (1)
35-42: Consider adding a note about secret management for production.While placeholder values are fine for an example file, it would be helpful to add a comment noting that for production deployments, API keys should be managed via AWS Secrets Manager rather than stored in tfvars files.
📝 Suggested documentation addition
+# NOTE: For production, manage API keys via AWS Secrets Manager instead of +# storing them in tfvars. The module supports secret references. providers_config = { openai = { api_key = "sk-..." }terraform/examples/gcp-gke/variables.tf (1)
12-16: Prefer a pinned image tag in examples.Using
"latest"makes deployments non-reproducible. Consider forcing explicit input or using a pinned tag.♻️ Suggested change
variable "image_tag" { description = "Bifrost Docker image tag." type = string - default = "latest" + default = null }terraform/modules/bifrost/variables.tf (1)
11-18: Consider adding validation to ensureservicematchescloud_provider.The validation allows any service regardless of the selected cloud provider. A user could specify
cloud_provider = "azure"withservice = "ecs", which would fail later in the module. A cross-variable validation would catch this at plan time.Example validation approach (Terraform 1.9+)
variable "service" { description = "Cloud service to deploy on. AWS: ecs, eks. GCP: gke, cloud-run. Azure: aks, aci." type = string validation { condition = contains(["ecs", "eks", "gke", "cloud-run", "aks", "aci", "deployment"], var.service) error_message = "service must be one of: ecs, eks, gke, cloud-run, aks, aci, deployment" } } # In main.tf or a separate validation block: locals { valid_service_combinations = { aws = ["ecs", "eks"] gcp = ["gke", "cloud-run"] azure = ["aks", "aci"] kubernetes = ["deployment"] } } # Use a check block or precondition in resource
bb8db77 to
cdf161d
Compare
There was a problem hiding this comment.
Actionable comments posted: 11
🤖 Fix all issues with AI agents
In `@terraform/examples/kubernetes/variables.tf`:
- Around line 3-7: The default for variable "kubeconfig_path" uses
"~/.kube/config" which Terraform won't expand; update usage to call
pathexpand(var.kubeconfig_path) where the value is consumed (e.g., in the
kubernetes provider's config_path) or change the variable default to an absolute
path and document the requirement; specifically, replace direct references to
var.kubeconfig_path with pathexpand(var.kubeconfig_path) in the provider
configuration (or add guidance in the variable description to require an
absolute path).
In `@terraform/modules/bifrost/aws/services/ecs/main.tf`:
- Around line 97-101: The network_configuration block currently hardcodes
assign_public_ip = true, which prevents deployments into private subnets; add a
new Terraform variable named assign_public_ip (default = true) in the module's
variables (variables.tf) and replace the literal true with var.assign_public_ip
inside the network_configuration block so callers can set assign_public_ip =
false for private‑only deployments; ensure the variable has a short description
and default true to preserve current behavior.
In `@terraform/modules/bifrost/aws/variables.tf`:
- Around line 76-80: The variable allowed_cidr currently defaults to "0.0.0.0/0"
which makes ingress public by default; remove the default value so the variable
becomes required (no default) and update the variable description to note that
callers must explicitly provide a CIDR for ingress; locate the declaration of
variable "allowed_cidr" in variables.tf and delete the default = "0.0.0.0/0"
line (and adjust any module call sites or documentation to pass an explicit
CIDR).
In `@terraform/modules/bifrost/azure/services/aks/main.tf`:
- Around line 111-116: The azure_disk block for the AKS PV is setting
caching_mode = "ReadOnly", which will prevent SQLite from performing writes;
update the azure_disk block (the azure_disk configuration that references
azurerm_managed_disk.bifrost_disk and uses data_disk_uri =
azurerm_managed_disk.bifrost_disk.id) to use caching_mode = "ReadWrite" (or
"None" if you have a specific requirement) so the disk allows read-write
operations, then re-plan/apply Terraform and verify the disk mount/permissions
for the SQLite data path.
In `@terraform/modules/bifrost/azure/variables.tf`:
- Around line 63-68: The variable "allowed_cidr" currently defaults to "*" which
opens ingress; remove the default (i.e., make the variable required) so users
must explicitly provide a CIDR, and update the variable "allowed_cidr"
description to state that a specific CIDR range is required for ingress rather
than a wildcard; ensure no other code assumes a default value for allowed_cidr.
In `@terraform/modules/bifrost/gcp/services/cloud-run/variables.tf`:
- Around line 92-96: The variable "create_load_balancer" has a misleading
description about unauthenticated access; update the variable's description to
accurately reflect its purpose (e.g., "Create an external HTTP(S) load balancer
to expose the service publicly" or similar) or rename the variable to something
like "allow_unauthenticated" if the intent is to control public access; locate
the declaration of variable "create_load_balancer" and change either the
description text to match load balancer creation or rename the variable and
update all references accordingly (e.g., in modules that consume
create_load_balancer).
In `@terraform/modules/bifrost/gcp/services/gke/main.tf`:
- Around line 400-401: The ingress resource kubernetes_ingress_v1.bifrost is
gated only by var.domain_name != null, which is inconsistent with other modules;
update the resource's count condition to require both var.create_load_balancer
and var.domain_name (e.g., count = var.create_load_balancer && var.domain_name
!= null ? 1 : 0) so the ingress is created only when load balancer creation is
enabled and a domain is provided; modify any related references or tests
expecting the previous behavior to ensure consistency with AWS EKS and
Kubernetes modules.
In `@terraform/modules/bifrost/gcp/services/gke/variables.tf`:
- Around line 145-149: The variable "master_authorized_cidr" currently defaults
to "0.0.0.0/0" which exposes the GKE control plane; remove the default so the
variable is required (no default) and update its description to indicate this
must be explicitly set to a scoped CIDR for production. Additionally, add a
validation rule on master_authorized_cidr (using the variable block's
validation) to reject "0.0.0.0/0" (and empty values) to prevent accidental
public exposure, and ensure any downstream references to master_authorized_cidr
remain unchanged.
In `@terraform/modules/bifrost/kubernetes/variables.tf`:
- Around line 66-70: Change the storage_class_name variable to default to null
(variable "storage_class_name") so it becomes opt‑in, and update any places that
render the PVC spec (where you set storageClassName) to only include
storageClassName when var.storage_class_name is not null (e.g., guard the
template or resource attribute that currently assigns var.storage_class_name).
Specifically: set default = null for variable "storage_class_name" and modify
the PVC/template/resource code that uses storageClassName to conditionally add
the field only when var.storage_class_name != null.
In `@terraform/modules/bifrost/variables.tf`:
- Around line 203-207: The variable "allowed_cidr" currently defaults to
"0.0.0.0/0" which makes ingress public by default; remove the default so the
variable is required (i.e., no default = "0.0.0.0/0") and optionally add a
validation block on allowed_cidr to reject the literal "0.0.0.0/0" if you want
to enforce a non-public CIDR; update any call sites or module consumers to pass
an explicit CIDR value.
In `@terraform/README.md`:
- Around line 7-15: The README's Terraform module reference pins source in
module "bifrost" to ?ref=v1.4.6 which lacks the new Terraform modules; update
the source ref in the module "bifrost" declaration (the source string) to the
correct release tag that contains this stack's Terraform modules or temporarily
use ?ref=main so the modules are available (ensure image_tag and ref match the
chosen release tag).
🧹 Nitpick comments (7)
terraform/modules/bifrost/gcp/services/gke/outputs.tf (1)
21-41: Consider external URL fallback whendomain_nameis unset.
Todayservice_urldefaults to the cluster-local DNS, which won’t be reachable externally even when an ingress IP exists. Consider usingingress_ipas the next fallback to provide a usable URL.♻️ Possible update
+locals { + ingress_ip = try(google_compute_global_address.bifrost[0].address, null) +} + output "service_url" { description = "URL to access the Bifrost service." value = ( var.domain_name != null ? "http://${var.domain_name}" - : "http://${kubernetes_service.bifrost.metadata[0].name}.${kubernetes_namespace.bifrost.metadata[0].name}.svc.cluster.local" + : local.ingress_ip != null + ? "http://${local.ingress_ip}" + : "http://${kubernetes_service.bifrost.metadata[0].name}.${kubernetes_namespace.bifrost.metadata[0].name}.svc.cluster.local" ) } output "health_check_url" { description = "URL to the Bifrost health check endpoint." value = ( var.domain_name != null ? "http://${var.domain_name}${var.health_check_path}" - : "http://${kubernetes_service.bifrost.metadata[0].name}.${kubernetes_namespace.bifrost.metadata[0].name}.svc.cluster.local${var.health_check_path}" + : local.ingress_ip != null + ? "http://${local.ingress_ip}${var.health_check_path}" + : "http://${kubernetes_service.bifrost.metadata[0].name}.${kubernetes_namespace.bifrost.metadata[0].name}.svc.cluster.local${var.health_check_path}" ) }terraform/modules/bifrost/azure/services/aks/main.tf (1)
183-183: Pin the init container image tag for reproducibility.Using
busybox:latestcan lead to inconsistent deployments if the upstream image changes. Consider pinning to a specific version.📌 Suggested fix
init_container { name = "fix-permissions" - image = "busybox:latest" + image = "busybox:1.36" command = ["sh", "-c", "chown -R 1000:1000 /app/data && chmod -R 755 /app/data"]terraform/modules/bifrost/kubernetes/main.tf (1)
102-105: Pin the init container image tag for reproducibility.Using
busybox:latestcan lead to inconsistent deployments. Consider pinning to a specific version for reproducible infrastructure.📌 Suggested fix
init_container { name = "fix-permissions" - image = "busybox:latest" + image = "busybox:1.36" command = ["sh", "-c", "chown -R 1000:1000 /app/data && chmod -R 755 /app/data"]terraform/modules/bifrost/azure/main.tf (1)
140-144: Consider adding secret expiration for production deployments.The Key Vault secret lacks an
expiration_date, which is flagged by security scanners. For a starter module this is acceptable, but document that production deployments should set secret rotation/expiration policies.📝 Example with expiration
resource "azurerm_key_vault_secret" "config" { name = "${var.name_prefix}-config" value = var.config_json key_vault_id = azurerm_key_vault.this.id + expiration_date = timeadd(timestamp(), "8760h") # 1 year + + lifecycle { + ignore_changes = [expiration_date] + } }terraform/modules/bifrost/aws/services/eks/main.tf (1)
320-323: Pin the init container image tag for reproducibility.Using
busybox:latestcan cause deployment inconsistencies.📌 Suggested fix
init_container { name = "fix-permissions" - image = "busybox:latest" + image = "busybox:1.36"terraform/modules/bifrost/gcp/services/gke/main.tf (1)
217-220: Pin the init container image tag for reproducibility.Using
busybox:latestcan cause deployment inconsistencies across environments.📌 Suggested fix
init_container { name = "fix-permissions" - image = "busybox:latest" + image = "busybox:1.36"terraform/modules/bifrost/aws/main.tf (1)
62-69: Expose public‑subnet and egress defaults so production can lock down networking.For a baseline module, these defaults are fine, but production users should be guided or enabled to tighten them (e.g., private subnets or restricted egress CIDRs). Consider variables or explicit README notes for these defaults.
Based on learnings: The Bifrost Terraform modules are baseline starter modules, not production-hardened. In reviews of terraform/modules/bifrost/**/*.tf, verify that production-specific security configurations (e.g., AKS API server authorized IP ranges, Azure Policy, specific RBAC settings, AWS security group restrictions) are clearly left for users to add based on their deployment requirements, and that there are no baked-in defaults that could compromise security.
Also applies to: 126-133
e4388e2 to
4c7d024
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@terraform/modules/bifrost/azure/main.tf`:
- Around line 20-27: Add validation to ensure existing_vpc_id and
existing_subnet_ids are provided together: add a variable validation (or
precondition) that enforces either both existing_vpc_id and existing_subnet_ids
are null or existing_vpc_id is non-null and existing_subnet_ids is non-null and
non-empty; this prevents referencing azurerm_virtual_network.this[0] or
azurerm_subnet.this[0] when their resources were created with count = 0. Target
the variable block for existing_vpc_id (and/or existing_subnet_ids) and
implement the condition and clear error_message to instruct users to "Provide
both existing_vpc_id and existing_subnet_ids (non-empty), or leave both null."
In `@terraform/modules/bifrost/azure/services/aci/variables.tf`:
- Around line 62-66: The subnet_ids variable is currently required but unused by
the ACI module; make it optional by adding a default value and update the
description: modify the variable "subnet_ids" to include default = null and
change the description to clarify it's optional and only used when private
networking is required (e.g., "Optional list of subnet IDs for private
networking; set to null when using public IPs"). Ensure you only change the
variable block for subnet_ids so other modules referencing it are unaffected.
In `@terraform/modules/bifrost/variables.tf`:
- Around line 1-18: Update the variable "service" description to include the
missing "deployment" option so it matches the validation list; edit the
description string for variable "service" (the variable named service and its
validation block) to append or include "deployment" (e.g., mention "deployment"
as the generic Kubernetes option) so the human-readable description aligns with
the allowed values in the validation condition.
🧹 Nitpick comments (8)
terraform/modules/bifrost/azure/services/aks/main.tf (1)
180-194: Consider pinning the busybox image version.Using
busybox:latestin the init container works but may lead to inconsistent behavior across deployments if the upstream image changes.📌 Suggested pinned version
init_container { name = "fix-permissions" - image = "busybox:latest" + image = "busybox:1.36" command = ["sh", "-c", "chown -R 1000:1000 /app/data && chmod -R 755 /app/data"]terraform/modules/bifrost/aws/services/eks/main.tf (1)
319-333: Consider pinning the busybox image version.Same suggestion as the AKS module—using a specific tag improves reproducibility.
📌 Suggested pinned version
init_container { name = "fix-permissions" - image = "busybox:latest" + image = "busybox:1.36" command = ["sh", "-c", "chown -R 1000:1000 /app/data && chmod -R 755 /app/data"]terraform/modules/bifrost/gcp/services/cloud-run/variables.tf (1)
92-96: Variable description could be more accurate.The description says "Create a load balancer" but for Cloud Run, this variable actually controls whether to grant public (unauthenticated) access via IAM—Cloud Run services don't require a separate load balancer resource.
📝 Suggested description improvement
variable "create_load_balancer" { - description = "Create a load balancer for the Cloud Run service." + description = "Expose the Cloud Run service publicly (allow unauthenticated access)." type = bool }terraform/modules/bifrost/aws/variables.tf (1)
76-80: Consider enhancing the description with a production warning.The
0.0.0.0/0default is acceptable for starter module convenience, but the description could better communicate security implications.Based on learnings: "allowed_cidr variable defaults to 0.0.0.0/0 for dev/starter convenience. This is acceptable for baseline starter modules, but production usage must override it. Ensure this behavior is clearly documented."
📝 Suggested description improvement
variable "allowed_cidr" { - description = "CIDR block allowed for ingress traffic." + description = "CIDR block allowed for ingress traffic. Defaults to 0.0.0.0/0 for quick-start convenience. ⚠️ For production, restrict to trusted networks." type = string default = "0.0.0.0/0" }terraform/examples/azure-aks/main.tf (1)
8-17: Missing Kubernetes provider configuration for AKS.The
kubernetesprovider is declared but not configured. For AKS deployments, you typically need to configure it with cluster credentials after the AKS cluster is created. This is a common chicken-and-egg problem in Terraform.Consider adding provider configuration or documenting that users should run
terraform applyin two phases, or use a data source to fetch AKS credentials:💡 Example kubernetes provider configuration for AKS
data "azurerm_kubernetes_cluster" "bifrost" { count = var.create_cluster ? 0 : 1 name = var.existing_cluster_name resource_group_name = var.resource_group_name } provider "kubernetes" { host = var.create_cluster ? module.bifrost.cluster_host : data.azurerm_kubernetes_cluster.bifrost[0].kube_config[0].host client_certificate = base64decode(var.create_cluster ? module.bifrost.cluster_client_certificate : data.azurerm_kubernetes_cluster.bifrost[0].kube_config[0].client_certificate) client_key = base64decode(var.create_cluster ? module.bifrost.cluster_client_key : data.azurerm_kubernetes_cluster.bifrost[0].kube_config[0].client_key) cluster_ca_certificate = base64decode(var.create_cluster ? module.bifrost.cluster_ca_certificate : data.azurerm_kubernetes_cluster.bifrost[0].kube_config[0].cluster_ca_certificate) }terraform/modules/bifrost/kubernetes/main.tf (2)
102-115: Consider pinning the busybox image tag for reproducibility.Using
busybox:latestcan lead to inconsistent behavior across deployments if the upstream image changes. For production stability, consider pinning to a specific version.💡 Suggested fix
init_container { name = "fix-permissions" - image = "busybox:latest" + image = "busybox:1.36" command = ["sh", "-c", "chown -R 1000:1000 /app/data && chmod -R 755 /app/data"]
134-143: Resource limits use a fixed 2x multiplier which may be excessive for some workloads.The limits are set to
cpu * 2andmemory * 2. For a starter module this is reasonable, but be aware that if users set high base values (e.g.,cpu = 4000,memory = 8192), the limits become quite large (8 vCPU, 16Gi). This is fine for most use cases, but worth documenting.terraform/modules/bifrost/azure/main.tf (1)
142-146: Add optionalsecret_expiration_datevariable for Key Vault secret rotation.This aligns with the module's pattern for optional security features. The
expiration_dateattribute forazurerm_key_vault_secretaccepts RFC 3339 / ISO-8601 UTC format (YYYY-MM-DDTHH:MM:SSZ). Consider adding a nullablesecret_expiration_datevariable and conditionally wiring it to enable secret rotation policies for production deployments.Proposed change
resource "azurerm_key_vault_secret" "config" { name = "${var.name_prefix}-config" value = var.config_json key_vault_id = azurerm_key_vault.this.id + expiration_date = var.secret_expiration_date }
7340a22 to
9b0360f
Compare
Merge activity
|

Summary
Add Terraform modules for deploying Bifrost on AWS, GCP, and Azure with a unified interface. This enables infrastructure-as-code deployments across multiple cloud providers and services.
Changes
Type of change
Affected areas
How to test
Breaking changes
Security considerations
The modules handle sensitive configuration through cloud provider secret management services:
Configuration can be provided via Terraform variables or files, with proper sensitive marking to prevent exposure in logs.
Checklist