Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,20 @@ to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [0.7.0] - 2025-12-12

## [0.6.0] - 2025-12-10
[Compare with previous version](https://github.com/sparkfabrik/terraform-google-services-monitoring/compare/0.6.0...0.7.0)
Comment thread
FabrizioCafolla marked this conversation as resolved.

[Compare with previous version](https://github.com/sparkfabrik/terraform-google-services-monitoring/compare/0.5.0...0.6.0)
### Added

### Changed
- refs platform/board#4071: add SSL certificate expiration alert configuration

## [0.6.0] - 2025-12-11

[Compare with previous version](https://github.com/sparkfabrik/terraform-google-services-monitoring/compare/0.5.0...0.6.0)

### Added

- refs platform/board#4052: add Typesense monitoring alerts and configuration for uptime checks and container checks

## [0.5.0] - 2025-12-01
Expand Down
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ Supported services:
| <a name="input_kyverno"></a> [kyverno](#input\_kyverno) | Configuration for Kyverno monitoring alerts. Allows customization of cluster name, project, notification channels, alert documentation, metric thresholds, auto-close timing, enablement, extra filters, and namespace. | <pre>object({<br/> enabled = optional(bool, true)<br/> cluster_name = string<br/> project_id = optional(string, null)<br/> notification_enabled = optional(bool, true)<br/> notification_channels = optional(list(string), [])<br/> # Rate limit for notifications, e.g. "300s" for 5 minutes, used only for log match alerts<br/> logmatch_notification_rate_limit = optional(string, "300s")<br/> alert_documentation = optional(string, null)<br/> auto_close_seconds = optional(number, 3600)<br/> filter_extra = optional(string, "")<br/> namespace = optional(string, "kyverno")<br/> })</pre> | n/a | yes |
| <a name="input_notification_channels"></a> [notification\_channels](#input\_notification\_channels) | List of notification channel IDs to notify when an alert is triggered | `list(string)` | `[]` | no |
| <a name="input_project_id"></a> [project\_id](#input\_project\_id) | The Google Cloud project ID where logging exclusions will be created | `string` | n/a | yes |
| <a name="input_ssl_alert"></a> [ssl\_alert](#input\_ssl\_alert) | Configuration for SSL certificate expiration alerts. Allows customization of project, notification channels, alert thresholds, and user labels. | <pre>object({<br/> enabled = optional(bool, false)<br/> project_id = optional(string, null)<br/> notification_enabled = optional(bool, true)<br/> notification_channels = optional(list(string), [])<br/> threshold_days = optional(list(number), [15, 7])<br/> user_labels = optional(map(string), {})<br/> })</pre> | `{}` | no |
| <a name="input_typesense"></a> [typesense](#input\_typesense) | Configuration for Typesense monitoring alerts. Supports uptime checks for HTTP endpoints and container-level alerts (pod restarts) in GKE. Each app is identified by its name (map key). For container checks, the app name corresponds to the Kubernetes 'app' label; for apps with only uptime checks, this correspondence does not apply. | <pre>object({<br/> enabled = optional(bool, false)<br/> project_id = optional(string, null)<br/> notification_enabled = optional(bool, true)<br/> notification_channels = optional(list(string), [])<br/> cluster_name = optional(string, null) # GKE cluster name for container checks<br/><br/> # Apps configuration - map keyed by app_name<br/> apps = optional(map(object({<br/> # Uptime check configuration (optional)<br/> uptime_check = optional(object({<br/> enabled = optional(bool, true)<br/> host = string<br/> path = optional(string, "/readyz")<br/> }), null)<br/><br/> # Container check configuration for GKE (optional)<br/> container_check = optional(object({<br/> enabled = optional(bool, true)<br/> namespace = string<br/> pod_restart = optional(object({<br/> threshold = optional(number, 0)<br/> alignment_period = optional(number, 60)<br/> duration = optional(number, 0)<br/> auto_close_seconds = optional(number, 3600)<br/> }), {})<br/> }), null)<br/> })), {})<br/> })</pre> | `{}` | no |

## Outputs
Expand All @@ -49,6 +50,7 @@ Supported services:
| <a name="output_cloud_sql_cpu_utilization"></a> [cloud\_sql\_cpu\_utilization](#output\_cloud\_sql\_cpu\_utilization) | n/a |
| <a name="output_cloud_sql_disk_utilization"></a> [cloud\_sql\_disk\_utilization](#output\_cloud\_sql\_disk\_utilization) | n/a |
| <a name="output_cloud_sql_memory_utilization"></a> [cloud\_sql\_memory\_utilization](#output\_cloud\_sql\_memory\_utilization) | n/a |
| <a name="output_ssl_alert_policy_names"></a> [ssl\_alert\_policy\_names](#output\_ssl\_alert\_policy\_names) | n/a |

## Resources

Expand All @@ -59,6 +61,7 @@ Supported services:
| [google_monitoring_alert_policy.cloud_sql_disk_utilization](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/monitoring_alert_policy) | resource |
| [google_monitoring_alert_policy.cloud_sql_memory_utilization](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/monitoring_alert_policy) | resource |
| [google_monitoring_alert_policy.kyverno_logmatch_alert](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/monitoring_alert_policy) | resource |
| [google_monitoring_alert_policy.ssl_expiring_days](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/monitoring_alert_policy) | resource |
| [google_monitoring_alert_policy.typesense_pod_restart](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/monitoring_alert_policy) | resource |

## Modules
Expand Down
4 changes: 4 additions & 0 deletions outputs.tf
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,7 @@ output "cloud_sql_memory_utilization" {
output "cloud_sql_cpu_utilization" {
value = { for k, v in google_monitoring_alert_policy.cloud_sql_cpu_utilization : k => v.name }
}

output "ssl_alert_policy_names" {
value = { for days, alert in google_monitoring_alert_policy.ssl_expiring_days : days => alert.name }
}
37 changes: 37 additions & 0 deletions ssl_alert.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
locals {
ssl_alert_project_id = var.ssl_alert.project_id != null ? var.ssl_alert.project_id : var.project_id

ssl_alert_notification_channels = var.ssl_alert.notification_enabled ? (length(var.ssl_alert.notification_channels) > 0 ? var.ssl_alert.notification_channels : var.notification_channels) : []
}

resource "google_monitoring_alert_policy" "ssl_expiring_days" {
for_each = var.ssl_alert.enabled ? toset([for days in var.ssl_alert.threshold_days : tostring(days)]) : []

display_name = "SSL certificate expiring soon (${each.value} days)"
combiner = "OR"
conditions {
condition_threshold {
filter = "metric.type=\"monitoring.googleapis.com/uptime_check/time_until_ssl_cert_expires\" AND resource.type=\"uptime_url\""
comparison = "COMPARISON_LT"
threshold_value = each.value
duration = "600s"
trigger {
count = 1
}
aggregations {
alignment_period = "1200s"
per_series_aligner = "ALIGN_NEXT_OLDER"
cross_series_reducer = "REDUCE_MEAN"
group_by_fields = [
"resource.label.*"
]
}
}
display_name = "SSL certificate expiring soon (${each.value} days)"
}

user_labels = var.ssl_alert.user_labels

notification_channels = local.ssl_alert_notification_channels
project = local.ssl_alert_project_id
}
14 changes: 13 additions & 1 deletion variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,6 @@ variable "cert_manager" {
variable "typesense" {
description = "Configuration for Typesense monitoring alerts. Supports uptime checks for HTTP endpoints and container-level alerts (pod restarts) in GKE. Each app is identified by its name (map key). For container checks, the app name corresponds to the Kubernetes 'app' label; for apps with only uptime checks, this correspondence does not apply."
default = {}

type = object({
enabled = optional(bool, false)
project_id = optional(string, null)
Expand Down Expand Up @@ -152,3 +151,16 @@ variable "typesense" {
error_message = "When any app has container_check configured, 'cluster_name' must be provided at the typesense level."
}
}

variable "ssl_alert" {
description = "Configuration for SSL certificate expiration alerts. Allows customization of project, notification channels, alert thresholds, and user labels."
default = {}
type = object({
enabled = optional(bool, false)
project_id = optional(string, null)
notification_enabled = optional(bool, true)
notification_channels = optional(list(string), [])
threshold_days = optional(list(number), [15, 7])
user_labels = optional(map(string), {})
})
}