Skip to content

Latest commit

 

History

History
73 lines (53 loc) · 9.57 KB

File metadata and controls

73 lines (53 loc) · 9.57 KB

Terraform GCP Services Monitoring Module

This module creates a set of monitoring alerts for Google Cloud Platform services.

Supported services:

  • Cloud SQL

    • CPU usage
    • Storage usage
    • Memory usage
  • Kyverno

    • Error logs for admission-controller, background-controller, cleanup-controller, reports-controller
  • cert-manager

    • Error logs for cert-manager controller when an Issuer or ClusterIssuer is missing

Providers

Name Version
google >= 5.10

Requirements

Name Version
terraform >= 1.5
google >= 5.10

Inputs

Name Description Type Default Required
cert_manager Configuration for cert-manager missing issuer log alert. Allows customization of project, cluster, namespace, notification channels, alert documentation, enablement, extra filters, auto-close timing, and notification rate limiting.
object({
enabled = optional(bool, true)
cluster_name = string
project_id = optional(string, null)
namespace = optional(string, "cert-manager")
notification_enabled = optional(bool, true)
notification_channels = optional(list(string), [])
logmatch_notification_rate_limit = optional(string, "300s")
alert_documentation = optional(string, null)
auto_close_seconds = optional(number, 3600)
filter_extra = optional(string, "")
})
n/a yes
cloud_sql Configuration for Cloud SQL monitoring alerts. Supports customization of project, auto-close timing, notification channels, and per-instance alert thresholds for CPU, memory, and disk utilization.
object({
project_id = optional(string, null)
auto_close = optional(string, "86400s") # default 24h
notification_enabled = optional(bool, true)
notification_channels = optional(list(string), [])
instances = optional(map(object({
cpu_utilization = optional(list(object({
severity = optional(string, "WARNING"),
threshold = optional(number, 0.90)
alignment_period = optional(string, "120s")
duration = optional(string, "300s")
})), [
{
threshold = 0.85,
duration = "1200s",
},
{
severity = "CRITICAL",
threshold = 1,
duration = "300s",
alignment_period = "60s",
}
])
memory_utilization = optional(list(object({
severity = optional(string, "WARNING"),
threshold = optional(number, 0.90)
alignment_period = optional(string, "300s")
duration = optional(string, "300s")
})), [
{
severity = "WARNING",
},
{
severity = "CRITICAL",
threshold = 0.95,
}
])
disk_utilization = optional(list(object({
severity = optional(string, "WARNING"),
threshold = optional(number, 0.85)
alignment_period = optional(string, "300s")
duration = optional(string, "600s")
})), [
{
severity = "WARNING",
},
{
severity = "CRITICAL",
threshold = 0.95,
}
])
})), {})
})
n/a yes
kyverno Configuration for Kyverno monitoring alerts. Allows customization of cluster name, project, notification channels, alert documentation, metric thresholds, auto-close timing, enablement, extra filters, and namespace.
object({
enabled = optional(bool, true)
cluster_name = string
project_id = optional(string, null)
notification_enabled = optional(bool, true)
notification_channels = optional(list(string), [])
# Rate limit for notifications, e.g. "300s" for 5 minutes, used only for log match alerts
logmatch_notification_rate_limit = optional(string, "300s")
alert_documentation = optional(string, null)
auto_close_seconds = optional(number, 3600)
filter_extra = optional(string, "")
namespace = optional(string, "kyverno")
})
n/a yes
notification_channels List of notification channel IDs to notify when an alert is triggered list(string) [] no
project_id The Google Cloud project ID where logging exclusions will be created string n/a yes
ssl_alert Configuration for SSL certificate expiration alerts. Allows customization of project, notification channels, alert thresholds, and user labels.
object({
enabled = optional(bool, false)
project_id = optional(string, null)
notification_enabled = optional(bool, true)
notification_channels = optional(list(string), [])
threshold_days = optional(list(number), [15, 7])
user_labels = optional(map(string), {})
})
{} no
typesense Configuration for Typesense monitoring alerts. Supports uptime checks for HTTP endpoints and container-level alerts (pod restarts) in GKE. Each app is identified by its name (map key). For container checks, the app name corresponds to the Kubernetes 'app' label; for apps with only uptime checks, this correspondence does not apply.
object({
enabled = optional(bool, false)
project_id = optional(string, null)
notification_enabled = optional(bool, true)
notification_channels = optional(list(string), [])
cluster_name = optional(string, null) # GKE cluster name for container checks

# Apps configuration - map keyed by app_name
apps = optional(map(object({
# Uptime check configuration (optional)
uptime_check = optional(object({
enabled = optional(bool, true)
host = string
path = optional(string, "/readyz")
}), null)

# Container check configuration for GKE (optional)
container_check = optional(object({
enabled = optional(bool, true)
namespace = string
pod_restart = optional(object({
threshold = optional(number, 0)
alignment_period = optional(number, 60)
duration = optional(number, 0)
auto_close_seconds = optional(number, 3600)
}), {})
}), null)
})), {})
})
{} no

Outputs

Name Description
cloud_sql_cpu_utilization n/a
cloud_sql_disk_utilization n/a
cloud_sql_memory_utilization n/a
ssl_alert_policy_names n/a

Resources

Name Type
google_monitoring_alert_policy.cert_manager_logmatch_alert resource
google_monitoring_alert_policy.cloud_sql_cpu_utilization resource
google_monitoring_alert_policy.cloud_sql_disk_utilization resource
google_monitoring_alert_policy.cloud_sql_memory_utilization resource
google_monitoring_alert_policy.kyverno_logmatch_alert resource
google_monitoring_alert_policy.ssl_expiring_days resource
google_monitoring_alert_policy.typesense_pod_restart resource

Modules

Name Source Version
typesense_uptime_checks github.com/sparkfabrik/terraform-sparkfabrik-gcp-http-monitoring 1.0.0