Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 6 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,13 @@ This module supports configuring disk support for Materialize using local SSDs i
When using disk support for Materialize on GCP, you need to use machine types that support local SSD attachment. Here are some recommended machine types:

* [N2 series](https://cloud.google.com/compute/docs/general-purpose-machines#n2d_machine_types) with local NVMe SSDs:
* For memory-optimized workloads similar to AWS r7gd, consider `n2-highmem-16` or `n2-highmem-32` with local NVMe SSDs
* For memory-optimized workloads, consider `n2-highmem-16` or `n2-highmem-32` with local NVMe SSDs
* Example: `n2-highmem-32` with 2 or more local SSDs

* [N2D series](https://cloud.google.com/compute/docs/general-purpose-machines#n2d_machine_types) with local NVMe SSDs:
* For memory-optimized workloads, consider `n2d-highmem-16` or `n2d-highmem-32` with local NVMe SSDs
* Example: `n2d-highmem-32` with 2 or more local SSDs

### Enabling Disk Support

To enable disk support with default settings in your Terraform configuration:
Expand Down Expand Up @@ -105,15 +109,6 @@ The following table helps you determine the appropriate number of local SSDs bas
Remember that each local NVMe SSD in GCP provides 375GB of storage.
Choose the appropriate `local_ssd_count` to make sure your total disk space is at least twice the amount of RAM in your machine type for optimal Materialize performance.

### Local SSD Limitations in GCP

Note that there are some differences between AWS NVMe instance store and GCP local SSDs:

1. GCP local NVMe SSDs have a fixed size of 375 GB each
2. Local SSDs must be attached at instance creation time
3. The number of local SSDs you can attach depends on the machine type
4. Data on local SSDs is lost when the instance stops or is deleted

## Requirements

| Name | Version |
Expand Down Expand Up @@ -151,7 +146,7 @@ No resources.
| <a name="input_cert_manager_install_timeout"></a> [cert\_manager\_install\_timeout](#input\_cert\_manager\_install\_timeout) | Timeout for installing the cert-manager helm chart, in seconds. | `number` | `300` | no |
| <a name="input_cert_manager_namespace"></a> [cert\_manager\_namespace](#input\_cert\_manager\_namespace) | The name of the namespace in which cert-manager is or will be installed. | `string` | `"cert-manager"` | no |
| <a name="input_database_config"></a> [database\_config](#input\_database\_config) | Cloud SQL configuration | <pre>object({<br/> tier = optional(string, "db-custom-2-4096")<br/> version = optional(string, "POSTGRES_15")<br/> password = string<br/> username = optional(string, "materialize")<br/> db_name = optional(string, "materialize")<br/> })</pre> | n/a | yes |
| <a name="input_disk_support_config"></a> [disk\_support\_config](#input\_disk\_support\_config) | Advanced configuration for disk support (only used when enable\_disk\_support = true) | <pre>object({<br/> install_openebs = optional(bool, true)<br/> run_disk_setup_script = optional(bool, true)<br/> local_ssd_count = optional(number, 1)<br/> create_storage_class = optional(bool, true)<br/> openebs_version = optional(string, "4.2.0")<br/> openebs_namespace = optional(string, "openebs")<br/> storage_class_name = optional(string, "openebs-lvm-instance-store-ext4")<br/> storage_class_provisioner = optional(string, "local.csi.openebs.io")<br/> storage_class_parameters = optional(object({<br/> storage = optional(string, "lvm")<br/> volgroup = optional(string, "instance-store-vg")<br/> }), {})<br/> })</pre> | `{}` | no |
| <a name="input_disk_support_config"></a> [disk\_support\_config](#input\_disk\_support\_config) | Advanced configuration for disk support (only used when enable\_disk\_support = true) | <pre>object({<br/> install_openebs = optional(bool, true)<br/> run_disk_setup_script = optional(bool, true)<br/> local_ssd_count = optional(number, 1)<br/> create_storage_class = optional(bool, true)<br/> openebs_version = optional(string, "4.2.0")<br/> openebs_namespace = optional(string, "openebs")<br/> storage_class_name = optional(string, "openebs-lvm-instance-store-ext4")<br/> storage_class_provisioner = optional(string, "local.csi.openebs.io")<br/> })</pre> | `{}` | no |
| <a name="input_enable_disk_support"></a> [enable\_disk\_support](#input\_enable\_disk\_support) | Enable disk support for Materialize using OpenEBS and local SSDs. When enabled, this configures OpenEBS, runs the disk setup script, and creates appropriate storage classes. | `bool` | `true` | no |
| <a name="input_gke_config"></a> [gke\_config](#input\_gke\_config) | GKE cluster configuration. Make sure to use large enough machine types for your Materialize instances. | <pre>object({<br/> node_count = number<br/> machine_type = string<br/> disk_size_gb = number<br/> min_nodes = number<br/> max_nodes = number<br/> })</pre> | <pre>{<br/> "disk_size_gb": 100,<br/> "machine_type": "n2-highmem-8",<br/> "max_nodes": 2,<br/> "min_nodes": 1,<br/> "node_count": 1<br/>}</pre> | no |
| <a name="input_helm_chart"></a> [helm\_chart](#input\_helm\_chart) | Chart name from repository or local path to chart. For local charts, set the path to the chart directory. | `string` | `"materialize-operator"` | no |
Expand Down
15 changes: 5 additions & 10 deletions docs/header.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,13 @@ This module supports configuring disk support for Materialize using local SSDs i
When using disk support for Materialize on GCP, you need to use machine types that support local SSD attachment. Here are some recommended machine types:

* [N2 series](https://cloud.google.com/compute/docs/general-purpose-machines#n2d_machine_types) with local NVMe SSDs:
* For memory-optimized workloads similar to AWS r7gd, consider `n2-highmem-16` or `n2-highmem-32` with local NVMe SSDs
* For memory-optimized workloads, consider `n2-highmem-16` or `n2-highmem-32` with local NVMe SSDs
* Example: `n2-highmem-32` with 2 or more local SSDs

* [N2D series](https://cloud.google.com/compute/docs/general-purpose-machines#n2d_machine_types) with local NVMe SSDs:
* For memory-optimized workloads, consider `n2d-highmem-16` or `n2d-highmem-32` with local NVMe SSDs
* Example: `n2d-highmem-32` with 2 or more local SSDs

### Enabling Disk Support

To enable disk support with default settings in your Terraform configuration:
Expand Down Expand Up @@ -104,12 +108,3 @@ The following table helps you determine the appropriate number of local SSDs bas

Remember that each local NVMe SSD in GCP provides 375GB of storage.
Choose the appropriate `local_ssd_count` to make sure your total disk space is at least twice the amount of RAM in your machine type for optimal Materialize performance.

### Local SSD Limitations in GCP

Note that there are some differences between AWS NVMe instance store and GCP local SSDs:

1. GCP local NVMe SSDs have a fixed size of 375 GB each
2. Local SSDs must be attached at instance creation time
3. The number of local SSDs you can attach depends on the machine type
4. Data on local SSDs is lost when the instance stops or is deleted
4 changes: 2 additions & 2 deletions main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@ locals {
storage_class_name = lookup(var.disk_support_config, "storage_class_name", "openebs-lvm-instance-store-ext4")
storage_class_provisioner = lookup(var.disk_support_config, "storage_class_provisioner", "local.csi.openebs.io")
storage_class_parameters = {
storage = try(var.disk_support_config.storage_class_parameters.storage, "lvm")
storage = "lvm"
fsType = "ext4"
volgroup = try(var.disk_support_config.storage_class_parameters.volgroup, "instance-store-vg")
volgroup = "instance-store-vg"
}
}
}
Expand Down
37 changes: 8 additions & 29 deletions modules/gke/bootstrap.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@ echo "Starting GCP NVMe SSD setup"
# Install required tools
if command -v apt-get >/dev/null 2>&1; then
apt-get update
apt-get install -y lvm2=2.03.11-2.1
apt-get install -y lvm2
elif command -v yum >/dev/null 2>&1; then
yum install -y lvm2-2.03.11-5.el9
yum install -y lvm2
else
echo "No package manager found. Please install required tools manually."
exit 1
Expand All @@ -17,37 +17,16 @@ fi
# Find NVMe devices
SSD_DEVICE_LIST=()

# Try the standard GCP pattern first
devices=$(find /dev/disk/by-id/ -name "google-local-nvme-ssd-*" 2>/dev/null || true)
devices=$(find /dev/disk/by-id/ -name "google-local-ssd-*" 2>/dev/null || true)
if [ -n "$devices" ]; then
while read -r device; do
SSD_DEVICE_LIST+=("$device")
done <<<"$devices"
fi

# If no devices found via standard pattern, look for NVMe devices directly
if [ ${#SSD_DEVICE_LIST[@]} -eq 0 ]; then
echo "No Local NVMe SSD devices found via standard pattern. Checking direct NVMe devices..."

for device in /dev/nvme*n*; do
# Skip if not a block device or if it's a partition
if [[ -b "$device" && ! "$device" =~ "p"[0-9]+ ]]; then
# Check if size is approximately 375GB
size_bytes=$(blockdev --getsize64 $device 2>/dev/null || echo 0)
# 375GB = approximately 402653184000 bytes
if ((size_bytes > 400000000000 && size_bytes < 405000000000)); then
echo "Found potential NVMe local SSD: $device ($((size_bytes / (1024 * 1024 * 1024))) GB)"
SSD_DEVICE_LIST+=("$device")
fi
fi
done
fi

echo "Found ${#SSD_DEVICE_LIST[@]} NVMe SSD devices: ${SSD_DEVICE_LIST[*]:-none}"

if [ ${#SSD_DEVICE_LIST[@]} -eq 0 ]; then
echo "No usable NVMe SSD devices found"
exit 0
else
echo "ERROR: No Local SSD devices found at standard path /dev/disk/by-id/google-local-ssd-*"
echo "Please verify that local SSDs were properly attached to this instance"
echo "See: https://cloud.google.com/compute/docs/disks/local-ssd"
exit 1
fi

# Check if any of the devices are already in use by LVM
Expand Down
20 changes: 14 additions & 6 deletions modules/gke/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -249,11 +249,9 @@ resource "kubernetes_daemonset" "disk_setup" {

resources {
limits = {
cpu = "200m"
memory = "256Mi"
memory = "128Mi"
}
requests = {
cpu = "100m"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still want at least some cpu request.

memory = "128Mi"
}
}
Comment on lines +250 to +261
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory requests and limits should always be equal.

CPU limits are generally frowned upon, as CPU can be returned extremely quickly. In the case of this bootstrap script, it will likely be idle for most of the time, so limiting CPU is likely not helpful.

Expand All @@ -278,6 +276,11 @@ resource "kubernetes_daemonset" "disk_setup" {
name = "mnt"
mount_path = "/mnt"
}

volume_mount {
name = "dev"
mount_path = "/dev"
}
}

container {
Expand All @@ -286,11 +289,9 @@ resource "kubernetes_daemonset" "disk_setup" {

resources {
limits = {
cpu = "50m"
memory = "64Mi"
memory = "32Mi"
}
requests = {
cpu = "10m"
memory = "32Mi"
}
}
Expand Down Expand Up @@ -318,6 +319,13 @@ resource "kubernetes_daemonset" "disk_setup" {
path = "/mnt"
}
}

volume {
name = "dev"
host_path {
path = "/dev"
}
}
}
}
}
Expand Down
4 changes: 0 additions & 4 deletions variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -221,10 +221,6 @@ variable "disk_support_config" {
openebs_namespace = optional(string, "openebs")
storage_class_name = optional(string, "openebs-lvm-instance-store-ext4")
storage_class_provisioner = optional(string, "local.csi.openebs.io")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The storage_class_provisioner should be constant.

storage_class_parameters = optional(object({
storage = optional(string, "lvm")
volgroup = optional(string, "instance-store-vg")
}), {})
})
default = {}
}