Skip to content

Commit 7361356

Browse files
authored
Merge pull request #57 from MaterializeInc/swap_follow_up_improvements
Follow ups after enabling swap
2 parents 0004cdd + 9942c2f commit 7361356

File tree

6 files changed

+124
-146
lines changed

6 files changed

+124
-146
lines changed

README.md

Lines changed: 30 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -91,58 +91,7 @@ This module requires an existing Azure Resource Group. You can either:
9191
resource_group_name = "your-existing-rg"
9292
```
9393

94-
## Disk Support for Materialize on Azure
95-
96-
This module supports configuring disks for Materialize on Azure using **local NVMe SSDs** available in specific VM families, along with **OpenEBS** and LVM for volume management.
97-
98-
### Recommended Azure VM Types with Local NVMe Disks
99-
100-
Materialize benefits from fast ephemeral storage and recommends a **minimum 2:1 disk-to-RAM ratio**. The [Epdsv6-series](https://learn.microsoft.com/en-us/azure/virtual-machines/sizes/memory-optimized/epdsv6-series?tabs=sizebasic#sizes-in-series) virtual machines offer a balanced combination of **high memory, local NVMe storage**.
101-
102-
#### Epdsv6-series
103-
104-
| VM Size | vCPUs | Memory | Ephemeral Disk | Disk-to-RAM Ratio |
105-
| -------------------- | ----- | ------- | -------------- | ----------------- |
106-
| `Standard_E2pds_v6` | 2 | 16 GiB | 75 GiB | ~4.7:1 |
107-
| `Standard_E4pds_v6` | 4 | 32 GiB | 150 GiB | ~4.7:1 |
108-
| `Standard_E8pds_v6` | 8 | 64 GiB | 300 GiB | ~4.7:1 |
109-
| `Standard_E16pds_v6` | 16 | 128 GiB | 600 GiB | ~4.7:1 |
110-
| `Standard_E32pds_v6` | 32 | 256 GiB | 1,200 GiB | ~4.7:1 |
111-
112-
> [!NOTE]
113-
> These VM types provide **ephemeral local NVMe SSD disks**. Data is lost when the VM is stopped or deleted, so they should only be used for **temporary or performance-critical data** managed by Materialize.
114-
115-
### Enabling Disk Support on Azure
116-
117-
When `enable_disk_support` is set to `true`, the module:
118-
119-
1. Uses a bootstrap container to identify and configure available NVMe disks
120-
1. Sets up **OpenEBS** with `lvm-localpv` to manage the ephemeral disks
121-
1. Creates a StorageClass for Materialize
122-
123-
Example configuration:
124-
125-
```hcl
126-
enable_disk_support = true
127-
128-
aks_config = {
129-
node_count = 2
130-
vm_size = "Standard_E4pds_v6"
131-
os_disk_size_gb = 100
132-
min_nodes = 2
133-
max_nodes = 4
134-
}
135-
136-
disk_support_config = {
137-
install_openebs = true
138-
run_disk_setup_script = true
139-
create_storage_class = true
140-
141-
openebs_version = "4.3.3"
142-
openebs_namespace = "openebs"
143-
storage_class_name = "openebs-lvm-instance-store-ext4"
144-
}
145-
```
94+
### Advanced Configuration
14695

14796
## `materialize_instances` variable
14897

@@ -188,7 +137,7 @@ No providers.
188137
| <a name="module_networking"></a> [networking](#module\_networking) | ./modules/networking | n/a |
189138
| <a name="module_operator"></a> [operator](#module\_operator) | github.com/MaterializeInc/terraform-helm-materialize | v0.1.35 |
190139
| <a name="module_storage"></a> [storage](#module\_storage) | ./modules/storage | n/a |
191-
| <a name="module_swap_nodepool"></a> [swap\_nodepool](#module\_swap\_nodepool) | ./modules/nodepool | n/a |
140+
| <a name="module_materialize_nodepool"></a> [materialize\_nodepool](#module\_materialize\_nodepool) | ./modules/nodepool | n/a |
192141

193142
## Resources
194143

@@ -198,7 +147,14 @@ No resources.
198147

199148
| Name | Description | Type | Default | Required |
200149
|------|-------------|------|---------|:--------:|
201-
| <a name="input_aks_config"></a> [aks\_config](#input\_aks\_config) | AKS cluster configuration | <pre>object({<br/> vm_size = string<br/> disk_size_gb = number<br/> min_nodes = number<br/> max_nodes = number<br/> })</pre> | <pre>{<br/> "disk_size_gb": 100,<br/> "max_nodes": 5,<br/> "min_nodes": 1,<br/> "vm_size": "Standard_E4pds_v6"<br/>}</pre> | no |
150+
| <a name="input_system_node_pool_vm_size"></a> [system\_node\_pool\_vm\_size](#input\_system\_node\_pool\_vm\_size) | VM size for system node pool | `string` | `"Standard_E4pds_v6"` | no |
151+
| <a name="input_system_node_pool_disk_size_gb"></a> [system\_node\_pool\_disk\_size\_gb](#input\_system\_node\_pool\_disk\_size\_gb) | Disk size in GB for system node pool | `number` | `100` | no |
152+
| <a name="input_system_node_pool_min_nodes"></a> [system\_node\_pool\_min\_nodes](#input\_system\_node\_pool\_min\_nodes) | Minimum number of nodes in system node pool | `number` | `1` | no |
153+
| <a name="input_system_node_pool_max_nodes"></a> [system\_node\_pool\_max\_nodes](#input\_system\_node\_pool\_max\_nodes) | Maximum number of nodes in system node pool | `number` | `4` | no |
154+
| <a name="input_materialize_node_pool_vm_size"></a> [materialize\_node\_pool\_vm\_size](#input\_materialize\_node\_pool\_vm\_size) | VM size for Materialize node pool | `string` | `"Standard_E4pds_v6"` | no |
155+
| <a name="input_materialize_node_pool_disk_size_gb"></a> [materialize\_node\_pool\_disk\_size\_gb](#input\_materialize\_node\_pool\_disk\_size\_gb) | Disk size in GB for Materialize node pool | `number` | `100` | no |
156+
| <a name="input_materialize_node_pool_min_nodes"></a> [materialize\_node\_pool\_min\_nodes](#input\_materialize\_node\_pool\_min\_nodes) | Minimum number of nodes in Materialize node pool | `number` | `1` | no |
157+
| <a name="input_materialize_node_pool_max_nodes"></a> [materialize\_node\_pool\_max\_nodes](#input\_materialize\_node\_pool\_max\_nodes) | Maximum number of nodes in Materialize node pool | `number` | `4` | no |
202158
| <a name="input_cert_manager_chart_version"></a> [cert\_manager\_chart\_version](#input\_cert\_manager\_chart\_version) | Version of the cert-manager helm chart to install. | `string` | `"v1.17.1"` | no |
203159
| <a name="input_cert_manager_install_timeout"></a> [cert\_manager\_install\_timeout](#input\_cert\_manager\_install\_timeout) | Timeout for installing the cert-manager helm chart, in seconds. | `number` | `300` | no |
204160
| <a name="input_cert_manager_namespace"></a> [cert\_manager\_namespace](#input\_cert\_manager\_namespace) | The name of the namespace in which cert-manager is or will be installed. | `string` | `"cert-manager"` | no |
@@ -219,7 +175,6 @@ No resources.
219175
| <a name="input_orchestratord_version"></a> [orchestratord\_version](#input\_orchestratord\_version) | Version of the Materialize orchestrator to install | `string` | `null` | no |
220176
| <a name="input_prefix"></a> [prefix](#input\_prefix) | Prefix to be used for resource names | `string` | `"materialize"` | no |
221177
| <a name="input_resource_group_name"></a> [resource\_group\_name](#input\_resource\_group\_name) | The name of an existing resource group to use | `string` | n/a | yes |
222-
| <a name="input_swap_enabled"></a> [swap\_enabled](#input\_swap\_enabled) | Enable swap for Materialize. When enabled, this configures swap on a new nodepool, and adds it to the clusterd node selectors. | `bool` | `false` | no |
223178
| <a name="input_tags"></a> [tags](#input\_tags) | Tags to apply to all resources | `map(string)` | `{}` | no |
224179
| <a name="input_use_local_chart"></a> [use\_local\_chart](#input\_use\_local\_chart) | Whether to use a local chart instead of one from a repository | `bool` | `false` | no |
225180
| <a name="input_use_self_signed_cluster_issuer"></a> [use\_self\_signed\_cluster\_issuer](#input\_use\_self\_signed\_cluster\_issuer) | Whether to install and use a self-signed ClusterIssuer for TLS. To work around limitations in Terraform, this will be treated as `false` if no materialize instances are defined. | `bool` | `true` | no |
@@ -269,6 +224,26 @@ More advanced TLS support using user-provided CAs or per-Materialize `Issuer`s a
269224

270225
## Upgrade Notes
271226

227+
#### v0.7.0
228+
229+
This is an intermediate version to handle some changes that must be applied in stages.
230+
It is recommended to upgrade to v0.8.x after upgrading to this version.
231+
232+
Breaking changes:
233+
* Swap is enabled by default.
234+
* Support for lgalloc, our legacy spill to disk mechanism, is deprecated, and will be removed in the next version.
235+
* We now always use two node pools, one for system workloads and one for Materialize workloads.
236+
* Variables for configuring these node pools have been renamed, so they may be configured separately.
237+
238+
To avoid downtime when upgrading to future versions, you must perform a rollout at this version.
239+
1. Ensure your `environmentd_version` is at least `v26.0.0`.
240+
2. Update your `request_rollout` (and `force_rollout` if already at the correct `environmentd_version`).
241+
3. Run `terraform apply`.
242+
243+
You must upgrade to at least v0.6.x before upgrading to v0.7.0 of this terraform code.
244+
245+
It is strongly recommended to have enabled swap on v0.6.x before upgrading to v0.7.0 or higher.
246+
272247
#### v0.6.1
273248

274249
We recommend upgrading to at least v0.5.10 before upgrading to v0.6.x of this terraform code.

docs/footer.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,26 @@ More advanced TLS support using user-provided CAs or per-Materialize `Issuer`s a
2727

2828
## Upgrade Notes
2929

30+
#### v0.7.0
31+
32+
This is an intermediate version to handle some changes that must be applied in stages.
33+
It is recommended to upgrade to v0.8.x after upgrading to this version.
34+
35+
Breaking changes:
36+
* Swap is enabled by default.
37+
* Support for lgalloc, our legacy spill to disk mechanism, is deprecated, and will be removed in the next version.
38+
* We now always use two node pools, one for system workloads and one for Materialize workloads.
39+
* Variables for configuring these node pools have been renamed, so they may be configured separately.
40+
41+
To avoid downtime when upgrading to future versions, you must perform a rollout at this version.
42+
1. Ensure your `environmentd_version` is at least `v26.0.0`.
43+
2. Update your `request_rollout` (and `force_rollout` if already at the correct `environmentd_version`).
44+
3. Run `terraform apply`.
45+
46+
You must upgrade to at least v0.6.x before upgrading to v0.7.0 of this terraform code.
47+
48+
It is strongly recommended to have enabled swap on v0.6.x before upgrading to v0.7.0 or higher.
49+
3050
#### v0.6.1
3151

3252
We recommend upgrading to at least v0.5.10 before upgrading to v0.6.x of this terraform code.

docs/header.md

Lines changed: 1 addition & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -90,58 +90,7 @@ This module requires an existing Azure Resource Group. You can either:
9090
resource_group_name = "your-existing-rg"
9191
```
9292

93-
## Disk Support for Materialize on Azure
94-
95-
This module supports configuring disks for Materialize on Azure using **local NVMe SSDs** available in specific VM families, along with **OpenEBS** and LVM for volume management.
96-
97-
### Recommended Azure VM Types with Local NVMe Disks
98-
99-
Materialize benefits from fast ephemeral storage and recommends a **minimum 2:1 disk-to-RAM ratio**. The [Epdsv6-series](https://learn.microsoft.com/en-us/azure/virtual-machines/sizes/memory-optimized/epdsv6-series?tabs=sizebasic#sizes-in-series) virtual machines offer a balanced combination of **high memory, local NVMe storage**.
100-
101-
#### Epdsv6-series
102-
103-
| VM Size | vCPUs | Memory | Ephemeral Disk | Disk-to-RAM Ratio |
104-
| -------------------- | ----- | ------- | -------------- | ----------------- |
105-
| `Standard_E2pds_v6` | 2 | 16 GiB | 75 GiB | ~4.7:1 |
106-
| `Standard_E4pds_v6` | 4 | 32 GiB | 150 GiB | ~4.7:1 |
107-
| `Standard_E8pds_v6` | 8 | 64 GiB | 300 GiB | ~4.7:1 |
108-
| `Standard_E16pds_v6` | 16 | 128 GiB | 600 GiB | ~4.7:1 |
109-
| `Standard_E32pds_v6` | 32 | 256 GiB | 1,200 GiB | ~4.7:1 |
110-
111-
> [!NOTE]
112-
> These VM types provide **ephemeral local NVMe SSD disks**. Data is lost when the VM is stopped or deleted, so they should only be used for **temporary or performance-critical data** managed by Materialize.
113-
114-
### Enabling Disk Support on Azure
115-
116-
When `enable_disk_support` is set to `true`, the module:
117-
118-
1. Uses a bootstrap container to identify and configure available NVMe disks
119-
1. Sets up **OpenEBS** with `lvm-localpv` to manage the ephemeral disks
120-
1. Creates a StorageClass for Materialize
121-
122-
Example configuration:
123-
124-
```hcl
125-
enable_disk_support = true
126-
127-
aks_config = {
128-
node_count = 2
129-
vm_size = "Standard_E4pds_v6"
130-
os_disk_size_gb = 100
131-
min_nodes = 2
132-
max_nodes = 4
133-
}
134-
135-
disk_support_config = {
136-
install_openebs = true
137-
run_disk_setup_script = true
138-
create_storage_class = true
139-
140-
openebs_version = "4.3.3"
141-
openebs_namespace = "openebs"
142-
storage_class_name = "openebs-lvm-instance-store-ext4"
143-
}
144-
```
93+
### Advanced Configuration
14594

14695
## `materialize_instances` variable
14796

examples/simple/main.tf

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,15 @@ module "materialize" {
9191

9292
materialize_instances = var.materialize_instances
9393

94-
swap_enabled = var.swap_enabled
94+
system_node_pool_vm_size = "Standard_E4pds_v6"
95+
system_node_pool_disk_size_gb = 100
96+
system_node_pool_min_nodes = 1
97+
system_node_pool_max_nodes = 2
98+
99+
materialize_node_pool_vm_size = "Standard_E4pds_v6"
100+
materialize_node_pool_disk_size_gb = 100
101+
materialize_node_pool_min_nodes = 1
102+
materialize_node_pool_max_nodes = 2
95103

96104
database_config = {
97105
sku_name = "GP_Standard_D2s_v3"
@@ -190,12 +198,6 @@ variable "use_self_signed_cluster_issuer" {
190198
default = true
191199
}
192200

193-
variable "swap_enabled" {
194-
description = "Enable swap for Materialize. When enabled, this configures swap on a new nodepool, and adds it to the clusterd node selectors."
195-
type = bool
196-
default = false
197-
}
198-
199201
# Output the Materialize instance details
200202
output "aks_cluster" {
201203
description = "AKS cluster details"

main.tf

Lines changed: 18 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ locals {
44
module = "materialize"
55
})
66

7+
# TODO we can't delete this until we're certain no one is using it
78
# Disk support configuration
89
disk_config = {
910
install_openebs = var.enable_disk_support ? lookup(var.disk_support_config, "install_openebs", true) : false
@@ -47,10 +48,10 @@ module "aks" {
4748
subnet_id = module.networking.aks_subnet_id
4849
service_cidr = var.network_config.service_cidr
4950

50-
vm_size = var.aks_config.vm_size
51-
disk_size_gb = var.aks_config.disk_size_gb
52-
min_nodes = var.aks_config.min_nodes
53-
max_nodes = var.aks_config.max_nodes
51+
vm_size = var.system_node_pool_vm_size
52+
disk_size_gb = var.system_node_pool_disk_size_gb
53+
min_nodes = var.system_node_pool_min_nodes
54+
max_nodes = var.system_node_pool_max_nodes
5455

5556
# Disk support configuration
5657
enable_disk_setup = local.disk_config.run_disk_setup_script
@@ -62,22 +63,21 @@ module "aks" {
6263
tags = local.common_labels
6364
}
6465

65-
module "swap_nodepool" {
66-
count = var.swap_enabled ? 1 : 0
66+
module "materialize_nodepool" {
6767
source = "./modules/nodepool"
6868
depends_on = [module.aks]
6969

7070
prefix = "${var.prefix}-mz-swap"
7171
cluster_id = module.aks.cluster_id
7272
subnet_id = module.networking.aks_subnet_id
7373

74-
vm_size = var.aks_config.vm_size
75-
disk_size_gb = var.aks_config.disk_size_gb
74+
vm_size = var.materialize_node_pool_vm_size
75+
disk_size_gb = var.materialize_node_pool_disk_size_gb
7676

7777
autoscaling_config = {
7878
enabled = true
79-
min_nodes = var.aks_config.min_nodes
80-
max_nodes = var.aks_config.max_nodes
79+
min_nodes = var.materialize_node_pool_min_nodes
80+
max_nodes = var.materialize_node_pool_max_nodes
8181
}
8282

8383
swap_enabled = true
@@ -87,6 +87,11 @@ module "swap_nodepool" {
8787
tags = local.common_labels
8888
}
8989

90+
moved {
91+
from = module.swap_nodepool[0]
92+
to = module.materialize_nodepool
93+
}
94+
9095
module "database" {
9196
source = "./modules/database"
9297

@@ -148,7 +153,7 @@ locals {
148153
region = var.location
149154
}
150155
clusters = {
151-
swap_enabled = var.swap_enabled
156+
swap_enabled = true
152157
}
153158
}
154159
observability = {
@@ -184,6 +189,7 @@ locals {
184189
}
185190
}
186191
} : {}
192+
# TODO we can't delete this until we're certain no one is using it
187193
storage = var.enable_disk_support ? {
188194
storageClass = {
189195
create = local.disk_config.create_storage_class
@@ -252,7 +258,7 @@ module "operator" {
252258

253259
depends_on = [
254260
module.aks,
255-
module.swap_nodepool,
261+
module.materialize_nodepool,
256262
module.database,
257263
module.storage,
258264
module.certificates,

0 commit comments

Comments
 (0)