create guide/recomendation for custom cluster sizes #32444

jubrad · 2025-05-09T00:21:39Z

Motivation

Currently self-managed users are more or less flying blind when it comes to cluster sizes. We should offer some recommendations/guidance here. This is probably not the way to do it, but it's a start. It largely matches the defaults we've provided with some explanation. This should probably have some product review.

*This is a very rough draft, just trying to get the ball rolling on this.

Tips for reviewer

Checklist

This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

kay-kim · 2025-05-13T20:38:48Z

doc/user/content/sql/appendix-cluster-sizes.md

 {{% self-managed/materialize-cluster-sizes %}}

+## Custom Cluster Sizes


Moved the content into the appendix to lower the prominence since we don't want to encourage people to override the defaults.

kay-kim · 2025-05-13T20:46:02Z

doc/user/content/sql/appendix-cluster-sizes.md

+        memory_limit: <string>     # e.g., "46575MiB"
+```
+
+{{< yaml-table data="best_practices/sizing_recommendation" >}}


We also have some random blurbs scattered across (listing here)

If spill-to-disk is not enabled: 1:8 ratio of vCPU to GiB memory

If spill-to-disk is enabled (Recommended): 1:16 ratio of vCPU to GiB local instance storage

2:1 disk-to-RAM ratio with spill-to-disk enabled.

2:1 disk-to-RAM ratio with spill-to-disk enabled.

Once we settle on what's what, will rework into unified statements.

If spill-to-disk is not enabled: 1:8 ratio of vCPU to GiB memory
If spill-to-disk is enabled (Recommended): 1:16 ratio of vCPU to GiB local instance storage

Where is this coming from? I'm pretty sure we always want 1:8.

2:1 disk-to-RAM ratio with spill-to-disk enabled
sounds right

Where is this coming from? I'm pretty sure we always want 1:8.

https://github.com/MaterializeInc/materialize/tree/main/misc/helm-charts/operator#operational-guidelines

oh yeah, that's right, I've never seen cores to disk ratio before but this makes sense to me.

kay-kim · 2025-05-13T20:46:55Z

doc/user/content/sql/appendix-cluster-sizes.md

@@ -7,8 +7,40 @@ menu:
    weight: 900
 ---

+## Default Cluster Sizes


https://preview.materialize.com/materialize/32444/self-managed/v25.1/sql/appendix-cluster-sizes/#custom-cluster-sizes

kay-kim · 2025-05-13T21:13:31Z

doc/user/content/sql/appendix-cluster-sizes.md

+{{</ tip >}}
+
+```yaml
+operator:


And, ... I'm guessing that if using terraforms, people would set these via https://github.com/MaterializeInc/terraform-aws-materialize?tab=readme-ov-file#input_helm_values ?

I think we hide this from our documentation (which is probably good) but if you pull down any of the sample values or defaults it'll be there for people to change.

Something to consider for the future is I'm familiar with adding some blurb about "If you need guidance on ..., offer ...." kind of a blurbs. We're not there yet but custom cluster sizing might be one of those.

jubrad

Meaningless review since I "authored" this PR but I approve!

antiguru · 2025-05-14T17:34:51Z

doc/user/content/sql/appendix-cluster-sizes.md

+    sizes:
+      <size>:
+        workers: <int>
+        scale: <int>


Do we want to let users set scale to anything but 1? Larger scales aren' much tested and might come with caveats (network bandwidth requirements) that aren't well documented.

👍 Will update to have scale: 1 # Generally, should be set to 1.

We do show, however, in our default settings for 6400cc a scale value of 2. Hope that's okay ... since it is what it is set to. https://preview.materialize.com/materialize/32444/self-managed/v25.1/sql/appendix-cluster-sizes/#default-cluster-sizes

antiguru · 2025-05-14T17:36:27Z

doc/user/content/sql/appendix-cluster-sizes.md

+        scale: <int>
+        cpu_exclusive: <bool>
+        cpu_limit: <float>         # e.g., 6
+        credits_per_hour: <string> # e.g., "0.0"


credits_per_hour could be optional? That's a different change, but we should document that it's just a number for accounting purposes.

yeah, probably worth back-porting a "0.00" default. I can do this.

antiguru · 2025-05-14T17:37:45Z

doc/user/data/best_practices/sizing_recommendation.yml

+    Recommendation: |
+
+      Prefer whole number values to enable CPU affinity. Kubernetes only allows
+      CPU Affinity for pods taking a whole number of cores (not hyperthreads).


I think the hyperthread mention is incorrect here. Kubernetes doesn't really distinguish between cores and their hyperthreads.

jubrad requested a review from a team as a code owner May 9, 2025 00:21

jubrad requested a review from kay-kim May 9, 2025 16:30

create guide/recomendation for custom cluster sizes

94a9bf0

jubrad force-pushed the wip-custom-cluster-size-recomendations branch from 80938d7 to 94a9bf0 Compare May 9, 2025 19:59

tweak

6257fb0

kay-kim reviewed May 13, 2025

View reviewed changes

jubrad commented May 14, 2025

View reviewed changes

antiguru reviewed May 14, 2025

View reviewed changes

rm hyperthreads + tweak scale/credits_per_hour in yaml block

13d50f0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

create guide/recomendation for custom cluster sizes #32444

create guide/recomendation for custom cluster sizes #32444

jubrad commented May 9, 2025

kay-kim May 13, 2025

kay-kim May 13, 2025

jubrad May 14, 2025

kay-kim May 14, 2025

jubrad May 14, 2025

kay-kim May 13, 2025 •

edited

Loading

kay-kim May 13, 2025

jubrad May 14, 2025

jubrad May 14, 2025

kay-kim May 14, 2025

jubrad left a comment

antiguru May 14, 2025

kay-kim May 14, 2025

antiguru May 14, 2025

jubrad May 14, 2025

antiguru May 14, 2025

		{{% self-managed/materialize-cluster-sizes %}}

		## Custom Cluster Sizes

create guide/recomendation for custom cluster sizes #32444

Are you sure you want to change the base?

create guide/recomendation for custom cluster sizes #32444

Conversation

jubrad commented May 9, 2025

Motivation

Tips for reviewer

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kay-kim May 13, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jubrad left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kay-kim May 13, 2025 •

edited

Loading