Lift the etcd limit from 8GiB to 100GiB #1071

ronaldngounou · 2025-10-11T09:24:57Z

As per performance improvements to etcd size limits have been evaluated to 100GB instead of 8GB.
https://www.cncf.io/blog/2019/05/09/performance-optimization-of-etcd-in-web-scale-data-scenario/

Contributes to issue #588

k8s-ci-robot · 2025-10-11T09:25:03Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ronaldngounou
Once this PR has been reviewed and has the lgtm label, please assign ivanvc for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ronaldngounou · 2025-10-11T18:39:48Z

Lint issues fixed:

content/en/docs/v3.4/faq.md:29:291 MD059/descriptive-link-text 
Link text should be descriptive [Context: "[here]"] 
(https://github.com/DavidAnson/markdownlint/blob/main/doc/md059.md)

jberkus · 2025-10-18T00:22:33Z

If you're doing this refactoring, I'd like to make it clear to users that the 100GB is a recommended maximum size, and not a hard limit. This would mean different text in a couple of places. I don't know what the actual hard limit is; probably need to look at the boltDB code.

ronaldngounou · 2025-11-02T23:03:01Z

Could you please suggest a wording that we should have in the meatime?

content/en/blog/2023/how_to_debug_large_db_size_issue.md

wendy-ha18 · 2025-11-18T08:12:29Z

content/en/docs/v3.5/op-guide/hardware.md

 ## Memory

-etcd has a relatively small memory footprint but its performance still depends on having enough memory. An etcd server will aggressively cache key-value data and spends most of the rest of its memory tracking watchers. Typically 8GB is enough. For heavy deployments with thousands of watchers and millions of keys, allocate 16GB to 64GB memory accordingly.
+etcd has a relatively small memory footprint but its performance still depends on having enough memory. An etcd server will aggressively cache key-value data and spends most of the rest of its memory tracking watchers. Typically 100GB is enough. For heavy deployments with thousands of watchers and millions of keys, allocate 16GB to 64GB memory accordingly.


Suggested change

etcd has a relatively small memory footprint but its performance still depends on having enough memory. An etcd server will aggressively cache key-value data and spends most of the rest of its memory tracking watchers. Typically 100GB is enough. For heavy deployments with thousands of watchers and millions of keys, allocate 16GB to 64GB memory accordingly.

etcd has a relatively small memory footprint but its performance still depends on having enough memory. An etcd server will aggressively cache key-value data and spends most of the rest of its memory tracking watchers. Typically 8GB is enough. For heavy deployments with thousands of watchers and millions of keys, allocate 16GB to 64GB memory accordingly. 100GB is a suggested maximum size for normal environments and etcd warns at startup if the configured value exceeds it.

Within the context of this doc: etcd has a relatively small memory ...... Typically 8GB is enough.... 100GB is a suggested maximum size for normal environments and etcd warns at startup if the configured value exceeds it is makes more sense for me.

Do we actually have a warning at 100GB? I don't have a machine I can test that on.

Reverted this change

content/en/docs/v3.1/dev-guide/limit.md

jberkus · 2025-11-19T01:27:30Z

content/en/docs/v3.5/op-guide/hardware.md

 ## Memory

-etcd has a relatively small memory footprint but its performance still depends on having enough memory. An etcd server will aggressively cache key-value data and spends most of the rest of its memory tracking watchers. Typically 8GB is enough. For heavy deployments with thousands of watchers and millions of keys, allocate 16GB to 64GB memory accordingly.
+etcd has a relatively small memory footprint but its performance still depends on having enough memory. An etcd server will aggressively cache key-value data and spends most of the rest of its memory tracking watchers. Typically 100GB is enough. For heavy deployments with thousands of watchers and millions of keys, allocate 16GB to 64GB memory accordingly.


Let's make this a limit, not a recommendation:

Suggested change

etcd has a relatively small memory footprint but its performance still depends on having enough memory. An etcd server will aggressively cache key-value data and spends most of the rest of its memory tracking watchers. Typically 100GB is enough. For heavy deployments with thousands of watchers and millions of keys, allocate 16GB to 64GB memory accordingly.

etcd has a relatively small memory footprint but its performance still depends on having enough memory. An etcd server will aggressively cache key-value data and spends most of the rest of its memory tracking watchers. Typically 8GB is enough. For heavy deployments with thousands of watchers and millions of keys, allocate 16GB to 64GB memory accordingly, up to a recommended maximum of 100GB.

jberkus · 2025-11-19T01:37:47Z

For content/en/blog/2023/how_to_debug_large_db_size_issue.md let's take it out of this PR, and open a separate effort to convert the blog post into an Operations doc.

content/en/docs/v3.6/dev-guide/limit.md

ToSuperGod · 2025-11-21T01:35:19Z

May I ask if there is any data in etcd that affects the cluster during data compression and fragmentation after storing 50GB of data? And how long does it take for large-scale insertion/query operations after completing the above operations

ronaldngounou · 2025-12-05T05:50:31Z

@ToSuperGod

May I ask if there is any data in etcd that affects the cluster during data compression and fragmentation after storing 50GB of data? And how long does it take for large-scale insertion/query operations after completing the above operations

When etcd stores large amounts of data (like 50GB, which is quite large for etcd), several things happen:
The main issue is fragmentation in the backend BoltDB database. As etcd performs updates and deletes, it creates "holes" in the database file. This fragmentation affects both disk usage and read performance since the database file becomes much larger than the actual data it contains.
Compression in etcd happens at the storage layer (BoltDB uses some compression), but the bigger concern is the growth of the database file itself. With heavy fragmentation, you might see a 50GB logical database consuming significantly more disk space, and reads having to traverse more pages.

Cluster Impact:
Performance degradation occurs in several ways: slower reads from the apiserver (affecting all kubectl commands and controller reconciliation loops), increased latency for watch operations (which controllers depend on), and potential leader election issues if followers fall too far behind during compaction.

The most critical impact is on write performance - if etcd is struggling, the entire Kubernetes control plane slows down because every resource change goes through etcd.

Compaction and Defragmentation:
After compaction (which removes old revisions) and defragmentation (which rebuilds the database file), you should see significant improvement. Defragmentation is particularly important - it rewrites the database to eliminate holes.

Timing for Large Operations:
For insertion/query timing after these operations, it really depends on your specific setup, but generally:

After proper compaction and defrag, query latency should drop significantly (potentially 50-90% improvement if fragmentation was severe)
Large-scale insertions should also improve, but etcd has hard limits - it's designed for storing configuration/state, not as a general-purpose database
etcd recommends keeping databases under 8GB in practice

If you're consistently storing 50GB in etcd, that's a red flag - you might need to rethink what you're storing there. Consider if you're inadvertently storing large ConfigMaps/Secrets or have resource leaks.

https://www.cncf.io/blog/2019/05/09/performance-optimization-of-etcd-in-web-scale-data-scenario/ Signed-off-by: Ronald Ngounou <[email protected]>

k8s-ci-robot added the size/XS label Oct 11, 2025

ronaldngounou force-pushed the issue588-lift_etcd_GB_limit branch 2 times, most recently from ba37b27 to c93d626 Compare October 11, 2025 09:40

k8s-ci-robot added size/M and removed size/XS labels Oct 11, 2025

ronaldngounou force-pushed the issue588-lift_etcd_GB_limit branch from c93d626 to 49407a6 Compare October 11, 2025 18:36

wendy-ha18 reviewed Nov 18, 2025

View reviewed changes

jberkus reviewed Nov 19, 2025

View reviewed changes

content/en/docs/v3.6/dev-guide/limit.md Show resolved Hide resolved

jberkus mentioned this pull request Nov 19, 2025

Create new storage limits doc based on blog post #1090

Open

ronaldngounou force-pushed the issue588-lift_etcd_GB_limit branch from 49407a6 to 8f3c651 Compare December 5, 2025 05:44

Lift the 8GB etcd limit to 100GB

1faefe1

https://www.cncf.io/blog/2019/05/09/performance-optimization-of-etcd-in-web-scale-data-scenario/ Signed-off-by: Ronald Ngounou <[email protected]>

ronaldngounou force-pushed the issue588-lift_etcd_GB_limit branch from 8f3c651 to 1faefe1 Compare December 5, 2025 06:02

	etcd has a relatively small memory footprint but its performance still depends on having enough memory. An etcd server will aggressively cache key-value data and spends most of the rest of its memory tracking watchers. Typically 100GB is enough. For heavy deployments with thousands of watchers and millions of keys, allocate 16GB to 64GB memory accordingly.
	etcd has a relatively small memory footprint but its performance still depends on having enough memory. An etcd server will aggressively cache key-value data and spends most of the rest of its memory tracking watchers. Typically 8GB is enough. For heavy deployments with thousands of watchers and millions of keys, allocate 16GB to 64GB memory accordingly, up to a recommended maximum of 100GB.

Lift the etcd limit from 8GiB to 100GiB #1071

Are you sure you want to change the base?

Lift the etcd limit from 8GiB to 100GiB #1071

Uh oh!

Conversation

ronaldngounou commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Oct 11, 2025

Uh oh!

ronaldngounou commented Oct 11, 2025

Uh oh!

jberkus commented Oct 18, 2025

Uh oh!

ronaldngounou commented Nov 2, 2025

Uh oh!

Uh oh!

wendy-ha18 Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

wendy-ha18 Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jberkus Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

ronaldngounou Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jberkus Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

ronaldngounou Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

jberkus commented Nov 19, 2025

Uh oh!

Uh oh!

ToSuperGod commented Nov 21, 2025

Uh oh!

ronaldngounou commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ronaldngounou commented Oct 11, 2025 •

edited

Loading

wendy-ha18 Nov 18, 2025 •

edited

Loading