Recommended storage backend for Milvus etcd on Kubernetes: Local NVMe RAID0 vs network NVMe persistent disk #49953
Unanswered
karantyagi-reltio
asked this question in
Q&A and General discussion
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Environment
Milvus version: v2.6.12
Deployment mode(standalone or cluster): cluster
We are evaluating the best storage option for Milvus etcd when running Milvus on Kubernetes across cloud providers.
Our current setup:
We are comparing two approaches:
Local NVMe SSD + RAID0 + local-static-volume-provisioner
Network-attached persistent NVMe disk
Question:
For Milvus etcd, which storage backend is recommended in production?
Specifically:
Additionally, if we use Local NVMe SSD with RAID0 for etcd storage and two Kubernetes nodes fail simultaneously, would the Milvus cluster still be recoverable?
Our concern is that Local NVMe SSD is node-local and ephemeral. In the event of node failure or node replacement, the underlying Local NVMe RAID0 data would be permanently lost. Since etcd stores critical Milvus metadata, we would like to understand:
Would appreciate clarification on the recommended production architecture and recovery expectations in such failure scenarios.
Also, if we migrate existing PVC data to a new PVC backed by Local NVMe SSD, is there any risk of data inconsistency or data loss during or after the migration?
Specifically:
We would also like to understand the recommended migration and recovery strategy before moving existing Milvus workloads to Local NVMe-backed storage.
Beta Was this translation helpful? Give feedback.
All reactions