Open
Description
Search before asking
- I had searched in the issues and found no similar issues.
Version
Using Kubernetes with Docker image: datafuselabs/databend-meta:v1.2.680-p3
What's Wrong?
When databend meta starts, it ignores config values and loads them from logs instead.
Here is startup log for a node. Most important you can see that we want these nodes to join the clone3
namespace.
Databend Metasrv
Version: v1.2.680-p3-4c4896dc57-simd(1.85.0-nightly-2025-01-14T09:48:25.167808264Z)
Working DataVersion: V004(2024-11-11: WAL based raft-log)
Raft Feature set:
Server Provide: { append:v0, install_snapshot:v1, install_snapshot:v3, vote:v0 }
Client Require: { append:v0, install_snapshot:v3, vote:v0 }
Disk Data: V004(2024-11-11: WAL based raft-log); Upgrading: None
Dir: /data/databend-meta/raft
Log File: enabled=false, level=INFO, dir=/data/databend-meta/log, format=json, limit=48, prefix_filter=
Stderr: enabled=true, level=WARN, format=text
Raft Id: 2; Cluster: databend
Dir: /data/databend-meta/raft
Status: join ["plaid-databend-meta-0.plaid-databend-meta.pw-clone3.svc.cluster.local:28004", "plaid-databend-meta-1.plaid-databend-meta.pw-clone3.svc.cluster.local:28004", "plaid-databend-meta-2.plaid-databend-meta.pw-clone3.svc.cluster.local:28004"]
HTTP API listen at: 0.0.0.0:28002
gRPC API listen at: 0.0.0.0:9191 advertise: plaid-databend-meta-2.plaid-databend-meta.pw-clone3.svc.cluster.local:9191
Raft API listen at: 0.0.0.0:28004 advertise: plaid-databend-meta-2.plaid-databend-meta.pw-clone3.svc.cluster.local:28004
Upgrade ondisk data if out of date: V004
Find and clean previous unfinished upgrading
Upgrade ondisk data finished: V004
But, what happens if you check databendmetactl --status
root@plaid-databend-meta-0:/# ./databend-metactl status
BinaryVersion: v1.2.680-p3-4c4896dc57-simd(1.85.0-nightly-2025-01-14T09:48:25.167808264Z)
DataVersion: V004
RaftLogSize: 5261946
RaftLog:
- CacheItems: 4604
- CacheUsedSize: 5760356
- WALTotalSize: 5261946
- WALOpenChunkSize: 17553
- WALOffset: 5261946
- WALClosedChunkCount: 3
- WALClosedChunkTotalSize: 5244393
- WALClosedChunkSizes:
- ChunkId(00_000_000_000_000_000_000): 5243001
- ChunkId(00_000_000_000_005_243_001): 196
- ChunkId(00_000_000_000_005_243_197): 1196
SnapshotKeyCount: 60742
Node: id=0 raft=plaid-databend-meta-0.plaid-databend-meta.pw-clone.svc.cluster.local:28004
State: Leader
Leader: id=0 raft=plaid-databend-meta-0.plaid-databend-meta.pw-clone.svc.cluster.local:28004 grpc=plaid-databend-meta-0.plaid-databend-meta.pw-clone.svc.cluster.local:9191
CurrentTerm: 83091
LastSeq: 563264
LastLogIndex: 438193
LastApplied: T83091-N0.438173
SnapshotLastLogID: T83089-N0.437685
Purged: T83086-N1.433589
Replication:
- [0] T83091-N0.438193 *
Voters:
- id=0 raft=plaid-databend-meta-0.plaid-databend-meta.pw-clone.svc.cluster.local:28004 grpc=plaid-databend-meta-0.plaid-databend-meta.pw-clone.svc.cluster.local:9191
- id=1 raft=plaid-databend-meta-1.plaid-databend-meta.pw-clone.svc.cluster.local:28004 grpc=plaid-databend-meta-1.plaid-databend-meta.pw-clone.svc.cluster.local:9191
- id=2 raft=plaid-databend-meta-2.plaid-databend-meta.pw-clone.svc.cluster.local:28004 grpc=plaid-databend-meta-2.plaid-databend-meta.pw-clone.svc.cluster.local:9191
You see it took the connection string from the logs and replaced all the connection info. Which break all nodes.
How to Reproduce?
Take a databend meta backup from one cluster/namespace, or one set of server.
Restore that backup to another cluster/namespace or other server.
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Metadata
Metadata
Assignees
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
inviscid commentedon Jan 16, 2025
This appears to be a regression of some kind since this used to work correctly in the past.
drmingdrmer commentedon Jan 17, 2025
Yes, this is a designed feature. The raft advertisement address is stored in the raft log and cannot be changed unless you execute a membership change command. Therefore, regardless of what the raft address is in the config file, it will be ignored after the cluster is initialized.
If you want to change the raft address, you will need to remove the node and add a new one with the updated address. However, this behavior is different for the gRPC address, which is updated every time the corresponding node is restarted.
If there is an issue with your scenario, could you clarify which field in the config file is being ignored or replaced?
drmingdrmer commentedon Jan 17, 2025
the advertisement address is strictly prohibited from any change from the very first day. The address must not be changed under any circumstances. Any alteration to the address can trigger a brain - split issue.
In contrast, the GRPC address can be modified during system restart. This ability to change the GRPC address during restart was added as a feature about a year ago.
kkapper commentedon Jan 17, 2025
@drmingdrmer What would happen if you needed to migrate your databend meta cluster to another datacenter? For disaster recovery or changing hosting providers.
You would need to set all of the IP addresses in the new architecture be the same as they were in the original?
kkapper commentedon Jan 17, 2025
This is especially problematic on Kubernetes, because we have the ability to move things across namespaces quite easily, which changes addresses, and in general practice we can never guarantee we will hold the same IP address for long.
kkapper commentedon Jan 17, 2025
The V003 convention seems to behave in the way I would expect, where the raft log only retained the entry for the node by index, and mapped the value when the log was replayed:
["header",{"DataHeader":{"key":"header","value":{"version":"V003","upgrading":null}}}] ["raft_state",{"RaftStateKV":{"key":"Id","value":{"NodeId":0}}}] ["raft_state",{"RaftStateKV":{"key":"HardState","value":{"HardState":{"leader_id":{"term":155936,"node_id":2},"committed":true}}}}] ["raft_state",{"RaftStateKV":{"key":"Committed","value":{"Committed":{"leader_id":{"term":155936,"node_id":2},"index":163988}}}}]
["raft_log",{"Logs":{"key":61440,"value":{"log_id":{"leader_id":{"term":712,"node_id":1},"index":61440},"payload":{"Normal":{"txid":null,"time_ms":1720357064373,"cmd":{"UpsertKV":{"key":"__fd_clusters_v2/plaid/default/databend_query/72BYsTnAMhgr4njReCgTX4","seq":{"GE":1},"value":"AsIs","value_meta":{"expire_at":null,"ttl":{"millis":60000}}}}}}}}}]
kkapper commentedon Jan 17, 2025
Storing any connection data via IP is going to break Kubernetes implementations for sure.
Is it possible that we could get a
log by index
orlog by ip
toggle?drmingdrmer commentedon Jan 17, 2025
The standard process is to backup data from the running databend-meta cluster with:
and then restore the a new cluster in another DC with updated peer addresses specified:
See: https://docs.databend.com/guides/deploy/deploy/production/metasrv-backup-restore#import-data-as-a-new-databend-meta-cluster
drmingdrmer commentedon Jan 17, 2025
You do not need to hold the same IP. but the host name for
raft-advertise-address
should be kept consistent during migrating.drmingdrmer commentedon Jan 17, 2025
There is no behavior change in any version. What is the unexpected behavior you have encountered?
drmingdrmer commentedon Jan 17, 2025
What do you mean by “log by IP”? What kind of information are you trying to obtain using the IP address?
kkapper commentedon Jan 20, 2025
This does not work when importing into a new environment.
The address specified for --initial-cluster will be replaced with what came from the backup being imported.
kkapper commentedon Jan 20, 2025
Here is an example of a backup and restore with different addresses:
^
Here is the import into a brand new databend cluster.
When running ./databend-metactl status right after the import finishes.
kkapper commentedon Jan 20, 2025
You can see the voting addresses are different from the member addresses of the current cluster.
kkapper commentedon Jan 20, 2025
The entirety of the
Specifically in the backup being restored, you can see the endpoints are included in the backup set:
My proposal is that a backup should not have any concern for the hostnames of the source cluster.
For example, a Redis AOF can be replayed on any new cluster.
kkapper commentedon Jan 20, 2025
@drmingdrmer I should probably rephrase:
This issue should read: Databend Meta Backups Use The Source Cluster Hostnames Instead Of The Destination.
What I'd like to see exactly:
Databend backups only include the name of the node in each line, which would enable a backup to be transferred from any source cluster, to any destination.
drmingdrmer commentedon Jan 21, 2025
That is impossible. The hostname is part of the raft log. Removing portion of raft log just results in data inconsistency.
The raft-advertise-address should be updated with
--initial-cluster
argument specified. In order to find out what's going wrong, can you re-export the data from a restored databend-meta service? For example:databend-metactl --export --raft-dir .databend/new_meta1
after shutting down the restored databend-meta service. Andgrep
forAddNode
.There should be several lines of raft-log that adds new cluster configuration to override the existing ones, such as:
And make sure the config file you were using to start the databend-meta contains the correct new cluster addresses.