Skip to content

bug: Databend Meta Log Data Overwrites Config Values #17296

Open
@kkapper

Description

@kkapper

Search before asking

  • I had searched in the issues and found no similar issues.

Version

Using Kubernetes with Docker image: datafuselabs/databend-meta:v1.2.680-p3

What's Wrong?

When databend meta starts, it ignores config values and loads them from logs instead.

Here is startup log for a node. Most important you can see that we want these nodes to join the clone3 namespace.

Databend Metasrv

Version: v1.2.680-p3-4c4896dc57-simd(1.85.0-nightly-2025-01-14T09:48:25.167808264Z)
Working DataVersion: V004(2024-11-11: WAL based raft-log)

Raft Feature set:
Server Provide: { append:v0, install_snapshot:v1, install_snapshot:v3, vote:v0 }
Client Require: { append:v0, install_snapshot:v3, vote:v0 }

Disk Data: V004(2024-11-11: WAL based raft-log); Upgrading: None
Dir: /data/databend-meta/raft

Log File: enabled=false, level=INFO, dir=/data/databend-meta/log, format=json, limit=48, prefix_filter=
Stderr: enabled=true, level=WARN, format=text
Raft Id: 2; Cluster: databend
Dir: /data/databend-meta/raft
Status: join ["plaid-databend-meta-0.plaid-databend-meta.pw-clone3.svc.cluster.local:28004", "plaid-databend-meta-1.plaid-databend-meta.pw-clone3.svc.cluster.local:28004", "plaid-databend-meta-2.plaid-databend-meta.pw-clone3.svc.cluster.local:28004"]

HTTP API listen at: 0.0.0.0:28002
gRPC API listen at: 0.0.0.0:9191 advertise: plaid-databend-meta-2.plaid-databend-meta.pw-clone3.svc.cluster.local:9191
Raft API listen at: 0.0.0.0:28004 advertise: plaid-databend-meta-2.plaid-databend-meta.pw-clone3.svc.cluster.local:28004
Upgrade ondisk data if out of date: V004
Find and clean previous unfinished upgrading
Upgrade ondisk data finished: V004

But, what happens if you check databendmetactl --status

root@plaid-databend-meta-0:/# ./databend-metactl status
BinaryVersion: v1.2.680-p3-4c4896dc57-simd(1.85.0-nightly-2025-01-14T09:48:25.167808264Z)
DataVersion: V004
RaftLogSize: 5261946
RaftLog:
  - CacheItems: 4604
  - CacheUsedSize: 5760356
  - WALTotalSize: 5261946
  - WALOpenChunkSize: 17553
  - WALOffset: 5261946
  - WALClosedChunkCount: 3
  - WALClosedChunkTotalSize: 5244393
  - WALClosedChunkSizes:
    - ChunkId(00_000_000_000_000_000_000): 5243001
    - ChunkId(00_000_000_000_005_243_001): 196
    - ChunkId(00_000_000_000_005_243_197): 1196
SnapshotKeyCount: 60742
Node: id=0 raft=plaid-databend-meta-0.plaid-databend-meta.pw-clone.svc.cluster.local:28004
State: Leader
Leader: id=0 raft=plaid-databend-meta-0.plaid-databend-meta.pw-clone.svc.cluster.local:28004 grpc=plaid-databend-meta-0.plaid-databend-meta.pw-clone.svc.cluster.local:9191
CurrentTerm: 83091
LastSeq: 563264
LastLogIndex: 438193
LastApplied: T83091-N0.438173
SnapshotLastLogID: T83089-N0.437685
Purged: T83086-N1.433589
Replication:
  - [0] T83091-N0.438193 *
Voters:
  - id=0 raft=plaid-databend-meta-0.plaid-databend-meta.pw-clone.svc.cluster.local:28004 grpc=plaid-databend-meta-0.plaid-databend-meta.pw-clone.svc.cluster.local:9191
  - id=1 raft=plaid-databend-meta-1.plaid-databend-meta.pw-clone.svc.cluster.local:28004 grpc=plaid-databend-meta-1.plaid-databend-meta.pw-clone.svc.cluster.local:9191
  - id=2 raft=plaid-databend-meta-2.plaid-databend-meta.pw-clone.svc.cluster.local:28004 grpc=plaid-databend-meta-2.plaid-databend-meta.pw-clone.svc.cluster.local:9191

You see it took the connection string from the logs and replaced all the connection info. Which break all nodes.

How to Reproduce?

Take a databend meta backup from one cluster/namespace, or one set of server.

Restore that backup to another cluster/namespace or other server.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Activity

added
C-bugCategory: something isn't working
on Jan 15, 2025
inviscid

inviscid commented on Jan 16, 2025

@inviscid

This appears to be a regression of some kind since this used to work correctly in the past.

drmingdrmer

drmingdrmer commented on Jan 17, 2025

@drmingdrmer
Member

Yes, this is a designed feature. The raft advertisement address is stored in the raft log and cannot be changed unless you execute a membership change command. Therefore, regardless of what the raft address is in the config file, it will be ignored after the cluster is initialized.

If you want to change the raft address, you will need to remove the node and add a new one with the updated address. However, this behavior is different for the gRPC address, which is updated every time the corresponding node is restarted.

If there is an issue with your scenario, could you clarify which field in the config file is being ignored or replaced?

drmingdrmer

drmingdrmer commented on Jan 17, 2025

@drmingdrmer
Member

the advertisement address is strictly prohibited from any change from the very first day. The address must not be changed under any circumstances. Any alteration to the address can trigger a brain - split issue.

In contrast, the GRPC address can be modified during system restart. This ability to change the GRPC address during restart was added as a feature about a year ago.

kkapper

kkapper commented on Jan 17, 2025

@kkapper
Author

@drmingdrmer What would happen if you needed to migrate your databend meta cluster to another datacenter? For disaster recovery or changing hosting providers.

You would need to set all of the IP addresses in the new architecture be the same as they were in the original?

kkapper

kkapper commented on Jan 17, 2025

@kkapper
Author

This is especially problematic on Kubernetes, because we have the ability to move things across namespaces quite easily, which changes addresses, and in general practice we can never guarantee we will hold the same IP address for long.

kkapper

kkapper commented on Jan 17, 2025

@kkapper
Author

The V003 convention seems to behave in the way I would expect, where the raft log only retained the entry for the node by index, and mapped the value when the log was replayed:

["header",{"DataHeader":{"key":"header","value":{"version":"V003","upgrading":null}}}] ["raft_state",{"RaftStateKV":{"key":"Id","value":{"NodeId":0}}}] ["raft_state",{"RaftStateKV":{"key":"HardState","value":{"HardState":{"leader_id":{"term":155936,"node_id":2},"committed":true}}}}] ["raft_state",{"RaftStateKV":{"key":"Committed","value":{"Committed":{"leader_id":{"term":155936,"node_id":2},"index":163988}}}}]

["raft_log",{"Logs":{"key":61440,"value":{"log_id":{"leader_id":{"term":712,"node_id":1},"index":61440},"payload":{"Normal":{"txid":null,"time_ms":1720357064373,"cmd":{"UpsertKV":{"key":"__fd_clusters_v2/plaid/default/databend_query/72BYsTnAMhgr4njReCgTX4","seq":{"GE":1},"value":"AsIs","value_meta":{"expire_at":null,"ttl":{"millis":60000}}}}}}}}}]

kkapper

kkapper commented on Jan 17, 2025

@kkapper
Author

Storing any connection data via IP is going to break Kubernetes implementations for sure.

Is it possible that we could get a log by index or log by ip toggle?

drmingdrmer

drmingdrmer commented on Jan 17, 2025

@drmingdrmer
Member

@drmingdrmer What would happen if you needed to migrate your databend meta cluster to another datacenter? For disaster recovery or changing hosting providers.

You would need to set all of the IP addresses in the new architecture be the same as they were in the original?

The standard process is to backup data from the running databend-meta cluster with:

databend-metactl export --grpc-api-address "127.0.0.1:9191" --db meta.db

and then restore the a new cluster in another DC with updated peer addresses specified:

databend-metactl import --raft-dir ./.databend/new_meta1 --db meta.db \
    --id=1 \
    --initial-cluster 1=localhost:29103 \
    --initial-cluster 2=localhost:29203 \
    --initial-cluster 3=localhost:29303
databend-metactl import --raft-dir ./.databend/new_meta2 --db meta.db \
    --id=2 \
    --initial-cluster 1=localhost:29103 \
    --initial-cluster 2=localhost:29203 \
    --initial-cluster 3=localhost:29303
databend-metactl import --raft-dir ./.databend/new_meta3 --db meta.db \
    --id=3 \
    --initial-cluster 1=localhost:29103 \
    --initial-cluster 2=localhost:29203 \
    --initial-cluster 3=localhost:29303

See: https://docs.databend.com/guides/deploy/deploy/production/metasrv-backup-restore#import-data-as-a-new-databend-meta-cluster

drmingdrmer

drmingdrmer commented on Jan 17, 2025

@drmingdrmer
Member

This is especially problematic on Kubernetes, because we have the ability to move things across namespaces quite easily, which changes addresses, and in general practice we can never guarantee we will hold the same IP address for long.

You do not need to hold the same IP. but the host name for raft-advertise-address should be kept consistent during migrating.

drmingdrmer

drmingdrmer commented on Jan 17, 2025

@drmingdrmer
Member

The V003 convention seems to behave in the way I would expect, where the raft log only retained the entry for the node by index, and mapped the value when the log was replayed:

["header",{"DataHeader":{"key":"header","value":{"version":"V003","upgrading":null}}}] ["raft_state",{"RaftStateKV":{"key":"Id","value":{"NodeId":0}}}] ["raft_state",{"RaftStateKV":{"key":"HardState","value":{"HardState":{"leader_id":{"term":155936,"node_id":2},"committed":true}}}}] ["raft_state",{"RaftStateKV":{"key":"Committed","value":{"Committed":{"leader_id":{"term":155936,"node_id":2},"index":163988}}}}]

["raft_log",{"Logs":{"key":61440,"value":{"log_id":{"leader_id":{"term":712,"node_id":1},"index":61440},"payload":{"Normal":{"txid":null,"time_ms":1720357064373,"cmd":{"UpsertKV":{"key":"__fd_clusters_v2/plaid/default/databend_query/72BYsTnAMhgr4njReCgTX4","seq":{"GE":1},"value":"AsIs","value_meta":{"expire_at":null,"ttl":{"millis":60000}}}}}}}}}]

There is no behavior change in any version. What is the unexpected behavior you have encountered?

drmingdrmer

drmingdrmer commented on Jan 17, 2025

@drmingdrmer
Member

Storing any connection data via IP is going to break Kubernetes implementations for sure.

Is it possible that we could get a log by index or log by ip toggle?

What do you mean by “log by IP”? What kind of information are you trying to obtain using the IP address?

kkapper

kkapper commented on Jan 20, 2025

@kkapper
Author

@drmingdrmer What would happen if you needed to migrate your databend meta cluster to another datacenter? For disaster recovery or changing hosting providers.
You would need to set all of the IP addresses in the new architecture be the same as they were in the original?

The standard process is to backup data from the running databend-meta cluster with:

databend-metactl export --grpc-api-address "127.0.0.1:9191" --db meta.db

and then restore the a new cluster in another DC with updated peer addresses specified:

databend-metactl import --raft-dir ./.databend/new_meta1 --db meta.db \
    --id=1 \
    --initial-cluster 1=localhost:29103 \
    --initial-cluster 2=localhost:29203 \
    --initial-cluster 3=localhost:29303
databend-metactl import --raft-dir ./.databend/new_meta2 --db meta.db \
    --id=2 \
    --initial-cluster 1=localhost:29103 \
    --initial-cluster 2=localhost:29203 \
    --initial-cluster 3=localhost:29303
databend-metactl import --raft-dir ./.databend/new_meta3 --db meta.db \
    --id=3 \
    --initial-cluster 1=localhost:29103 \
    --initial-cluster 2=localhost:29203 \
    --initial-cluster 3=localhost:29303

See: https://docs.databend.com/guides/deploy/deploy/production/metasrv-backup-restore#import-data-as-a-new-databend-meta-cluster

This does not work when importing into a new environment.

The address specified for --initial-cluster will be replaced with what came from the backup being imported.

kkapper

kkapper commented on Jan 20, 2025

@kkapper
Author

Here is an example of a backup and restore with different addresses:

initialize leader node

Import:
Into Meta Dir: '/data/databend-meta/raft'
Initialize Cluster with Id: 0, cluster: {
Peer: 0=plaid-databend-meta-0:28004
Peer: 1=plaid-databend-meta-1:28004
Peer: 2=plaid-databend-meta-2:28004
}
Initialize Cluster: id=0, ["0=plaid-databend-meta-0:28004", "1=plaid-databend-meta-1:28004", "2=plaid-databend-meta-2:28004"]
peer:0=plaid-databend-meta-0:28004
new cluster node:id=0 raft=plaid-databend-meta-0:28004 grpc=
peer:1=plaid-databend-meta-1:28004
new cluster node:id=1 raft=plaid-databend-meta-1:28004 grpc=
peer:2=plaid-databend-meta-2:28004
new cluster node:id=2 raft=plaid-databend-meta-2:28004 grpc=

^
Here is the import into a brand new databend cluster.

When running ./databend-metactl status right after the import finishes.

root@plaid-databend-meta-0:/# ./databend-metactl status
BinaryVersion: v1.2.680-p3-4c4896dc57-simd(1.85.0-nightly-2025-01-14T09:48:25.167808264Z)
DataVersion: V004
RaftLogSize: 78544686
RaftLog:
  - CacheItems: 102578
  - CacheUsedSize: 74142898
  - WALTotalSize: 78544686
  - WALOpenChunkSize: 741
  - WALOffset: 78544686
  - WALClosedChunkCount: 4
  - WALClosedChunkTotalSize: 78543945
  - WALClosedChunkSizes:
    - ChunkId(00_000_000_000_000_000_000): 77957284
    - ChunkId(00_000_000_000_077_957_284): 585442
    - ChunkId(00_000_000_000_078_542_726): 188
    - ChunkId(00_000_000_000_078_542_914): 1031
SnapshotKeyCount: 262486
Node: id=0 raft=plaid-databend-meta-0.plaid-databend-meta.pw-superkellendev.svc.cluster.local:28004
State: Candidate
CurrentTerm: 43495
LastSeq: 2239773
LastLogIndex: 1895021
LastApplied: T43488-N0.1895016
SnapshotLastLogID: T43484-N2.1894843
Purged: T41914-N1.1792443
Voters:
  - id=0 raft=plaid-databend-meta-0.plaid-databend-meta.pw-superkellendev.svc.cluster.local:28004 grpc=plaid-databend-meta-0.plaid-databend-meta.pw-superkellendev.svc.cluster.local:9191
  - id=1 raft=plaid-databend-meta-1.plaid-databend-meta.pw-superkellendev.svc.cluster.local:28004 grpc=plaid-databend-meta-1.plaid-databend-meta.pw-superkellendev.svc.cluster.local:9191
  - id=2 raft=plaid-databend-meta-2.plaid-databend-meta.pw-superkellendev.svc.cluster.local:28004 grpc=plaid-databend-meta-2.plaid-databend-meta.pw-superkellendev.svc.cluster.local:9191
kkapper

kkapper commented on Jan 20, 2025

@kkapper
Author

You can see the voting addresses are different from the member addresses of the current cluster.

kkapper

kkapper commented on Jan 20, 2025

@kkapper
Author

The V003 convention seems to behave in the way I would expect, where the raft log only retained the entry for the node by index, and mapped the value when the log was replayed:
["header",{"DataHeader":{"key":"header","value":{"version":"V003","upgrading":null}}}] ["raft_state",{"RaftStateKV":{"key":"Id","value":{"NodeId":0}}}] ["raft_state",{"RaftStateKV":{"key":"HardState","value":{"HardState":{"leader_id":{"term":155936,"node_id":2},"committed":true}}}}] ["raft_state",{"RaftStateKV":{"key":"Committed","value":{"Committed":{"leader_id":{"term":155936,"node_id":2},"index":163988}}}}]
["raft_log",{"Logs":{"key":61440,"value":{"log_id":{"leader_id":{"term":712,"node_id":1},"index":61440},"payload":{"Normal":{"txid":null,"time_ms":1720357064373,"cmd":{"UpsertKV":{"key":"__fd_clusters_v2/plaid/default/databend_query/72BYsTnAMhgr4njReCgTX4","seq":{"GE":1},"value":"AsIs","value_meta":{"expire_at":null,"ttl":{"millis":60000}}}}}}}}}]

There is no behavior change in any version. What is the unexpected behavior you have encountered?

The entirety of the

Here is an example of a backup and restore with different addresses:

initialize leader node

Import:
Into Meta Dir: '/data/databend-meta/raft'
Initialize Cluster with Id: 0, cluster: {
Peer: 0=plaid-databend-meta-0:28004
Peer: 1=plaid-databend-meta-1:28004
Peer: 2=plaid-databend-meta-2:28004
}
Initialize Cluster: id=0, ["0=plaid-databend-meta-0:28004", "1=plaid-databend-meta-1:28004", "2=plaid-databend-meta-2:28004"]
peer:0=plaid-databend-meta-0:28004
new cluster node:id=0 raft=plaid-databend-meta-0:28004 grpc=
peer:1=plaid-databend-meta-1:28004
new cluster node:id=1 raft=plaid-databend-meta-1:28004 grpc=
peer:2=plaid-databend-meta-2:28004
new cluster node:id=2 raft=plaid-databend-meta-2:28004 grpc=

^ Here is the import into a brand new databend cluster.

When running ./databend-metactl status right after the import finishes.

root@plaid-databend-meta-0:/# ./databend-metactl status
BinaryVersion: v1.2.680-p3-4c4896dc57-simd(1.85.0-nightly-2025-01-14T09:48:25.167808264Z)
DataVersion: V004
RaftLogSize: 78544686
RaftLog:
  - CacheItems: 102578
  - CacheUsedSize: 74142898
  - WALTotalSize: 78544686
  - WALOpenChunkSize: 741
  - WALOffset: 78544686
  - WALClosedChunkCount: 4
  - WALClosedChunkTotalSize: 78543945
  - WALClosedChunkSizes:
    - ChunkId(00_000_000_000_000_000_000): 77957284
    - ChunkId(00_000_000_000_077_957_284): 585442
    - ChunkId(00_000_000_000_078_542_726): 188
    - ChunkId(00_000_000_000_078_542_914): 1031
SnapshotKeyCount: 262486
Node: id=0 raft=plaid-databend-meta-0.plaid-databend-meta.pw-superkellendev.svc.cluster.local:28004
State: Candidate
CurrentTerm: 43495
LastSeq: 2239773
LastLogIndex: 1895021
LastApplied: T43488-N0.1895016
SnapshotLastLogID: T43484-N2.1894843
Purged: T41914-N1.1792443
Voters:
  - id=0 raft=plaid-databend-meta-0.plaid-databend-meta.pw-superkellendev.svc.cluster.local:28004 grpc=plaid-databend-meta-0.plaid-databend-meta.pw-superkellendev.svc.cluster.local:9191
  - id=1 raft=plaid-databend-meta-1.plaid-databend-meta.pw-superkellendev.svc.cluster.local:28004 grpc=plaid-databend-meta-1.plaid-databend-meta.pw-superkellendev.svc.cluster.local:9191
  - id=2 raft=plaid-databend-meta-2.plaid-databend-meta.pw-superkellendev.svc.cluster.local:28004 grpc=plaid-databend-meta-2.plaid-databend-meta.pw-superkellendev.svc.cluster.local:9191

Specifically in the backup being restored, you can see the endpoints are included in the backup set:

["raft_log",{"LogEntry":{"log_id":{"leader_id":{"term":42013,"node_id":2},"index":1844602},"payload":{"Normal":{"txid":null,"cmd":{"AddNode":{"node_id":0,"node":{"name":"0","endpoint":{"addr":"plaid-databend-meta-0.plaid-databend-meta.pw-superkellendev.svc.cluster.local","port":28004},"grpc_api_advertise_address":null},"overriding":true}}}}}}]
["raft_log",{"LogEntry":{"log_id":{"leader_id":{"term":42013,"node_id":2},"index":1844603},"payload":{"Normal":{"txid":null,"cmd":{"AddNode":{"node_id":1,"node":{"name":"1","endpoint":{"addr":"plaid-databend-meta-1.plaid-databend-meta.pw-superkellendev.svc.cluster.local","port":28004},"grpc_api_advertise_address":null},"overriding":true}}}}}}]
["raft_log",{"LogEntry":{"log_id":{"leader_id":{"term":42013,"node_id":2},"index":1844604},"payload":{"Normal":{"txid":null,"cmd":{"AddNode":{"node_id":2,"node":{"name":"2","endpoint":{"addr":"plaid-databend-meta-2.plaid-databend-meta.pw-superkellendev.svc.cluster.local","port":28004},"grpc_api_advertise_address":null},"overriding":true}}}}}}]

My proposal is that a backup should not have any concern for the hostnames of the source cluster.

For example, a Redis AOF can be replayed on any new cluster.

kkapper

kkapper commented on Jan 20, 2025

@kkapper
Author

@drmingdrmer I should probably rephrase:

This issue should read: Databend Meta Backups Use The Source Cluster Hostnames Instead Of The Destination.

What I'd like to see exactly:

Databend backups only include the name of the node in each line, which would enable a backup to be transferred from any source cluster, to any destination.

drmingdrmer

drmingdrmer commented on Jan 21, 2025

@drmingdrmer
Member

My proposal is that a backup should not have any concern for the hostnames of the source cluster.

That is impossible. The hostname is part of the raft log. Removing portion of raft log just results in data inconsistency.

The raft-advertise-address should be updated with --initial-cluster argument specified. In order to find out what's going wrong, can you re-export the data from a restored databend-meta service? For example: databend-metactl --export --raft-dir .databend/new_meta1 after shutting down the restored databend-meta service. And grep for AddNode.

There should be several lines of raft-log that adds new cluster configuration to override the existing ones, such as:

["raft_log",{"LogEntry":{"log_id":{"leader_id":{"term":1,"node_id":1},"index":14},"payload":{"Normal":{"txid":null,"cmd":{"AddNode":{"node_id":4,"node":{"name":"4","endpoint":{"addr":"localhost","port":29103},"grpc_api_advertise_address":null},"overriding":true}}}}}}]
["raft_log",{"LogEntry":{"log_id":{"leader_id":{"term":1,"node_id":1},"index":15},"payload":{"Normal":{"txid":null,"cmd":{"AddNode":{"node_id":5,"node":{"name":"5","endpoint":{"addr":"localhost","port":29203},"grpc_api_advertise_address":null},"overriding":true}}}}}}]
["raft_log",{"LogEntry":{"log_id":{"leader_id":{"term":1,"node_id":1},"index":16},"payload":{"Normal":{"txid":null,"cmd":{"AddNode":{"node_id":6,"node":{"name":"6","endpoint":{"addr":"localhost","port":29303},"grpc_api_advertise_address":null},"overriding":true}}}}}}]

And make sure the config file you were using to start the databend-meta contains the correct new cluster addresses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

C-bugCategory: something isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @drmingdrmer@inviscid@kkapper

      Issue actions

        bug: Databend Meta Log Data Overwrites Config Values · Issue #17296 · databendlabs/databend