Skip to content

Conversation

@jtanza
Copy link

@jtanza jtanza commented Nov 6, 2025

Right now, the CDC service is scanning the entire cdc_state table on each tserver when updating checkpoint info. This causes a very large number of RocksDB ops when there are a significant amount of tservers in the cluster.

To address this, these changes:

  • Add CDCStateTable::FetchEntriesForTablet(tablet_id, selector) to do hash-keyed reads (binds tablet_id) with paging.
  • Updates CDCServiceImpl::PopulateTabletCheckPointInfo to:
    • Build a set of locally tracked tablets (tablet_checkpoints_), and for each, call FetchEntriesForTablet instead of table-wide scan.
    • Read slot rows once via FetchEntriesForTablet(kCDCSDKSlotEntryTabletId, ...) and compute per-namespace min record_id_commit_time.
  • Update CDCServiceImpl::PopulateCDCSDKTabletCheckPointInfo (single-tablet path) to use FetchEntriesForTablet(input_tablet_id, ...) instead of scanning all rows.
  • Keep paging but increased per-page limit (1024) since reads are now partition-targeted.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@netlify
Copy link

netlify bot commented Nov 6, 2025

Deploy Preview for infallible-bardeen-164bc9 ready!

Built without sensitive environment variables

Name Link
🔨 Latest commit 27189c1
🔍 Latest deploy log https://app.netlify.com/projects/infallible-bardeen-164bc9/deploys/690cee3ce78fa4000884f0b4
😎 Deploy Preview https://deploy-preview-29268--infallible-bardeen-164bc9.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@hari90
Copy link
Contributor

hari90 commented Nov 8, 2025

Thank you for submitting the diff.
Two major concerns. The way you get local tablets does not seem right, and change in the behavior might very likely break xCluster so we should double check the logic here.
Getting the entries one tablet at a time is extremely inefficient and will cause a RPC overload, and more strain on the cdc_state table and make it worse, not better.

@hari90
Copy link
Contributor

hari90 commented Nov 8, 2025

@suranjan @hulien22 please review this diff

@hari90 hari90 requested a review from suranjan November 8, 2025 02:03

Result<std::vector<CDCStateTableEntry>> CDCStateTable::FetchEntriesForTablet(
const TabletId& tablet_id, CDCStateTableEntrySelector&& field_filter) {
DCHECK(!tablet_id.empty());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DCHECK(!tablet_id.empty());
RSTATUS_DCHECK(!tablet_id.empty(), IllegalState, "Invalid tablet id provided");

LOG_WITH_FUNC(WARNING) << "Failed operation: " << error->failed_op().ToString()
<< ", status: " << error->status();
}
RETURN_NOT_OK(flush_status.status);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RETURN_NOT_OK(flush_status.status);
return flush_status.status;

Comment on lines +784 to +785
auto row_block = ql::RowsResult(read_op.get()).GetRowBlock();
RETURN_NOT_OK(row_block);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto row_block = ql::RowsResult(read_op.get()).GetRowBlock();
RETURN_NOT_OK(row_block);
auto row_block = VERIFY_RESULT(ql::RowsResult(read_op.get()).GetRowBlock());

// Bind hash key to limit results to the specific tablet_id.
QLAddStringHashValue(req_read, tablet_id);
req_read->set_return_paging_state(true);
req_read->set_limit(1024);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1024 should be a gFlag so that we can configure it at runtime.

if (!read_op->response().has_paging_state()) {
break;
}
*req_read->mutable_paging_state() = read_op->response().paging_state();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
*req_read->mutable_paging_state() = read_op->response().paging_state();
*req_read->mutable_paging_state() = std::move(read_op->response().paging_state());


std::vector<CDCStateTableEntry> results;

const auto read_op = cdc_table->NewReadOp();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const auto read_op = cdc_table->NewReadOp();


// Apply and page until done.
while (true) {
session->Apply(read_op);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
session->Apply(read_op);
auto read_op = cdc_table->NewReadOp();
session->Apply(read_op);

Comment on lines +3052 to +3056
RSTATUS_DCHECK(
entry.record_id_commit_time.has_value(), InternalError,
Format(
"The slot entry for the stream $0 did not have a value for record_id_commit_time",
stream_id),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RSTATUS_DCHECK(
entry.record_id_commit_time.has_value(), InternalError,
Format(
"The slot entry for the stream $0 did not have a value for record_id_commit_time",
stream_id),
if (!entry.record_id_commit_time.has_value() {
LOG(DFATAL) << "The slot entry for the stream " << stream_id << " did not have a value for record_id_commit_time";
continue;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We dont want one corrupted stream to impact the rest of the sytem

// Get the minimum record_id_commit_time for each namespace by looking at all the slot entries.
// Build the set of tablet_ids that this tserver is actively tracking for CDC.
std::unordered_set<TabletId> local_tablet_ids;
for (const auto& it : impl_->TabletCheckpointsCopy()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who populates TabletCheckpointsCopy?

I am not sure if this works with xCluster which expects the entry to exist before the GetChanges request comes in. cc @hulien22

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants