Skip to content

Conversation

@Li0k
Copy link
Contributor

@Li0k Li0k commented Jan 30, 2026

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

As title

Checklist

  • I have written necessary rustdoc comments.
  • I have added necessary unit tests and integration tests.
  • I have added test labels as necessary.
  • I have added fuzzing tests or opened an issue to track them.
  • My PR contains breaking changes.
  • My PR changes performance-critical code, so I will run (micro) benchmarks and present the results.
  • I have checked the Release Timeline and Currently Supported Versions to determine which release branches I need to cherry-pick this PR into.

Documentation

  • My PR needs documentation updates.
Release note

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes compaction task ID allocation by pre-allocating a range of IDs per loop iteration instead of fetching them individually. This reduces the number of SQL transactions needed when generating multiple compaction tasks.

Changes:

  • Added next_compaction_task_id_interval function to allocate multiple task IDs at once
  • Modified compaction task selection loop to pre-allocate IDs and assert bounds
  • Updated iceberg compaction manager to use the new helper function

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/meta/src/hummock/manager/sequence.rs Added new function to allocate multiple compaction task IDs in a single call
src/meta/src/hummock/manager/compaction/mod.rs Pre-allocates task ID range before loop and replaces per-iteration allocation with local increment
src/meta/src/manager/iceberg_compaction.rs Refactored to use the existing helper function instead of direct sequence call


// StoredIdGenerator already implements ids pre-allocation by ID_PREALLOCATE_INTERVAL.
let task_id = next_compaction_task_id(&self.env).await?;
assert!(next_task_id < task_id_end);
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion should check next_task_id < task_id_end BEFORE using task_id, but the assignment happens after the assert. If the assert fails, we'd still have incremented past the end. Consider moving the assignment of task_id after the assertion, or restructure to check the condition before both the assignment and increment.

Copilot uses AI. Check for mistakes.
Comment on lines 380 to 384
// Pre-allocate task ids for this loop to avoid per-group SQL transactions.
let task_id_capacity = compaction_groups.len() as u32;
let mut next_task_id =
next_compaction_task_id_interval(&self.env, task_id_capacity).await?;
let task_id_end = next_task_id + u64::from(task_id_capacity);
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-allocating exactly compaction_groups.len() task IDs may be insufficient. The loop can exit early if pick_tasks.len() >= max_select_count or if a group selector returns None, meaning not all allocated IDs will be used. However, if the loop continues past allocated capacity (e.g., if groups are processed multiple times or the logic changes), the assertion at line 410 will panic. Consider either: (1) allocating min(compaction_groups.len(), max_select_count) IDs, or (2) adding logic to handle exhaustion of pre-allocated IDs gracefully rather than panicking.

Copilot uses AI. Check for mistakes.
@Li0k
Copy link
Contributor Author

Li0k commented Jan 30, 2026

@zwang28 @hzxa21 Do you remember why each group fetches only one normal-task per pull?

Preallocate compaction task ID range per loop to avoid per-group SQL
transactions. Capacity is min(group_count, max_select_count) since each
group consumes at most one task ID (trivial tasks share the same ID with
normal task). Fallback to single ID allocation if pre-allocated range
is exhausted.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants