-
Notifications
You must be signed in to change notification settings - Fork 113
[controller] Skip truncating parent VT based on ConcurrentPushDetectionStrategy in the DeferredVersionSwapService #2355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…onStrategy in the DeferredVersionSwapService
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds conditional logic to skip truncating parent version topics (VT) during deferred version swap operations based on the ConcurrentPushDetectionStrategy. The change prevents unnecessary truncation of zombie parent VTs when the strategy is set to PARENT_VERSION_STATUS_ONLY, which doesn't require topic writes.
Key changes:
- Modified
DeferredVersionSwapService.updateStore()to check if topic write is needed before attempting truncation - Added test verifications to ensure truncation is skipped when appropriate strategy is configured
- Updated test setup to configure
ConcurrentPushDetectionStrategyin multiple test scenarios
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
services/venice-controller/src/main/java/com/linkedin/venice/controller/DeferredVersionSwapService.java |
Added logic to conditionally skip parent VT truncation based on ConcurrentPushDetectionStrategy.isTopicWriteNeeded() |
services/venice-controller/src/test/java/com/linkedin/venice/controller/TestDeferredVersionSwapService.java |
Added test setup to configure PARENT_VERSION_STATUS_ONLY strategy and verify truncation is not called |
services/venice-controller/src/test/java/com/linkedin/venice/controller/TestDeferredVersionSwapServiceWithSequentialRollout.java |
Added strategy configuration and verification for two test scenarios to ensure truncation behavior matches strategy |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // skip truncating if the topic was not created based on ConcurrentPushDetectionStrategy | ||
| if (strategy.isTopicWriteNeeded() && !veniceParentHelixAdmin.isTopicTruncated(kafkaTopicName)) { | ||
| LOGGER.info("Truncating parent VT for {}", kafkaTopicName); | ||
| veniceParentHelixAdmin.truncateKafkaTopic(Version.composeKafkaTopic(storeName, targetVersionNum)); |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Kafka topic name is being recomputed using Version.composeKafkaTopic(storeName, targetVersionNum) even though it was already computed on line 1228 and stored in the kafkaTopicName variable. Consider using the kafkaTopicName variable instead to avoid redundant computation and improve code maintainability.
| veniceParentHelixAdmin.truncateKafkaTopic(Version.composeKafkaTopic(storeName, targetVersionNum)); | |
| veniceParentHelixAdmin.truncateKafkaTopic(kafkaTopicName); |
| if (!veniceParentHelixAdmin.isTopicTruncated(kafkaTopicName)) { | ||
| ConcurrentPushDetectionStrategy strategy = | ||
| veniceControllerMultiClusterConfig.getControllerConfig(clusterName).getConcurrentPushDetectionStrategy(); | ||
| // skip truncating if the topic was not created based on ConcurrentPushDetectionStrategy |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment could be more precise. It currently says "skip truncating if the topic was not created" but the logic checks if isTopicWriteNeeded() returns true. Consider revising to: "skip truncating if the topic was not created (when strategy doesn't require topic writes)" to better explain the relationship between the strategy and topic creation.
| // skip truncating if the topic was not created based on ConcurrentPushDetectionStrategy | |
| // Skip truncating if the topic was not created, i.e., when the configured strategy does not require topic writes. |
| doReturn(ConcurrentPushDetectionStrategy.TOPIC_BASED_ONLY).when(clusterConfig).getConcurrentPushDetectionStrategy(); | ||
| doReturn(false).when(admin).isTopicTruncated(anyString()); |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test sets ConcurrentPushDetectionStrategy to TOPIC_BASED_ONLY (line 603) and mocks isTopicTruncated to return false (line 604), which according to the code logic in DeferredVersionSwapService.java line 1232 would trigger truncation (since isTopicWriteNeeded() returns true for TOPIC_BASED_ONLY and the topic is not truncated). However, line 616 expects that truncateKafkaTopic is never called. This test may not be properly validating the intended behavior. Consider either changing the strategy to PARENT_VERSION_STATUS_ONLY if the intent is to test that truncation doesn't happen, or adjusting the expectation to verify that truncation does happen with TOPIC_BASED_ONLY strategy.
| // Verify error recording was called due to the failure | ||
| verify(store, atLeastOnce()).updateVersionStatus(2, VersionStatus.PARTIALLY_ONLINE); | ||
| verify(admin, never()).rollForwardToFutureVersion(clusterName, storeName, region3); | ||
| verify(admin, never()).truncateKafkaTopic(anyString(), anyInt()); |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The verify call is checking for the wrong method signature. The actual code in DeferredVersionSwapService.java line 1234 calls truncateKafkaTopic(String topicName) with one parameter, but this test is verifying the two-parameter overload truncateKafkaTopic(anyString(), anyInt()). This means the test would not detect if the one-parameter version is called. Change this to verify(admin, never()).truncateKafkaTopic(anyString()); to match the actual method being called.
| verify(admin, never()).truncateKafkaTopic(anyString(), anyInt()); | |
| verify(admin, never()).truncateKafkaTopic(anyString()); |
| TestUtils.waitForNonDeterministicAssertion(5, TimeUnit.SECONDS, () -> { | ||
| // Verify that updateStore was called to mark parent version as ONLINE | ||
| verify(store, atLeastOnce()).updateVersionStatus(versionTwo, VersionStatus.ONLINE); | ||
| verify(admin, never()).truncateKafkaTopic(anyString(), anyInt()); |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The verify call is checking for the wrong method signature. The actual code in DeferredVersionSwapService.java line 1234 calls truncateKafkaTopic(String topicName) with one parameter, but this test is verifying the two-parameter overload truncateKafkaTopic(anyString(), anyInt()). This means the test would not detect if the one-parameter version is called. Change this to verify(admin, never()).truncateKafkaTopic(anyString()); to match the actual method being called.
| verify(admin, never()).truncateKafkaTopic(anyString(), anyInt()); | |
| verify(admin, never()).truncateKafkaTopic(anyString()); |
Problem Statement
[controller] Skip truncating parent VT based on ConcurrentPushDetectionStrategy in the DeferredVersionSwapService.
We still try to truncate zombie parent VT during defer swap, this skip helps remove the unnecessary truncation.
Related PR: #2305
Solution
Code changes
Concurrency-Specific Checks
Both reviewer and PR author to verify
synchronized,RWLock) are used where needed.ConcurrentHashMap,CopyOnWriteArrayList).How was this PR tested?
Does this PR introduce any user-facing or breaking changes?