-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Implement a fingerprinting mechanism to track compaction states in a more efficient manner #18844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
… storage configurable
| { | ||
| DefaultObjectMapper baseMapper = new DefaultObjectMapper(); | ||
| baseMapper.configure(SerializationFeature.ORDER_MAP_ENTRIES_BY_KEYS, true); | ||
| baseMapper.configure(MapperFeature.SORT_PROPERTIES_ALPHABETICALLY, true); |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation Note
ObjectMapper.configure
...ervice/src/main/java/org/apache/druid/indexing/compact/CompactionConfigBasedJobTemplate.java
Show resolved
Hide resolved
server/src/main/java/org/apache/druid/server/coordinator/duty/CompactSegments.java
Show resolved
Hide resolved
server/src/main/java/org/apache/druid/server/compaction/CompactionStatus.java
Show resolved
Hide resolved
server/src/main/java/org/apache/druid/segment/metadata/CompactionStateManager.java
Outdated
Show resolved
Hide resolved
| @LifecycleStop | ||
| public void stop() | ||
| { | ||
| fingerprintCache.invalidateAll(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this cache object need any other lifecycle cleanup?
server/src/main/java/org/apache/druid/segment/metadata/CompactionStateManager.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about if the operator has create tables disabled and does not properly create the table before upgrading?
server/src/main/java/org/apache/druid/segment/metadata/CompactionStateManager.java
Outdated
Show resolved
Hide resolved
kfaraz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feature, @capistrant !
I have started going through the PR, leaving a partial review here.
I am yet to go through several changes, such as the ones made in CompactionStatus, DatasourceCompactibleSegmentIterator, etc.
| * <p> | ||
| * Useful for simulations and unit tests where database persistence is not needed. | ||
| */ | ||
| public class HeapMemoryCompactionStateManager extends CompactionStateManager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be cleaner to let CompactionStateManager be an interface, and let both the heap-based and the concrete class implement it.
| * In-memory implementation of {@link CompactionStateManager} that stores | ||
| * compaction state fingerprints in heap memory without requiring a database. | ||
| * <p> | ||
| * Useful for simulations and unit tests where database persistence is not needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is used only in tests, we should probably put it in the test source root src/test/java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is where I originally put it, but then I tried to use it in a simulation class which is in the app code, not test. Let me review this though, maybe I am mistaken on how it is all working with the simulations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. Are you referring to CoordinatorSimulationBuilder or some other class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no CompactionRunSimulator, https://github.com/apache/druid/pull/18844/files#diff-b8a4fdf52e09ff26fa6f5610c021d196b9fa99673b83051de794ed07257be13b ... It creates CompactSegments instance, which as of now requires a CompactionStateManager. But I guess if we go the route of not supporting fingerprinting in the coordinator duty led compaction, this may not be a problem and it can be moved to the test space.
| |`druid.manager.compactionState.cacheSize`|The maximum number of compaction state fingerprints to cache in memory on the coordinator and overlord. Compaction state fingerprints are used to track the compaction configuration applied to segments. Consider increasing this value if you have a large number of datasources with compaction configurations.|`100`| | ||
| |`druid.manager.compactionState.prewarmSize`|The number of most recently used compaction state fingerprints to load into cache on Coordinator startup. This pre-warms the cache to improve performance immediately after startup.|`100`| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both Coordinator and Overlord (with segment metadata caching enabled) already keep all used segments in memory, including the respective (interned) CompactionState objects as well.
I don't think the number of distinct CompactState objects that we keep in memory will increase after this patch.
Do we still need to worry about the cache size of these objects?
Does a cache miss trigger a fetch from metadata store?
| { | ||
|
|
||
| /** | ||
| * Lazy initialization holder for deterministic ObjectMapper. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we shouldn't just inject this mapper annotated with @Sorted or @Deterministic as a lazy singleton. It may be injected into CompactionStateManager and fingerprints will always be created by that class rather than using a static utility method.
| if (segmentIterator.hasNext()) { | ||
| // If we are going to create compaction jobs for this compaction state, we need to persist the fingerprint -> state | ||
| // mapping so compacted segments from these jobs can reference a valid compaction state. | ||
| params.getCompactionStateManager().persistCompactionState( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The templates should only perform lightweight (i.e. non-IO) read-only operations as createCompactionJobs may be called frequently.
We should not do any persistence here.
Instead, the params can hold some mapping where we can add this compaction state and perform persistence on-demand (perhaps in the CompactionJobQueue).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for the guidance. Will work on how to get this out of hot path
| } | ||
| } | ||
|
|
||
| private static Function<Set<DataSegment>, Set<DataSegment>> addCompactionStateFingerprintToSegments(String compactionStateFingerprint) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's re-use the static function from AbstractTask itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure! I didn't know if it was bad form to reach into that class from MSQ. But I like having just one impl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fine to use AbstractTask in the MSQ code. Alternatively, you can put the method in IndexTaskUtils too.
| Tasks.DEFAULT_STORE_COMPACTION_STATE | ||
| ); | ||
|
|
||
| String compactionStateFingerprint = querySpec.getContext() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| String compactionStateFingerprint = querySpec.getContext() | |
| final String compactionStateFingerprint = querySpec.getContext() |
| pre-compute | ||
| pre-computed | ||
| pre-computing | ||
| pre-dates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
predates need not be hyphenated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sometimes my inability to spell, compounded by my inability to google how to spell, is embarrassing. this is one of those times. will fix
| * </p> | ||
| */ | ||
| @ManageLifecycle | ||
| public class CompactionStateManager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel that pre-warming the cache is really necessary. The fingerprint needs to be retrieved only when running the CompactionJobQueue on Overlord or CompactSegments on Coordinator.
- Let's always keep all the compaction states in memory. We are already keeping all the used segments in memory. The distinct
CompactionStateobjects will be fairly small in number and size. - The states can be cached in
HeapMemorySegmentMetadataCachewhich already serves as a cache for used segments, pending segments and schemas. - If possible, let's support the fingerprint flow only with compaction supervisors and not the Coordinator-based
CompactSegmentsduty. That would simplify the new flow and be another motivation for users to migrate to using compaction supervisors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible, let's support the fingerprint flow only with compaction supervisors and not the Coordinator-based CompactSegments duty. That would simplify the new flow and be another motivation for users to migrate to using compaction supervisors.
would we want to deprecate CompactSegments compaction on the coordinator in this case? so we aren't forever supporting compaction without fingerprints + compaction with fingerprints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the plan was to deprecate CompactSegments once compaction supervisors took off. I don't fully recall if compaction supervisors is already marked GA or not. They would also have to be made the default, if we want to start deprecation of CompactSegments.
But I feel all of this should be out of scope for the current PR.
If supporting the fingerprint logic in CompactSegments is not additional work and does not complicate the flow, we can leave it as is.
My only concern is that there should be just one service that is responsible for persisting new fingerprints. I would prefer that to be the Overlord, so that it always has a consistent cache state. So we either just don't support fingerprints on the Coordinator or we handle persistence by calling an Overlord API.
(I am yet to go through the whole PR to identify all the call sites that may persist a compaction state. I have only found the one in CompactionConfigBasedJobTemplate so far.)
Description
Compaction State Fingerprinting
Instead of storing
CompactionStateas thelastCompactionStatefield in every compaction segment, generate a fingerprint for aCompactionStateand attach that to compacted segments. Add new centralized storage forCompactionStatewhere individual states can be looked up by the aforementioned fingerprint. Since it is common for many segments in a data source to share a singleCompactionState, this greatly reduces the metadata storage overhead for storing compaction states.Metadata Store Changes
druid_segmentsAdd new column
compaction_state_fingerprintthat stores the fingerprint representation of the segments current compaction state. It can benullif no compaction has taken place.druid_compactionStatesNew metadata table that stores the full
CompactionStateassociated with a fingerprint. Segments can look up their full compaction state here by using thecompaction_state_fingerprintthat they are associated with.CompactionStateManagerThe CompactionStateManager is responsible for managing the persistence and lifecycle of compaction states on the Coordinator. It stores unique compaction configurations (identified by fingerprints) in the metadata database and maintains a cache to optimize lookups. The manager tracks which compaction states are actively referenced by segments, marking unreferenced states as unused and periodically cleaning up old unused states. This fingerprinting approach allows Druid to efficiently store and retrieve compaction metadata without duplicating identical compaction configurations across multiple segments, while the cache layer minimizes database queries for frequently accessed compaction states.
OnHeapCompactionStateManagerMeant to serve as a mechanism for testing and simulations where metadata persistence may not be available/needed
Legacy
lastCompactionStateRoadmapThis PR implements no automatic transition to fingerprints for segments who are compacted and store
CompactionStatein theirlastCompactionStatefield. Instead this PR aims to continue supportinglastCompactionStatein Compaction decision making for segments compacted before fingerprinting. This means that legacy segments will not have to be re-compacted simply because they are not fingerprinted, as long as they have the properCompactionStateas specified by the compaction configuration for the data source in question.This PR also continues to write both the new fingerprint as well as the legacy
lastCompactionStateby default. This allows normal rolling upgrade order as well as Druid version rollback without un-needed re-compaction. An operator can disable writinglastCompactionStateby updating the cluster compaction config, after the Druid upgrade completes. Eventually, Druid code base will cease writinglastCompactionStateat all and instead force using fingerprinting going forward. I think this should be done in the Druid version following the first version that this new feature is seen in. Even at this point,lastCompactionStatewill need to continue to be supported for already written segments, unless we want to devise an automated migration plan that can run in the background of a cluster to get all compacted segments migrated to fingerprinting.Release note
coming soon
Upgrade Note
Metadata store changes are required for this upgrade. If you already have
druid.metadata.storage.connector.createTablesset totrueno action is needed. If you have this feature disabled, you will need to alter thesegmentstable and create thecompactionStatestable. Postgres DDL is provided below as a guide. You will have to adapt the syntax to your metadata store backend as well as use proper table naming depending on your configured table prefix and database.Key changed/added classes in this PR
CompactionStatusCompactionConfigBasedJobTemplateCompactionStateSQLMetadataConnectorCompactionStateManagerCompactSegmentsKillUnreferencedCompactionStateThis PR has: