Skip to content

Conversation

@capistrant
Copy link
Contributor

@capistrant capistrant commented Dec 15, 2025

Description

Compaction State Fingerprinting

Instead of storing CompactionState as the lastCompactionState field in every compaction segment, generate a fingerprint for a CompactionState and attach that to compacted segments. Add new centralized storage for CompactionState where individual states can be looked up by the aforementioned fingerprint. Since it is common for many segments in a data source to share a single CompactionState, this greatly reduces the metadata storage overhead for storing compaction states.

Metadata Store Changes
druid_segments

Add new column compaction_state_fingerprint that stores the fingerprint representation of the segments current compaction state. It can be null if no compaction has taken place.

druid_compactionStates

New metadata table that stores the full CompactionState associated with a fingerprint. Segments can look up their full compaction state here by using the compaction_state_fingerprint that they are associated with.

CompactionStateManager

The CompactionStateManager is responsible for managing the persistence and lifecycle of compaction states on the Coordinator. It stores unique compaction configurations (identified by fingerprints) in the metadata database and maintains a cache to optimize lookups. The manager tracks which compaction states are actively referenced by segments, marking unreferenced states as unused and periodically cleaning up old unused states. This fingerprinting approach allows Druid to efficiently store and retrieve compaction metadata without duplicating identical compaction configurations across multiple segments, while the cache layer minimizes database queries for frequently accessed compaction states.

OnHeapCompactionStateManager

Meant to serve as a mechanism for testing and simulations where metadata persistence may not be available/needed

Legacy lastCompactionState Roadmap

This PR implements no automatic transition to fingerprints for segments who are compacted and store CompactionState in their lastCompactionState field. Instead this PR aims to continue supporting lastCompactionState in Compaction decision making for segments compacted before fingerprinting. This means that legacy segments will not have to be re-compacted simply because they are not fingerprinted, as long as they have the proper CompactionState as specified by the compaction configuration for the data source in question.

This PR also continues to write both the new fingerprint as well as the legacy lastCompactionState by default. This allows normal rolling upgrade order as well as Druid version rollback without un-needed re-compaction. An operator can disable writing lastCompactionState by updating the cluster compaction config, after the Druid upgrade completes. Eventually, Druid code base will cease writing lastCompactionState at all and instead force using fingerprinting going forward. I think this should be done in the Druid version following the first version that this new feature is seen in. Even at this point, lastCompactionState will need to continue to be supported for already written segments, unless we want to devise an automated migration plan that can run in the background of a cluster to get all compacted segments migrated to fingerprinting.

Release note

coming soon

Upgrade Note

Metadata store changes are required for this upgrade. If you already have druid.metadata.storage.connector.createTables set to true no action is needed. If you have this feature disabled, you will need to alter the segments table and create the compactionStates table. Postgres DDL is provided below as a guide. You will have to adapt the syntax to your metadata store backend as well as use proper table naming depending on your configured table prefix and database.

-- create the compaction states lookup table and associated indices
CREATE TABLE druid_compactionStates (
    id BIGSERIAL NOT NULL,
    created_date VARCHAR(255) NOT NULL,
    datasource VARCHAR(255) NOT NULL,
    fingerprint VARCHAR(255) NOT NULL,
    payload BYTEA NOT NULL,
    used BOOLEAN NOT NULL,
    used_status_last_updated VARCHAR(255) NOT NULL,
    PRIMARY KEY (id),
    UNIQUE (fingerprint)
  );

  CREATE INDEX idx_druid_compactionStates_fingerprint ON druid_compactionStates(fingerprint);
  CREATE INDEX idx_druid_compactionStates_used ON druid_compactionStates(used, used_status_last_updated);
-- modify druid_segments table to have a column for storing compaction state fingerprints
ALTER TABLE druid_segments ADD COLUMN compaction_state_fingerprint VARCHAR(255);

Key changed/added classes in this PR
  • CompactionStatus
  • CompactionConfigBasedJobTemplate
  • CompactionState
  • SQLMetadataConnector
  • CompactionStateManager
  • CompactSegments
  • KillUnreferencedCompactionState

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

{
DefaultObjectMapper baseMapper = new DefaultObjectMapper();
baseMapper.configure(SerializationFeature.ORDER_MAP_ENTRIES_BY_KEYS, true);
baseMapper.configure(MapperFeature.SORT_PROPERTIES_ALPHABETICALLY, true);

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note

Invoking
ObjectMapper.configure
should be avoided because it has been deprecated.
@LifecycleStop
public void stop()
{
fingerprintCache.invalidateAll();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this cache object need any other lifecycle cleanup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about if the operator has create tables disabled and does not properly create the table before upgrading?

Copy link
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feature, @capistrant !

I have started going through the PR, leaving a partial review here.
I am yet to go through several changes, such as the ones made in CompactionStatus, DatasourceCompactibleSegmentIterator, etc.

* <p>
* Useful for simulations and unit tests where database persistence is not needed.
*/
public class HeapMemoryCompactionStateManager extends CompactionStateManager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be cleaner to let CompactionStateManager be an interface, and let both the heap-based and the concrete class implement it.

* In-memory implementation of {@link CompactionStateManager} that stores
* compaction state fingerprints in heap memory without requiring a database.
* <p>
* Useful for simulations and unit tests where database persistence is not needed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is used only in tests, we should probably put it in the test source root src/test/java.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is where I originally put it, but then I tried to use it in a simulation class which is in the app code, not test. Let me review this though, maybe I am mistaken on how it is all working with the simulations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. Are you referring to CoordinatorSimulationBuilder or some other class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no CompactionRunSimulator, https://github.com/apache/druid/pull/18844/files#diff-b8a4fdf52e09ff26fa6f5610c021d196b9fa99673b83051de794ed07257be13b ... It creates CompactSegments instance, which as of now requires a CompactionStateManager. But I guess if we go the route of not supporting fingerprinting in the coordinator duty led compaction, this may not be a problem and it can be moved to the test space.

Comment on lines +814 to +815
|`druid.manager.compactionState.cacheSize`|The maximum number of compaction state fingerprints to cache in memory on the coordinator and overlord. Compaction state fingerprints are used to track the compaction configuration applied to segments. Consider increasing this value if you have a large number of datasources with compaction configurations.|`100`|
|`druid.manager.compactionState.prewarmSize`|The number of most recently used compaction state fingerprints to load into cache on Coordinator startup. This pre-warms the cache to improve performance immediately after startup.|`100`|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both Coordinator and Overlord (with segment metadata caching enabled) already keep all used segments in memory, including the respective (interned) CompactionState objects as well.
I don't think the number of distinct CompactState objects that we keep in memory will increase after this patch.

Do we still need to worry about the cache size of these objects?
Does a cache miss trigger a fetch from metadata store?

{

/**
* Lazy initialization holder for deterministic ObjectMapper.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we shouldn't just inject this mapper annotated with @Sorted or @Deterministic as a lazy singleton. It may be injected into CompactionStateManager and fingerprints will always be created by that class rather than using a static utility method.

if (segmentIterator.hasNext()) {
// If we are going to create compaction jobs for this compaction state, we need to persist the fingerprint -> state
// mapping so compacted segments from these jobs can reference a valid compaction state.
params.getCompactionStateManager().persistCompactionState(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The templates should only perform lightweight (i.e. non-IO) read-only operations as createCompactionJobs may be called frequently.
We should not do any persistence here.
Instead, the params can hold some mapping where we can add this compaction state and perform persistence on-demand (perhaps in the CompactionJobQueue).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for the guidance. Will work on how to get this out of hot path

}
}

private static Function<Set<DataSegment>, Set<DataSegment>> addCompactionStateFingerprintToSegments(String compactionStateFingerprint)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's re-use the static function from AbstractTask itself?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure! I didn't know if it was bad form to reach into that class from MSQ. But I like having just one impl

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to use AbstractTask in the MSQ code. Alternatively, you can put the method in IndexTaskUtils too.

Tasks.DEFAULT_STORE_COMPACTION_STATE
);

String compactionStateFingerprint = querySpec.getContext()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
String compactionStateFingerprint = querySpec.getContext()
final String compactionStateFingerprint = querySpec.getContext()

pre-compute
pre-computed
pre-computing
pre-dates
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

predates need not be hyphenated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sometimes my inability to spell, compounded by my inability to google how to spell, is embarrassing. this is one of those times. will fix

* </p>
*/
@ManageLifecycle
public class CompactionStateManager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel that pre-warming the cache is really necessary. The fingerprint needs to be retrieved only when running the CompactionJobQueue on Overlord or CompactSegments on Coordinator.

  1. Let's always keep all the compaction states in memory. We are already keeping all the used segments in memory. The distinct CompactionState objects will be fairly small in number and size.
  2. The states can be cached in HeapMemorySegmentMetadataCache which already serves as a cache for used segments, pending segments and schemas.
  3. If possible, let's support the fingerprint flow only with compaction supervisors and not the Coordinator-based CompactSegments duty. That would simplify the new flow and be another motivation for users to migrate to using compaction supervisors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, let's support the fingerprint flow only with compaction supervisors and not the Coordinator-based CompactSegments duty. That would simplify the new flow and be another motivation for users to migrate to using compaction supervisors.

would we want to deprecate CompactSegments compaction on the coordinator in this case? so we aren't forever supporting compaction without fingerprints + compaction with fingerprints?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the plan was to deprecate CompactSegments once compaction supervisors took off. I don't fully recall if compaction supervisors is already marked GA or not. They would also have to be made the default, if we want to start deprecation of CompactSegments.

But I feel all of this should be out of scope for the current PR.

If supporting the fingerprint logic in CompactSegments is not additional work and does not complicate the flow, we can leave it as is.

My only concern is that there should be just one service that is responsible for persisting new fingerprints. I would prefer that to be the Overlord, so that it always has a consistent cache state. So we either just don't support fingerprints on the Coordinator or we handle persistence by calling an Overlord API.

(I am yet to go through the whole PR to identify all the call sites that may persist a compaction state. I have only found the one in CompactionConfigBasedJobTemplate so far.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants