Upstream: cee06f0b1ef6d5c8806a94a387c7b5aa5da2b302#752
Conversation
* Update StringUtils.replace() after fix in JDK9 * Upgrade optimized string replace algorithm * Update methods by re-using declared StringUtils#replace method * Replace hard-coded UTF-8 encodings with StandardCharsets
…tionals; cleanup framework init (apache#17829) * cleans up `SqlTestFramework` initialization to leave the `OverrideModule` empty - so that tests could more easily take over parts * remove the `QueryComponentSupplier#createEngine` factory method - instead uses a `Class<SqlEngine>` and use the `injector` to initialize it * enables the usage of `!disabled <supplier> <message>` - to mark cases which are not yet supported with a specific configuration for some reason * fixes that `datasets` was not respecting the `rollup` specification of the ingest * enables to use `MultiComponentSupplier` backed tests - these will turn into matrix tests over multiple componentsuppliers - enabling running the same testcase in different scenarios
Description ----------- apache#17653 introduces a cache for segment metadata on the Overlord. This patch is a follow up to that to make the cache more robust, performant and debug-friendly. Changes --------- - Do not cache unused segments This significantly reduces sync time in cases where the cluster has a lot of unused segments. Unused segments are needed only during segment allocation to ensure that a duplicate ID is not allocated. This is a rare DB query which is supported by sufficient indexes and thus need not be cached at the moment. - Update cache directly when segments are marked as unused to avoid race conditions with DB sync. - Fix NPE when using segment metadata cache with concurrent locks. - Atomically update segment IDs and pending segments in a `HeapMemoryDatasourceSegmentCache` using methods `syncSegmentIds()` and `syncPendingSegments()` rather than updating one by one. This ensures that the locks are held for a shorter period and the update made to the cache is atomic. Main updated classes ---------------------- - `IndexerMetadataStorageCoordinator` - `OverlordDataSourcesResource` - `HeapMemorySegmentMetadataCache` - `HeapMemoryDatasourceSegmentCache` Cleaner cache sync -------------------- In every sync, the following steps are performed for each datasource: - Retrieve ALL used segment IDs from metadata store - Atomically update segment IDs in cache and determine list of segment IDs which need to be refreshed. - Fetch payloads of segments that need to be refreshed - Atomically update fetched payloads into the cache - Fetch ALL pending segments - Atomically update pending segments into the cache - Clean up empty intervals from datasource caches
Prior to this patch, an offset specified on a groupBy that itself has an inner groupBy would lead to an error like "Cannot push down offsets". This happened because of a violated assumption: the processing logic assumes that offsets have been pushed into limits (so limit pushdown optimizations can safely be used). This patch adjusts processing to incorporate offsets into limits during processing of subqueries. Later on, in post-processing, offsets are applied as written.
| } | ||
|
|
||
| default Set<String> getTableNames() | ||
| default Set<String> getTableNames(BrokerSegmentMetadataCache segmentMetadataCache) |
Check notice
Code scanning / CodeQL
Useless parameter Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 1 year ago
To fix the problem, we need to remove the unused segmentMetadataCache parameter from the getTableNames method. This involves updating the method signature to exclude the parameter and ensuring that any calls to this method are updated accordingly. Since we are only provided with the interface definition, we will update the method signature in the interface.
| @@ -62,3 +62,3 @@ | ||
|
|
||
| default Set<String> getTableNames(BrokerSegmentMetadataCache segmentMetadataCache) | ||
| default Set<String> getTableNames() | ||
| { |
| } | ||
|
|
||
| @SuppressWarnings("unused") | ||
| protected void validateFrameworkConfig(SqlTestFrameworkConfig cfg) |
Check notice
Code scanning / CodeQL
Useless parameter Note test
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 1 year ago
To fix the problem, we should remove the unused parameter cfg from the validateFrameworkConfig method. This will simplify the method signature and eliminate the unnecessary parameter, making the code cleaner and easier to maintain.
- Remove the parameter
cfgfrom thevalidateFrameworkConfigmethod. - Update any calls to
validateFrameworkConfigto remove the argument being passed.
| @@ -197,3 +197,3 @@ | ||
| SqlTestFrameworkConfig cfg = SqlTestFrameworkConfig.fromURL(parts[1]); | ||
| validateFrameworkConfig(cfg); | ||
| validateFrameworkConfig(); | ||
| if (MultiComponentSupplier.class.isAssignableFrom(cfg.componentSupplier)) { | ||
| @@ -208,3 +208,3 @@ | ||
| @SuppressWarnings("unused") | ||
| protected void validateFrameworkConfig(SqlTestFrameworkConfig cfg) | ||
| protected void validateFrameworkConfig() | ||
| { |
No description provided.