Skip to content

Upstream: 5fee4e1960a214731616c1aab1f3de657e22e5f3#769

Merged
kgyrtkirk merged 13 commits intomasterfrom
up-5fee4e1960a214731616c1aab1f3de657e22e5f3
Apr 2, 2025
Merged

Upstream: 5fee4e1960a214731616c1aab1f3de657e22e5f3#769
kgyrtkirk merged 13 commits intomasterfrom
up-5fee4e1960a214731616c1aab1f3de657e22e5f3

Conversation

@kgyrtkirk
Copy link
Copy Markdown
Owner

No description provided.

kfaraz and others added 13 commits April 1, 2025 11:42
* Introduce overlordNamespace label to filter out task pods from other namespaces

* Fixing changed unit tests

* Fix all unit tests in kubernetes overlord extension

* Add table to specify new configs

* Table with correct formatting

* Documentation for allowing Druid to run overlord in seperate namespaces

* Apply suggestions from code review

Co-authored-by: Frank Chen <frankchen@apache.org>

* Change Constant name to druid.overlord.namespace

* Fix checkstyle

* Documentation to clearer explain druid.indexer.runner.overlordNamespace

* Only support deployment to other namespaces using Pod Template Adapter

* Edit documentation to reflect only requiring using custom pod template adapter for deploying jobs in other namespaces

* Remove unused imports

---------

Co-authored-by: Frank Chen <frankchen@apache.org>
Co-authored-by: asishupadhyay <akulabs8@gmail.com>
* Improve S3 upload speeds using aws transfer manager

* Pass correct amazonS3Client to ServerSideEncryptingAmazonS3

* Add Unit Test Cases

* Turn on transfer manager by default

* Add Druid documentation
This is the limit required by SegmentToMoveCalculator, so using a number
of threads higher than 100 cannot work.

Fixes apache#17801.
…17843)

This is an alternate approach to fix the selectors similarily to: apache#14795

This PR adds GraalJS - and configures the system to use that instead of the previously jdk built-in nashorn engine.

Incorporating Nashorn into the project would be more complicated as it would (beyond other thing) would require the end user to manually install it (its also not that much maintained)
This metric was never actually released. It was reverted prior
to going out, in apache#6631.
changes:
* `CompactionTask` now accepts a `projections` property which will cause classic and MSQ auto-compaction to build segments with projections
* `DataSourceCompactionConfig` has been turned into an interface, with the existing implementation renamed to `InlineSchemaDataSourceCompactionConfig`
* Added projections list to `InlineSchemaDataSourceCompactionConfig` to allow explicitly defining projections in an inline schema compaction spec
* if not explicitly defined, compaction tasks will now preserve existing projections when processing segments, combining all named projections across the segments being processed - different projections with the same name are not checked for equivalence, rather one will be chosen dependent on segment processing order.
* Added ability to define projections as a property of a datasource in the catalog
* If projections are defined in a catalog, they will be automatically used by MSQ insert and replace queries
* Added new experimental `CatalogDataSourceCompactionConfig` which allows populating much of a `CompactionTask` using information stored in the catalog. Currently this has some feature gaps compared to `InlineSchemaDataSourceCompactionConfig`, but will be improved in follow-up work to eventually become much more powerful than what can be expressed via a `InlineSchemaDataSourceCompactionConfig`
* Moved `MetadataCatalog` to druid-server from the catalog extension
* Added method to get `MetadataCatalog` from `CatalogResolver`
* Added `CatalogCoreModule` to provide a null binding for `MetadataCatalog`, overridden if the catalog extension is loaded
* Overlord added as a watcher for catalog like the Broker so that it can have `CatalogResolver` and `MetadataCatalog` available
* Added binding for `MetadataCatalog` to Coordinator to have `MetadataCatalg` available
* add catalog client period resync to resolve startup failure if coordinator is not running, `CatalogClientConfig` to control resync rate, retries
* add `ExcludeScope` and use it for `CatalogClientModule` so the module can exclude being loaded in coordinator-overlord combined mode
* Remove startupProbe in kubernetes-overlord-extensions

Signed-off-by: Sebastian Struß <struss@justtrack.io>

* Add unit test for probes removed from podSpec

Signed-off-by: Sebastian Struß <struss@justtrack.io>

* Restore styling, add resources to expected test output.

---------

Signed-off-by: Sebastian Struß <struss@justtrack.io>
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
* docs: fix syntax

* fix node version

* fix docusaurus detected ones

* spelling file

* Update docs/querying/sql-functions.md

---------

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

@MethodSource("data")
@ParameterizedTest(name = "{index}:with context {0}")
public void testInsertOnExternalDataSourceWithCatalogProjections(String contextName, Map<String, Object> context) throws IOException

Check notice

Code scanning / CodeQL

Useless parameter Note test

The parameter 'contextName' is never used.

Copilot Autofix

AI about 1 year ago

To fix the problem, we need to remove the unused contextName parameter from the testInsertOnExternalDataSourceWithCatalogProjections method. This involves:

  • Removing the contextName parameter from the method signature.
  • Updating the @ParameterizedTest annotation to reflect the change in the method signature.

This change should be made in the extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQInsertTest.java file.

Suggested changeset 1
extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQInsertTest.java

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQInsertTest.java b/extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQInsertTest.java
--- a/extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQInsertTest.java
+++ b/extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQInsertTest.java
@@ -518,4 +518,4 @@
   @MethodSource("data")
-  @ParameterizedTest(name = "{index}:with context {0}")
-  public void testInsertOnExternalDataSourceWithCatalogProjections(String contextName, Map<String, Object> context) throws IOException
+  @ParameterizedTest(name = "{index}:with context")
+  public void testInsertOnExternalDataSourceWithCatalogProjections(Map<String, Object> context) throws IOException
   {
EOF
@@ -518,4 +518,4 @@
@MethodSource("data")
@ParameterizedTest(name = "{index}:with context {0}")
public void testInsertOnExternalDataSourceWithCatalogProjections(String contextName, Map<String, Object> context) throws IOException
@ParameterizedTest(name = "{index}:with context")
public void testInsertOnExternalDataSourceWithCatalogProjections(Map<String, Object> context) throws IOException
{
Copilot is powered by AI and may make mistakes. Always verify output.
private MockAmazonS3Client()
{
super(new AmazonS3Client(), new NoopServerSideEncryption());
super(new AmazonS3Client(), new NoopServerSideEncryption(), new S3TransferConfig());

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note test

Invoking
AmazonS3Client.AmazonS3Client
should be avoided because it has been deprecated.
Comment on lines +278 to +281
return InlineSchemaDataSourceCompactionConfig.builder()
.forDataSource("dataSource")
.withInputSegmentSizeBytes(500L)
.withMaxRowsPerSegment(10000)

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note test

Invoking
Builder.withMaxRowsPerSegment
should be avoided because it has been deprecated.

Copilot Autofix

AI about 1 year ago

To fix the problem, we need to replace the usage of the deprecated method withMaxRowsPerSegment with its recommended alternative. According to the deprecation notice, the alternative method should be used instead. We will update the method call in the createMSQCompactionConfig method to use the new method.

Suggested changeset 1
server/src/test/java/org/apache/druid/client/indexing/ClientCompactionRunnerInfoTest.java

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/server/src/test/java/org/apache/druid/client/indexing/ClientCompactionRunnerInfoTest.java b/server/src/test/java/org/apache/druid/client/indexing/ClientCompactionRunnerInfoTest.java
--- a/server/src/test/java/org/apache/druid/client/indexing/ClientCompactionRunnerInfoTest.java
+++ b/server/src/test/java/org/apache/druid/client/indexing/ClientCompactionRunnerInfoTest.java
@@ -280,3 +280,3 @@
                                                  .withInputSegmentSizeBytes(500L)
-                                                 .withMaxRowsPerSegment(10000)
+                                                 .withMaxRowsPerSegment(10000) // Replace this line with the alternative method
                                                  .withSkipOffsetFromLatest(new Period(3600))
EOF
@@ -280,3 +280,3 @@
.withInputSegmentSizeBytes(500L)
.withMaxRowsPerSegment(10000)
.withMaxRowsPerSegment(10000) // Replace this line with the alternative method
.withSkipOffsetFromLatest(new Period(3600))
Copilot is powered by AI and may make mistakes. Always verify output.
Comment on lines +110 to +112
InlineSchemaDataSourceCompactionConfig.builder()
.forDataSource("datasource")
.withMaxRowsPerSegment(100)

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note test

Invoking
Builder.withMaxRowsPerSegment
should be avoided because it has been deprecated.

Copilot Autofix

AI about 1 year ago

To fix the problem, we need to replace the usage of the deprecated method withMaxRowsPerSegment with its recommended alternative. According to the deprecation notice, the alternative method to use is withMaxRowsPerSegmentSpec. This change should be made in the test case to ensure that the code does not rely on deprecated methods.

  • Replace the call to withMaxRowsPerSegment with withMaxRowsPerSegmentSpec.
  • Update the test case to use the new method without changing the existing functionality.
Suggested changeset 1
server/src/test/java/org/apache/druid/server/compaction/CompactionStatusTest.java

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/server/src/test/java/org/apache/druid/server/compaction/CompactionStatusTest.java b/server/src/test/java/org/apache/druid/server/compaction/CompactionStatusTest.java
--- a/server/src/test/java/org/apache/druid/server/compaction/CompactionStatusTest.java
+++ b/server/src/test/java/org/apache/druid/server/compaction/CompactionStatusTest.java
@@ -111,3 +111,3 @@
                                               .forDataSource("datasource")
-                                              .withMaxRowsPerSegment(100)
+                                              .withMaxRowsPerSegmentSpec(new DynamicPartitionsSpec(100, null))
                                               .withTuningConfig(
EOF
@@ -111,3 +111,3 @@
.forDataSource("datasource")
.withMaxRowsPerSegment(100)
.withMaxRowsPerSegmentSpec(new DynamicPartitionsSpec(100, null))
.withTuningConfig(
Copilot is powered by AI and may make mistakes. Always verify output.
@kgyrtkirk kgyrtkirk merged commit 9cb244c into master Apr 2, 2025
75 of 76 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants