Skip to content

Upstream: 4f607fb4bae1ddfa5fc1f9c819759cbd45fe0ae2#768

Merged
kgyrtkirk merged 12 commits intomasterfrom
up-4f607fb4bae1ddfa5fc1f9c819759cbd45fe0ae2
Apr 2, 2025
Merged

Upstream: 4f607fb4bae1ddfa5fc1f9c819759cbd45fe0ae2#768
kgyrtkirk merged 12 commits intomasterfrom
up-4f607fb4bae1ddfa5fc1f9c819759cbd45fe0ae2

Conversation

@kgyrtkirk
Copy link
Copy Markdown
Owner

No description provided.

kfaraz and others added 12 commits April 1, 2025 11:42
* Introduce overlordNamespace label to filter out task pods from other namespaces

* Fixing changed unit tests

* Fix all unit tests in kubernetes overlord extension

* Add table to specify new configs

* Table with correct formatting

* Documentation for allowing Druid to run overlord in seperate namespaces

* Apply suggestions from code review

Co-authored-by: Frank Chen <frankchen@apache.org>

* Change Constant name to druid.overlord.namespace

* Fix checkstyle

* Documentation to clearer explain druid.indexer.runner.overlordNamespace

* Only support deployment to other namespaces using Pod Template Adapter

* Edit documentation to reflect only requiring using custom pod template adapter for deploying jobs in other namespaces

* Remove unused imports

---------

Co-authored-by: Frank Chen <frankchen@apache.org>
Co-authored-by: asishupadhyay <akulabs8@gmail.com>
* Improve S3 upload speeds using aws transfer manager

* Pass correct amazonS3Client to ServerSideEncryptingAmazonS3

* Add Unit Test Cases

* Turn on transfer manager by default

* Add Druid documentation
This is the limit required by SegmentToMoveCalculator, so using a number
of threads higher than 100 cannot work.

Fixes apache#17801.
…17843)

This is an alternate approach to fix the selectors similarily to: apache#14795

This PR adds GraalJS - and configures the system to use that instead of the previously jdk built-in nashorn engine.

Incorporating Nashorn into the project would be more complicated as it would (beyond other thing) would require the end user to manually install it (its also not that much maintained)
This metric was never actually released. It was reverted prior
to going out, in apache#6631.
changes:
* `CompactionTask` now accepts a `projections` property which will cause classic and MSQ auto-compaction to build segments with projections
* `DataSourceCompactionConfig` has been turned into an interface, with the existing implementation renamed to `InlineSchemaDataSourceCompactionConfig`
* Added projections list to `InlineSchemaDataSourceCompactionConfig` to allow explicitly defining projections in an inline schema compaction spec
* if not explicitly defined, compaction tasks will now preserve existing projections when processing segments, combining all named projections across the segments being processed - different projections with the same name are not checked for equivalence, rather one will be chosen dependent on segment processing order.
* Added ability to define projections as a property of a datasource in the catalog
* If projections are defined in a catalog, they will be automatically used by MSQ insert and replace queries
* Added new experimental `CatalogDataSourceCompactionConfig` which allows populating much of a `CompactionTask` using information stored in the catalog. Currently this has some feature gaps compared to `InlineSchemaDataSourceCompactionConfig`, but will be improved in follow-up work to eventually become much more powerful than what can be expressed via a `InlineSchemaDataSourceCompactionConfig`
* Moved `MetadataCatalog` to druid-server from the catalog extension
* Added method to get `MetadataCatalog` from `CatalogResolver`
* Added `CatalogCoreModule` to provide a null binding for `MetadataCatalog`, overridden if the catalog extension is loaded
* Overlord added as a watcher for catalog like the Broker so that it can have `CatalogResolver` and `MetadataCatalog` available
* Added binding for `MetadataCatalog` to Coordinator to have `MetadataCatalg` available
* add catalog client period resync to resolve startup failure if coordinator is not running, `CatalogClientConfig` to control resync rate, retries
* add `ExcludeScope` and use it for `CatalogClientModule` so the module can exclude being loaded in coordinator-overlord combined mode
* Remove startupProbe in kubernetes-overlord-extensions

Signed-off-by: Sebastian Struß <struss@justtrack.io>

* Add unit test for probes removed from podSpec

Signed-off-by: Sebastian Struß <struss@justtrack.io>

* Restore styling, add resources to expected test output.

---------

Signed-off-by: Sebastian Struß <struss@justtrack.io>
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>

@MethodSource("data")
@ParameterizedTest(name = "{index}:with context {0}")
public void testInsertOnExternalDataSourceWithCatalogProjections(String contextName, Map<String, Object> context) throws IOException

Check notice

Code scanning / CodeQL

Useless parameter Note test

The parameter 'contextName' is never used.

Copilot Autofix

AI about 1 year ago

To fix the problem, we need to remove the unused contextName parameter from the testInsertOnExternalDataSourceWithCatalogProjections method. This involves:

  • Removing the contextName parameter from the method signature.
  • Updating the @ParameterizedTest annotation to reflect the change in the method signature.

This change should be made in the extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQInsertTest.java file.

Suggested changeset 1
extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQInsertTest.java

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQInsertTest.java b/extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQInsertTest.java
--- a/extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQInsertTest.java
+++ b/extensions-core/multi-stage-query/src/test/java/org/apache/druid/msq/exec/MSQInsertTest.java
@@ -518,4 +518,4 @@
   @MethodSource("data")
-  @ParameterizedTest(name = "{index}:with context {0}")
-  public void testInsertOnExternalDataSourceWithCatalogProjections(String contextName, Map<String, Object> context) throws IOException
+  @ParameterizedTest(name = "{index}:with context")
+  public void testInsertOnExternalDataSourceWithCatalogProjections(Map<String, Object> context) throws IOException
   {
EOF
@@ -518,4 +518,4 @@
@MethodSource("data")
@ParameterizedTest(name = "{index}:with context {0}")
public void testInsertOnExternalDataSourceWithCatalogProjections(String contextName, Map<String, Object> context) throws IOException
@ParameterizedTest(name = "{index}:with context")
public void testInsertOnExternalDataSourceWithCatalogProjections(Map<String, Object> context) throws IOException
{
Copilot is powered by AI and may make mistakes. Always verify output.
private MockAmazonS3Client()
{
super(new AmazonS3Client(), new NoopServerSideEncryption());
super(new AmazonS3Client(), new NoopServerSideEncryption(), new S3TransferConfig());

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note test

Invoking
AmazonS3Client.AmazonS3Client
should be avoided because it has been deprecated.

Copilot Autofix

AI about 1 year ago

To fix the problem, we need to replace the deprecated AmazonS3Client constructor with the recommended alternative. According to the AWS SDK documentation, the AmazonS3ClientBuilder should be used to create an instance of AmazonS3Client. This change will ensure that the code is using the latest and supported method for creating an AmazonS3Client instance.

  1. Replace the new AmazonS3Client() call with AmazonS3ClientBuilder.standard().build().
  2. Ensure that the necessary import for AmazonS3ClientBuilder is added.
Suggested changeset 1
extensions-core/s3-extensions/src/test/java/org/apache/druid/storage/s3/S3DataSegmentMoverTest.java

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/extensions-core/s3-extensions/src/test/java/org/apache/druid/storage/s3/S3DataSegmentMoverTest.java b/extensions-core/s3-extensions/src/test/java/org/apache/druid/storage/s3/S3DataSegmentMoverTest.java
--- a/extensions-core/s3-extensions/src/test/java/org/apache/druid/storage/s3/S3DataSegmentMoverTest.java
+++ b/extensions-core/s3-extensions/src/test/java/org/apache/druid/storage/s3/S3DataSegmentMoverTest.java
@@ -21,3 +21,3 @@
 
-import com.amazonaws.services.s3.AmazonS3Client;
+import com.amazonaws.services.s3.AmazonS3ClientBuilder;
 import com.amazonaws.services.s3.model.AccessControlList;
@@ -203,3 +203,3 @@
     {
-      super(new AmazonS3Client(), new NoopServerSideEncryption(), new S3TransferConfig());
+      super(AmazonS3ClientBuilder.standard().build(), new NoopServerSideEncryption(), new S3TransferConfig());
     }
EOF
@@ -21,3 +21,3 @@

import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.AmazonS3ClientBuilder;
import com.amazonaws.services.s3.model.AccessControlList;
@@ -203,3 +203,3 @@
{
super(new AmazonS3Client(), new NoopServerSideEncryption(), new S3TransferConfig());
super(AmazonS3ClientBuilder.standard().build(), new NoopServerSideEncryption(), new S3TransferConfig());
}
Copilot is powered by AI and may make mistakes. Always verify output.
Comment on lines +278 to +281
return InlineSchemaDataSourceCompactionConfig.builder()
.forDataSource("dataSource")
.withInputSegmentSizeBytes(500L)
.withMaxRowsPerSegment(10000)

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note test

Invoking
Builder.withMaxRowsPerSegment
should be avoided because it has been deprecated.

Copilot Autofix

AI about 1 year ago

To fix the problem, we need to replace the deprecated method withMaxRowsPerSegment with its recommended alternative. According to the deprecation notice, the alternative method should be used instead. We will update the method call in the createMSQCompactionConfig method to use the new method.

Suggested changeset 1
server/src/test/java/org/apache/druid/client/indexing/ClientCompactionRunnerInfoTest.java

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/server/src/test/java/org/apache/druid/client/indexing/ClientCompactionRunnerInfoTest.java b/server/src/test/java/org/apache/druid/client/indexing/ClientCompactionRunnerInfoTest.java
--- a/server/src/test/java/org/apache/druid/client/indexing/ClientCompactionRunnerInfoTest.java
+++ b/server/src/test/java/org/apache/druid/client/indexing/ClientCompactionRunnerInfoTest.java
@@ -280,3 +280,3 @@
                                                  .withInputSegmentSizeBytes(500L)
-                                                 .withMaxRowsPerSegment(10000)
+                                                 .withMaxRowsPerSegmentSpec(10000)
                                                  .withSkipOffsetFromLatest(new Period(3600))
EOF
@@ -280,3 +280,3 @@
.withInputSegmentSizeBytes(500L)
.withMaxRowsPerSegment(10000)
.withMaxRowsPerSegmentSpec(10000)
.withSkipOffsetFromLatest(new Period(3600))
Copilot is powered by AI and may make mistakes. Always verify output.
Comment on lines +110 to +112
InlineSchemaDataSourceCompactionConfig.builder()
.forDataSource("datasource")
.withMaxRowsPerSegment(100)

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note test

Invoking
Builder.withMaxRowsPerSegment
should be avoided because it has been deprecated.
@kgyrtkirk kgyrtkirk merged commit 4f607fb into master Apr 2, 2025
75 of 76 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants