Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-19282. STSClientFactory: do not use URIBuilder #7483

Open
wants to merge 1 commit into
base: trunk
Choose a base branch
from

Conversation

LDVSOFT
Copy link

@LDVSOFT LDVSOFT commented Mar 7, 2025

Description of PR

URIBuilder was used from the AWS SDK for Java v2, to be precise from the shaded Apache HTTP Client. It is a problem if a user would like not to use the AWS SDK bundle, since more or less only 3 modules are needed (s3, s3-transfer & sts), but that may cause problems on unshaded dependency versions. Since a URI constructor can achieve the same here I switched it as a preferred option.

How was this patch tested?

I've run the test suite against a eu-west-1 bucket, without scaling/load since the change shouldn't affect that. To be exact, with something like this:

auth-keys.xml
<configuration>
<property>
    <name>test.fs.s3a.name</name>
    <value>s3a://hadoop-test-‹edited›</value>
</property>

<property>
    <name>test.fs.s3a.encryption.enabled</name>
    <value>false</value>
    <description>Don't wanna</description>
</property>

<property>
    <name>test.fs.s3a.create.acl.enabled</name>
    <value>false</value>
    <description>disabled on server</description>
</property>

<property>
    <name>fs.s3a.endpoint.region</name>
    <value>eu-west-1</value>
</property>

<property>
    <name>fs.s3a.assumed.role.sts.endpoint.region</name>
    <value>eu-west-1</value>
</property>

<property>
    <name>test.sts.endpoint</name>
    <description>Specific endpoint to use for STS requests.</description>
    <value>sts.eu-west-1.amazonaws.com</value>
</property>

<property>
    <name>fs.s3a.assumed.role.sts.endpoint</name>
    <value>${test.sts.endpoint}</value>
</property>

<property>
    <name>fs.contract.test.fs.s3a</name>
    <value>${test.fs.s3a.name}</value>
</property>

<property>
    <!-- Runs under aws-vault --no-session -->
    <name>fs.s3a.aws.credentials.provider</name>
    <value>software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider</value>
</property>

<property>
    <!-- Runs under aws-vault --no-session -->
    <name>fs.s3a.assumed.role.credentials.provider</name>
    <value>software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider</value>
</property>

<property>
    <name>fs.s3a.assumed.role.arn</name>
    <value>arn:aws:iam::‹edited›:role/hadoop_test_role_‹edited›</value>
</property>

<!-- is there a typo in the docs? -->
<property>
    <name>fs.s3a.delegation.token.endpoint</name>
    <value>${fs.s3a.assumed.role.sts.endpoint}</value>
</property>
</configuration>

Almost all test pass:

  • I wasn't able to make ITestDelegatedMRJob work. They probably clean out environment somewhere and my environment-provided AWS credentials didn't work. Also it looks parametrized, and I can't tell from the Surefire/Failsafe reports which causes a problem.
  • ITestRoleDelegationInFilesystem/ITestSessionDelegationInFilesystem fail a bit in missmatch2, but I'm really unfamiliar with credentials delegation. Probably lost environment variables on it's way?
  • Sometimes ITestS3APrefetchingInputStream fails with 0 size.
  • To be honest those also don't work for me on trunk!

Given that other tests pass and the scope of the change I think it's fine, and the problem is my test setup misconfiguration. If you know how to fix the setup — I can rerun with some other options.

Also, I've found this bug while repackaging Spark for a local K8S deployment, and with this fix STS configuration options work even if I do replace AWS SDK bundle with only required SDK modules.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

Sign-off

I give a license to the Apache Software Foundation to use this code, as required under §5 of the Apache License.

URIBuilder was used from the AWS SDK for Java v2, to be precise from the
shaded Apache HTTP Client. It is a problem if a user would like not to
use the AWS SDK bundle, since more or less only 3 modules are needed
(s3, s3-transfer & sts), but that may cause problems on unshaded
dependency versions. Since a URI constructor can achieve the same here I
switched it as a preferred option.
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 24s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
-1 ❌ mvninstall 0m 22s /branch-mvninstall-root.txt root in trunk failed.
-1 ❌ compile 0m 22s /branch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.txt hadoop-aws in trunk failed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.
-1 ❌ compile 0m 22s /branch-compile-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.txt hadoop-aws in trunk failed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.
-0 ⚠️ checkstyle 0m 20s /buildtool-branch-checkstyle-hadoop-tools_hadoop-aws.txt The patch fails to run checkstyle in hadoop-aws
-1 ❌ mvnsite 0m 21s /branch-mvnsite-hadoop-tools_hadoop-aws.txt hadoop-aws in trunk failed.
-1 ❌ javadoc 0m 11s /branch-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.txt hadoop-aws in trunk failed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.
-1 ❌ javadoc 0m 23s /branch-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.txt hadoop-aws in trunk failed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.
-1 ❌ spotbugs 0m 22s /branch-spotbugs-hadoop-tools_hadoop-aws.txt hadoop-aws in trunk failed.
+1 💚 shadedclient 2m 22s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
-1 ❌ mvninstall 1m 46s /patch-mvninstall-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
+1 💚 compile 1m 9s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 1m 9s the patch passed
+1 💚 compile 0m 16s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 0m 16s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 18s /buildtool-patch-checkstyle-hadoop-tools_hadoop-aws.txt The patch fails to run checkstyle in hadoop-aws
-1 ❌ mvnsite 0m 20s /patch-mvnsite-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ javadoc 0m 22s /patch-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.txt hadoop-aws in the patch failed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.
-1 ❌ javadoc 0m 15s /patch-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.txt hadoop-aws in the patch failed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.
+1 💚 spotbugs 0m 46s the patch passed
-1 ❌ shadedclient 2m 5s patch has errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 0m 21s /patch-unit-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
+0 🆗 asflicense 0m 23s ASF License check generated no output?
14m 13s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7483/1/artifact/out/Dockerfile
GITHUB PR #7483
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 9be25db600bb 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / ec866e0
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7483/1/testReport/
Max. process+thread count 88 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7483/1/console
versions git=2.25.1 maven=3.6.3
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@LDVSOFT
Copy link
Author

LDVSOFT commented Mar 7, 2025

Hm. Looks like CI run wasn't very good: the only error I found in linked logs is Maven running out of threads/memory. Do I need to do something?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants