Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAPREDUCE-7500. Support optimistic file renames in the commit protocol #7425

Open
wants to merge 8 commits into
base: trunk
Choose a base branch
from

Conversation

robreeves
Copy link

@robreeves robreeves commented Feb 21, 2025

Description of PR

This PR adds a new feature to commit files optimistically (assumes no conflicting file/dir in the destination) to avoid a FileSystem.getFileStatus RPC. The default behavior has not been changed. To use this feature this config must be set mapreduce.fileoutputcommitter.optimistic.file.commit.enabled=true.

This is useful for cases like Spark where no destination conflict is expected and the FileSystem.getFileStatus RPC is wasted time. When I profiled the commit time for a Spark job before this enhancement, it showed this call was taking 50% of the time (HDFS with intermittent latency in our environment).

How was this patch tested?

Correctness
I modified all tests in FileOutputCommitter tests to run with and without this configuration. I modified the test class to use parameterized tests using the default configs and this change enabled. There may also be an opportunity to move the v1/v2 algorithm tests into the parameterized test, but I opted to leave that refactor for later to minimize unnecessary changes.

[INFO] Running org.apache.hadoop.mapreduce.lib.output.TestFileOutputCommitter
[INFO] Tests run: 44, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.946 s - in org.apache.hadoop.mapreduce.lib.output.TestFileOutputCommitter

Performance
I tested the performance of the changes using Spark writing to HDFS for partitioned and non-partitioned datasets. The summary of the improvement is:

  • For the non-partitioned commit, the average commit time decreased from 16.6min to 4.8min (71% improvement).
  • For the partitioned commit, the average commit time decreased from 4.3min to 1.5min (65% improvement).

image

Non-partitioned test Spark script:

val fileCount = 5000
val path = "/path/temp_data_no_part"

spark.range(0, fileCount, 1, fileCount).write
  .mode(SaveMode.Overwrite)
  .option("path", path)
  .save()

Partitioned test Spark script:

val fileCount = 1000
val partitionCount = 5
val path = "/path/temp_data_part"

spark
  .range(0, fileCount, 1, fileCount)
  .withColumn("part", $"id" % lit(partitionCount))
  .write
  .mode(SaveMode.Overwrite)
  .option("path", path)
  .partitionBy("part")
  .save()

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 22s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 23m 37s trunk passed
+1 💚 compile 0m 24s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 19s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 0m 26s trunk passed
+1 💚 mvnsite 0m 29s trunk passed
+1 💚 javadoc 0m 23s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 18s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 0m 53s trunk passed
+1 💚 shadedclient 20m 1s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 21s the patch passed
+1 💚 compile 0m 19s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 19s the patch passed
+1 💚 compile 0m 16s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 0m 16s the patch passed
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 4 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️ checkstyle 0m 16s /results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: The patch generated 3 new + 15 unchanged - 0 fixed = 18 total (was 15)
+1 💚 mvnsite 0m 21s the patch passed
+1 💚 javadoc 0m 15s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 13s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 0m 50s the patch passed
+1 💚 shadedclient 19m 47s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 6m 47s hadoop-mapreduce-client-core in the patch passed.
+1 💚 asflicense 0m 21s The patch does not generate ASF License warnings.
77m 22s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7425/1/artifact/out/Dockerfile
GITHUB PR #7425
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 24488530f9c6 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 7a76f03
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7425/1/testReport/
Max. process+thread count 1212 (vs. ulimit of 5500)
modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7425/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 20s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 24m 17s trunk passed
+1 💚 compile 0m 23s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 21s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 0m 23s trunk passed
+1 💚 mvnsite 0m 25s trunk passed
+1 💚 javadoc 0m 19s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 14s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 0m 55s trunk passed
+1 💚 shadedclient 20m 59s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 18s the patch passed
+1 💚 compile 0m 18s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 18s the patch passed
+1 💚 compile 0m 17s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 0m 17s the patch passed
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 3 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️ checkstyle 0m 15s /results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: The patch generated 14 new + 26 unchanged - 0 fixed = 40 total (was 26)
+1 💚 mvnsite 0m 18s the patch passed
+1 💚 javadoc 0m 12s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 11s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 0m 54s the patch passed
+1 💚 shadedclient 21m 1s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 6m 50s hadoop-mapreduce-client-core in the patch passed.
+1 💚 asflicense 0m 22s The patch does not generate ASF License warnings.
79m 51s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7425/2/artifact/out/Dockerfile
GITHUB PR #7425
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux e792c9cdafad 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 2ffc045
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7425/2/testReport/
Max. process+thread count 1620 (vs. ulimit of 5500)
modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7425/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 22s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 23m 59s trunk passed
+1 💚 compile 0m 24s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 19s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 0m 23s trunk passed
+1 💚 mvnsite 0m 29s trunk passed
+1 💚 javadoc 0m 18s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 20s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 0m 53s trunk passed
+1 💚 shadedclient 21m 20s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 18s the patch passed
+1 💚 compile 0m 20s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 20s the patch passed
+1 💚 compile 0m 17s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 0m 17s the patch passed
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 3 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️ checkstyle 0m 18s /results-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: The patch generated 14 new + 26 unchanged - 0 fixed = 40 total (was 26)
+1 💚 mvnsite 0m 21s the patch passed
+1 💚 javadoc 0m 13s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 13s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 0m 51s the patch passed
+1 💚 shadedclient 21m 16s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 6m 41s hadoop-mapreduce-client-core in the patch passed.
+1 💚 asflicense 0m 24s The patch does not generate ASF License warnings.
80m 39s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7425/3/artifact/out/Dockerfile
GITHUB PR #7425
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 80d5bf7d32df 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 2ffc045
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7425/3/testReport/
Max. process+thread count 1557 (vs. ulimit of 5500)
modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7425/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@robreeves robreeves marked this pull request as ready for review February 21, 2025 22:21
@robreeves
Copy link
Author

robreeves commented Feb 21, 2025

@steveloughran @tasanuma can you take a look please?

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 20s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 23m 38s trunk passed
+1 💚 compile 0m 25s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 21s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 0m 25s trunk passed
+1 💚 mvnsite 0m 27s trunk passed
+1 💚 javadoc 0m 21s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 19s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 0m 53s trunk passed
+1 💚 shadedclient 20m 59s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 19s the patch passed
+1 💚 compile 0m 18s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 18s the patch passed
+1 💚 compile 0m 16s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 0m 16s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 15s the patch passed
+1 💚 mvnsite 0m 20s the patch passed
+1 💚 javadoc 0m 12s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 12s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 0m 50s the patch passed
+1 💚 shadedclient 21m 25s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 6m 28s hadoop-mapreduce-client-core in the patch passed.
+1 💚 asflicense 0m 21s The patch does not generate ASF License warnings.
79m 19s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7425/4/artifact/out/Dockerfile
GITHUB PR #7425
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux faf013b7ddcd 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 8cab9f6
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7425/4/testReport/
Max. process+thread count 1573 (vs. ulimit of 5500)
modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7425/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 20s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 24m 58s trunk passed
+1 💚 compile 0m 21s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 21s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 0m 22s trunk passed
+1 💚 mvnsite 0m 23s trunk passed
+1 💚 javadoc 0m 20s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 16s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 0m 50s trunk passed
+1 💚 shadedclient 21m 44s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 19s the patch passed
+1 💚 compile 0m 20s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 20s the patch passed
+1 💚 compile 0m 15s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 0m 15s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 16s the patch passed
+1 💚 mvnsite 0m 21s the patch passed
+1 💚 javadoc 0m 12s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 14s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 spotbugs 0m 52s the patch passed
+1 💚 shadedclient 20m 5s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 6m 47s hadoop-mapreduce-client-core in the patch passed.
+1 💚 asflicense 0m 25s The patch does not generate ASF License warnings.
80m 27s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7425/5/artifact/out/Dockerfile
GITHUB PR #7425
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux b381cb483b1f 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 00daa94
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7425/5/testReport/
Max. process+thread count 1557 (vs. ulimit of 5500)
modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7425/5/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor

I don't want to anywhere near that code as it is (a) critical and (b) and incredibly complicated co-recursive mix of two algorithms where you have to step though with a debugger to work out WTF is going wrong.

It isn't suited to cloud storage and even with HDFS, it hits limits due to lack of parallelisation.

So sorry, no, I don't want to touch this. There's just too much risk.

At the same time, if we can speed up that manifest committer, there's appeal there. Glancing at the RenameFilesStage, it already remembers if a dir had to be created -and if so knows there's nothing at the far end. Otherwise it does that probe + delete.

An optimistic commit there may have benefits, especially with azure where the HEAD probe will double the IO load of any rename, and job commit can put a lot of strain on IO quotas.

Can you take a look there?

I'm going to recommend

  • start with base manifest committer and your normal workload
  • set up a dir mapreduce.manifest.committer.summary.report.directory. This will save the iostatistics summary of the job for viewing, including summaries of number and duration of delete calls.
  • see if you could make RenameFilesStage.commitOneFile more optimistic. Ultimately this'd have to be made optional, but for an experiment it'd be good to see what gains you get

Test on HDFS -works well there and is more performant than the older committer, due to the parallel renames.

A before/after test on abfs would be interesting too. ABFS is a special pain point here as it does have problems with rename under load; if that load can be reduced, then that's good. But if because the parent dirs are actually created, such as when committing into an empty directory tree, I wouldn't expect any change at all.

@xinglin
Copy link
Contributor

xinglin commented Feb 27, 2025

Hi @robreeves,

Can we use the OVERRIDE option with rename to achieve the same effect? We don't have to check for existence before we do the rename. Just pass OVERRIDE to rename and we will override if destination already exists.

* If OVERWRITE option is passed as an argument, rename overwrites the dst if

@steveloughran
Copy link
Contributor

file context's rename/3 is its ow

@steveloughran
Copy link
Contributor

(sorry, accidentally closed it while trying to cancel my comment)

  • I'm not going to accept any changes to the core committer as it is too risky to change
  • happy to review changes to ManifestOutputCommitter
  • It doesn't use the FileContext APIs, it uses FileSystem, with a special integration extension for Abfs where file renames can be given the etag of the source file; this delivers resilience on rename failures caused by transient overload/recovery of the abfs store.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants