Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-18893. Make Trash Policy pluggable for different FileSystems #6061

Open
wants to merge 1 commit into
base: trunk
Choose a base branch
from

Conversation

mehakmeet
Copy link
Contributor

@mehakmeet mehakmeet commented Sep 13, 2023

Description of PR

Follow up from #4729

How was this patch tested?

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@mehakmeet
Copy link
Contributor Author

Putting up an initial PR for Trash policy separation. Need to verify this in a cluster still, but have added a small test for S3A and ABFS. HDFS has some good tests for Trash too.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 29s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 1s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 44s Maven dependency ordering for branch
+1 💚 mvninstall 20m 0s trunk passed
+1 💚 compile 10m 40s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 9m 25s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 2m 22s trunk passed
+1 💚 mvnsite 2m 39s trunk passed
+1 💚 javadoc 2m 10s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 2m 0s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 32s trunk passed
+1 💚 shadedclient 23m 22s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 23s Maven dependency ordering for patch
+1 💚 mvninstall 1m 26s the patch passed
+1 💚 compile 10m 36s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 10m 36s the patch passed
+1 💚 compile 9m 34s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 9m 34s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 2m 18s /results-checkstyle-root.txt root: The patch generated 5 new + 73 unchanged - 5 fixed = 78 total (was 78)
+1 💚 mvnsite 2m 37s the patch passed
+1 💚 javadoc 2m 6s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 2m 0s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
-1 ❌ spotbugs 1m 45s /new-spotbugs-hadoop-common-project_hadoop-common.html hadoop-common-project/hadoop-common generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+1 💚 shadedclient 21m 30s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 16m 29s /patch-unit-hadoop-common-project_hadoop-common.txt hadoop-common in the patch passed.
+1 💚 unit 2m 39s hadoop-aws in the patch passed.
+1 💚 unit 2m 6s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 49s The patch does not generate ASF License warnings.
175m 18s
Reason Tests
SpotBugs module:hadoop-common-project/hadoop-common
Dead store to dir in org.apache.hadoop.fs.TrashPolicyDefault.deleteCheckpoint(Path) At TrashPolicyDefault.java:org.apache.hadoop.fs.TrashPolicyDefault.deleteCheckpoint(Path) At TrashPolicyDefault.java:[line 426]
Failed junit tests hadoop.conf.TestCommonConfigurationFields
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6061/1/artifact/out/Dockerfile
GITHUB PR #6061
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 51904d531945 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 754a375
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6061/1/testReport/
Max. process+thread count 1276 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws hadoop-tools/hadoop-azure U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6061/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

* @param fs the file system to be used
* @return an instance of TrashPolicy
*/
@SuppressWarnings("ClassReferencesSubclass")
public static TrashPolicy getInstance(Configuration conf, FileSystem fs) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the conf come from the filesystem? as if so it'd let us do per-bucket stuff. if not, well, it's possibly too late

@@ -193,7 +193,7 @@ public boolean moveToTrash(Path path) throws IOException {
// move to current trash
fs.rename(path, trashPath,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should check return value here. "false" isn't that useful, but it does mean the rename failed

@@ -347,31 +354,42 @@ private void createCheckpoint(Path trashRoot, Date date) throws IOException {
while (true) {
try {
fs.rename(current, checkpoint, Rename.NONE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, handle failure here

@@ -347,31 +354,42 @@ private void createCheckpoint(Path trashRoot, Date date) throws IOException {
while (true) {
try {
fs.rename(current, checkpoint, Rename.NONE);
LOG.info("Created trash checkpoint: " + checkpoint.toUri().getPath());
LOG.info("Created trash checkpoint: {}", checkpoint.toUri().getPath());
break;
} catch (FileAlreadyExistsException e) {
if (++attempt > 1000) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets make this a constant

} catch (FileNotFoundException fnfe) {
return;
return 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that the listStatus calls may not raise the FNFE until the next/hasNext call, so catch it in the iterator below too

Configuration conf = new Configuration();
Trash trash = new Trash(getFileSystem(), conf);
assertEquals("Mismatch in Trash Policy set by the config",
trash.getTrashPolicy().getClass(), EmptyTrashPolicy.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer assertJ

@Test
public void testTrashSetToEmptyTrashPolicy() throws IOException {
Configuration conf = new Configuration();
Trash trash = new Trash(getFileSystem(), conf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use conf from filesystem

@@ -3450,23 +3454,29 @@ public Path getTrashRoot(Path path) {
public Collection<FileStatus> getTrashRoots(boolean allUsers) {
Path userHome = new Path(getHomeDirectory().toUri().getPath());
List<FileStatus> ret = new ArrayList<>();
// an operation to look up a path status and add it to the return list
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice bit of work

if (getFileSystem().delete(path, true)) {
LOG.info("Deleted trash checkpoint: {}", path);
} else {
LOG.warn("Couldn't delete checkpoint: {}. Ignoring.", path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a check to see if path exists. if it doesn't exist any more, just log at info rather than warn


/**
* parse the name of a checkpoint to extgact its timestamp.
* Uses the Hadoop 0.23 checkpoint as well as the older version (!).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be time to return the older version code if it is over complex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants