Skip to content

Conversation

@ravinarayansingh
Copy link
Contributor

@ravinarayansingh ravinarayansingh commented Oct 9, 2025

Summary

NIFI-15077

  • Introduced MOVE_CONFLICT_RESOLUTION property to handle filename collisions during file move
  • Implemented logic for conflict resolution strategies (REPLACE, RENAME, IGNORE, etc.)
  • Refactored filename normalization and path handling utilities
  • Updated FetchFTP and FetchSFTP processors for enhanced conflict resolution support
  • Added corresponding unit tests and utility methods for unique filename generation in conflicts

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000

Pull Request Formatting

  • Pull Request based on current revision of the main branch
  • Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • Build completed using ./mvnw clean install -P contrib-check
    • JDK 21
    • JDK 25

Licensing

  • New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • Documentation formatting appears as expected in rendered files

…ons in FetchFileTransfer

- Introduced MOVE_CONFLICT_RESOLUTION property to handle filename collisions during file move
- Implemented logic for conflict resolution strategies (REPLACE, RENAME, IGNORE, etc.)
- Refactored filename normalization and path handling utilities
- Updated FetchFTP and FetchSFTP processors for enhanced conflict resolution support
- Added corresponding unit tests and utility methods for unique filename generation in conflicts
Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for proposing this improvement @ravinarayansingh. The concept makes sense, and the basic approach looks good. I noted some recommendations on a few implementation and logging details.

ravinarayansingh and others added 2 commits October 22, 2025 21:13
…/src/main/java/org/apache/nifi/processor/util/file/transfer/FetchFileTransfer.java

Co-authored-by: David Handermann <[email protected]>
…lictUtil

- Replaced incrementing prefix-based approach with UUID prefix for unique filename generation
- Simplified conflict resolution logic in FetchFileTransfer during file move operations
- Improved logging consistency and updated descriptions for MOVE_DESTINATION_DIR property
@ravinarayansingh
Copy link
Contributor Author

Thanks for proposing this improvement @ravinarayansingh. The concept makes sense, and the basic approach looks good. I noted some recommendations on a few implementation and logging details.

Thanks for review, @exceptionfactory.
I’ve implemented the suggested changes. Please take another look when you have a chance.

- Adjusted indentation for property description for better readability
- Adjusted indentation for property description for better readability
Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updates @ravinarayansingh, there was one previous comment remaining on description formatting, and I recommended some adjustments to the logging messages, then this should be ready to go.

ravinarayansingh and others added 4 commits October 25, 2025 13:57
…/src/main/java/org/apache/nifi/processor/util/file/transfer/FetchFileTransfer.java

Co-authored-by: David Handermann <[email protected]>
…/src/main/java/org/apache/nifi/processor/util/file/transfer/FetchFileTransfer.java

Co-authored-by: David Handermann <[email protected]>
…/src/main/java/org/apache/nifi/processor/util/file/transfer/FetchFileTransfer.java

Co-authored-by: David Handermann <[email protected]>
…/src/main/java/org/apache/nifi/processor/util/file/transfer/FetchFileTransfer.java

Co-authored-by: David Handermann <[email protected]>
Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the incremental adjustments @ravinarayansingh, it looks like just accepting the proposed changes resulted in compilation issues, and a few comments are still unaddressed. If you can work through the remaining items and update the pull request, that would be helpful.

- Moved unique filename generation method from FileTransferConflictUtil to FetchFileTransfer
- Simplified conflict resolution logic by removing unused utility class
Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @ravinarayansingh. This looks better, but handling IGNORE, REJECT, and FAIL with the same case does not seem correct. The IGNORE and NONE options should not log a warning as previously mentioned, and the other two options require routing to a different location.

@ravinarayansingh
Copy link
Contributor Author

Thanks for the updates @ravinarayansingh. This looks better, but handling IGNORE, REJECT, and FAIL with the same case does not seem correct. The IGNORE and NONE options should not log a warning as previously mentioned, and the other two options require routing to a different location.

Thanks for the feedback, @exceptionfactory.
Just to confirm, for the IGNORE and NONE conflict resolution strategies, I’ll keep the log level as INFO, like this:

case FileTransfer.CONFLICT_RESOLUTION_IGNORE:
case FileTransfer.CONFLICT_RESOLUTION_NONE:
    getLogger().info("Configured to {} on move conflict for {}. Original remote file will be left in place.", strategy, flowFile);
    return;
                            

For the other two options — REJECT and FAIL — should these also log a warning, or should they additionally route the FlowFile to a different relationship (for example, failure or reject)?

@exceptionfactory
Copy link
Contributor

Thanks for the updates @ravinarayansingh. This looks better, but handling IGNORE, REJECT, and FAIL with the same case does not seem correct. The IGNORE and NONE options should not log a warning as previously mentioned, and the other two options require routing to a different location.

Thanks for the feedback, @exceptionfactory. Just to confirm, for the IGNORE and NONE conflict resolution strategies, I’ll keep the log level as INFO, like this:

case FileTransfer.CONFLICT_RESOLUTION_IGNORE:
case FileTransfer.CONFLICT_RESOLUTION_NONE:
    getLogger().info("Configured to {} on move conflict for {}. Original remote file will be left in place.", strategy, flowFile);
    return;
                            

Yes, for IGNORE and NONE, and INFO log looks good.

For the other two options — REJECT and FAIL — should these also log a warning, or should they additionally route the FlowFile to a different relationship (for example, failure or reject)?

Those options should log a warning, and route to the appropriate relationship, based on the description for each value.

…TE and adjusted pre-commit handling for MOVE

- Centralized MOVE and DELETE completion handling with pre-commit routing for MOVE conflicts and post-commit actions for DELETE.
- Introduced detailed failure reasons for enhanced debugging.
- Added REL_REJECT and REL_FAILURE relationships for comprehensive failure management.
- Updated unit tests to reflect the new process flow and relationships.
@ravinarayansingh
Copy link
Contributor Author

Hi @exceptionfactory
I have refactored the FetchFileTransfer processor to improve post-commit and pre-commit handling logic. Centralized MOVE and DELETE completion strategies, introduced detailed failure reasons for better debugging, and added new REL_REJECT and REL_FAILURE relationships for more robust error management. Updated corresponding unit tests to align with the revised flow.
please have a look

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @ravinarayansingh.

It looks like the changes resulted in some unit test failures:

TestFTP.testFetchFtpUnicodeFileName:350 expected: <1> but was: <0>
TestFTP.testFetchFtp:257 » IndexOutOfBounds Index 0 out of bounds for length 0

I should have considered this in my previous reply, but the changes surface the fact that the FetchFTP and FetchSFTP Processors do not have failure and reject relationships. They have more specific types of failure relationships, (not found, permission denied, and comms failure), but those do not immediately align with the other failure conditions.

Introducing new relationships creates potential migration problems if not handled programmatically, such as auto-terminating or migrating. That could be done, but on balance, adding new relationships seems to introduce additional complexity for a conditional scenario, which is not ideal.

Taking another look at the options, I think the reject and fail completion strategies should be removed and not supported in this Processor. Instead, limiting support to ignore or rename would avoid introducing new relationships, and still provide more flexibility.

How does that sound?

@ravinarayansingh
Copy link
Contributor Author

Thanks for the updates @ravinarayansingh.

It looks like the changes resulted in some unit test failures:

TestFTP.testFetchFtpUnicodeFileName:350 expected: <1> but was: <0>
TestFTP.testFetchFtp:257 » IndexOutOfBounds Index 0 out of bounds for length 0

I should have considered this in my previous reply, but the changes surface the fact that the FetchFTP and FetchSFTP Processors do not have failure and reject relationships. They have more specific types of failure relationships, (not found, permission denied, and comms failure), but those do not immediately align with the other failure conditions.

Introducing new relationships creates potential migration problems if not handled programmatically, such as auto-terminating or migrating. That could be done, but on balance, adding new relationships seems to introduce additional complexity for a conditional scenario, which is not ideal.

Taking another look at the options, I think the reject and fail completion strategies should be removed and not supported in this Processor. Instead, limiting support to ignore or rename would avoid introducing new relationships, and still provide more flexibility.

How does that sound?

Hi @exceptionfactory
That makes perfect sense, I agree with you. Limiting the completion strategies to ignore and rename sounds like the right approach. It keeps the design simpler and avoids unnecessary migration or compatibility issues.

- Removed redundant conflict resolution strategies (REJECT, FAIL) and relationships (REL_REJECT, REL_FAILURE).
- Refactored MOVE conflict handling to streamline conditions and improve clarity.
- Introduced `getSimpleFilename` utility for filename parsing.
- Updated logging to prioritize successful processing while handling edge cases gracefully.
@ravinarayansingh
Copy link
Contributor Author

Hi @exceptionfactory

I’ve updated the code with the following changes to simplify the FetchFileTransfer completion strategy logic:

  • Removed redundant conflict resolution strategies (REJECT, FAIL) and relationships (REL_REJECT, REL_FAILURE).
  • Refactored MOVE conflict handling to streamline conditions and improve clarity.
  • Introduced getSimpleFilename utility for filename parsing.
  • Updated logging to prioritize successful processing while handling edge cases gracefully.

please have look

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the redirected approach @ravinarayansingh. The general scope now looks good in terms of output relationships. The new methods, however, presents a number of implementation concerns and need some refactoring to avoid deep levels of nesting and numerous return statements.

- Centralized MOVE pre-commit and DELETE post-commit handling.
- Modularized MOVE conflict resolution with `resolveMoveConflict` and `handleMovePreCommit`.
- Enhanced logging and failure routing for improved debugging.
Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @ravinarayansingh.

Unfortunately this change appears to be running into to some fundamental limitations of the current design of FetchFileTransfer. Pushing down transfer relationship handling to separate methods, introducing short-circuit returns, and passing around large numbers of method arguments highlight some of the challenges.

It is noteworthy that the Completion Strategy property description indicates that if the strategy fails, a warning will be logged. With that in mind, it seems like the change should much more localized, avoiding FlowFile transfer and other operations in short-circuit methods.

- Moved DELETE and MOVE strategies to pre-commit for centralized handling.
- Introduced detailed failure reason constants for all failure scenarios.
- Updated flow file routing to use REL_PERMISSION_DENIED and REL_COMMS_FAILURE based on a failure type.
- Enhanced unit tests to validate updated behavior and new failure conditions.
@ravinarayansingh
Copy link
Contributor Author

Thanks for the updates @ravinarayansingh.

Unfortunately this change appears to be running into to some fundamental limitations of the current design of FetchFileTransfer. Pushing down transfer relationship handling to separate methods, introducing short-circuit returns, and passing around large numbers of method arguments highlight some of the challenges.

It is noteworthy that the Completion Strategy property description indicates that if the strategy fails, a warning will be logged. With that in mind, it seems like the change should much more localized, avoiding FlowFile transfer and other operations in short-circuit methods.

Thanks for the feedback @exceptionfactory

I’ve refactored the implementation to make the handling more localized and aligned with the intended design of FetchFileTransfer. Specifically:

  • Moved DELETE and MOVE strategies to a centralized pre-commit phase for consistent completion handling.
  • Introduced detailed failure reason constants to clearly identify each failure scenario.
  • Updated FlowFile routing to use REL_PERMISSION_DENIED and REL_COMMS_FAILURE based on the specific failure type, improving diagnostic clarity.
  • Enhanced unit tests to cover the updated completion strategy logic and verify the new failure conditions.

This refactoring keeps the behavior consistent with the Completion Strategy description while minimizing side effects and improving maintainability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants