-
Notifications
You must be signed in to change notification settings - Fork 1.5k
GH-3279: Introduce StorageProvider to de-couple Hadoop vs NIO constructs #3280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
ArnavBalyan
commented
Aug 25, 2025
- Add a minimal StorageProvider abstraction and selector that routes to hadoop vs non-hadoop classes.
- Make Hadoop I/O resolve FileSystem per path to correctly hit the right connector.
- This isolates local I/O from Hadoop today and sets up a clean interface to pull the correct concrete implementation at runtime.
* Opens the given path for reading. | ||
* | ||
* @param path fully-qualified file path (implementation specific semantics) | ||
* @return an InputStream that must be closed by the caller |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The JavaDoc should specify what specific IOException
subclasses might be thrown (e.g., FileNotFoundException,
AccessDeniedException) to help implementers and users handle
errors appropriately.
* @param path fully-qualified file path | ||
* @param overwrite whether an existing file should be replaced | ||
* @return an OutputStream that must be closed by the caller | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interface should clarify stream ownership and closing
responsibilities. Consider returning AutoCloseable wrappers or
documenting that callers must use try-with-resources.
* @param path fully-qualified file path (implementation specific semantics) | ||
* @return an InputStream that must be closed by the caller | ||
*/ | ||
InputStream openForRead(String path) throws IOException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future if we want to use this abstraction to read parquet files, the code requires to create a SeekableInputStream
Can be more useful to use SeekableInputStream that already extends InputStream?