Skip to content

Conversation

ArnavBalyan
Copy link
Member

  • Add a minimal StorageProvider abstraction and selector that routes to hadoop vs non-hadoop classes.
  • Make Hadoop I/O resolve FileSystem per path to correctly hit the right connector.
  • This isolates local I/O from Hadoop today and sets up a clean interface to pull the correct concrete implementation at runtime.

* Opens the given path for reading.
*
* @param path fully-qualified file path (implementation specific semantics)
* @return an InputStream that must be closed by the caller
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JavaDoc should specify what specific IOException
subclasses might be thrown (e.g., FileNotFoundException,
AccessDeniedException) to help implementers and users handle
errors appropriately.

* @param path fully-qualified file path
* @param overwrite whether an existing file should be replaced
* @return an OutputStream that must be closed by the caller
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interface should clarify stream ownership and closing
responsibilities. Consider returning AutoCloseable wrappers or
documenting that callers must use try-with-resources.

* @param path fully-qualified file path (implementation specific semantics)
* @return an InputStream that must be closed by the caller
*/
InputStream openForRead(String path) throws IOException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future if we want to use this abstraction to read parquet files, the code requires to create a SeekableInputStream
Can be more useful to use SeekableInputStream that already extends InputStream?

@ArnavBalyan ArnavBalyan marked this pull request as draft September 2, 2025 09:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants