Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#6561] improvement(core): Cache Hadoop Filesystem instance on Gravitino server to improve the performance #6619

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ public FileSystem getFileSystem(Path path, Map<String, String> config) throws IO

Configuration configuration = FileSystemUtils.createConfiguration(hadoopConfMap);

return AliyunOSSFileSystem.newInstance(path.toUri(), configuration);
return AliyunOSSFileSystem.get(path.toUri(), configuration);
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ public FileSystem getFileSystem(Path path, Map<String, String> config) throws IO
checkAndSetCredentialProvider(hadoopConfMap);

Configuration configuration = FileSystemUtils.createConfiguration(hadoopConfMap);
return S3AFileSystem.newInstance(path.toUri(), configuration);
return S3AFileSystem.get(path.toUri(), configuration);
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ public FileSystem getFileSystem(@Nonnull Path path, @Nonnull Map<String, String>
}

Configuration configuration = FileSystemUtils.createConfiguration(hadoopConfMap);
return FileSystem.newInstance(path.toUri(), configuration);
return FileSystem.get(path.toUri(), configuration);
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ public FileSystem getFileSystem(Path path, Map<String, String> config) throws IO
Map<String, String> hadoopConfMap =
FileSystemUtils.toHadoopConfigMap(config, GRAVITINO_KEY_TO_GCS_HADOOP_KEY);
Configuration configuration = FileSystemUtils.createConfiguration(hadoopConfMap);
return FileSystem.newInstance(path.toUri(), configuration);
return FileSystem.get(path.toUri(), configuration);
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,23 @@ public interface FileSystemProvider {
FileSystem getFileSystem(@Nonnull Path path, @Nonnull Map<String, String> config)
throws IOException;

/**
* Get the FileSystem instance according to the configuration map and file path.
*
* @param config The configuration for the FileSystem instance.
* @param path The path to the file system.
* @param disableCache Whether to cache the FileSystem instance.
* @return The FileSystem instance.
* @throws IOException If the FileSystem instance cannot be created.
*/
default FileSystem getFileSystem(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this method only be used by the Gravitino server?

@Nonnull Path path, @Nonnull Map<String, String> config, boolean disableCache)
throws IOException {
// disable cache
config.put(String.format("fs.%s.impl.disable.cache", scheme()), String.valueOf(disableCache));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the config already contains key fs.%s.impl.disable.cache? will we cover it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will overwrite. I think we should prioritize the disableCache parameter.

return getFileSystem(path, config);
}

/**
* Scheme of this FileSystem provider. The value is 'file' for LocalFileSystem, 'hdfs' for HDFS,
* etc.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ public class HDFSFileSystemProvider implements FileSystemProvider {
public FileSystem getFileSystem(@Nonnull Path path, @Nonnull Map<String, String> config)
throws IOException {
Configuration configuration = FileSystemUtils.createConfiguration(GRAVITINO_BYPASS, config);
return FileSystem.newInstance(path.toUri(), configuration);
return FileSystem.get(path.toUri(), configuration);
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ public class LocalFileSystemProvider implements FileSystemProvider {
public FileSystem getFileSystem(Path path, Map<String, String> config) throws IOException {
Configuration configuration =
FileSystemUtils.createConfiguration(BUILTIN_HDFS_FS_PROVIDER, config);
return FileSystem.newInstance(path.toUri(), configuration);
return FileSystem.get(path.toUri(), configuration);
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -319,7 +319,7 @@ private FilesetContextPair getFilesetContext(Path virtualPath, FilesetDataOperat

totalProperty.putAll(getCredentialProperties(provider, catalog, identifier));

return provider.getFileSystem(filePath, totalProperty);
return provider.getFileSystem(filePath, totalProperty, true);
} catch (IOException ioe) {
throw new GravitinoRuntimeException(
ioe,
Expand Down