Description
Is your feature request related to a problem? Please describe.
S3 objects are more commonly referenced as s3://bucket_name/object_key
. ROS3 virtual file driver uses S3 object URLs instead in the form: https://s3.<aws_region>.amazonaws.com/<bucket_name>/<object_key>
. Constructing an object's URL requires extracting bucket name and object key from its URI. AWS region comes from the usual sources (config file or env. variable) unless specified by the user application. Bucket names are up to 63 bytes long, while object keys can be up to 1024 bytes long.
Describe the solution you'd like
Be able to use S3 object URIs when working with HDF5 files in S3 cloud stores. Example:
h5ls -r --vfd=ros3 s3://my-bucket/a/b/c/file.h5.
Describe alternatives you've considered
Write code to parse S3 object URI and assemble object's URL prior to invoking libhdf5 or its tools. This is what h5py currently does.
Additional context
Below is how ChatGPT would extract bucket and key names.
#include <stdio.h>
#include <string.h>
// Function to parse S3 URI
void parseS3URI(const char *uri, char *bucket_name, char *object_key) {
// Check if the URI starts with "s3://"
if (strncmp(uri, "s3://", 5) != 0) {
printf("Invalid S3 URI format. It should start with 's3://'\n");
return;
}
// Skip "s3://"
const char *path_start = uri + 5;
// Find the '/' delimiter to separate bucket name and object key
const char *slash_pos = strchr(path_start, '/');
if (slash_pos == NULL) {
printf("Invalid S3 URI format. Missing '/' after bucket name\n");
return;
}
// Calculate lengths
size_t bucket_len = slash_pos - path_start;
size_t key_len = strlen(slash_pos + 1);
// Copy bucket name
strncpy(bucket_name, path_start, bucket_len);
bucket_name[bucket_len] = '\0';
// Copy object key
strncpy(object_key, slash_pos + 1, key_len);
object_key[key_len] = '\0';
}
int main() {
// Example S3 URI
const char *s3_uri = "s3://my-test-bucket/my-folder/my-file.h5";
// Buffer to store parsed components
char bucket_name[64];
char object_key[1025];
// Parse the S3 URI
parseS3URI(s3_uri, bucket_name, object_key);
// Print the parsed components
printf("Bucket Name: %s\n", bucket_name);
printf("Object Key: %s\n", object_key);
return 0;
}
Metadata
Metadata
Labels
Type
Projects
Status