Skip to content

Add documentation and s3proxy testing for ROS3 VFD #5560

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

jhendersonHDF
Copy link
Collaborator

@jhendersonHDF jhendersonHDF commented Jun 1, 2025

Adds testing of ROS3 VFD with s3proxy and docker

Adds workflow to build ROS3 VFD and optionally build aws-c-s3 library from source or use package
managers where possible

Adds new H5Pset_fapl_ros3_endpoint()/H5Pget_fapl_ros3_endpoint() API functions to set/get an
alternative endpoint URL to use when opening files with the ROS3 VFD

Cleans up warnings in tools and tests related to ROS3 VFD structure size

Fixes #5109

Adds workflow to build ROS3 VFD and optionally build aws-c-s3 library
from source or use package managers

Adds testing of ROS3 VFD with s3proxy and docker

Adds new H5Pset_fapl_ros3_endpoint()/H5Pget_fapl_ros3_endpoint() API
functions to set/get an alternative endpoint URL to use when opening
files with the ROS3 VFD

Cleans up warnings in tools and tests related to ROS3 VFD structure size

Co-authored-by: Larry Knox <[email protected]>
Co-authored-by: Allen Byrne <[email protected]>
@jhendersonHDF jhendersonHDF added Priority - 1. High These are important issues that should be resolved in the next release Component - C Library Core C library issues (usually in the src directory) Component - Tools Command-line tools like h5dump, includes high-level tools Component - Documentation Doxygen, markdown, etc. Component - Testing Code in test or testpar directories, GitHub workflows labels Jun 1, 2025
@jhendersonHDF jhendersonHDF added the Component - Build CMake, Autotools label Jun 1, 2025
@github-project-automation github-project-automation bot moved this to To be triaged in HDF5 - TRIAGE & TRACK Jun 1, 2025
@jhendersonHDF
Copy link
Collaborator Author

jhendersonHDF commented Jun 1, 2025

This PR is the followup to #5548 and replaces and largely copies the remaining features needed from #5397, but with several changes and additions made to work with the aws-c-s3 library. The ROS3 is built on the 3 major platforms for PRs, but only tested on Ubuntu as testing currently relies on docker and the aws cli, which are preinstalled on the Ubuntu runners. It should be fairly straightforward to get testing running on Windows and Mac as well, but this is a good start. For PRs, aws-c-s3 is installed from vcpkg on Windows and homebrew on Mac, but Ubuntu doesn't have a package yet, so it gets built from source. The VFD is also built and tested on Ubuntu with a source build of aws-c-s3 in the VFD daily build action.

It's fairly large due to all the changes that had to be made to work well with #5397, but if need be I can try to break it down into more pieces. The large line change count is just due to the inclusion of the large text file for testing, which I hope we can remove in the future.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajelenak @lrknox @byrnHDF Can you review and let me know if there's anything else that would be good to know here?

@@ -669,6 +681,14 @@ H5FD__s3comms_s3r_open(const char *url, const H5FD_ros3_fapl_t *fa, const char *
aws_error_str(aws_last_error()));
cond_var_init = true;

/* Check if path-style requests should be forced */
pathstyle_env = getenv(HDF5_ROS3_VFD_FORCE_PATH_STYLE);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed for s3proxy testing, as I couldn't get virtual-hosted-style requests to work correctly with it, but path-style requests work just fine.

@@ -1041,6 +1041,50 @@
* additional variable settings associated with the Split driver, there is no H5Pget_fapl_split
* function.
*
* \subsubsection subsubsec_file_alternate_drivers_ros3 The ROS3 Driver
* The ROS3 (read-only S3) driver is used to enable read-only access to HDF5 files which are stored
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a user's guide entry for the VFD, but as the entries for the other VFDs were brief I didn't add too many details here. Most information about the VFD is in its header file currently. But I think we should revise these entries into separate pages or something similar so they can serve as the main location for information about a particular VFD. That or maybe free-form doxygen like this should be put into each VFD's header file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The H5?module.h files were convenient (initially) for User Guide entries as they wouldn't be confused/interfere with reference Manuals entries and some UG entries were extensive. Anything should be able to be reworked/relocated as necessary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now these text files have been included as the ros3 and s3comms tests use them. But the ROS3 VFD can't actually read from text files normally, it can only do so indirectly through H5FDopen() and H5FDread(), which is what those tests are testing. We should revise those tests in the future to do more meaningful testing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this file mostly just cleaned up stack size warnings from the H5FD_ros3_fapl_ext_t structure being too large.

},
"", /* Session/security token */
};
static H5FD_ros3_fapl_ext_t *ros3_fa_g = NULL;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, I revised h5dump, h5ls and h5stat to allocate these structures to fix stack size warnings.

@@ -891,18 +872,6 @@ parse_command_line(int argc, const char *const *argv)
vfd_info_g.type = VFD_BY_NAME;
vfd_info_g.u.name = H5_optarg;
use_custom_vfd_g = true;

#ifdef H5_HAVE_ROS3_VFD
if (0 == strcmp(vfd_info_g.u.name, drivernames[ROS3_VFD_IDX]))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic like this was moved down in h5dump, h5ls and h5stat to handle overlap between options like -f ros3 and --vfd-name ros3 where the former would set the pointer to the VFD info and the latter wouldn't.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are they really different options?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of the tools code they're pretty much all exactly the same now among tools, with the "old" options being replaced by the --vfd-name option. The latter was kind of meant for vfd plug-ins, but it evolved to cover builtin library vfds as well, so they really should work the same at this point.

@@ -2802,7 +2808,7 @@ main(int argc, char *argv[])
data_g = true;
}
else if (!strcmp(argv[argno], "--enable-error-stack")) {
enable_error_stack = 1;
enable_error_stack = 2;
Copy link
Collaborator Author

@jhendersonHDF jhendersonHDF Jun 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@byrnHDF Since h5ls doesn't accept an argument for --enable-error-stack, I just changed the default here to 2 so that file open errors aren't hidden since they're often the most interesting errors. At this point, I kind of think 2 should be the default everywhere so that --enable-error-stack works and you don't have to specify --enable-error-stack=2.

description: "Install aws-c-s3 from a package manager ('package') or build from source ('source')"
required: true
type: string
aws_c_s3_tag:
Copy link
Contributor

@byrnHDF byrnHDF Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should add a default value to optional input

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually checking for empty input and letting the checkout action use the default branch if so. Probably a bit more future proof than specifying a default (though let's hope the name of the default branch doesn't change in GH again).

find_program (AWS_CLI_EXECUTABLE aws)
if (NOT AWS_CLI_EXECUTABLE)
set (HDF5_ENABLE_ROS3_VFD_DOCKER_PROXY OFF CACHE BOOL "Use docker for ROS3 VFD S3proxy testing" FORCE)
message (WARNING "AWS cli is required for ROS3 VFD S3proxy testing, but could not be found.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a FATAL_ERROR if it is required?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did the same as the existing logic for the docker option which just turns the option back off. We could make both fatal errors instead though if we want. I just added this because I noticed the test logic expects the aws cli commands to be available as well, but wasn't checking for them. For me it caused a failure with no obvious reason why.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original may have been wrong, but the more important decision is what is correct? If the user asked for it and it can't be found, will the build/test still give the intended result or should we stop and force the user to fix it. Given the time for some tests to complete (especially on Windows) I would prefer to know immediately and not waste time.

@@ -573,6 +587,18 @@ New Features

Tools:
------
- Added AWS endpoint command option to allow specifying an alternate endpoint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this mention the ENV variables?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can add a line for that.

use_virtual_style = false;

/*
* If an alternate endpoint URL was specified, use a path-style
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed this comment in several places - why is this a problem, because I thought the aws library would deal with this? Is it a hdf design decision?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is about what a server will accept, not what the aws library can do since it doesnt care too much about that and cant necessarily tell ahead of time what's supported. The only request format I could get to work between the library and s3proxy was the path-style format, which is why this is needed. Virtual-hosted-style requests seem somewhat specific to s3 (though I haven't checked other cloud providers) and since the initial use case for the alternate endpoint is for testing with s3proxy, this seemed reasonable to do. https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html eventually support for path-style requests will be removed from S3, but that's a future problem. I expect we'll still have to maintain support for path-style requests into the future for compatibility.

manager for the name of this package if it's available. HDF5 installs
this library in the following ways in CI testing:

Linux (Ubuntu):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On fedora 42 I had to install the "devel" package as well

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I need to update this line since Ubuntu didn't have a package yet. I don't know how much we want to get into the details for each Linux distribution since it could vary widely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for linux there always seems to be a devel package and it is needed for CMake to configure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

H5FDros3_s3comms.c:2202:74: warning: unused parameter ‘aws_params’ [-Wunused-parameter]
2202 | H5FD__s3comms_format_user_agent_header(const H5FD__s3comms_aws_params_t *aws_params,

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix. I turned on the HDF5_ENABLE_WARNINGS_AS_ERRORS=ON to try and catch new warnings in the ros3 code, but there appear to be several warning types it doesnt treat as errors, which we should revisit.


if (*value)
if (NULL == (*value = strdup(*value)))
HGOTO_ERROR(H5E_RESOURCE, H5E_CANTALLOC, FAIL, "can't copy string property token");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the failure message refer to endpoint instead of property token?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think this implementation was copy pasted from the previous implementation for tokens. Will tidy up the error messages.

* populated. If
* Only secret_key is allowed to be empty (the empty string, "").
* If all three (or four) strings are empty (""), the default fapl will be
* default. Both aws_region and secret_id values must be both empty or both
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Both" at the beginning of the sentence seems redundant; not terribly important, though.

@nbagha1 nbagha1 moved this from To be triaged to Scheduled/On-Deck in HDF5 - TRIAGE & TRACK Jun 9, 2025
@nbagha1 nbagha1 added this to the Release 2.0.0 milestone Jun 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component - Build CMake, Autotools Component - C Library Core C library issues (usually in the src directory) Component - Documentation Doxygen, markdown, etc. Component - Testing Code in test or testpar directories, GitHub workflows Component - Tools Command-line tools like h5dump, includes high-level tools Priority - 1. High These are important issues that should be resolved in the next release
Projects
Status: Scheduled/On-Deck
Development

Successfully merging this pull request may close these issues.

Expand ros3 VFD testing
5 participants