Skip to content

HL: Fix prefix-match bugs in H5TB field lookup and H5DS/H5IM CLASS checks (#5633)#6371

Merged
brtnfld merged 1 commit into
HDFGroup:developfrom
brtnfld:5633
May 22, 2026
Merged

HL: Fix prefix-match bugs in H5TB field lookup and H5DS/H5IM CLASS checks (#5633)#6371
brtnfld merged 1 commit into
HDFGroup:developfrom
brtnfld:5633

Conversation

@brtnfld

@brtnfld brtnfld commented Apr 16, 2026

Copy link
Copy Markdown
Collaborator

Fixes #5633 and related prefix-match bugs across the HL library.

H5TB

H5TB_find_field() compared only strlen(field) bytes of a tokenized user field name, so a user-supplied name like "PressureExtra" would match a real column "Pressure". Changed to strcmp against the NUL-terminated token parsed out of the comma-separated list. Applies to both H5TBread_fields_name() and H5TBwrite_fields_name(). The write path also silently accepted input with zero matching fields; it now returns an error, matching the read path's existing behavior.

H5DS

H5DSis_scale() and H5DS_is_reserved() used strncmp(buf, CLASS, MIN(strlen(CLASS), strlen(buf))), which matched any value whose prefix equaled the expected class name. Replaced with strcmp. Also:

Fixed a latent double-free in H5DSis_scale() on the no-match path, which was previously unreachable because the buggy compare always matched.
Added a variable-length string guard: a VLEN CLASS attribute passes the H5T_STR_NULLTERM check, but H5Aread into a char* buffer would write an hvl_t, so it is now explicitly rejected.
Allocate one extra byte and explicitly NUL-terminate after H5Aread as defense-in-depth.

H5IM

Same class of bug in H5IMis_image() (IMAGE class) and H5IMis_palette() (PALETTE class). Additionally:

Initialized did/aid/atid/attr_data at declaration so the out: error handler is safe from any failure point.
Added VLEN string guard (same rationale as H5DS).
Fixed pre-existing resource leaks in the out: handler: aid, atid, and attr_data were not closed/freed on mid-function failures.
Allocate one extra byte and explicitly NUL-terminate after H5Aread.

Regression tests

hl/test/test_table.c: read and write tests that request "PressureExtra" against a table with a real "Pressure" column and expect failure.
hl/test/test_image.c: test_class_prefix covers CLASS values "IMAGE_EXTRA", "I", "PALETTE_EXTRA", and "PAL" — all must return 0.
hl/test/test_ds.c: test_is_scale_class_prefix writes a 16-byte CLASS attribute holding "DIMENSION_S" (null-padded), the exact path H5DSis_scale() walks for the scale class length, and asserts 0. test_is_reserved_class_prefix exercises H5DS_is_reserved() indirectly via H5DSattach_scale() on a dataset tagged CLASS="IMAGE_EXTRA", which must not be treated as reserved.

All new tests were verified to fail against the unpatched sources and pass with the fix.

@github-project-automation github-project-automation Bot moved this to To be triaged in HDF5 - TRIAGE & TRACK Apr 16, 2026
@brtnfld brtnfld marked this pull request as draft April 16, 2026 22:38
@brtnfld brtnfld marked this pull request as ready for review April 16, 2026 23:35
@nbagha1 nbagha1 added the HDFG-internal Internally coded for use by the HDF Group label Apr 17, 2026
@nbagha1 nbagha1 moved this from To be triaged to On-Deck in HDF5 - TRIAGE & TRACK Apr 17, 2026
@nbagha1 nbagha1 added this to the HDF5 2.2.0 milestone Apr 17, 2026
@bmribler

Copy link
Copy Markdown
Collaborator

General cosmetic comment: I think the use of 0 vs. SUCCEED and -1 vs. FAIL should be more consistent.

Comment thread hl/src/H5DS.c Outdated
Comment thread hl/src/H5TB.c
@github-project-automation github-project-automation Bot moved this from On-Deck to In progress in HDF5 - TRIAGE & TRACK Apr 21, 2026
brtnfld added a commit to brtnfld/hdf5 that referenced this pull request Apr 21, 2026
- Replace "NUL termination" with "null termination" in comments
  (H5DS.c lines 2291, 2501; H5IM.c line 1029) to match codebase
  convention (H5T_STR_NULLTERM)
- Add clarifying comment in H5TBget_field_info else-branch explaining
  that name_len+1 <= HLTB_MAX_FIELD_LEN and that callers must provide
  buffers of at least HLTB_MAX_FIELD_LEN bytes (documented in H5TBpublic.h)
@bmribler bmribler self-requested a review April 22, 2026 20:35
bmribler
bmribler previously approved these changes Apr 22, 2026
lrknox
lrknox previously approved these changes Apr 23, 2026
Comment thread hl/src/H5IM.c Outdated
/* check to make sure string is null-terminated */
if (H5T_STR_NULLTERM != H5Tget_strpad(atid))
goto out;
/* Reject VLEN strings; H5Aread into a char* buffer would write an hvl_t.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the specification for these attributes disallow variable-length strings? What will be read is char **, not hvl_t.

@brtnfld brtnfld May 19, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected comment, char * pointer overwrite rather than hvl_t. Fixed occurrences elsewhere as well.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern is whether or not variable-length strings should actually be rejected. The specifications don't seem to directly say that they should. PALETTE_CLASS and IMAGE_CLASS don't appear in the current specification and there is a line of text that says "Optionally, String valued attributes may be stored in a String longer than the minimum, in which case it must be zero terminated or null padded."

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to accept VL strings

brtnfld added a commit to brtnfld/hdf5 that referenced this pull request May 19, 2026
VL strings use char* in memory, not hvl_t (which is only for VL
sequences). Correct the comment per jhendersonHDF review on PR HDFGroup#6371.
@brtnfld brtnfld dismissed stale reviews from lrknox and bmribler via 89faa76 May 19, 2026 22:20
brtnfld added a commit to brtnfld/hdf5 that referenced this pull request May 19, 2026
VL strings use char* in memory, not hvl_t (which is only for VL
sequences). Correct two comments in H5DS.c and the CHANGELOG entry
per jhendersonHDF review on PR HDFGroup#6371.
brtnfld added a commit to brtnfld/hdf5 that referenced this pull request May 20, 2026
- Replace "NUL termination" with "null termination" in comments
  (H5DS.c lines 2291, 2501; H5IM.c line 1029) to match codebase
  convention (H5T_STR_NULLTERM)
- Add clarifying comment in H5TBget_field_info else-branch explaining
  that name_len+1 <= HLTB_MAX_FIELD_LEN and that callers must provide
  buffers of at least HLTB_MAX_FIELD_LEN bytes (documented in H5TBpublic.h)
brtnfld added a commit to brtnfld/hdf5 that referenced this pull request May 20, 2026
VL strings use char* in memory, not hvl_t (which is only for VL
sequences). Correct the comment per jhendersonHDF review on PR HDFGroup#6371.
brtnfld added a commit to brtnfld/hdf5 that referenced this pull request May 20, 2026
VL strings use char* in memory, not hvl_t (which is only for VL
sequences). Correct two comments in H5DS.c and the CHANGELOG entry
per jhendersonHDF review on PR HDFGroup#6371.
Comment thread hl/test/test_table.c Outdated
Comment thread hl/test/test_table.c
@brtnfld brtnfld force-pushed the 5633 branch 5 times, most recently from 33b8862 to 5f8c79f Compare May 20, 2026 20:56
Comment thread hl/test/test_table.c Outdated
char *field_253 = (char *)malloc(254);
char *field_254 = (char *)malloc(255);
char *field_255 = (char *)malloc(256);
char *field_1000 = (char *)malloc(1001);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These values and those used below should probably be calculated in terms of HLTB_MAX_FIELD_LEN so that the test doesn't need updating if that value changes.

@brtnfld brtnfld May 21, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated, along with similar issues.

jhendersonHDF
jhendersonHDF previously approved these changes May 20, 2026

@jhendersonHDF jhendersonHDF left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly just a minor comment about maintainability but looks good either way

…ecks

- H5TB: strcmp replaces strncmp in H5TBfind_field so that field names
  that are a prefix of a requested name (or vice-versa) are no longer
  matched. HLTB_MAX_FIELD_LEN (255) is now public in H5TBpublic.h and
  exposed to Fortran as HLTB_MAX_FIELD_LEN_F in H5TBff.F90.
  H5TBget_field_info documents the buffer-size requirement and truncation
  behaviour. H5TBget_field_info guards against overflow on long names.

- H5IM: H5IMis_image and H5IMis_palette refactored into a shared helper
  (H5IM__class_attr_equals). The helper now reads both fixed-length and
  variable-length CLASS string attributes, using H5Treclaim for VL memory.
  strcmp replaces strncmp for exact-match semantics.

- H5DS: H5DSis_scale and H5DS_is_reserved both support variable-length
  CLASS string attributes via H5Aget_space/H5Aread/H5Treclaim. The
  fixed-length path retains the 16-byte size guard. strcmp is used
  throughout for exact comparison.

- Tests: new test functions test_is_scale_class_prefix,
  test_is_reserved_class_prefix, and test_class_prefix cover fixed-length
  prefix/exact/wrong-value cases and variable-length string cases.
  test_table.c adds write and read field-name prefix rejection cases and
  boundary-length truncation verification. All malloc calls are NULL-checked.

- CHANGELOG updated with a summary of all fixes.
@brtnfld brtnfld merged commit 853451a into HDFGroup:develop May 22, 2026
129 checks passed
@github-project-automation github-project-automation Bot moved this from In progress to Done in HDF5 - TRIAGE & TRACK May 22, 2026
hyoklee pushed a commit to hyoklee/hdf5 that referenced this pull request May 29, 2026
…ecks (HDFGroup#6371)

- H5TB: strcmp replaces strncmp in H5TBfind_field so that field names
  that are a prefix of a requested name (or vice-versa) are no longer
  matched. HLTB_MAX_FIELD_LEN (255) is now public in H5TBpublic.h and
  exposed to Fortran as HLTB_MAX_FIELD_LEN_F in H5TBff.F90.
  H5TBget_field_info documents the buffer-size requirement and truncation
  behaviour. H5TBget_field_info guards against overflow on long names.

- H5IM: H5IMis_image and H5IMis_palette refactored into a shared helper
  (H5IM__class_attr_equals). The helper now reads both fixed-length and
  variable-length CLASS string attributes, using H5Treclaim for VL memory.
  strcmp replaces strncmp for exact-match semantics.

- H5DS: H5DSis_scale and H5DS_is_reserved both support variable-length
  CLASS string attributes via H5Aget_space/H5Aread/H5Treclaim. The
  fixed-length path retains the 16-byte size guard. strcmp is used
  throughout for exact comparison.

- Tests: new test functions test_is_scale_class_prefix,
  test_is_reserved_class_prefix, and test_class_prefix cover fixed-length
  prefix/exact/wrong-value cases and variable-length string cases.
  test_table.c adds write and read field-name prefix rejection cases and
  boundary-length truncation verification. All malloc calls are NULL-checked.

- CHANGELOG updated with a summary of all fixes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

HDFG-internal Internally coded for use by the HDF Group

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

H5TBwrite_fields_name and H5TBread_fields_name can select the wrong field if one name is a prefix of another

6 participants