HL: Fix prefix-match bugs in H5TB field lookup and H5DS/H5IM CLASS checks (#5633)#6371
Conversation
|
General cosmetic comment: I think the use of 0 vs. SUCCEED and -1 vs. FAIL should be more consistent. |
- Replace "NUL termination" with "null termination" in comments (H5DS.c lines 2291, 2501; H5IM.c line 1029) to match codebase convention (H5T_STR_NULLTERM) - Add clarifying comment in H5TBget_field_info else-branch explaining that name_len+1 <= HLTB_MAX_FIELD_LEN and that callers must provide buffers of at least HLTB_MAX_FIELD_LEN bytes (documented in H5TBpublic.h)
| /* check to make sure string is null-terminated */ | ||
| if (H5T_STR_NULLTERM != H5Tget_strpad(atid)) | ||
| goto out; | ||
| /* Reject VLEN strings; H5Aread into a char* buffer would write an hvl_t. |
There was a problem hiding this comment.
Does the specification for these attributes disallow variable-length strings? What will be read is char **, not hvl_t.
There was a problem hiding this comment.
Corrected comment, char * pointer overwrite rather than hvl_t. Fixed occurrences elsewhere as well.
There was a problem hiding this comment.
My main concern is whether or not variable-length strings should actually be rejected. The specifications don't seem to directly say that they should. PALETTE_CLASS and IMAGE_CLASS don't appear in the current specification and there is a line of text that says "Optionally, String valued attributes may be stored in a String longer than the minimum, in which case it must be zero terminated or null padded."
There was a problem hiding this comment.
changed to accept VL strings
VL strings use char* in memory, not hvl_t (which is only for VL sequences). Correct the comment per jhendersonHDF review on PR HDFGroup#6371.
VL strings use char* in memory, not hvl_t (which is only for VL sequences). Correct two comments in H5DS.c and the CHANGELOG entry per jhendersonHDF review on PR HDFGroup#6371.
- Replace "NUL termination" with "null termination" in comments (H5DS.c lines 2291, 2501; H5IM.c line 1029) to match codebase convention (H5T_STR_NULLTERM) - Add clarifying comment in H5TBget_field_info else-branch explaining that name_len+1 <= HLTB_MAX_FIELD_LEN and that callers must provide buffers of at least HLTB_MAX_FIELD_LEN bytes (documented in H5TBpublic.h)
VL strings use char* in memory, not hvl_t (which is only for VL sequences). Correct the comment per jhendersonHDF review on PR HDFGroup#6371.
VL strings use char* in memory, not hvl_t (which is only for VL sequences). Correct two comments in H5DS.c and the CHANGELOG entry per jhendersonHDF review on PR HDFGroup#6371.
33b8862 to
5f8c79f
Compare
| char *field_253 = (char *)malloc(254); | ||
| char *field_254 = (char *)malloc(255); | ||
| char *field_255 = (char *)malloc(256); | ||
| char *field_1000 = (char *)malloc(1001); |
There was a problem hiding this comment.
These values and those used below should probably be calculated in terms of HLTB_MAX_FIELD_LEN so that the test doesn't need updating if that value changes.
There was a problem hiding this comment.
updated, along with similar issues.
jhendersonHDF
left a comment
There was a problem hiding this comment.
Mostly just a minor comment about maintainability but looks good either way
6b2694c to
e329d29
Compare
…ecks - H5TB: strcmp replaces strncmp in H5TBfind_field so that field names that are a prefix of a requested name (or vice-versa) are no longer matched. HLTB_MAX_FIELD_LEN (255) is now public in H5TBpublic.h and exposed to Fortran as HLTB_MAX_FIELD_LEN_F in H5TBff.F90. H5TBget_field_info documents the buffer-size requirement and truncation behaviour. H5TBget_field_info guards against overflow on long names. - H5IM: H5IMis_image and H5IMis_palette refactored into a shared helper (H5IM__class_attr_equals). The helper now reads both fixed-length and variable-length CLASS string attributes, using H5Treclaim for VL memory. strcmp replaces strncmp for exact-match semantics. - H5DS: H5DSis_scale and H5DS_is_reserved both support variable-length CLASS string attributes via H5Aget_space/H5Aread/H5Treclaim. The fixed-length path retains the 16-byte size guard. strcmp is used throughout for exact comparison. - Tests: new test functions test_is_scale_class_prefix, test_is_reserved_class_prefix, and test_class_prefix cover fixed-length prefix/exact/wrong-value cases and variable-length string cases. test_table.c adds write and read field-name prefix rejection cases and boundary-length truncation verification. All malloc calls are NULL-checked. - CHANGELOG updated with a summary of all fixes.
…ecks (HDFGroup#6371) - H5TB: strcmp replaces strncmp in H5TBfind_field so that field names that are a prefix of a requested name (or vice-versa) are no longer matched. HLTB_MAX_FIELD_LEN (255) is now public in H5TBpublic.h and exposed to Fortran as HLTB_MAX_FIELD_LEN_F in H5TBff.F90. H5TBget_field_info documents the buffer-size requirement and truncation behaviour. H5TBget_field_info guards against overflow on long names. - H5IM: H5IMis_image and H5IMis_palette refactored into a shared helper (H5IM__class_attr_equals). The helper now reads both fixed-length and variable-length CLASS string attributes, using H5Treclaim for VL memory. strcmp replaces strncmp for exact-match semantics. - H5DS: H5DSis_scale and H5DS_is_reserved both support variable-length CLASS string attributes via H5Aget_space/H5Aread/H5Treclaim. The fixed-length path retains the 16-byte size guard. strcmp is used throughout for exact comparison. - Tests: new test functions test_is_scale_class_prefix, test_is_reserved_class_prefix, and test_class_prefix cover fixed-length prefix/exact/wrong-value cases and variable-length string cases. test_table.c adds write and read field-name prefix rejection cases and boundary-length truncation verification. All malloc calls are NULL-checked. - CHANGELOG updated with a summary of all fixes.
Fixes #5633 and related prefix-match bugs across the HL library.
H5TB
H5TB_find_field() compared only strlen(field) bytes of a tokenized user field name, so a user-supplied name like "PressureExtra" would match a real column "Pressure". Changed to strcmp against the NUL-terminated token parsed out of the comma-separated list. Applies to both H5TBread_fields_name() and H5TBwrite_fields_name(). The write path also silently accepted input with zero matching fields; it now returns an error, matching the read path's existing behavior.
H5DS
H5DSis_scale() and H5DS_is_reserved() used strncmp(buf, CLASS, MIN(strlen(CLASS), strlen(buf))), which matched any value whose prefix equaled the expected class name. Replaced with strcmp. Also:
Fixed a latent double-free in H5DSis_scale() on the no-match path, which was previously unreachable because the buggy compare always matched.
Added a variable-length string guard: a VLEN CLASS attribute passes the H5T_STR_NULLTERM check, but H5Aread into a char* buffer would write an hvl_t, so it is now explicitly rejected.
Allocate one extra byte and explicitly NUL-terminate after H5Aread as defense-in-depth.
H5IM
Same class of bug in H5IMis_image() (IMAGE class) and H5IMis_palette() (PALETTE class). Additionally:
Initialized did/aid/atid/attr_data at declaration so the out: error handler is safe from any failure point.
Added VLEN string guard (same rationale as H5DS).
Fixed pre-existing resource leaks in the out: handler: aid, atid, and attr_data were not closed/freed on mid-function failures.
Allocate one extra byte and explicitly NUL-terminate after H5Aread.
Regression tests
hl/test/test_table.c: read and write tests that request "PressureExtra" against a table with a real "Pressure" column and expect failure.
hl/test/test_image.c: test_class_prefix covers CLASS values "IMAGE_EXTRA", "I", "PALETTE_EXTRA", and "PAL" — all must return 0.
hl/test/test_ds.c: test_is_scale_class_prefix writes a 16-byte CLASS attribute holding "DIMENSION_S" (null-padded), the exact path H5DSis_scale() walks for the scale class length, and asserts 0. test_is_reserved_class_prefix exercises H5DS_is_reserved() indirectly via H5DSattach_scale() on a dataset tagged CLASS="IMAGE_EXTRA", which must not be treated as reserved.
All new tests were verified to fail against the unpatched sources and pass with the fix.