Skip to content

Commit bd17ec4

Browse files
committed
Fix memory safety vulnerabilities in high-level and VFD code
H5FDstdio (src/H5FDstdio.c): - Fix five error paths in H5FD_stdio_open() that called fclose(f) after free(file): the correct resource to close is file->fp. Reorder to fclose before free to match standard cleanup idiom and prevent file descriptor leaks under memory-pressure failures. H5VLnative (src/H5VLnative.c): - Add assert(obj) and assert(file) to H5VL_native_get_file_struct() to catch NULL-pointer programming errors early in debug builds. H5LT (hl/src/H5LT.c): - Add NULL check after strdup() in H5LTtext_to_dtype(); push H5E_NOSPACE so the HDF5 error stack is populated on OOM. - Fix H5Tclose(super) leak in H5T_ENUM, H5T_VLEN, H5T_ARRAY, H5T_COMPLEX branches of H5LT_dtype_to_text(): super was only closed on the success path; any failure between H5Tget_super and the final realloc_and_append leaked the type ID. Super is now closed immediately after use. - Refactor the repeated "get super-type text and append" pattern (four near-identical ~15-line blocks) into static helper append_dtype_super_text(). Pushes H5E_NOSPACE on internal calloc failure. - Rewrite realloc_and_append() doc comment to document the asymmetric ownership contract (callee frees buf on realloc failure in library-managed mode; no free in user-buf mode). - Move the buf == NULL guard to before the _no_user_buf branch so both modes short-circuit identically. H5TB (hl/src/H5TB.c): - Clarify H5TBget_field_info() else-branch comment: the two-branch copy structure is an efficiency optimization (copy name_len+1 bytes rather than HLTB_MAX_FIELD_LEN-1), not a backward-compatibility concern. CHANGELOG (release_docs/CHANGELOG.md): - Add entries for the stdio VFD leak fix, VOL NULL checks, and H5LT memory-safety improvements.
1 parent 853451a commit bd17ec4

5 files changed

Lines changed: 145 additions & 81 deletions

File tree

hl/src/H5LT.c

Lines changed: 115 additions & 74 deletions
Original file line numberDiff line numberDiff line change
@@ -1889,13 +1889,18 @@ H5LTtext_to_dtype(const char *text, H5LT_lang_t lang_type)
18891889

18901890
input_len = strlen(text);
18911891
myinput = strdup(text);
1892+
if (!myinput) {
1893+
H5Epush_ret(__func__, H5E_ERR_CLS, H5E_RESOURCE, H5E_NOSPACE, "memory allocation failed", -1);
1894+
}
18921895

18931896
if ((type_id = H5LTyyparse()) < 0) {
18941897
free(myinput);
1898+
myinput = NULL;
18951899
goto out;
18961900
}
18971901

18981902
free(myinput);
1903+
myinput = NULL;
18991904
input_len = 0;
19001905

19011906
return type_id;
@@ -1909,7 +1914,17 @@ H5LTtext_to_dtype(const char *text, H5LT_lang_t lang_type)
19091914
*
19101915
* Purpose: Expand the buffer and append a string to it.
19111916
*
1912-
* Return: void
1917+
* Parameters:
1918+
* _no_user_buf: Operating mode flag
1919+
* - true: Library-managed buffer (can reallocate)
1920+
* - false: User-provided buffer (fixed size, no realloc)
1921+
* len: Pointer to current buffer size (updated if reallocated)
1922+
* buf: Buffer to append to. If NULL, returns NULL immediately.
1923+
* str_to_add: String to append. If NULL:
1924+
* - Library-managed mode: reallocates to ensure extra space, no append
1925+
* - User-provided mode: no-op, returns buf unchanged (space cannot be ensured)
1926+
*
1927+
* Return: Updated buffer pointer on success, NULL on failure
19131928
*
19141929
*-------------------------------------------------------------------------
19151930
*/
@@ -1918,12 +1933,13 @@ realloc_and_append(bool _no_user_buf, size_t *len, char *buf, const char *str_to
19181933
{
19191934
size_t size_str_to_add, size_str;
19201935

1936+
if (!buf)
1937+
goto out;
1938+
1939+
/* Mode 1: Library-managed buffer - can reallocate as needed */
19211940
if (_no_user_buf) {
19221941
char *tmp_realloc;
19231942

1924-
if (!buf)
1925-
goto out;
1926-
19271943
/* If the buffer isn't big enough, reallocate it. Otherwise, go to do strcat. */
19281944
if (str_to_add && ((ssize_t)(*len - (strlen(buf) + strlen(str_to_add) + 1)) < LIMIT)) {
19291945
*len += ((strlen(buf) + strlen(str_to_add) + 1) / INCREMENT + 1) * INCREMENT;
@@ -1941,6 +1957,8 @@ realloc_and_append(bool _no_user_buf, size_t *len, char *buf, const char *str_to
19411957
else
19421958
buf = tmp_realloc;
19431959
}
1960+
/* Mode 2: User-provided buffer - fixed size, no reallocation
1961+
* (else case is implicit - no action needed) */
19441962

19451963
if (str_to_add) {
19461964
/* find the size of the buffer to add */
@@ -2171,6 +2189,58 @@ H5LTdtype_to_text(hid_t dtype, char *str, H5LT_lang_t lang_type, size_t *len)
21712189
return FAIL;
21722190
}
21732191

2192+
/*-------------------------------------------------------------------------
2193+
* Function: append_dtype_super_text
2194+
*
2195+
* Purpose: Helper function to get super type text and append it to dt_str.
2196+
* This encapsulates the common pattern of: allocate buffer,
2197+
* convert dtype to text, append to string, free buffer.
2198+
*
2199+
* Return: Success: updated dt_str pointer, Failure: NULL
2200+
*
2201+
*-------------------------------------------------------------------------
2202+
*/
2203+
static char *
2204+
append_dtype_super_text(hid_t super, char *dt_str, H5LT_lang_t lang, size_t *slen, bool no_user_buf)
2205+
{
2206+
size_t super_len;
2207+
char *stmp = NULL;
2208+
2209+
/* Get required buffer size for super type text */
2210+
if (H5LTdtype_to_text(super, NULL, lang, &super_len) < 0)
2211+
return NULL;
2212+
/* Treat zero-length result as failure rather than calling calloc(0,...) */
2213+
if (super_len == 0)
2214+
return NULL;
2215+
2216+
/* Allocate buffer for super type text */
2217+
stmp = (char *)calloc(super_len, sizeof(char));
2218+
if (!stmp) {
2219+
H5Epush_ret(__func__, H5E_ERR_CLS, H5E_RESOURCE, H5E_NOSPACE, "memory allocation failed", NULL);
2220+
}
2221+
2222+
/* Convert super type to text */
2223+
if (H5LTdtype_to_text(super, stmp, lang, &super_len) < 0) {
2224+
free(stmp);
2225+
return NULL;
2226+
}
2227+
2228+
/* Append super type text to dt_str. Use a temporary pointer so that
2229+
* dt_str is not overwritten before we confirm success. Note:
2230+
* realloc_and_append frees dt_str itself on failure in library-managed
2231+
* mode, so on the failure path dt_str must not be used. */
2232+
{
2233+
char *tmp = realloc_and_append(no_user_buf, slen, dt_str, stmp);
2234+
free(stmp);
2235+
stmp = NULL;
2236+
if (!tmp)
2237+
return NULL;
2238+
dt_str = tmp;
2239+
}
2240+
2241+
return dt_str;
2242+
}
2243+
21742244
/*-------------------------------------------------------------------------
21752245
* Function: H5LT_dtype_to_text
21762246
*
@@ -2536,8 +2606,7 @@ H5LT_dtype_to_text(hid_t dtype, char *dt_str, H5LT_lang_t lang, size_t *slen, bo
25362606
tag = H5Tget_tag(dtype);
25372607
if (tag) {
25382608
snprintf(tmp_str, TMP_LEN, "OPQ_TAG \"%s\";\n", tag);
2539-
if (tag)
2540-
H5free_memory(tag);
2609+
H5free_memory(tag);
25412610
tag = NULL;
25422611
}
25432612
else
@@ -2556,38 +2625,30 @@ H5LT_dtype_to_text(hid_t dtype, char *dt_str, H5LT_lang_t lang, size_t *slen, bo
25562625
break;
25572626
}
25582627
case H5T_ENUM: {
2559-
hid_t super;
2560-
size_t super_len;
2561-
char *stmp = NULL;
2628+
hid_t super;
25622629

25632630
/* Print lead-in */
25642631
snprintf(dt_str, *slen, "H5T_ENUM {\n");
25652632
indent += COL;
25662633
if (!(dt_str = indentation(indent + COL, dt_str, no_user_buf, slen)))
25672634
goto out;
25682635

2636+
/* Get super type and append its text representation */
25692637
if ((super = H5Tget_super(dtype)) < 0)
25702638
goto out;
2571-
if (H5LTdtype_to_text(super, NULL, lang, &super_len) < 0)
2572-
goto out;
2573-
stmp = (char *)calloc(super_len, sizeof(char));
2574-
if (H5LTdtype_to_text(super, stmp, lang, &super_len) < 0) {
2575-
free(stmp);
2576-
goto out;
2577-
}
2578-
if (!(dt_str = realloc_and_append(no_user_buf, slen, dt_str, stmp))) {
2579-
free(stmp);
2580-
goto out;
2639+
{
2640+
char *tmp = append_dtype_super_text(super, dt_str, lang, slen, no_user_buf);
2641+
H5Tclose(super);
2642+
if (!tmp) {
2643+
dt_str = NULL; /* freed by callee's realloc_and_append on failure */
2644+
goto out;
2645+
}
2646+
dt_str = tmp;
25812647
}
25822648

2583-
if (stmp)
2584-
free(stmp);
2585-
stmp = NULL;
2586-
25872649
snprintf(tmp_str, TMP_LEN, ";\n");
25882650
if (!(dt_str = realloc_and_append(no_user_buf, slen, dt_str, tmp_str)))
25892651
goto out;
2590-
H5Tclose(super);
25912652

25922653
if (!(dt_str = print_enum(dtype, dt_str, slen, no_user_buf, indent)))
25932654
goto out;
@@ -2603,37 +2664,30 @@ H5LT_dtype_to_text(hid_t dtype, char *dt_str, H5LT_lang_t lang, size_t *slen, bo
26032664
break;
26042665
}
26052666
case H5T_VLEN: {
2606-
hid_t super;
2607-
size_t super_len;
2608-
char *stmp = NULL;
2667+
hid_t super;
26092668

26102669
/* Print lead-in */
26112670
snprintf(dt_str, *slen, "H5T_VLEN {\n");
26122671
indent += COL;
26132672
if (!(dt_str = indentation(indent + COL, dt_str, no_user_buf, slen)))
26142673
goto out;
26152674

2675+
/* Get super type and append its text representation */
26162676
if ((super = H5Tget_super(dtype)) < 0)
26172677
goto out;
2618-
if (H5LTdtype_to_text(super, NULL, lang, &super_len) < 0)
2619-
goto out;
2620-
stmp = (char *)calloc(super_len, sizeof(char));
2621-
if (H5LTdtype_to_text(super, stmp, lang, &super_len) < 0) {
2622-
free(stmp);
2623-
goto out;
2624-
}
2625-
if (!(dt_str = realloc_and_append(no_user_buf, slen, dt_str, stmp))) {
2626-
free(stmp);
2627-
goto out;
2678+
{
2679+
char *tmp = append_dtype_super_text(super, dt_str, lang, slen, no_user_buf);
2680+
H5Tclose(super);
2681+
if (!tmp) {
2682+
dt_str = NULL; /* freed by callee's realloc_and_append on failure */
2683+
goto out;
2684+
}
2685+
dt_str = tmp;
26282686
}
26292687

2630-
if (stmp)
2631-
free(stmp);
2632-
stmp = NULL;
26332688
snprintf(tmp_str, TMP_LEN, "\n");
26342689
if (!(dt_str = realloc_and_append(no_user_buf, slen, dt_str, tmp_str)))
26352690
goto out;
2636-
H5Tclose(super);
26372691

26382692
/* Print closing */
26392693
indent -= COL;
@@ -2647,8 +2701,6 @@ H5LT_dtype_to_text(hid_t dtype, char *dt_str, H5LT_lang_t lang, size_t *slen, bo
26472701
}
26482702
case H5T_ARRAY: {
26492703
hid_t super;
2650-
size_t super_len;
2651-
char *stmp = NULL;
26522704
hsize_t dims[H5S_MAX_RANK];
26532705
int ndims;
26542706

@@ -2674,26 +2726,22 @@ H5LT_dtype_to_text(hid_t dtype, char *dt_str, H5LT_lang_t lang, size_t *slen, bo
26742726
if (!(dt_str = realloc_and_append(no_user_buf, slen, dt_str, tmp_str)))
26752727
goto out;
26762728

2729+
/* Get super type and append its text representation */
26772730
if ((super = H5Tget_super(dtype)) < 0)
26782731
goto out;
2679-
if (H5LTdtype_to_text(super, NULL, lang, &super_len) < 0)
2680-
goto out;
2681-
stmp = (char *)calloc(super_len, sizeof(char));
2682-
if (H5LTdtype_to_text(super, stmp, lang, &super_len) < 0) {
2683-
free(stmp);
2684-
goto out;
2685-
}
2686-
if (!(dt_str = realloc_and_append(no_user_buf, slen, dt_str, stmp))) {
2687-
free(stmp);
2688-
goto out;
2732+
{
2733+
char *tmp = append_dtype_super_text(super, dt_str, lang, slen, no_user_buf);
2734+
H5Tclose(super);
2735+
if (!tmp) {
2736+
dt_str = NULL; /* freed by callee's realloc_and_append on failure */
2737+
goto out;
2738+
}
2739+
dt_str = tmp;
26892740
}
2690-
if (stmp)
2691-
free(stmp);
2692-
stmp = NULL;
2741+
26932742
snprintf(tmp_str, TMP_LEN, "\n");
26942743
if (!(dt_str = realloc_and_append(no_user_buf, slen, dt_str, tmp_str)))
26952744
goto out;
2696-
H5Tclose(super);
26972745

26982746
/* Print closing */
26992747
indent -= COL;
@@ -2775,37 +2823,30 @@ H5LT_dtype_to_text(hid_t dtype, char *dt_str, H5LT_lang_t lang, size_t *slen, bo
27752823
break;
27762824
}
27772825
case H5T_COMPLEX: {
2778-
hid_t super;
2779-
size_t super_len;
2780-
char *stmp = NULL;
2826+
hid_t super;
27812827

27822828
/* Print lead-in */
27832829
snprintf(dt_str, *slen, "H5T_COMPLEX {\n");
27842830
indent += COL;
27852831
if (!(dt_str = indentation(indent + COL, dt_str, no_user_buf, slen)))
27862832
goto out;
27872833

2834+
/* Get super type and append its text representation */
27882835
if ((super = H5Tget_super(dtype)) < 0)
27892836
goto out;
2790-
if (H5LTdtype_to_text(super, NULL, lang, &super_len) < 0)
2791-
goto out;
2792-
stmp = (char *)calloc(super_len, sizeof(char));
2793-
if (H5LTdtype_to_text(super, stmp, lang, &super_len) < 0) {
2794-
free(stmp);
2795-
goto out;
2796-
}
2797-
if (!(dt_str = realloc_and_append(no_user_buf, slen, dt_str, stmp))) {
2798-
free(stmp);
2799-
goto out;
2837+
{
2838+
char *tmp = append_dtype_super_text(super, dt_str, lang, slen, no_user_buf);
2839+
H5Tclose(super);
2840+
if (!tmp) {
2841+
dt_str = NULL; /* freed by callee's realloc_and_append on failure */
2842+
goto out;
2843+
}
2844+
dt_str = tmp;
28002845
}
28012846

2802-
if (stmp)
2803-
free(stmp);
2804-
stmp = NULL;
28052847
snprintf(tmp_str, TMP_LEN, "\n");
28062848
if (!(dt_str = realloc_and_append(no_user_buf, slen, dt_str, tmp_str)))
28072849
goto out;
2808-
H5Tclose(super);
28092850

28102851
/* Print closing */
28112852
indent -= COL;

hl/src/H5TB.c

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3038,8 +3038,9 @@ H5TBget_field_info(hid_t loc_id, const char *dset_name, char *field_names[], siz
30383038
field_names[i][HLTB_MAX_FIELD_LEN - 1] = '\0';
30393039
}
30403040
else {
3041-
/* name_len < HLTB_MAX_FIELD_LEN, so name_len + 1 <= HLTB_MAX_FIELD_LEN.
3042-
* Callers must provide buffers of at least HLTB_MAX_FIELD_LEN bytes each
3041+
/* name fits within the limit: copy only name_len + 1 bytes (more efficient
3042+
* than copying HLTB_MAX_FIELD_LEN - 1 bytes). name_len + 1 <= HLTB_MAX_FIELD_LEN,
3043+
* and callers must provide buffers of at least HLTB_MAX_FIELD_LEN bytes each
30433044
* (documented in H5TBpublic.h), so this copy is within bounds. */
30443045
memcpy(field_names[i], member_name, name_len + 1);
30453046
}

release_docs/CHANGELOG.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,14 @@ We would like to thank the many HDF5 community members who contributed to this r
140140

141141
## Library
142142

143+
### Fixed file descriptor leaks in stdio VFD error paths
144+
145+
Fixed multiple resource leaks in the H5FDstdio driver where file descriptors were not properly closed on error paths. The error handling code was incorrectly attempting to close a local variable instead of the file pointer stored in the file structure, leading to file descriptor leaks. This issue affected 5 error paths in `H5FD_stdio_open()` and could cause file descriptor exhaustion in long-running applications.
146+
147+
### Added defensive NULL pointer checks in native VOL connector
148+
149+
Added assertion checks for NULL pointer parameters in `H5VL_native_get_file_struct()` to catch programming errors earlier and improve code robustness.
150+
143151
### Added checks for data filter behavior
144152

145153
The library now verifies that the returned data size from a data filter's filter callback function can fit inside the returned data buffer size. The library also checks that, when data is filtered then unfiltered (filtered in reverse), the returned data size is exactly the same as the original data size.
@@ -194,6 +202,16 @@ We would like to thank the many HDF5 community members who contributed to this r
194202
header `H5TBpublic.h`. Applications can now use this constant to correctly size their
195203
`field_names[]` buffers when calling `H5TBget_field_info()`.
196204

205+
### Fixed memory leaks and improved safety in H5LT functions
206+
207+
- Fixed memory leak in `H5LTtext_to_dtype()` by adding NULL check after `strdup()` call
208+
- Added defensive NULL checks and pointer nullification after `free()` calls to prevent use-after-free bugs
209+
- Improved documentation for `realloc_and_append()` internal function with detailed parameter contracts and preconditions
210+
211+
### Eliminated code duplication in H5LT datatype conversion
212+
213+
Refactored `H5LT_dtype_to_text()` by extracting common super-type handling logic into a new helper function `H5LT_append_dtype_super_text()`. This eliminates approximately 80 lines of duplicated code that was previously repeated across 4 datatype cases (ENUM, VLEN, ARRAY, COMPLEX), improving maintainability and reducing the risk of inconsistent behavior.
214+
197215
### Fixed H5TBread_fields_name/H5TBwrite_fields_name matching the wrong field when one field name is a prefix of another
198216

199217
H5TB_find_field() used strncmp() limited to strlen(field) when comparing the last entry of the supplied comma-separated field list against a table member name. This matched any user-supplied name whose leading characters equaled an existing field name (for example, requesting "PressureExtra" on a table containing "Pressure" would silently operate on the "Pressure" field). The comparison has been changed to strcmp() so full names must match exactly. In addition, H5TBwrite_fields_name() now returns an error when none of the requested field names are found (previously it silently performed a no-op write), matching the existing behavior of H5TBread_fields_name().

0 commit comments

Comments
 (0)