Skip to content

Commit f3eaed0

Browse files
committed
fix: variable-length string attribute reading (Issue #14)
Root causes fixed: - V1/V2 attribute message 8-byte alignment (H5O_ALIGN_OLD macro) - IsVariableString() now checks ClassBitField instead of Properties - VLen string format: 4-byte length prefix + Global Heap reference Changes: - internal/core/attribute.go: Added DatatypeVarLen case to ReadValue() - internal/core/datatype.go: Fixed IsVariableString() per HDF5 spec III.A.2.4.d - Updated all documentation examples to use attr.ReadValue() Files created by h5py now work correctly. Closes #14
1 parent 86f05e0 commit f3eaed0

File tree

11 files changed

+381
-59
lines changed

11 files changed

+381
-59
lines changed

CHANGELOG.md

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,63 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
---
99

10+
## [v0.13.4] - 2025-01-29
11+
12+
### 🐛 Bug Fixes
13+
14+
#### Fixed: Variable-Length String Attribute Reading (Issue #14)
15+
16+
Users reported that attributes could not be read from HDF5 files created by h5py.
17+
The issue manifested in two ways:
18+
1. Integer attributes on root groups were not found
19+
2. Variable-length string attributes returned "unsupported datatype class 9"
20+
21+
**Root Causes**:
22+
23+
1. **V1/V2 Attribute Alignment**: Name, datatype, and dataspace fields in attribute messages
24+
must be padded to 8-byte boundaries (per `H5O_ALIGN_OLD` macro in C library), but we
25+
were using exact sizes.
26+
27+
2. **IsVariableString() Detection**: The function was checking `Properties[0] & 0x0F == DatatypeString`,
28+
but per HDF5 Format Specification III.A.2.4.d, variable-length string type is indicated by
29+
`ClassBitField & 0x0F == 1` (where 1 = String, 0 = Sequence).
30+
31+
3. **VLen String Data Format**: Variable-length strings include a 4-byte length prefix before
32+
the Global Heap reference, making the total size `4 + offsetSize + 4` bytes, not `offsetSize + 4`.
33+
34+
**Fixed**:
35+
- `internal/core/attribute.go`: Added 8-byte alignment for V1/V2 attribute parsing
36+
- `internal/core/datatype.go`: Fixed `IsVariableString()` to check `ClassBitField` correctly
37+
- `internal/core/attribute.go`: Added `DatatypeVarLen` case to `ReadValue()` with proper vlen format
38+
- `internal/core/attribute.go`: Added `readVariableLengthString()` helper for Global Heap access
39+
40+
**Result**:
41+
- Integer attributes on root groups now read correctly
42+
- Variable-length string attributes on datasets now read correctly
43+
- Files created by h5py work without issues
44+
45+
**Test file**: Python script from Issue #14 creates file with:
46+
- Root group: `File Attribute = 123456` (integer)
47+
- Dataset: `Dataset Attribute 1 = "Test Attribute 1"` (vlen string)
48+
- Dataset: `Dataset Attribute 2 = "Test Attribute 2"` (vlen string)
49+
50+
All attributes now read successfully.
51+
52+
### 📊 Test Suite Results
53+
54+
**Official HDF5 Test Suite Results**:
55+
- Pass rate: **100%** (378/378 valid files) - maintained
56+
- All existing tests pass
57+
- New unit tests added for variable-length string attributes
58+
59+
**Files Changed**:
60+
- `internal/core/attribute.go` - VLen string support, 8-byte alignment fix
61+
- `internal/core/attribute_test.go` - New unit tests for vlen strings
62+
- `internal/core/datatype.go` - Fixed IsVariableString() logic
63+
- `internal/core/datatype_helpers_test.go` - Updated test cases
64+
65+
---
66+
1067
## [v0.13.3] - 2025-01-28
1168

1269
### 🐛 Bug Fixes

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
[![Stars](https://img.shields.io/github/stars/scigolib/hdf5?style=flat-square&logo=github)](https://github.com/scigolib/hdf5/stargazers)
1313
[![Discussions](https://img.shields.io/github/discussions/scigolib/hdf5?style=flat-square&logo=github&label=discussions)](https://github.com/scigolib/hdf5/discussions)
1414

15-
A modern, pure Go library for reading and writing HDF5 files without CGo dependencies. **v0.13.0: HDF5 2.0.0 compatibility with security hardening, AI/ML datatypes, and 86.1% code coverage!**
15+
A modern, pure Go library for reading and writing HDF5 files without CGo dependencies. **v0.13.4: 100% HDF5 test suite pass rate, full attribute support including variable-length strings!**
1616

1717
---
1818

@@ -393,8 +393,8 @@ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file
393393
---
394394

395395
**Status**: Stable - HDF5 2.0.0 compatible with security hardening
396-
**Version**: v0.13.0 (4 CVEs fixed, AI/ML datatypes, 86.1% coverage, 0 lint issues)
397-
**Last Updated**: 2025-11-13
396+
**Version**: v0.13.4 (100% HDF5 test suite pass rate, attribute reading fix, 86.1% coverage)
397+
**Last Updated**: 2025-01-29
398398

399399
---
400400

ROADMAP.md

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
> **Strategic Advantage**: We have official HDF5 C library as reference implementation!
44
> **Approach**: Port proven algorithms, not invent from scratch - Senior Go Developer mindset
55
6-
**Last Updated**: 2025-01-28 | **Current Version**: v0.13.3 | **Strategy**: HDF5 2.0.0 compatible → security hardened → v1.0.0 LTS | **Milestone**: v0.13.3 RELEASED! (2025-01-28 compatibility improvements) → v1.0.0 LTS (Q3 2026)
6+
**Last Updated**: 2025-01-29 | **Current Version**: v0.13.4 | **Strategy**: HDF5 2.0.0 compatible → security hardened → v1.0.0 LTS | **Milestone**: v0.13.4 RELEASED! (2025-01-29 attribute reading fix) → v1.0.0 LTS (Q3 2026)
77

88
---
99

@@ -93,6 +93,13 @@ v1.0.0 LTS → Long-term support release (Q3 2026)
9393
- Official HDF5 test suite: **100% pass rate** (378/378 valid files)
9494
- Added flux.h5 to test suite, professional error testing for corrupt files
9595

96+
**v0.13.4** = Attribute Reading Fix ✅ RELEASED (2025-01-29)
97+
- Fixed Issue #14: Variable-length string attributes not readable
98+
- Fixed V1/V2 attribute message 8-byte alignment (H5O_ALIGN_OLD macro)
99+
- Fixed IsVariableString() detection (ClassBitField, not Properties)
100+
- Fixed vlen string data format (4-byte length prefix + Global Heap reference)
101+
- Files created by h5py now work correctly
102+
96103
**v0.13.x** = Stable Maintenance Phase (current)
97104
- Monitoring for bug reports from production use
98105
- Performance optimizations when identified
@@ -112,7 +119,7 @@ v1.0.0 LTS → Long-term support release (Q3 2026)
112119

113120
---
114121

115-
## 📊 Current Status (v0.13.3)
122+
## 📊 Current Status (v0.13.4)
116123

117124
**Phase**: 🛡️ Stable Maintenance (monitoring, community support)
118125
**HDF5 2.0.0 Format Spec v4.0**: Complete! 🎉
@@ -397,5 +404,5 @@ v1.0.0 LTS → Long-term support release (Q3 2026)
397404
---
398405

399406
*Version 5.2 (Updated 2025-01-27)*
400-
*Current: v0.13.3 (STABLE) | Phase: Maintenance | Next: v0.14.0+ (community-driven) | Target: v1.0.0 LTS (Q3 2026)*
407+
*Current: v0.13.4 (STABLE) | Phase: Maintenance | Next: v0.14.0+ (community-driven) | Target: v1.0.0 LTS (Q3 2026)*
401408

docs/guides/DATATYPES.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -421,7 +421,12 @@ if err == nil {
421421

422422
```go
423423
for _, attr := range attrs {
424-
switch v := attr.Value.(type) {
424+
value, err := attr.ReadValue()
425+
if err != nil {
426+
fmt.Printf("error reading %s: %v\n", attr.Name, err)
427+
continue
428+
}
429+
switch v := value.(type) {
425430
case int32:
426431
fmt.Printf("int32: %d\n", v)
427432
case int64:

docs/guides/FAQ.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -163,7 +163,7 @@ See [ROADMAP.md](../../ROADMAP.md) for future plans.
163163

164164
### Can I read attributes?
165165

166-
**Yes!** Full attribute reading support:
166+
**Yes!** Full attribute reading support including variable-length strings:
167167

168168
```go
169169
// Group attributes
@@ -174,13 +174,19 @@ attrs, err := dataset.Attributes()
174174

175175
// Access attribute values
176176
for _, attr := range attrs {
177-
fmt.Printf("%s = %v (type: %T)\n", attr.Name, attr.Value, attr.Value)
177+
value, err := attr.ReadValue()
178+
if err != nil {
179+
log.Printf("Error reading %s: %v", attr.Name, err)
180+
continue
181+
}
182+
fmt.Printf("%s = %v (type: %T)\n", attr.Name, value, value)
178183
}
179184
```
180185

181186
**Supported**:
182187
- ✅ Compact attributes (in object header)
183188
- ✅ Dense attributes (fractal heap direct blocks)
189+
- ✅ All datatypes including variable-length strings (v0.13.4+)
184190

185191
**Note**: Dense attributes (8+ attributes) fully supported via B-tree v2 and fractal heap.
186192

docs/guides/READING_DATA.md

Lines changed: 28 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -285,7 +285,9 @@ Dataset: temperature
285285

286286
## 🏷️ Reading Attributes
287287

288-
Attributes are metadata attached to groups and datasets.
288+
Attributes are metadata attached to groups and datasets. Attributes can contain
289+
any HDF5 datatype including integers, floats, fixed-length strings, and
290+
variable-length strings.
289291

290292
### Reading Group Attributes
291293

@@ -301,8 +303,13 @@ if err != nil {
301303

302304
fmt.Printf("Group '/' has %d attributes:\n", len(attrs))
303305
for _, attr := range attrs {
306+
value, err := attr.ReadValue()
307+
if err != nil {
308+
fmt.Printf(" - %s: ERROR: %v\n", attr.Name, err)
309+
continue
310+
}
304311
fmt.Printf(" - %s: %v (type: %s)\n",
305-
attr.Name, attr.Value, attr.Datatype)
312+
attr.Name, value, attr.Datatype)
306313
}
307314
```
308315

@@ -320,7 +327,12 @@ file.Walk(func(path string, obj hdf5.Object) {
320327
if len(attrs) > 0 {
321328
fmt.Printf("\nDataset: %s\n", path)
322329
for _, attr := range attrs {
323-
fmt.Printf(" @%s = %v\n", attr.Name, attr.Value)
330+
value, err := attr.ReadValue()
331+
if err != nil {
332+
fmt.Printf(" @%s = ERROR: %v\n", attr.Name, err)
333+
continue
334+
}
335+
fmt.Printf(" @%s = %v\n", attr.Name, value)
324336
}
325337
}
326338
}
@@ -329,14 +341,21 @@ file.Walk(func(path string, obj hdf5.Object) {
329341

330342
### Attribute Types
331343

332-
Attributes support the same datatypes as datasets:
344+
Attributes support the same datatypes as datasets, including variable-length strings (v0.13.4+):
333345

334346
```go
335347
for _, attr := range attrs {
336348
fmt.Printf("Attribute: %s\n", attr.Name)
337349

350+
// Read the value first
351+
value, err := attr.ReadValue()
352+
if err != nil {
353+
fmt.Printf(" Error: %v\n", err)
354+
continue
355+
}
356+
338357
// Value is interface{}, type depends on HDF5 datatype
339-
switch v := attr.Value.(type) {
358+
switch v := value.(type) {
340359
case int32:
341360
fmt.Printf(" Type: int32, Value: %d\n", v)
342361

@@ -788,7 +807,8 @@ func main() {
788807

789808
fmt.Println("=== File Metadata ===")
790809
for _, attr := range attrs {
791-
fmt.Printf("%s: %v\n", attr.Name, attr.Value)
810+
value, _ := attr.ReadValue()
811+
fmt.Printf("%s: %v\n", attr.Name, value)
792812
}
793813

794814
// Extract dataset metadata
@@ -801,7 +821,8 @@ func main() {
801821
attrs, err := ds.Attributes()
802822
if err == nil {
803823
for _, attr := range attrs {
804-
fmt.Printf(" @%s = %v\n", attr.Name, attr.Value)
824+
value, _ := attr.ReadValue()
825+
fmt.Printf(" @%s = %v\n", attr.Name, value)
805826
}
806827
}
807828

docs/guides/TROUBLESHOOTING.md

Lines changed: 23 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -179,15 +179,21 @@ for i, child := range root.Children() {
179179

180180
**Error Message**:
181181
```
182-
Error: unsupported datatype: H5T_ARRAY
183182
Error: unsupported datatype class: 10
183+
Error: unsupported datatype class 9 or size 16
184184
```
185185

186-
**Cause**: Dataset uses a datatype not yet implemented.
186+
**Cause**: Dataset or attribute uses a datatype not yet implemented.
187187

188-
**Supported Types**: int32, int64, float32, float64, strings, compounds
188+
**Supported Types (v0.13.4+)**:
189+
- Integer types (int8-64, uint8-64)
190+
- Floating point (float32, float64, bfloat16, FP8)
191+
- Strings (fixed-length and variable-length)
192+
- Compounds (nested structures)
193+
- Arrays, Enums, References, Opaque
189194

190-
**Unsupported**: arrays, enums, references, opaque, time
195+
**Note**: If you see "unsupported datatype class 9", upgrade to v0.13.4+ which adds
196+
variable-length string support for attributes.
191197

192198
**Solution**:
193199

@@ -375,21 +381,28 @@ panic: interface conversion: interface {} is float64, not int32
375381

376382
**Solution**:
377383

378-
Always use safe type assertion:
384+
Always read the value first, then use safe type assertion:
379385

380386
```go
387+
// Read the attribute value
388+
value, err := attr.ReadValue()
389+
if err != nil {
390+
log.Printf("Error reading attribute: %v", err)
391+
return
392+
}
393+
381394
// Bad: Direct assertion (can panic)
382-
value := attr.Value.(int32)
395+
intValue := value.(int32)
383396

384397
// Good: Safe assertion with check
385-
if value, ok := attr.Value.(int32); ok {
386-
fmt.Printf("int32: %d\n", value)
398+
if intValue, ok := value.(int32); ok {
399+
fmt.Printf("int32: %d\n", intValue)
387400
} else {
388-
fmt.Printf("Not int32, actual type: %T\n", attr.Value)
401+
fmt.Printf("Not int32, actual type: %T\n", value)
389402
}
390403

391404
// Best: Use type switch
392-
switch v := attr.Value.(type) {
405+
switch v := value.(type) {
393406
case int32:
394407
fmt.Printf("int32: %d\n", v)
395408
case int64:

0 commit comments

Comments
 (0)