Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,63 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

---

## [v0.13.4] - 2025-01-29

### 🐛 Bug Fixes

#### Fixed: Variable-Length String Attribute Reading (Issue #14)

Users reported that attributes could not be read from HDF5 files created by h5py.
The issue manifested in two ways:
1. Integer attributes on root groups were not found
2. Variable-length string attributes returned "unsupported datatype class 9"

**Root Causes**:

1. **V1/V2 Attribute Alignment**: Name, datatype, and dataspace fields in attribute messages
must be padded to 8-byte boundaries (per `H5O_ALIGN_OLD` macro in C library), but we
were using exact sizes.

2. **IsVariableString() Detection**: The function was checking `Properties[0] & 0x0F == DatatypeString`,
but per HDF5 Format Specification III.A.2.4.d, variable-length string type is indicated by
`ClassBitField & 0x0F == 1` (where 1 = String, 0 = Sequence).

3. **VLen String Data Format**: Variable-length strings include a 4-byte length prefix before
the Global Heap reference, making the total size `4 + offsetSize + 4` bytes, not `offsetSize + 4`.

**Fixed**:
- `internal/core/attribute.go`: Added 8-byte alignment for V1/V2 attribute parsing
- `internal/core/datatype.go`: Fixed `IsVariableString()` to check `ClassBitField` correctly
- `internal/core/attribute.go`: Added `DatatypeVarLen` case to `ReadValue()` with proper vlen format
- `internal/core/attribute.go`: Added `readVariableLengthString()` helper for Global Heap access

**Result**:
- Integer attributes on root groups now read correctly
- Variable-length string attributes on datasets now read correctly
- Files created by h5py work without issues

**Test file**: Python script from Issue #14 creates file with:
- Root group: `File Attribute = 123456` (integer)
- Dataset: `Dataset Attribute 1 = "Test Attribute 1"` (vlen string)
- Dataset: `Dataset Attribute 2 = "Test Attribute 2"` (vlen string)

All attributes now read successfully.

### 📊 Test Suite Results

**Official HDF5 Test Suite Results**:
- Pass rate: **100%** (378/378 valid files) - maintained
- All existing tests pass
- New unit tests added for variable-length string attributes

**Files Changed**:
- `internal/core/attribute.go` - VLen string support, 8-byte alignment fix
- `internal/core/attribute_test.go` - New unit tests for vlen strings
- `internal/core/datatype.go` - Fixed IsVariableString() logic
- `internal/core/datatype_helpers_test.go` - Updated test cases

---

## [v0.13.3] - 2025-01-28

### 🐛 Bug Fixes
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
[![Stars](https://img.shields.io/github/stars/scigolib/hdf5?style=flat-square&logo=github)](https://github.com/scigolib/hdf5/stargazers)
[![Discussions](https://img.shields.io/github/discussions/scigolib/hdf5?style=flat-square&logo=github&label=discussions)](https://github.com/scigolib/hdf5/discussions)

A modern, pure Go library for reading and writing HDF5 files without CGo dependencies. **v0.13.0: HDF5 2.0.0 compatibility with security hardening, AI/ML datatypes, and 86.1% code coverage!**
A modern, pure Go library for reading and writing HDF5 files without CGo dependencies. **v0.13.4: 100% HDF5 test suite pass rate, full attribute support including variable-length strings!**

---

Expand Down Expand Up @@ -393,8 +393,8 @@ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file
---

**Status**: Stable - HDF5 2.0.0 compatible with security hardening
**Version**: v0.13.0 (4 CVEs fixed, AI/ML datatypes, 86.1% coverage, 0 lint issues)
**Last Updated**: 2025-11-13
**Version**: v0.13.4 (100% HDF5 test suite pass rate, attribute reading fix, 86.1% coverage)
**Last Updated**: 2025-01-29

---

Expand Down
13 changes: 10 additions & 3 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
> **Strategic Advantage**: We have official HDF5 C library as reference implementation!
> **Approach**: Port proven algorithms, not invent from scratch - Senior Go Developer mindset

**Last Updated**: 2025-01-28 | **Current Version**: v0.13.3 | **Strategy**: HDF5 2.0.0 compatible → security hardened → v1.0.0 LTS | **Milestone**: v0.13.3 RELEASED! (2025-01-28 compatibility improvements) → v1.0.0 LTS (Q3 2026)
**Last Updated**: 2025-01-29 | **Current Version**: v0.13.4 | **Strategy**: HDF5 2.0.0 compatible → security hardened → v1.0.0 LTS | **Milestone**: v0.13.4 RELEASED! (2025-01-29 attribute reading fix) → v1.0.0 LTS (Q3 2026)

---

Expand Down Expand Up @@ -93,6 +93,13 @@ v1.0.0 LTS → Long-term support release (Q3 2026)
- Official HDF5 test suite: **100% pass rate** (378/378 valid files)
- Added flux.h5 to test suite, professional error testing for corrupt files

**v0.13.4** = Attribute Reading Fix ✅ RELEASED (2025-01-29)
- Fixed Issue #14: Variable-length string attributes not readable
- Fixed V1/V2 attribute message 8-byte alignment (H5O_ALIGN_OLD macro)
- Fixed IsVariableString() detection (ClassBitField, not Properties)
- Fixed vlen string data format (4-byte length prefix + Global Heap reference)
- Files created by h5py now work correctly

**v0.13.x** = Stable Maintenance Phase (current)
- Monitoring for bug reports from production use
- Performance optimizations when identified
Expand All @@ -112,7 +119,7 @@ v1.0.0 LTS → Long-term support release (Q3 2026)

---

## 📊 Current Status (v0.13.3)
## 📊 Current Status (v0.13.4)

**Phase**: 🛡️ Stable Maintenance (monitoring, community support)
**HDF5 2.0.0 Format Spec v4.0**: Complete! 🎉
Expand Down Expand Up @@ -397,5 +404,5 @@ v1.0.0 LTS → Long-term support release (Q3 2026)
---

*Version 5.2 (Updated 2025-01-27)*
*Current: v0.13.3 (STABLE) | Phase: Maintenance | Next: v0.14.0+ (community-driven) | Target: v1.0.0 LTS (Q3 2026)*
*Current: v0.13.4 (STABLE) | Phase: Maintenance | Next: v0.14.0+ (community-driven) | Target: v1.0.0 LTS (Q3 2026)*

7 changes: 6 additions & 1 deletion docs/guides/DATATYPES.md
Original file line number Diff line number Diff line change
Expand Up @@ -421,7 +421,12 @@ if err == nil {

```go
for _, attr := range attrs {
switch v := attr.Value.(type) {
value, err := attr.ReadValue()
if err != nil {
fmt.Printf("error reading %s: %v\n", attr.Name, err)
continue
}
switch v := value.(type) {
case int32:
fmt.Printf("int32: %d\n", v)
case int64:
Expand Down
10 changes: 8 additions & 2 deletions docs/guides/FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ See [ROADMAP.md](../../ROADMAP.md) for future plans.

### Can I read attributes?

**Yes!** Full attribute reading support:
**Yes!** Full attribute reading support including variable-length strings:

```go
// Group attributes
Expand All @@ -174,13 +174,19 @@ attrs, err := dataset.Attributes()

// Access attribute values
for _, attr := range attrs {
fmt.Printf("%s = %v (type: %T)\n", attr.Name, attr.Value, attr.Value)
value, err := attr.ReadValue()
if err != nil {
log.Printf("Error reading %s: %v", attr.Name, err)
continue
}
fmt.Printf("%s = %v (type: %T)\n", attr.Name, value, value)
}
```

**Supported**:
- ✅ Compact attributes (in object header)
- ✅ Dense attributes (fractal heap direct blocks)
- ✅ All datatypes including variable-length strings (v0.13.4+)

**Note**: Dense attributes (8+ attributes) fully supported via B-tree v2 and fractal heap.

Expand Down
35 changes: 28 additions & 7 deletions docs/guides/READING_DATA.md
Original file line number Diff line number Diff line change
Expand Up @@ -285,7 +285,9 @@ Dataset: temperature

## 🏷️ Reading Attributes

Attributes are metadata attached to groups and datasets.
Attributes are metadata attached to groups and datasets. Attributes can contain
any HDF5 datatype including integers, floats, fixed-length strings, and
variable-length strings.

### Reading Group Attributes

Expand All @@ -301,8 +303,13 @@ if err != nil {

fmt.Printf("Group '/' has %d attributes:\n", len(attrs))
for _, attr := range attrs {
value, err := attr.ReadValue()
if err != nil {
fmt.Printf(" - %s: ERROR: %v\n", attr.Name, err)
continue
}
fmt.Printf(" - %s: %v (type: %s)\n",
attr.Name, attr.Value, attr.Datatype)
attr.Name, value, attr.Datatype)
}
```

Expand All @@ -320,7 +327,12 @@ file.Walk(func(path string, obj hdf5.Object) {
if len(attrs) > 0 {
fmt.Printf("\nDataset: %s\n", path)
for _, attr := range attrs {
fmt.Printf(" @%s = %v\n", attr.Name, attr.Value)
value, err := attr.ReadValue()
if err != nil {
fmt.Printf(" @%s = ERROR: %v\n", attr.Name, err)
continue
}
fmt.Printf(" @%s = %v\n", attr.Name, value)
}
}
}
Expand All @@ -329,14 +341,21 @@ file.Walk(func(path string, obj hdf5.Object) {

### Attribute Types

Attributes support the same datatypes as datasets:
Attributes support the same datatypes as datasets, including variable-length strings (v0.13.4+):

```go
for _, attr := range attrs {
fmt.Printf("Attribute: %s\n", attr.Name)

// Read the value first
value, err := attr.ReadValue()
if err != nil {
fmt.Printf(" Error: %v\n", err)
continue
}

// Value is interface{}, type depends on HDF5 datatype
switch v := attr.Value.(type) {
switch v := value.(type) {
case int32:
fmt.Printf(" Type: int32, Value: %d\n", v)

Expand Down Expand Up @@ -788,7 +807,8 @@ func main() {

fmt.Println("=== File Metadata ===")
for _, attr := range attrs {
fmt.Printf("%s: %v\n", attr.Name, attr.Value)
value, _ := attr.ReadValue()
fmt.Printf("%s: %v\n", attr.Name, value)
}

// Extract dataset metadata
Expand All @@ -801,7 +821,8 @@ func main() {
attrs, err := ds.Attributes()
if err == nil {
for _, attr := range attrs {
fmt.Printf(" @%s = %v\n", attr.Name, attr.Value)
value, _ := attr.ReadValue()
fmt.Printf(" @%s = %v\n", attr.Name, value)
}
}

Expand Down
33 changes: 23 additions & 10 deletions docs/guides/TROUBLESHOOTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,15 +179,21 @@ for i, child := range root.Children() {

**Error Message**:
```
Error: unsupported datatype: H5T_ARRAY
Error: unsupported datatype class: 10
Error: unsupported datatype class 9 or size 16
```

**Cause**: Dataset uses a datatype not yet implemented.
**Cause**: Dataset or attribute uses a datatype not yet implemented.

**Supported Types**: int32, int64, float32, float64, strings, compounds
**Supported Types (v0.13.4+)**:
- Integer types (int8-64, uint8-64)
- Floating point (float32, float64, bfloat16, FP8)
- Strings (fixed-length and variable-length)
- Compounds (nested structures)
- Arrays, Enums, References, Opaque

**Unsupported**: arrays, enums, references, opaque, time
**Note**: If you see "unsupported datatype class 9", upgrade to v0.13.4+ which adds
variable-length string support for attributes.

**Solution**:

Expand Down Expand Up @@ -375,21 +381,28 @@ panic: interface conversion: interface {} is float64, not int32

**Solution**:

Always use safe type assertion:
Always read the value first, then use safe type assertion:

```go
// Read the attribute value
value, err := attr.ReadValue()
if err != nil {
log.Printf("Error reading attribute: %v", err)
return
}

// Bad: Direct assertion (can panic)
value := attr.Value.(int32)
intValue := value.(int32)

// Good: Safe assertion with check
if value, ok := attr.Value.(int32); ok {
fmt.Printf("int32: %d\n", value)
if intValue, ok := value.(int32); ok {
fmt.Printf("int32: %d\n", intValue)
} else {
fmt.Printf("Not int32, actual type: %T\n", attr.Value)
fmt.Printf("Not int32, actual type: %T\n", value)
}

// Best: Use type switch
switch v := attr.Value.(type) {
switch v := value.(type) {
case int32:
fmt.Printf("int32: %d\n", v)
case int64:
Expand Down
Loading