Skip to content

fix: variable-length string attribute reading (Issue #14)#15

Merged
kolkov merged 1 commit intomainfrom
fix/issue-14-vlen-string-attributes
Jan 29, 2026
Merged

fix: variable-length string attribute reading (Issue #14)#15
kolkov merged 1 commit intomainfrom
fix/issue-14-vlen-string-attributes

Conversation

@kolkov
Copy link
Contributor

@kolkov kolkov commented Jan 29, 2026

Summary

Fixes #14 - Users couldn't read attributes from HDF5 files created by h5py.

Root causes identified and fixed:

  1. V1/V2 Attribute Alignment: Name, datatype, and dataspace fields must be padded to 8-byte boundaries (per H5O_ALIGN_OLD macro in C library)

  2. IsVariableString() Detection: Was checking Properties[0] & 0x0F == DatatypeString, but per HDF5 Format Spec III.A.2.4.d, variable-length string type is indicated by ClassBitField & 0x0F == 1

  3. VLen String Data Format: Variable-length strings include a 4-byte length prefix before the Global Heap reference, making the total size 4 + offsetSize + 4 bytes

Changes:

  • internal/core/attribute.go - Added DatatypeVarLen case to ReadValue(), fixed 8-byte alignment
  • internal/core/datatype.go - Fixed IsVariableString() to check ClassBitField correctly
  • internal/core/attribute_test.go - Added unit tests for vlen string attributes
  • Documentation updated with correct attr.ReadValue() API

Test Results

Group '/' has 1 attributes:
  - File Attribute = 123456

Dataset: Test Dataset
Dataset 'Test Dataset' has 2 attributes:
  - Dataset Attribute 1 = Test Attribute 1
  - Dataset Attribute 2 = Test Attribute 2
  • All tests pass ✅
  • Linter: 0 issues ✅
  • Official HDF5 Suite: 100% (378/378) ✅

Test plan

  • Unit tests for IsVariableString() detection
  • Unit tests for vlen attribute ReadValue() error cases
  • Integration test with h5py-created file
  • Official HDF5 test suite (100% pass rate maintained)
  • Linter passes (0 issues)

Root causes fixed:
- V1/V2 attribute message 8-byte alignment (H5O_ALIGN_OLD macro)
- IsVariableString() now checks ClassBitField instead of Properties
- VLen string format: 4-byte length prefix + Global Heap reference

Changes:
- internal/core/attribute.go: Added DatatypeVarLen case to ReadValue()
- internal/core/datatype.go: Fixed IsVariableString() per HDF5 spec III.A.2.4.d
- Updated all documentation examples to use attr.ReadValue()

Files created by h5py now work correctly.

Closes #14
@codecov
Copy link

codecov bot commented Jan 29, 2026

Codecov Report

❌ Patch coverage is 41.17647% with 40 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/core/attribute.go 39.39% 40 Missing ⚠️

📢 Thoughts on this report? Let us know!

@kolkov kolkov merged commit 9063e5d into main Jan 29, 2026
11 checks passed
@kolkov kolkov deleted the fix/issue-14-vlen-string-attributes branch January 29, 2026 21:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Can't read attributes of groups and datasets

1 participant