Skip to content

Apply fixed values (value) on dataset specs to generated Data subclasses; warn at spec-author time #1466

@rly

Description

@rly

Background

DatasetSpec supports a value field declaring a fixed value for the dataset. For non-data fields (nested AttributeSpecs, etc.), CustomClassGenerator.process_field_spec already handles value (src/hdmf/build/classgenerator.py:239-249, 327-329, 368-369) by marking the field non-settable, removing it from the generated __init__'s docval, and injecting the value post-init via self.fields[f] = field_spec.value.

That machinery doesn't apply to the dataset's own data. get_attr_names (src/hdmf/build/objectmapper.py:567-579) only collects spec.attributes and (for GroupSpec) nested groups/datasets/links — it never includes data. So data is never in not_inherited_fields and process_field_spec is never called for it. The dataset's data arg flows in solely from get_docval(parent_cls.__init__) (classgenerator.py:68), and the only spec-driven hook on it is _update_data_docval_arg (added in #1459).

_update_data_docval_arg now handles default_value for data. It does not yet handle value.

Why low priority

This only surfaces if an extension author defines a new dataset type with a hard-coded value for its own data — i.e., a generated Data subclass that is effectively a singleton constant. That's an unusual pattern with limited practical use. Worth supporting eventually for completeness, but no known consumer needs it today.

Why not warn at class generation?

#1459 originally warned at class generation when a dataset spec had value set. That warning was removed because it fires at extension import time on end-user machines, where only the extension author can act on it. See PR feedback.

What's needed

Two pieces, separable.

1. Apply fixed values on generated subclasses of Data

All the wiring belongs in the _update_data_docval_arg / set_init flow rather than process_field_spec:

  • In _update_data_docval_arg, when spec.value is not None: drop the data arg from docval_args so users can't pass it.
  • In set_init, inject the fixed value during __init__ — either by calling super().__init__(data=spec.value) or by writing to the underlying private slot post-init. Note this requires threading spec into set_init (today set_init takes not_inherited_fields and name; spec would be a new arg, or accessed via the closure that already captures not_inherited_fields).
  • Mark the inherited data property as non-settable. The existing __fields__ settable=False mechanism only governs __fields__ entries the generator owns; data is a hand-written property on Data. May need a small extension to that mechanism, or a generated __setattr__ guard.
  • Decide and document semantics: a generated Data subclass with fixed data is effectively a singleton constant. Reasonable, but worth being explicit.

2. Surface unhandled spec features to spec authors, not end users

The right time to flag features the loader doesn't apply (today: fixed value on dataset types; potentially others later) is when authoring/exporting the spec, not at extension import time:

  • Add validation hooks in the spec-export path (NamespaceCatalog, NamespaceBuilder, or wherever specs are serialized to YAML) that flag features that are known to be no-ops on read.
  • Or expose a developer-facing "check my spec" entry point that authors can run during extension development / CI.

(2) is useful even after (1) lands, because there will inevitably be other quietly-ignored spec features in the future.

Suggested next steps (not blocking)

  1. Implement (1) above and remove the TODO comment in _update_data_docval_arg.
  2. Independently, design (2): a spec-author-facing lint that catches not-yet-supported spec features and runs at spec authoring/export time rather than class generation time.

Context

Metadata

Metadata

Assignees

No one assigned

    Labels

    category: enhancementimprovements of code or code behaviorpriority: lowalternative solution already working and/or relevant to only specific user(s)topic: maintenanceIssues related to tech debt / code maintainability

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions