Background
DatasetSpec supports a value field declaring a fixed value for the dataset. For non-data fields (nested AttributeSpecs, etc.), CustomClassGenerator.process_field_spec already handles value (src/hdmf/build/classgenerator.py:239-249, 327-329, 368-369) by marking the field non-settable, removing it from the generated __init__'s docval, and injecting the value post-init via self.fields[f] = field_spec.value.
That machinery doesn't apply to the dataset's own data. get_attr_names (src/hdmf/build/objectmapper.py:567-579) only collects spec.attributes and (for GroupSpec) nested groups/datasets/links — it never includes data. So data is never in not_inherited_fields and process_field_spec is never called for it. The dataset's data arg flows in solely from get_docval(parent_cls.__init__) (classgenerator.py:68), and the only spec-driven hook on it is _update_data_docval_arg (added in #1459).
_update_data_docval_arg now handles default_value for data. It does not yet handle value.
Why low priority
This only surfaces if an extension author defines a new dataset type with a hard-coded value for its own data — i.e., a generated Data subclass that is effectively a singleton constant. That's an unusual pattern with limited practical use. Worth supporting eventually for completeness, but no known consumer needs it today.
Why not warn at class generation?
#1459 originally warned at class generation when a dataset spec had value set. That warning was removed because it fires at extension import time on end-user machines, where only the extension author can act on it. See PR feedback.
What's needed
Two pieces, separable.
1. Apply fixed values on generated subclasses of Data
All the wiring belongs in the _update_data_docval_arg / set_init flow rather than process_field_spec:
- In
_update_data_docval_arg, when spec.value is not None: drop the data arg from docval_args so users can't pass it.
- In
set_init, inject the fixed value during __init__ — either by calling super().__init__(data=spec.value) or by writing to the underlying private slot post-init. Note this requires threading spec into set_init (today set_init takes not_inherited_fields and name; spec would be a new arg, or accessed via the closure that already captures not_inherited_fields).
- Mark the inherited
data property as non-settable. The existing __fields__ settable=False mechanism only governs __fields__ entries the generator owns; data is a hand-written property on Data. May need a small extension to that mechanism, or a generated __setattr__ guard.
- Decide and document semantics: a generated
Data subclass with fixed data is effectively a singleton constant. Reasonable, but worth being explicit.
2. Surface unhandled spec features to spec authors, not end users
The right time to flag features the loader doesn't apply (today: fixed value on dataset types; potentially others later) is when authoring/exporting the spec, not at extension import time:
- Add validation hooks in the spec-export path (
NamespaceCatalog, NamespaceBuilder, or wherever specs are serialized to YAML) that flag features that are known to be no-ops on read.
- Or expose a developer-facing "check my spec" entry point that authors can run during extension development / CI.
(2) is useful even after (1) lands, because there will inevitably be other quietly-ignored spec features in the future.
Suggested next steps (not blocking)
- Implement (1) above and remove the TODO comment in
_update_data_docval_arg.
- Independently, design (2): a spec-author-facing lint that catches not-yet-supported spec features and runs at spec authoring/export time rather than class generation time.
Context
Background
DatasetSpecsupports avaluefield declaring a fixed value for the dataset. For non-datafields (nestedAttributeSpecs, etc.),CustomClassGenerator.process_field_specalready handlesvalue(src/hdmf/build/classgenerator.py:239-249, 327-329, 368-369) by marking the field non-settable, removing it from the generated__init__'s docval, and injecting the value post-init viaself.fields[f] = field_spec.value.That machinery doesn't apply to the dataset's own
data.get_attr_names(src/hdmf/build/objectmapper.py:567-579) only collectsspec.attributesand (forGroupSpec) nestedgroups/datasets/links— it never includesdata. Sodatais never innot_inherited_fieldsandprocess_field_specis never called for it. The dataset'sdataarg flows in solely fromget_docval(parent_cls.__init__)(classgenerator.py:68), and the only spec-driven hook on it is_update_data_docval_arg(added in #1459)._update_data_docval_argnow handlesdefault_valuefordata. It does not yet handlevalue.Why low priority
This only surfaces if an extension author defines a new dataset type with a hard-coded
valuefor its own data — i.e., a generatedDatasubclass that is effectively a singleton constant. That's an unusual pattern with limited practical use. Worth supporting eventually for completeness, but no known consumer needs it today.Why not warn at class generation?
#1459 originally warned at class generation when a dataset spec had
valueset. That warning was removed because it fires at extension import time on end-user machines, where only the extension author can act on it. See PR feedback.What's needed
Two pieces, separable.
1. Apply fixed values on generated subclasses of Data
All the wiring belongs in the
_update_data_docval_arg/set_initflow rather thanprocess_field_spec:_update_data_docval_arg, whenspec.value is not None: drop thedataarg fromdocval_argsso users can't pass it.set_init, inject the fixed value during__init__— either by callingsuper().__init__(data=spec.value)or by writing to the underlying private slot post-init. Note this requires threadingspecintoset_init(todayset_inittakesnot_inherited_fieldsandname;specwould be a new arg, or accessed via the closure that already capturesnot_inherited_fields).dataproperty as non-settable. The existing__fields__settable=Falsemechanism only governs__fields__entries the generator owns;datais a hand-written property onData. May need a small extension to that mechanism, or a generated__setattr__guard.Datasubclass with fixeddatais effectively a singleton constant. Reasonable, but worth being explicit.2. Surface unhandled spec features to spec authors, not end users
The right time to flag features the loader doesn't apply (today: fixed
valueon dataset types; potentially others later) is when authoring/exporting the spec, not at extension import time:NamespaceCatalog,NamespaceBuilder, or wherever specs are serialized to YAML) that flag features that are known to be no-ops on read.(2) is useful even after (1) lands, because there will inevitably be other quietly-ignored spec features in the future.
Suggested next steps (not blocking)
_update_data_docval_arg.Context
_update_data_docval_arganddefault_valuesupport; defersvalue.