Skip to content

RFC: Caching field values #134

Open
Open
@AdamHillier

Description

@AdamHillier

Feature description and motivation

At the moment, field values on component instances behave much like instance attributes of generic Python class instances. One value exists per instance, and if it is mutable then an access after modification will return the same, modified value, e.g.:

@component
class A:
    foo: List[int] = Field(lambda: [1, 2, 3])

class B:
    def __init__(self):
        self.foo = [1, 2, 3]

a = A()
b = B()

assert a.foo == b.foo == [1, 2, 3]

a.foo.append(4)
b.foo.append(4)

assert a.foo == b.foo == [1, 2, 3, 4]

We could change this behaviour so that field values instead behave much more like @property values, i.e. the value is not 'cached' on the instance and instead re-generated on every access. See discussion here for a motivation of this different behaviour: larq/zoo#148 (comment).

Current implementation

For a full explanation of how components access field values, see the docstring of the _wrap_getattribute method in component.py:

"""
The logic for this overriden `__getattribute__` is as follows:
During component instantiation, any values passed to `__init__` are stored
in a dict on the instance `__component_instantiated_field_values__`. This
means that a priori the `__dict__` of a component instance is empty (of
non-dunder attributes).
Field values can come from three sources, in descending order of priority:
1) A value that was passed into `configure` (e.g. via the CLI), which is
stored in the `__component_configured_field_values__` dict on the
component instance or some parent component instance.
2) A value that was passed in at instantiation, which is stored in the
`__component_instantiated_field_values__` dict on the current component
instance (but not any parent instance).
3) A default value obtained from the `get_default` factory method of a
field defined on the component class of the current instance if it has
one, or otherwise from the factory of the field on the component class
of the nearest parent component instance with a field of the same name,
et cetera.
Once we find a field value from one of these three sources, we set the value
on the instance `__dict__` (i.e. we 'cache' it).
This means that if we find a value in the instance `__dict__` we can
immediately return it without worrying about checking the three cases above
in order. It also means that each look-up other than the first will incur no
substantial time penalty.
"""

New implementation

It would be straightforward to implement @property-esqe behaviour for default values which are passed into fields, as mutable default values are already generated from lambdas, and there's no issue with immutable default values being cached .

However, it would be much more difficult to implement for values passed in through the CLI. Consider the configuration CLI argument foo=[1, 2, 3]. We receive this as a string, and parse it into a Python value (in this case a list) to be used as the value for the field foo. If we wanted to return a new instance of this list on each access of foo, we would either need to be able to deep-clone generic mutable objects, or we would have to hold on to the configuration value as a string, and re-parse it into a Python value each time.

It's an open question whether we are happy for the behaviour of default values vs cli-overriden values to be different.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions