@@ -142,16 +142,199 @@ Female 40 60 30
142142Female 60 100 27
143143====== ========= ======= ======
144144
145+ Constructing Lookup Tables from a Component
146+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
147+
148+ Components can to register lookup tables to be built by specifying
149+ a ``data_sources `` block in their ``configuration_defaults `` property.
150+ As a basic example, DiseaseModel in ``vivarium_public_health `` has the following
151+ ``data_sources `` configuration:
152+ .. code-block :: python
153+
154+ @ property
155+ def configuration_defaults (self ) -> dict[str , Any]:
156+ return {
157+ f " { self .name} " : {
158+ " data_sources" : {
159+ " cause_specific_mortality_rate" : self .load_cause_specific_mortality_rate,
160+ },
161+ },
162+ }
163+
164+ which specifies a single lookup table named
165+ ``cause_specific_mortality_rate `` whose data is provided by the component's
166+ ``load_cause_specific_mortality_rate `` method.
167+
168+ Each entry in
169+ ``data_sources `` maps a table name to a data source from one of several supported types
170+ (see `Data Source Types `_). Barring edge cases (see
171+ `Limitations and When to Override `_), one should specify all of a component's
172+ lookup tables via ``data_sources ``, instead of accessing the builder's lookup
173+ interface directly.
174+
175+ When a component configures ``data_sources ``, the base
176+ :class: `Component <vivarium.component.Component> ` class automatically builds
177+ the lookup tables before the component's ``setup() `` method is called. The
178+ resulting tables are stored in the component's ``lookup_tables `` dictionary,
179+ keyed by the name specified in ``data_sources ``.
180+
181+ This approach separates the *what * (which tables to build and where to get data) from the
182+ *how * (the mechanics of table construction), making components easier to
183+ write and configure. It also allows users to override data sources in model specification files
184+ without modifying component code. Following the example above, a model specification could adjust the
185+ ``cause_specific_mortality_rate `` data source to point to different data or a scalar value:
186+
187+ .. code-block :: yaml
188+
189+ configuration :
190+ disease_model :
191+ data_sources :
192+ cause_specific_mortality_rate : 0.02
193+
194+ Data Source Types
195+ ^^^^^^^^^^^^^^^^^
196+
197+ Each entry in ``data_sources `` maps a table name to a data source. The
198+ following data source types are supported:
199+
200+ **Artifact key (string without ** ``:: `` **): **
201+ A string path to data in the artifact, e.g.,
202+ ``"cause.all_causes.cause_specific_mortality_rate" ``. The data is loaded
203+ via ``builder.data.load() ``.
204+
205+ **Callable: **
206+ Any callable (function, lambda, or bound method) that accepts a ``builder ``
207+ argument and returns the data.
208+
209+ **Scalar value: **
210+ A numeric value (``int ``, ``float ``), ``datetime ``, or ``timedelta `` that
211+ will be broadcast over the population index when the table is called.
212+
213+ **Method reference (string with ** ``self:: `` **): **
214+ A string of the form ``"self::method_name" `` that references a method on
215+ the component itself. The method should accept a ``builder `` argument and
216+ return the data. This is primarily for use in the `model specification YAML
217+ files <model_specification_concept> `_ where direct method references are not
218+ possible.
219+
220+ **External function reference (string with ** ``module.path:: `` **): **
221+ A string of the form ``"module.path::function_name" `` that references a
222+ function in another module. The function should accept a ``builder ``
223+ argument and return the data. This is primarily for use in the
224+ `model specification YAML files <model_specification_concept >`_ where direct
225+ method references are not possible.
226+
227+
228+
229+ Column Detection
230+ ^^^^^^^^^^^^^^^^
231+
232+ When building a lookup table from a :class: `pandas.DataFrame ` using ``data_sources ``,
233+ the component automatically determines key columns, parameter columns, and value columns
234+ based on the data structure:
235+
236+ - **Value columns ** default to ``["value"] `` (configurable via the artifact
237+ interface).
238+ - **Parameter columns ** are detected by finding columns ending in ``_start ``
239+ that have corresponding ``_end `` columns (e.g., ``age_start ``/``age_end ``).
240+ - **Key columns ** are all remaining columns that are neither value columns
241+ nor parameter bin edge columns.
242+
243+ See the `Construction Parameters `_ section for definitions of these
244+ column types.
245+
246+ Example: Writing a Component with Data Sources
247+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
248+
249+ A more complete example is reproduced from the ``Mortality `` component in ``vivarium_public_health ``:
250+
251+ .. code-block :: python
252+
253+ from vivarium import Component
254+
255+ class Mortality (Component ):
256+
257+ @ property
258+ def configuration_defaults (self ) -> dict[str , Any]:
259+ return {
260+ " mortality" : {
261+ " data_sources" : {
262+ # Artifact key - loaded via builder.data.load()
263+ " all_cause_mortality_rate" : " cause.all_causes.cause_specific_mortality_rate" ,
264+ # Method reference - calls self.load_unmodeled_csmr(builder)
265+ " unmodeled_cause_specific_mortality_rate" : self .load_unmodeled_csmr,
266+ # Another artifact key
267+ " life_expectancy" : " population.theoretical_minimum_risk_life_expectancy" ,
268+ },
269+ " unmodeled_causes" : [],
270+ },
271+ }
272+
273+ Example: Configuring Data Sources as a User
274+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
275+
276+ Users can override the default data sources in a model specification YAML
277+ file. This allows changing where data comes from without modifying component
278+ code:
279+
280+ .. code-block :: yaml
281+
282+ configuration :
283+ mortality :
284+ data_sources :
285+ # Override with a scalar value instead of artifact data
286+ all_cause_mortality_rate : 0.01
287+ # point to a module function
288+ unmodeled_cause_specific_mortality_rate : " my_module.data::load_unmodeled_csmr"
289+ # Or point to different artifact data
290+ life_expectancy : " alternative.life_expectancy.data"
291+
292+ Limitations and When to Override
293+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
294+
295+ The automatic ``data_sources `` mechanism works well for straightforward cases,
296+ but some scenarios require overriding the ``build_all_lookup_tables() `` method:
297+
298+ **Non-standard value columns: **
299+ The component defaults to ``["value"] `` as the value column name. If your
300+ data has differently named value columns or multiple value columns, you
301+ must call ``build_lookup_table() `` directly with explicit
302+ ``value_columns ``.
303+
304+ **Complex data transformations: **
305+ When data requires transformation before building tables (e.g., pivoting,
306+ computing derived parameters, combining multiple data sources), override
307+ ``build_all_lookup_tables() `` to perform the transformation first.
308+
309+ **Delegation to sub-components: **
310+ When lookup tables should be built by sub-components rather than the
311+ parent component, override ``build_all_lookup_tables() `` to skip the
312+ default behavior.
313+
314+ Examples of these patterns can be found in ``vivarium_public_health ``:
315+
316+ - ``RateTransition `` and ``DiseaseState `` in ``vivarium_public_health.disease ``
317+ demonstrate the basic ``data_sources `` pattern with various data source types.
318+ - ``Risk `` in ``vivarium_public_health.risks `` overrides ``build_all_lookup_tables() ``
319+ to delegate table construction to its exposure distribution sub-component.
320+
321+ Using the Lookup Interface Directly
322+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
323+
324+ For cases not covered by ``data_sources ``, or when working in an interactive
325+ context, you can build lookup tables directly using the builder's lookup
326+ interface.
327+
145328Example Usage
146329~~~~~~~~~~~~~
147330
148331The following is an example of creating and calling a lookup table in an
149- :ref: `interactive setting <interactive_tutorial >` using the data above. The
150- interface and process are the same when integrating a lookup table into a
151- :term: `component <Component> `, which is primarily how they are used. Assuming
152- you have a valid simulation object named ``sim `` and the data from the above
153- table in a :class: `pandas.DataFrame ` named ``data ``, you can construct a
154- lookup table in the following way, using the interface from the builder.
332+ :ref: `interactive setting <interactive_tutorial >` using the data from
333+ ` Construction Parameters `_ above. The interface and process are the same when
334+ integrating a lookup table into a :term: `component <Component> `, which is primarily
335+ how they are used. Assuming you have a valid simulation object named ``sim `` and
336+ the data from the above table in a :class: `pandas.DataFrame ` named ``data ``, you
337+ can construct a lookup table in the following way, using the interface from the builder.
155338
156339.. code-block :: python
157340
0 commit comments