One-liner
When a dimensioned variable with Jinja-templated presentation.grapher_config (e.g. un_wpp.population referencing << age >>/<< sex >>) is combined via arithmetic with another variable, or consumed by a downstream step that produces a non-dimensional output, its context-bound metadata can survive into the result — and then explode at grapher render time with UndefinedError: 'age' is undefined.
Context
Variable metadata today is designed to be propagated through pandas arithmetic and YAML overlays, which is the right default for origins, sources, licenses — those are legitimately inherited. But presentation (and especially presentation.grapher_config) is a context-bound field: its Jinja templates only make sense in the exact dimensional context where they were authored. Propagating them silently into a different context is a footgun.
Two mechanisms contribute:
-
Runtime combine in lib/catalog/owid/catalog/core/indicators.py (combine_indicators_metadata, ~L881–993):
origins/sources/licenses → combined (unique union) ✅ sensible.
display and presentation → kept only if all operands are identical, else None (L808–824, L827–844).
- Division (
/) keeps only the first operand's metadata (L625–630).
- Net effect: if one operand has rich
presentation and the other has None, they're not identical — so presentation is dropped. But if operands happen to share it (or one path funnels un_wpp into the result), the Jinja survives with no sanity check against the output's dimensions.
-
YAML overlay merge in lib/catalog/owid/catalog/core/yaml_metadata.py (_merge_variable_metadata, L148–180):
merge_fields=[\"presentation\", \"grapher_config\"] (L152) → deep-merge semantics.
- A downstream
*.meta.yml that overrides presentation.title_public does not clear sibling presentation.grapher_config.subtitle inherited at runtime. The author has to know to explicitly write subtitle: \"\" to clear it.
Together these make it very easy to ship a grapher variable whose subtitle Jinja can't resolve in the new context.
Example
un_wpp.population defines a Jinja subtitle tied to age/sex:
# etl/steps/data/garden/un/2024-07-12/un_wpp.meta.yml
presentation:
grapher_config:
subtitle: |-
<%- if age == '0' %>
<%- if sex == 'all' %>Children under 1 year old.<%- elif sex == 'female' %>Girls...<%- endif %>
<%- elif age == '18+' %>...<%- endif %> {definitions.global.projections}
malnutrition/2024-12-16/malnutrition.py multiplies a WDI rate by un_wpp.population:
tb_under_five = tb_population[(tb_population[\"age\"] == \"0-4\") & ...]
tb = pr.merge(tb, tb_under_five, on=[\"country\", \"year\"])
for col in COLUMNS:
tb[COLUMNS[col]] = ((tb[col] / 100) * tb[\"population\"]).round(0).astype(\"Int64\")
tb = tb.drop(columns=[..., \"sex\", \"age\", \"variant\"])
tb = tb.format([\"country\", \"year\"], short_name=\"malnutrition\")
malnutrition.meta.yml overrides title_public but not grapher_config.subtitle. At grapher time:
jinja2.exceptions.UndefinedError: 'age' is undefined
ValueError: Error expanding Jinja in metadata for column 'number_of_stunted_children' with dim values: {}.
The subtitle in the failing metadata is byte-for-byte the template from un_wpp.population — inherited all the way through despite the output having no dimensions.
A similar fix pattern already exists at etl/steps/data/garden/demography/2024-07-12/un_wpp_historical.meta.yml (commit 482b09c), where authors manually set subtitle: \"\" / note: \"\" to neutralize inherited Jinja. That pattern is the workaround — it's also evidence that the default is wrong.
Solution space (for discussion)
A few directions, roughly ordered by blast radius:
-
Don't propagate presentation across arithmetic by default. Treat it like title/description_short (which already reset) rather than origins. Motivation: presentation describes how a specific indicator should be presented; it's not a property of the data. Downstream authors who want it must set it explicitly. This is my preferred option — it matches what users expect.
-
Strip Jinja-templated presentation.grapher_config fields when dimensions change. At combine time, detect Jinja syntax (<%, {{) in string fields and drop them if the operand context differs. More surgical but more magic.
-
Validate Jinja renders at garden save time, not at grapher time. Attempt to render every templated metadata field against the current table's dimensions; fail fast with a clear error pointing at the authoring step rather than 3 steps downstream. Doesn't fix the propagation, but makes the footgun cheap to debug.
-
Change YAML overlay semantics for presentation: if a downstream .meta.yml specifies anything under presentation, treat it as a full replace rather than deep merge (opt-in deep-merge via an explicit key?). Riskier — lots of existing YAML relies on partial overrides of presentation.topic_tags etc.
-
Make "dimensioned Jinja" a first-class concept. Mark grapher_config.subtitle fields that reference dimensions with metadata (e.g. _dims_required: [age, sex]), and drop them automatically when those dims are not in the output table's index. Solves it cleanly but requires tagging.
I lean toward (1) + (3): stop propagating presentation across arithmetic (it's almost never the right thing), and add a save-time Jinja lint so the failure mode moves from "grapher step 500 lines deep" to "garden step tells you which field is broken."
Happy to prototype (1) behind a feature flag in combine_indicators_metadata if there's appetite.
One-liner
When a dimensioned variable with Jinja-templated
presentation.grapher_config(e.g.un_wpp.populationreferencing<< age >>/<< sex >>) is combined via arithmetic with another variable, or consumed by a downstream step that produces a non-dimensional output, its context-bound metadata can survive into the result — and then explode at grapher render time withUndefinedError: 'age' is undefined.Context
Variable metadata today is designed to be propagated through pandas arithmetic and YAML overlays, which is the right default for
origins,sources,licenses— those are legitimately inherited. Butpresentation(and especiallypresentation.grapher_config) is a context-bound field: its Jinja templates only make sense in the exact dimensional context where they were authored. Propagating them silently into a different context is a footgun.Two mechanisms contribute:
Runtime combine in
lib/catalog/owid/catalog/core/indicators.py(combine_indicators_metadata, ~L881–993):origins/sources/licenses→ combined (unique union) ✅ sensible.displayandpresentation→ kept only if all operands are identical, elseNone(L808–824, L827–844)./) keeps only the first operand's metadata (L625–630).presentationand the other hasNone, they're not identical — sopresentationis dropped. But if operands happen to share it (or one path funnels un_wpp into the result), the Jinja survives with no sanity check against the output's dimensions.YAML overlay merge in
lib/catalog/owid/catalog/core/yaml_metadata.py(_merge_variable_metadata, L148–180):merge_fields=[\"presentation\", \"grapher_config\"](L152) → deep-merge semantics.*.meta.ymlthat overridespresentation.title_publicdoes not clear siblingpresentation.grapher_config.subtitleinherited at runtime. The author has to know to explicitly writesubtitle: \"\"to clear it.Together these make it very easy to ship a grapher variable whose subtitle Jinja can't resolve in the new context.
Example
un_wpp.populationdefines a Jinja subtitle tied toage/sex:malnutrition/2024-12-16/malnutrition.pymultiplies a WDI rate byun_wpp.population:malnutrition.meta.ymloverridestitle_publicbut notgrapher_config.subtitle. At grapher time:The subtitle in the failing metadata is byte-for-byte the template from
un_wpp.population— inherited all the way through despite the output having no dimensions.A similar fix pattern already exists at
etl/steps/data/garden/demography/2024-07-12/un_wpp_historical.meta.yml(commit 482b09c), where authors manually setsubtitle: \"\"/note: \"\"to neutralize inherited Jinja. That pattern is the workaround — it's also evidence that the default is wrong.Solution space (for discussion)
A few directions, roughly ordered by blast radius:
Don't propagate
presentationacross arithmetic by default. Treat it liketitle/description_short(which already reset) rather thanorigins. Motivation:presentationdescribes how a specific indicator should be presented; it's not a property of the data. Downstream authors who want it must set it explicitly. This is my preferred option — it matches what users expect.Strip Jinja-templated
presentation.grapher_configfields when dimensions change. At combine time, detect Jinja syntax (<%,{{) in string fields and drop them if the operand context differs. More surgical but more magic.Validate Jinja renders at garden save time, not at grapher time. Attempt to render every templated metadata field against the current table's dimensions; fail fast with a clear error pointing at the authoring step rather than 3 steps downstream. Doesn't fix the propagation, but makes the footgun cheap to debug.
Change YAML overlay semantics for
presentation: if a downstream.meta.ymlspecifies anything underpresentation, treat it as a full replace rather than deep merge (opt-in deep-merge via an explicit key?). Riskier — lots of existing YAML relies on partial overrides ofpresentation.topic_tagsetc.Make "dimensioned Jinja" a first-class concept. Mark
grapher_config.subtitlefields that reference dimensions with metadata (e.g._dims_required: [age, sex]), and drop them automatically when those dims are not in the output table's index. Solves it cleanly but requires tagging.I lean toward (1) + (3): stop propagating
presentationacross arithmetic (it's almost never the right thing), and add a save-time Jinja lint so the failure mode moves from "grapher step 500 lines deep" to "garden step tells you which field is broken."Happy to prototype (1) behind a feature flag in
combine_indicators_metadataif there's appetite.