Skip to content

feat(external_data_files): load data from other YAML files#1880

Merged
yajo merged 1 commit intomasterfrom
external-data
Feb 16, 2025
Merged

feat(external_data_files): load data from other YAML files#1880
yajo merged 1 commit intomasterfrom
external-data

Conversation

@yajo
Copy link
Member

@yajo yajo commented Dec 4, 2024

When composing templates, it's often needed to be able to load answers from other templates that you know are usually combined with yours. Or any other kind of external data.

@moduon MT-8282

@yajo yajo marked this pull request as ready for review December 10, 2024 12:18
@yajo
Copy link
Member Author

yajo commented Dec 10, 2024

@copier-org/maintainers this one was a bit hard to get right, but I think it's OK now.

The whole point of this PR is to open a new venue for template composability. Now you can have a template that reads information from answers written for other templates. Still, both templates can stay independent. But if joined, they can have some extra features.

For example, some default values from one template can depend on the answers for another template. And others' answers can be accessed in current template.

@codecov
Copy link

codecov bot commented Dec 10, 2024

Codecov Report

Attention: Patch coverage is 95.71429% with 3 lines in your changes missing coverage. Please review.

Project coverage is 98.05%. Comparing base (ee5be56) to head (bf49d88).
Report is 129 commits behind head on master.

Files with missing lines Patch % Lines
copier/user_data.py 81.81% 2 Missing ⚠️
copier/main.py 96.77% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1880      +/-   ##
==========================================
- Coverage   98.05%   98.05%   -0.01%     
==========================================
  Files          53       53              
  Lines        5511     5552      +41     
==========================================
+ Hits         5404     5444      +40     
- Misses        107      108       +1     
Flag Coverage Δ
unittests 98.05% <95.71%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@yajo yajo requested a review from sisp December 10, 2024 12:36
Copy link
Member

@sisp sisp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before reviewing the implementation in detail, would you mind discussing the design first?

Even prior to this PR, multiple Copier templates could be applied to the same subproject. But each template was applied completely independently, and when some templates had overlapping questions that should be answered the same for consistency, then it was the user's responsibility to keep the answers in sync. The way I understand this PR, it aims to improve the UX in this regard by allowing a child template to load external data (e.g., answers from an answers file from a previously applied parent template) and, e.g., use them as default values for questions in the child template, facilitating the synchronization of semantically identical questions across a parent and child template.

That said, I have some doubts about the current design:

  • A child template makes assumptions about the answers structure of a parent template. Although there is a graceful fallback when, e.g., the parent template's answers file is not found, the child template's external data loading is tailored to the parent template. This limits template composition with enhanced UX to composing the child template with a specific parent template.
  • Enhanced UX for template composition relies on the order in which templates are applied. A child template that reuses answers from a parent template must be applied after the parent template has been applied.
  • Although answers from a parent template may be used as default values for questions in a child template, the questions are asked nevertheless and may be answered differently, which may lead to inconsistencies.

I have been thinking about a similar feature myself but with a different design, inspired by how, e.g., React components are composed.

Instead of coupling the child template with the parent template, how about adding a way for a parent template to configure child templates to be applied? The new parent template may have a copier.yml file like this:

_templates:
  # A Copier template for generating a Python project using
  # Poetry as package manager
  https://git.example.com/copier-python.git:
    _ref: v1.2.3 # optional but highly recommended
    project_name: "{{ project_name }}"
    python_versions: "{{ python_versions }}"

  # A Copier template for generating GitHub Actions configs for
  # a Python project with support for several package managers
  # including Poetry
  https://git.example.com/copier-python-github-actions.git:
    _ref: v2.3.4 # optional but highly recommended
    python_versions: "{{ python_versions }}"
    package_manager: poetry # hardcoded because `copier-python` uses Poetry

# This Copier template must define all questions needed for itself
# and all child templates.

project_name:
  type: str
  help: The name of the Python project

python_versions:
  type: str
  help: The supported Python versions
  choices: &python_versions
    - "3.9"
    - "3.10"
    - "3.11"
    - "3.12"
    - "3.13"
  default: *python_versions
  multiselect: true

The answers of this template are recorded in its answers file, and any answers files that are rendered by the child templates are omitted because they aren't needed.

I see a few advantages of this design:

  • There is hardly any coupling between child templates, as the parent template defines the exposed questionnaire, passes the answers down to the child templates as needed, and orchestrates their application.
  • There is no order to remember in which templates must be applied because the parent template orchestrates the application of the child templates.
  • There might even be a way to handle multi-template merge conflicts (during both "copy" and "update" operations) via some kind of N-way merge (just dreaming, I have no proof of concept).
  • There could even be a hierarchy of child templates. It's important to note that this hierarchy is different than what we have discussed in #934.

But I also see a few disadvantages of this design:

  • Composing templates requires one additional template that acts as the root/parent template to orchestrate the child templates' applications. So, ad-hoc template composition is not supported – at least I don't see how it would be possible.
  • Some questions from the child templates need to be (approximately) duplicated in the root/parent template. While this sounds redundant and cumbersome, it's the same situation that, e.g., React component developers are facing. I think there's no way around it.

WDYT, @yajo?

@yajo
Copy link
Member Author

yajo commented Dec 12, 2024 via email

Copy link
Member

@sisp sisp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarifications, this was quite helpful. Indeed, template composition as shown is only one example of the feature; it's actually more general – and lightweight, which is nice.

I've left a few remarks related to documentation and license poisoning.

Now, thinking of your proposal, if you really need that, can't you do it
already? Just by using a clever combination of git submodules, !include
tags, and symlinks in a parent template that includes 2 child ones... my
intuition is that it'll work out-of-the-box right now.

Almost, I believe. Mapping answers in the parent template to questions in a child template seems to be missing though. When using !include, the questionnaire of a child template is simply merged with the parent template, which is not the same as passing down answers in the parent template as data to the child template. Also, !include doesn't offer encapsulation of child templates, e.g. templates with different Jinja settings wouldn't be properly supported.

But as you said, the feature of this PR does not block or prevent my suggested design from becoming available in the future. 👍

@hparfr
Copy link
Contributor

hparfr commented Dec 17, 2024

It's not really clear for me why should I use this instead of --data-file argument

Copy link
Contributor

@hparfr hparfr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's _external_data_files but the key in the template is different "{{ _ext.parent1.name }}"
may be use the same term to be more clear like : "{{ _external_data_file.parent1.name}}"

@yajo
Copy link
Member Author

yajo commented Dec 30, 2024

All attended. Could you please review again?

@yajo yajo requested a review from sisp December 30, 2024 10:20
Copy link
Member

@sisp sisp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍, just a minor remark regarding type hints.

Copy link
Member Author

@yajo yajo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix

@pawamoy
Copy link
Contributor

pawamoy commented Jan 18, 2025

Since you removed DEFAULT_DATA it will also fix #1929 🙂

Copy link
Contributor

@pawamoy pawamoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I simply second @hparfr here:

It's not really clear for me why should I use this instead of --data-file argument

A small note could be added in each to make the distinction. My own, quick wording:

The CLI argument --data-file is not the same thing as the template configuration external_data: the former lets you provide a dictionary of answers matching a template questionnaire as a user, while the latter lets you enhance the Jinja rendering context of your template as a template writer.

Something like that.

When composing templates, it's often needed to be able to load answers from other templates that you know are usually combined with yours. Or any other kind of external data.

@moduon MT-8282
@yajo
Copy link
Member Author

yajo commented Feb 15, 2025

Rebased, fixed conflicts, fixed typing errors and suggestions... Please @sisp update your review to see if we can merge.

I'd like to have this included in the next release.

@yajo yajo requested a review from sisp February 15, 2025 19:20
Copy link
Member

@sisp sisp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, thanks, @yajo! 👌

@yajo yajo enabled auto-merge (rebase) February 16, 2025 07:38
@yajo yajo merged commit 25bc06d into master Feb 16, 2025
20 of 22 checks passed
@yajo yajo deleted the external-data branch February 16, 2025 08:00
@henryiii
Copy link
Contributor

Passing in --data is broken now, as it's no longer available in the context (See #1977), hopefully just an oversight?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot update project if source copier.yml contains {{ _copier_conf|to_json }}

5 participants