feat(external_data_files): load data from other YAML files#1880
Conversation
|
@copier-org/maintainers this one was a bit hard to get right, but I think it's OK now. The whole point of this PR is to open a new venue for template composability. Now you can have a template that reads information from answers written for other templates. Still, both templates can stay independent. But if joined, they can have some extra features. For example, some default values from one template can depend on the answers for another template. And others' answers can be accessed in current template. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1880 +/- ##
==========================================
- Coverage 98.05% 98.05% -0.01%
==========================================
Files 53 53
Lines 5511 5552 +41
==========================================
+ Hits 5404 5444 +40
- Misses 107 108 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Before reviewing the implementation in detail, would you mind discussing the design first?
Even prior to this PR, multiple Copier templates could be applied to the same subproject. But each template was applied completely independently, and when some templates had overlapping questions that should be answered the same for consistency, then it was the user's responsibility to keep the answers in sync. The way I understand this PR, it aims to improve the UX in this regard by allowing a child template to load external data (e.g., answers from an answers file from a previously applied parent template) and, e.g., use them as default values for questions in the child template, facilitating the synchronization of semantically identical questions across a parent and child template.
That said, I have some doubts about the current design:
- A child template makes assumptions about the answers structure of a parent template. Although there is a graceful fallback when, e.g., the parent template's answers file is not found, the child template's external data loading is tailored to the parent template. This limits template composition with enhanced UX to composing the child template with a specific parent template.
- Enhanced UX for template composition relies on the order in which templates are applied. A child template that reuses answers from a parent template must be applied after the parent template has been applied.
- Although answers from a parent template may be used as default values for questions in a child template, the questions are asked nevertheless and may be answered differently, which may lead to inconsistencies.
I have been thinking about a similar feature myself but with a different design, inspired by how, e.g., React components are composed.
Instead of coupling the child template with the parent template, how about adding a way for a parent template to configure child templates to be applied? The new parent template may have a copier.yml file like this:
_templates:
# A Copier template for generating a Python project using
# Poetry as package manager
https://git.example.com/copier-python.git:
_ref: v1.2.3 # optional but highly recommended
project_name: "{{ project_name }}"
python_versions: "{{ python_versions }}"
# A Copier template for generating GitHub Actions configs for
# a Python project with support for several package managers
# including Poetry
https://git.example.com/copier-python-github-actions.git:
_ref: v2.3.4 # optional but highly recommended
python_versions: "{{ python_versions }}"
package_manager: poetry # hardcoded because `copier-python` uses Poetry
# This Copier template must define all questions needed for itself
# and all child templates.
project_name:
type: str
help: The name of the Python project
python_versions:
type: str
help: The supported Python versions
choices: &python_versions
- "3.9"
- "3.10"
- "3.11"
- "3.12"
- "3.13"
default: *python_versions
multiselect: trueThe answers of this template are recorded in its answers file, and any answers files that are rendered by the child templates are omitted because they aren't needed.
I see a few advantages of this design:
- There is hardly any coupling between child templates, as the parent template defines the exposed questionnaire, passes the answers down to the child templates as needed, and orchestrates their application.
- There is no order to remember in which templates must be applied because the parent template orchestrates the application of the child templates.
- There might even be a way to handle multi-template merge conflicts (during both "copy" and "update" operations) via some kind of N-way merge (just dreaming, I have no proof of concept).
- There could even be a hierarchy of child templates. It's important to note that this hierarchy is different than what we have discussed in #934.
But I also see a few disadvantages of this design:
- Composing templates requires one additional template that acts as the root/parent template to orchestrate the child templates' applications. So, ad-hoc template composition is not supported – at least I don't see how it would be possible.
- Some questions from the child templates need to be (approximately) duplicated in the root/parent template. While this sounds redundant and cumbersome, it's the same situation that, e.g., React component developers are facing. I think there's no way around it.
WDYT, @yajo?
|
That other design you propose is nice, but I think that this design is
simpler in some senses:
- It's just a little adition that allows Copier to include external data.
- This data can be answers or other kind of data generated by other
non-copier systems.
- If that external data happens to be answers for another template, you
can reuse answers.
- Your template already has full access to that new external data, so
you don't really *need* to use it as default for your questionary. It's
just an example use. You can just do {{ _ext.some.thing | d("nothing") }}
in your tempalte directly.
- It allows updating both templates at different pace.
- It's done already :D
- It allows both templates keep "independent".
- If we ever implement the other approach, this won't hurt it.
Now, thinking of your proposal, if you really need that, can't you do it
already? Just by using a clever combination of git submodules, !include
tags, and symlinks in a parent template that includes 2 child ones... my
intuition is that it'll work out-of-the-box right now.
Finally, before reviewing the implementation, go review the test. It will
help you understanding the interface better I guess.
Message ID: ***@***.***>
Thanks for your comments, as always :)
|
There was a problem hiding this comment.
Thanks for the clarifications, this was quite helpful. Indeed, template composition as shown is only one example of the feature; it's actually more general – and lightweight, which is nice.
I've left a few remarks related to documentation and license poisoning.
Now, thinking of your proposal, if you really need that, can't you do it
already? Just by using a clever combination of git submodules, !include
tags, and symlinks in a parent template that includes 2 child ones... my
intuition is that it'll work out-of-the-box right now.
Almost, I believe. Mapping answers in the parent template to questions in a child template seems to be missing though. When using !include, the questionnaire of a child template is simply merged with the parent template, which is not the same as passing down answers in the parent template as data to the child template. Also, !include doesn't offer encapsulation of child templates, e.g. templates with different Jinja settings wouldn't be properly supported.
But as you said, the feature of this PR does not block or prevent my suggested design from becoming available in the future. 👍
|
It's not really clear for me why should I use this instead of |
hparfr
left a comment
There was a problem hiding this comment.
it's _external_data_files but the key in the template is different "{{ _ext.parent1.name }}"
may be use the same term to be more clear like : "{{ _external_data_file.parent1.name}}"
|
All attended. Could you please review again? |
|
Since you removed |
There was a problem hiding this comment.
LGTM. I simply second @hparfr here:
It's not really clear for me why should I use this instead of --data-file argument
A small note could be added in each to make the distinction. My own, quick wording:
The CLI argument --data-file is not the same thing as the template configuration external_data: the former lets you provide a dictionary of answers matching a template questionnaire as a user, while the latter lets you enhance the Jinja rendering context of your template as a template writer.
Something like that.
When composing templates, it's often needed to be able to load answers from other templates that you know are usually combined with yours. Or any other kind of external data. @moduon MT-8282
|
Rebased, fixed conflicts, fixed typing errors and suggestions... Please @sisp update your review to see if we can merge. I'd like to have this included in the next release. |
|
Passing in |
When composing templates, it's often needed to be able to load answers from other templates that you know are usually combined with yours. Or any other kind of external data.
@moduon MT-8282