feat: symlink overridden paths to canonical default locations on startup#147
feat: symlink overridden paths to canonical default locations on startup#147gipert wants to merge 2 commits into
Conversation
When paths.<key> in dataflow-config.yaml is set to an external location (e.g. reusing tier_dsp from another production), link_external_paths() now creates a relative symlink at the default generated/ location so the production tree maintains a consistent layout. Mirrors the same strategy already in place in legend-simflows.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #147 +/- ##
==========================================
- Coverage 64.70% 61.60% -3.10%
==========================================
Files 7 7
Lines 629 672 +43
==========================================
+ Hits 407 414 +7
- Misses 222 258 +36 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@ggmarshall i noticed that when par dirs are taken from outside the dataflow still creates a par dir with a validity file inside in the current cycle. why? is this needed? |
|
Looks like it is written but not used can add a check if in prod cycle and not write in that case |
|
i can then see how to skip this? do you know where to look? |
|
main snakefile |
|
i think the last commit is ugly, but i think there is no other way |
|
Can we not merge into a loop over ["dsp", "hit", 'psp", "pht"] so its a bit cleaner? |
When par_<tier> is overridden to an external location, writing the validity file there would pollute or fail on a shared/read-only directory. Guard each write with a path comparison so it only runs for tiers whose par path is within the current production tree.
4330835 to
a2f490a
Compare
Summary
link_external_paths()tolegenddataflow/methods/paths.py, called inonstartviapaths.link_external_paths(config, workflow.basedir, logger=logger)paths.<key>indataflow-config.yamlpoints outside the current production tree, a relative symlink is created at the canonical default location (e.g.generated/tier/dsp -> /other/prod/generated/tier/dsp) so thegenerated/layout stays consistent regardless of overrideslegend-simflows(legendsimflow.utils.link_external_paths)tier_*,tier_raw_blind) and par (par_*) paths, pluspltandmetadata; parent dirstierandparare excluded to avoid parent/child symlink conflictspar_<tier>path is redirected outside the current production tree, to avoid writing into shared/read-only external directoriesTest plan
tier_dsp) to an external directory — symlink appears atgenerated/tier/dsppar_pht— validity file is not written to the external par directory