You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/BUTLER.md
+32Lines changed: 32 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -50,3 +50,35 @@ The `PGPASSFILE` value is constructed using the `config.htcondor.remote_user_hom
50
50
51
51
> [!NOTE]
52
52
> Secrets for second-party Butlers may also be provided via an environment variable. By setting `LSST_DB_AUTH_CREDENTIALS` with the JSON string representation of a `db-auth.yaml` file, all dependencies on presumed filesystem objects in the submission environment are resolved.
53
+
54
+
## Butler Collection Management
55
+
56
+
CM Service creates tagged and chained Butler collections during its runtime:
57
+
58
+
### Preflight
59
+
During Campaign preflight, three Butler collection operations are called. This happens before any steps, groups, or jobs are created or executed.
60
+
61
+
1. A *tagged* collection is made from the Campaign's `collection.campaign_source` setting, constrained by the Campaign's `data.data_query` setting.
62
+
1. A *chained* collection is made from the Campaign's `collection.campaign_ancillary_inputs` setting.
63
+
1. A *chained* collection is made from the previous two collections.
64
+
65
+
The final chained collection is used as an *input* collection for all subsequent Campaign step operations, i.e., it is identified as part of the `payload.inCollection` for any BPS workflow files generated by the Campaign.
66
+
67
+
### Stepwise Processing
68
+
During Campaign stepwise processing, each Step in the Campaign includes Butler collection operations:
69
+
70
+
1. A step-specific *chained* collection is made from Campaign input collection and applied to the `payload.inCollection` parameter.
71
+
1. A step-group-specific *run* collection is made as a side effect of executing the step, named as indicated by the Group's `payload.outputRun` BPS Workflow parameter.
72
+
1. A step-specific *chained* collection is made from the set of *run* collections generated by the step-groups.
73
+
74
+
> [!Note]
75
+
> During Stepwise processing, the BPS Workflow `payload.dataQuery` is populated according to the Step's `child_config.base_query` parameter and modified according to any Group splitting algorithm applied to the Step; it is not affected by the `data.data_query` at the Campaign level.
76
+
77
+
> [!Note]
78
+
> Presumably, if the *tagged* Campaign input collection was constrained by a meaningful data query, then that query does not need to be repeated in the Stepwise consideration of Butler collections, and only the result of group-split algorithms is necessary. However, this means that any out-of-band observer of a CM-generated BPS workflow file will not understand the nature of the input collection and its interaction with the data query without cross-referencing, so insofar as it improves understandability, the workflow file should be as comprehensively detailed as possible, even if doing so is redudant.
79
+
80
+
### Postflight
81
+
During Campaign postflight, Butler collection operations are used to further chain together Campaign elements, eventually resulting in a single *chained* collection for the entire Campaign.
82
+
83
+
1. Each step-specific *chained* collection is itself chained to a Campaign *chained* "output" collection.
84
+
1. The final *chained* collection, named according to the Campaign's `collection.out` parameter, includes the Campaign "output" collection, the Campaign "input" collection, and the Campaign "resource_usage" collection.
0 commit comments