Skip to content

CDR: continuous input data removal for active workflows #11409

Description

@amaltaro

Impact of the new feature
WM in general

Is your feature request related to a problem? Please describe.
This is a long wished development, initially reported and briefly discussed in
#8134

This ticket is meant to evaluate and document the current input data placement mechanism and all its dependencies. Once the input data placement logic is fully understood and documented, we can start investigating how to minimize the amount of input data rules pinned on disk, thus removing input blocks while the workflow(s) is still active, as well as releasing a workflow to start processing while input data is still being transferred to Disk.

All of this needs to consider the projection needs for HL-LHC, to be investigated in this ticket:
#11408

This information will also be required for the Computing Conceptual Design Report (CDR).

Describe the solution you'd like
A thorough analysis of the current input data placement logic, for primary/parent/pileup data.
Then the description of a future model that will minimize the disk utilization for input data, including the different workflow types, systems that would have to be created and/or refactored.

No development is expected to be delivered from this issue, but a complete documentation of the required changes and potentially a candidate design for such system.

Describe alternatives you've considered
This issue can potentially spawn a few other more targeted issues to proceed with the required developments.

Additional context
Some investigation has been performed with this ticket: #11418

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

Status
ToDo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions