Skip to content

Get DAG working better for pipelines #7

@patcon

Description

@patcon

Right now, we run every pipeline ID as separate pipeline runs, with their own top-level pipeline.

There should be a way to get everything (e.g. imputation step, reducer step, clustering step, etc) operating as separate sub-pipelines, in which case we can run the whole thing as one big pipeline. This will allows us to be smarter about when gets regenerated during a pipeline run.

I think this will involve fixing up the namespaces, since they'll need to be totally unique.

Benefits:

  • this will speed things up for early intensive steps like KNN imputation. we'll only have to do it once.
  • it will add speed-ups when we start varying the clusters more

#todo

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions