Skip to content

MatPipe could have the ability to checkpoint, and resume #182

Open
@ardunn

Description

@ardunn

Often times, parts of a pipeline will work fine (e.g., featurization), but the entire pipeline will fail because something down the line throws an error. It would be nice to have a "checkpoint" option like so:

pipe = MatPipe(**some_config, checkpoint="/home/user/checkpoint_dir")

When starting anew (i.e., no checkpoints), matpipe starts from scratch, and saves intermediate objects and dataframes to checkpoint dir.

When warm starting (checkpoint dir exists), it loads the relevant data from the checkpoint dir so that it doesn't wind up doing extra work (doesn't have to refit if already fit, doesn't have to refeaturize if refeaturized, etc.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    major enhancementv2.0Issues and enhancements for upcoming major release v2.0

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions