Skip to content

How to avoid mistakenly resuming from ckpt? #631

@felipemello1

Description

@felipemello1

It seems that if a ckpt folder exists, it resumes from there, instead of using the original model. If a user is running experiments sequentially and is not aware of this behavior, subsequent experiments will silently be affected.

A better design is necessary to avoid this type of footgun, while still making it easy to resume from ckpt , if this is the intended behavior, without overwritting current ckpts.

Easiest option seems to be to enable a flag: "resume_from_ckpt". If the user wants to resume, then its a active effort.

Another options is to always create a new folder, e.g. "exp_{time}", but then it may silently take memory space in the users computer.

TODO: research for best practices

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions