Skip to content

Requirements, training times, ... trained models? #3

@tpietruszka

Description

@tpietruszka

It would be quite helpful to know approx memory requirements and training times for different configs. I could not find such information in the Mogrifier paper or anywhere in this repo.

I've started training the Mogrifier with one of the provided Wikitext-2 configs. Each "turn" took 4-5 minutes on a V100 GPU, so taking 1000 turns as specified in the config would take ~3 days. It filled almost all of the V100's memory, so 16GB is probably required for the provided configs? Or would 12GB cut it? (8GB did not, I've tried).

Are the figures above typical for all the datasets and provided configs, or do they differ significantly?
(or perhaps my setup was wrong and performance should be higher?)

Finally, is there any chance of the trained models getting published?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions