Skip to content

Resuming Training #4

@snrazavi

Description

@snrazavi

Hello,
I am trying to train a seq-att model for translation, but always after 3 or 4 epochs, the training process stops unexpectedly (Kill). Moreover, when I try to resume the training using --model_in option I recieve out of memory error and this is regardless of how much GPU memory I use using --dynet-mem. I have a GTX 980 GPU with 8 Giga bytes of graphics RAM. Also I should add that the memory problem is more critical and severe when I try to use other optimization methods such as adam.
Many thanks in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions