Training not converging with default settings

Hi, 
we're from the University of Wuerzburg and tried to replicate your project for German report data.
For now, we simply tried to get your code to run and train on MIMIC with the default settings provided as well as the settings provided in your paper. Of course, we made sure to have the same package versions as in the project.

However, we quickly get NaN loss after some iterations. So first, we tried to create a subsample of the dataset. For a very small dataset (~300 images), the training does converge. However, even for 1000 images the loss does not get smaller. We also tried several different learning rates and hyperparameters, but nothing helped so far.

I was hoping that you might be familiar with our problems and give us advice here.

Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training not converging with default settings #15

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Training not converging with default settings #15

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions