Skip to content

Blitz tutorial takes too long to run (GD instead of SGD?) #336

@gdalle

Description

@gdalle

Hi there!
In the 60-minute blitz tutorial (https://fluxml.ai/tutorials/2020/09/15/deep-learning-flux.html), the part where we train a network on CIFAR10 takes longer than expected. Could it be because we actually go through every minibatch in each epoch, instead of sampling only one?
I am specifically referring to this line

. Because of it, I feel like we are actually doing a non-stochastic gradient descent, which would explain the large runtime.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions