Skip to content

Autograd of superloss #1

Open
Open
@yuanpinz

Description

@yuanpinz

I've noticed the superloss implementation is similar with AlanChou's unoffical implementation (https://github.com/AlanChou/Super-Loss). Both of which used scipy to calculate lambertw. However, as stated in AlanChou's implementation, quoted:

The labertw function should be implemented with PyTorch instead of using the scipy library as mentioned in AlanChou/Truncated-Loss#3 (comment).

There is a mistake because the additive regularization part doesn't have any gradients for Autograd.

Does this implementation solve the above problem?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions