Autograd of superloss

I've noticed the superloss implementation is similar with AlanChou's unoffical implementation (https://github.com/AlanChou/Super-Loss). Both of which used scipy to calculate lambertw. However, as stated in AlanChou's implementation, quoted:

> The labertw function should be implemented with PyTorch instead of using the scipy library as mentioned in https://github.com/AlanChou/Truncated-Loss/issues/3#issuecomment-753650227.

> There is a mistake because the additive regularization part doesn't have any gradients for Autograd. 

Does this implementation solve the above problem?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autograd of superloss #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Autograd of superloss #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions