Skip to content

Conversation

@vl-dud
Copy link
Contributor

@vl-dud vl-dud commented Nov 13, 2024

L1 regularizer is added. Not quite sure about the line:

l1_loss = torch.sum(torch.stack([torch.sum(p.abs()) for p in trainable_variables]))

Perhaps regularization should not be applied to self.external_trainable_variables? Then it must be like:

l1_loss = torch.sum(torch.stack([torch.sum(p.abs()) for p in self.net.parameters()]))

However, in current l2 regularization trainable_variables (both self.net.parameters() and self.external_trainable_variables) are used.

@lululxvi
Copy link
Owner

You are right that regularization should not be applied to self.external_trainable_variables. Let us fix this first in this PR.

@vl-dud
Copy link
Contributor Author

vl-dud commented Nov 14, 2024

Fixed. Please note that I replaced

raise ValueError("L-BFGS optimizer doesn't support weight_decay > 0")

with

print(f"Warning: weight decay is ignored for {self.opt_name}")

And I removed the exception

raise ValueError("AdamW optimizer requires non-zero weight decay")

because AdamW will still start with the default weight_decay

@lululxvi
Copy link
Owner

Let us fix l2 in this PR. Use a new PR for adding l1 and l1+l2.

@vl-dud
Copy link
Contributor Author

vl-dud commented Nov 17, 2024

Take a look now please.

@lululxvi
Copy link
Owner

In line 113, emphasize that external_trainable_variables will not apply regularization.

@vl-dud
Copy link
Contributor Author

vl-dud commented Nov 24, 2024

In line 113, emphasize that external_trainable_variables will not apply regularization.

Fixed.

@lululxvi lululxvi merged commit 3544fdf into lululxvi:master Dec 2, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants