Skip to content

Stochastic Gating #862

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Stochastic Gating #862

wants to merge 1 commit into from

Conversation

kentslaney
Copy link

@kentslaney kentslaney commented May 7, 2025

prior art

If anyone has more GPUs than ideas, I'd appreciate this being tried (and getting feedback from it). It trains on a small scale and doesn't immediately diverge in loss from the original, but my hope is that it might mitigate mode collapse, which happens late and at scale. That being said, it's a negative result so far. If I get around to trying it at scale myself, I'll update the thread.

Thoughts and discussion without results is welcome as well.

I also have a standalone implementation for anyone without a training setup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant