You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Replace QB load balancing with a simpler fixed bias increment. Each step, increase bias for underloaded experts and decrease for overloaded ones by a fixed delta.
User prompt
follow agent.md and implement fixed bias increment instead of QB load balancing. That is, on each step increase or decrease each router bias term by 0.01, depending on if the expert is underloaded or overloaded compared to its fair share. Test fixed increment scales of 0.01, 0.05, 0.005, 0.001.
TL;DR
Replace QB load balancing with a simpler fixed bias increment. Each step, increase bias for underloaded experts and decrease for overloaded ones by a fixed delta.
User prompt
Scope
moe_fixed_bias_incrementexperiments/grug/moe/fixed_bias_increment_sweep.pyGrugTrainerConfig.fixed_bias_incrementMechanism
QB computes an optimal threshold from top-k statistics. This tests whether a simpler sign-based update achieves similar balancing.
Gate 1 runs (8 total)
Decision log
empty
Conclusion
pending