Skip to content

Conversation

@szaman19
Copy link
Contributor

  • Adds local accumulate cache to speed up Scatter-Accumulate
  • Add option of shared cache over multiple operations. Reduces the amount of data stored for gradient calculations
  • Add unit tests + ablation benchmark
  • Add end-to-end benchmark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant