MoonMath.ai

MoonMath.ai builds the performance layer for Physical AI.

We are a small team of mathematicians & engineers building production-grade acceleration for the next wave of AI systems via low level algorithms and system engineering.

MoonLite

MoonLite is MoonMath’s category of inference acceleration kernels designed for large generative models.

LiteAttention: Transforming Video Diffusion with Temporal Sparse Attention
LiteFFN: Replaces standard FFN layers with a decomposed module.
BackLite: Wraps Flash Attention 3 and uses attention sparsity to speed up the backward pass via gradient approximation.