I’m trying to understand the selective scan operation in Mamba, but I’m having difficulty connecting the implementation with the underlying math/intuition.
Could someone please share good resources (papers, blog posts, or explanations) that clearly explain:
- what selective scan is doing mathematically
- how it relates to state space models
- and how it is implemented efficiently in Mamba
Thanks
I’m trying to understand the selective scan operation in Mamba, but I’m having difficulty connecting the implementation with the underlying math/intuition.
Could someone please share good resources (papers, blog posts, or explanations) that clearly explain:
Thanks