Currently Matrix<Scalar> is rather opaque when in comes to accessing its contents.
A common access pattern of matrices is to iterate over them by either column or row order.
Being a performance framework however we would rather not have (the lowest-level) matrix iterators allocate any temporary memory, as would be necessary for column-wise iteration, given that ArraySlice<Scalar> does not allow for providing a stride.
(I already have a proof-of-concept for this on a local branch and will push it as a PR at some point.)