Skip to content

Commit 5b6f38d

Browse files
authored
Faster cpu ops (#1434)
* faster binary and cleaner copy * use recursive template for other ops * more cleanup * fix from cleanup * more clean * fix binary * use contiguous iterator * add 3d * nits * fix * fix? * fix * fix rebase
1 parent 0b4a586 commit 5b6f38d

File tree

12 files changed

+577
-1334
lines changed

12 files changed

+577
-1334
lines changed

mlx/backend/common/binary.h

Lines changed: 117 additions & 240 deletions
Large diffs are not rendered by default.

mlx/backend/common/binary_two.h

Lines changed: 130 additions & 447 deletions
Large diffs are not rendered by default.

mlx/backend/common/common.cpp

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -156,8 +156,7 @@ std::pair<bool, std::vector<size_t>> Reshape::prepare_reshape(
156156
}
157157

158158
// Firstly let's collapse all the contiguous dimensions of the input
159-
auto [shape, _strides] = collapse_contiguous_dims(in);
160-
auto& strides = _strides[0];
159+
auto [shape, strides] = collapse_contiguous_dims(in);
161160

162161
// If shapes fit exactly in the contiguous dims then no copy is necessary so
163162
// let's check.

0 commit comments

Comments
 (0)