High-performance Tensor Computation and Machine Learning
with Dynamic Auto-Differentiation natively in Rust
Modeled after PyTorch and NumPy, it provides the following features:
- N-dimensional Arrays (
NdArray) for tensor computations. - Linear Algebra & Operations with GPU and CPU acceleration
- Dynamic Automatic Differentiation (reverse-mode autograd) for machine learning.
To install either do cargo install redstone-ml or add this library to Cargo.toml.
Internally, Redstone speeds up its operations using custom ARM NEON kernels, BLAS, and vDSP on supported architectures. This makes it blazing fast! The benchmark below is for single-threaded Apple Silicon.
This project is still in its early stage, so feature requests, bugs, and other contributions are very welcome. Please contact me if you are interested!
More detailed documentation is available here.
An NdArray is a fixed-size multidimensional array container defined by its shape
and datatype. 1D (vectors) and 2D (matrices) arrays are often of special interest
and can be used in various linear algebra computations
including dot products, matrix products, batch matrix multiplications, einsums, and more.
NdArrays can be iterated over (along configurable dimensions), reshaped, sliced, and indexed,
reduced, and more.
This struct is heavily modeled after NumPy's ndarray and supports many of the same methods.
Example:
use redstone_ml::*;
let matrix_a = NdArray::new([[1, 3, 2], [-1, 0, -1]]); // shape [2, 3]
let matrix_b = NdArray::randint([3, 7], -5, 3);
let matrix_view = matrix_b.slice_along(Axis(1), 0..2); // shape [3, 2]
let matrix_c = matrix_a.matmul(matrix_view);
let result = matrix_c.sum();There are 2 ways we can create NdArray views: by borrowing or by consuming:
let data = NdArray::<f64>::rand([9]);
let matrix = (&data).reshape([3, 3]); // by borrowing (data remains alive after)
let data = NdArray::<f64>::rand([9]);
let matrix = data.reshape([3, 3]); // by consuming dataThe consuming syntax allows us to chain operations without worrying about lifetimes
// a reshaped and transposed random matrix
let matrix = NdArray::<f64>::rand([9]).reshape([3, 3]).T();Operations like reshape, view, diagonal, squeeze, unsqueeze, T, transpose, and
ravel do not create new NdArrays by duplicating memory (which would be slow).
They always return NdArray views which share memory with the source NdArray.
NdArray::clone() or NdArray::flatten() can be used to duplicate the underlying NdArray.
This means that all NdArray views have a lifetime at-most as long as the source NdArray.
We currently support the core linear algebra operations including dot products, matrix-vector and matrix-matrix multiplications, batched matrix multiplications, and trace.
vector1.dot(vector2);
matrix.trace(); // also trace_along/offset_trace
matrix.diagonal(); // also diagonal_along/offset_diagonal
matrix.matmul(vector);
matrix1.matmul(matrix2);
batch_matrices1.bmm(batch_matrices2);
// generic einsums
einsum([&matrix1, &matrix2, &vector], (["ij", "kj", "i"], "ik"));We can also perform various reductions including sum, product, min, max,
min_magnitude, and max_magnitude. Each of these is accelerated with various libraries
including vDSP, Arm64 NEON SIMD, and BLAS.
let sum = ndarray.sum();
let sum_along = ndarray.sum_along([0, -1]); // sum along first and last axesNdArrays can be used in arithmetic operations using the usual binary operators including
addition (+), subtraction (-), multiplication (*), division (/), remainder (%),
and bitwise operations (&, |, <<, >>).
let result = &arr1 + &arr2; // non-consuming
let result = &arr1 + arr2; // consumes RHS
let result = arr1 + arr2; // consumes bothNdArrays are automatically broadcast using the exact same rules as NumPy
to perform efficient computations with different-dimensional (yet compatible) data.
Slicing and indexing an NdArray always return a view. This is how we can access various
elements of vectors, columns/rows of matrices, and more.
let arr = NdArray::<f32>::rand([2, 4, 3, 5]); // 4D NdArray
let slice1 = arr.slice(s![.., 0, ..=2]); // use s! to specify a slice
let slice2 = arr.slice_along(Axis(-2), 0); // 0th element along second-to-last axis
let el = arr[[0, 3, 2, 4]];One can also iterate over an NdArray in various ways:
for subarray in arr.iter() { /* 4x3x5 subarrays */ }
for subarray in arr.iter_along(Axis(2)) { /* 2x4x5 subarrays */ }
for el in arr.flatiter() { /* element-wise iteration */ }The Tensor API is nearly identical to NdArray with the following differences:
- Only floating point (
f32,f64) types are supported - Operations without autograd implemented are omitted
Tensors allow us to perform dynamic automatic differentiation which is independent of control flow. This allows us to find matrix derivatives in complicated scenarios:
fn main() {
let mut a = Tensor::new([[7.5, 12.0], [5.0, 6.25]]);
let mut b = Tensor::new([0.5, -2.0]);
let c = Tensor::scalar(10.0);
a.set_requires_grad(true);
b.set_requires_grad(true);
let matrix_2x2 = (&a / &b) * (c + 5.0);
let result = matrix_2x2.matmul(&b);
result.backward();
println!("{:?}", a.gradient().unwrap());
println!("{:?}", b.gradient().unwrap());
}Gradients are only computed for Tensors with requires_grad = true to avoid unnecessary computation.
