Skip to content

Ruby Grant 2020 Proposal

Udit Gulati edited this page Oct 1, 2020 · 3 revisions

Contact information

Name: Udit Gulati

E-mail: [email protected]

GitHub: uditgulati

Blog: uditgulati.github.io/blog

Phone: +91 9896363996

Brief biography

I am an active SciRuby contributor. I worked on NumRuby(re-implementation of NMatrix) during summer of 2019 as part of Google Summer of Code program. I am currently a Software Engineer at a start-up under SMC Global Securities Limited.

Projects

  • NumRuby
  • Ruby-Sparse

Talks

Project Title

Views support for NumRuby and Ruby-Sparse improvements.

Abstract

NumRuby is the re-implementation of popular dense matrix library NMatrix. It is re-implemented with primary focus being performance and scalability. It has shown upto 100x speedup for the fundamental functionalities such as elementwise operations. This project aims at adding Views support to the NumRuby library. Views are used to have multi-dimensional data structures exchange data without creating copies which saves a lot of memory and execution time.

Sparse matrices are an important part of scientific computing. Many popular modern languages like Python, Julia and R (some even have this as part of their standard libraries) have sparse matrices support while Ruby doesn't have a good library for sparse. Ruby-Sparse library is aimed to build an efficient and feature-rich end-user intended Sparse matrix library in Ruby with interfaces to popular dense matrix libraries (like Numo-Narray, NumRuby[NMatrix]) with linear algebra support.

What are Views?

Views are the structures that point to the same data structure without copying data when assigning or doing operations such as slicing, broadcasting by only creating view metadata such as matrix dimensions and stride. By using views instead of copying the whole data structure can be quite handy especially when the underlying data structure can become huge often such as in NumRuby where large matrices can take a huge chunk of memory. Views are able to do this by having access to the underlying structure through a custom buffer protocol and it helps even in transfer of data to/from other matrix libraries given that they also support views.

What is Sparse storage?

Sparse arrays, or arrays that are mostly empty or filled with zeros, are common in many scientific applications. To save space we often avoid storing these arrays in traditional dense formats, and instead choose different data structures. The choice of data structure can significantly affect our storage and computational costs when working with these arrays.

Large sparse matrices often appear in scientific or engineering applications when solving partial differential equations. When storing and manipulating sparse matrices on a computer, it is beneficial and often necessary to use specialized algorithms and data structures that take advantage of the sparse structure of the matrix. Operations using standard dense-matrix structures and algorithms are slow and inefficient when applied to large sparse matrices as processing and memory are wasted on the zeroes. Sparse data is by nature more easily compressed and thus requires significantly less storage.

Some very large sparse matrices are infeasible to manipulate using standard dense-matrix algorithms. So, it is required to implement completely different algorithms for sparse matrices leading to implementation of a completely different library for sparse.

Sparse storage can be implemented by following ways:

  • Compressed sparse row (CSR)
  • Compressed sparse column (CSC)
  • Coordinate list (COO)
  • Diagonal

These types are explained in Technical details section with implementation details.

Utilities of Sparse storage

  • Better performant sparse matrix operations. Improvements in speed and memory usage.
  • Graph algorithms support.
  • Sparse matrix linear algebra is better in performance and requires lesser memory.

Motivation

NumRuby has made some huge improvements in terms of speed and code simplicity as compared to the previous implementation NMatrix, but it still has scope for few more improvements and views being one of the significant ones. It will make it a lot smooth to do slicing, broadcasting or to do matrix assignment operation.

Having views implemented for NumRuby will also help in interfacing the library with other matrix libraries having views support. This will help in faster interfacing with libraries such as Daru, Numo-Narray and Ruby-Sparse. This will be achieved by skipping the copying of underlying matrix data values and instead pointing directly to that underlying data using a custom buffer protocol.

While many popular languages have sparse storage support (either in standard library or as a third-party library), Ruby doesn't have a good support for it. This projects aims to fill this gap by implementing a sparse storage library for Ruby environment which is fast, easy to use, API rich and highly compatible with popular dense storage libraries such as Numo-Narray and NumRuby (Re-implementation of NMatrix).

Languages that support Sparse storage

1. Python

  • Python has 2 major libraries for sparse matrix support, namely, scipy.sparse and pydata.sparse.
  • scipy.sparse is for 2-dimensional matrices only. It supports quite a lot of types of sparse matrix. These are BSR, COO, CSC, CSR, DIA, DOK and LIL.
  • scipy.sparse has a sub-module scipy.sparse.csgraph for graph algorithms. It consists of algorithms such as Dijkstra shortest path, Floyd warshall, Bellman ford, Bfs & Dfs, Minimum spanning tree, Maximum bipartite matching etc.
  • scipy.sparse has another sub-module scipy.sparse.linalg which is for linear algebra capabilities for sparse matrices. It consists of capabilities such as finding inverse, determinant or norm of a matrix, matrix factorization and solving linear problems.
  • pydata.sparse is for arbitrary dimensional sparse matrices. It supports only 2 types of sparse types, namely, COO and DOK.
  • pydata.sparse also supports basic arithmetic operations on sparse matrix and also has useful utility functions, random, eye and ones to name a few.
  • pydata.sparse is compatible with numpy.ndarray while scipy.sparse is compatible with numpy.matrix.

2. Julia

  • Julia has support for sparse vectors and sparse matrices in the SparseArrays std-lib module.
  • In Julia, sparse matrices are stored in the Compressed Sparse Column (CSC) format.
  • There is support for sparse matrix arithmetic operations and indexing.

Codebase

https://github.com/SciRuby/numruby

https://github.com/SciRuby/ruby-sparse

Technical Details

Views Project Layout

A proposed project layout is as follows:

numruby  
└── ext  
    ├── nmatrix_buffer.c
    ├── ruby_nmatrix.h  
    ├── ruby_nmatrix.c 
    ├── slicig.c  
    └── broadcasting.c  

Views buffer structure

typedef struct bufferinfo{
  void *buf;
  nmatrix *obj;        /* owned reference */
  nm_dtype dtype;
  int readonly;
  size_t ndims;
  size_t count;
  size_t* shape;
  size_t *stride;
}nm_buffer;
static int
nm_getbuffer(nmatrix *obj, nm_buffer *view)
{
  nmatrix* self = (nmatrix*)obj;
  view->obj = (nmatrix*)self;
  view->buf = (void*)self->elements;
  view->count = self->count;
  view->readonly = 0;
  view->dtype = self->dtype;
  view->ndims = self->ndims;
  view->shape = self->shape;
  view->stride = self->stride;

  return 0;
}

Sparse storage Project Layout

A proposed project layout is as follows:

ruby-sparse  
└── ext  
    ├── blas   
    │   └── sparse_blas.c  
    ├── elementwise  
    │   ├── arithmetic.c  
    │   ├── exponential.c  
    │   ├── hyperbolic.c  
    │   ├── numeric.c  
    │   └── trigonometric.c  
    ├── indexing  
    │   ├── index.c  
    │   ├── iterators.c  
    │   ├── slicing.c 
    │   └── broadcasting.c 
    ├── extconf.rb  
    ├── interfaces  
    │   ├── narray.c  
    │   └── nmatrix.c  
    ├── io  
    │   └── serialize.c 
    ├── mkmf.rb   
    ├── ruby_sparse.h  
    └── ruby_sparse.c  

Sparse storage Module structure

RubySparse  
	RubySparse::CSR  
	RubySparse::CSC  
	RubySparse::COO  
	RubySparse::Dia
	RubySparse::Linalg  
	RubySparse::SparseMatrix  

Classes

1. RubySparse::CSR

Class for Compressed Sparse Row matrix.

2. RubySparse::CSC

Class for Compressed Sparse Column matrix.

3. RubySparse::COO

Class for COOrdinate format sparse matrix.

4. RubySparse::Dia

Class for DIAgonal sparse matrix.

5. RubySparse::SparseMatrix

This class provides a base class for all sparse matrices. It cannot be instantiated. Most of the work is provided by it's subclasses.

Functions

1. eye(m, n, dtype: 'float', format: 'DIA')

Sparse matrix with ones on diagonal. Returns a sparse (m x n) matrix where the diagonal is all ones and everything else is zeros.

2. kron(A, B, format: 'CSR')

Kronecker product of sparse matrices A and B. Returns Kronecker product in a sparse matrix format.

3. tril(A, format: 'CSR')

Return the lower triangular portion of a matrix in sparse format.

4. triu(A, format: 'CSR')

Return the upper triangular portion of a matrix in sparse format

5. hstack(blocks, format: 'CSR', dtype: 'float')

Stack sparse matrices horizontally (column wise).

6. vstack(blocks, format: 'CSR', dtype: 'float')

Stack sparse matrices vertically (row wise).

7. dot(A, B)

Perform the dot product on two sparse matrices A and B.

8. save(file, matrix)

Save a sparse matrix to a file.

9. load(file)

Load a sparse matrix from a file. Returns a sparse matrix containing the loaded data.

Submodules

1. RubySparse::Linalg

This module consists of linear algebra capabilities for sparse matrices.

  • inv(A): Compute the inverse of a sparse matrix. Returns the inverse of A.
  • expm(A): Compute the matrix exponential using Pade approximation. Returns the matrix exponential of A.
  • norm(A, order, axis): Norm of a sparse matrix. This function is able to return one of seven different matrix norms, depending on the value of the order parameter. Returns norm of matrix.
  • solve(A, B): Solve the sparse linear system Ax=b, where b may be a vector or a matrix. Returns x, the solution of linear equation.
  • eigs(A, k): Find k eigenvalues and eigenvectors of the square matrix A. Returns an arrays of k eigenvalues and an array of k eigenvectors.
  • eigsh(A, k): Find k eigenvalues and eigenvectors of the real symmetric square matrix or complex hermitian matrix A. Returns an arrays of k eigenvalues and an array of k eigenvectors.
  • svds(A, k): Compute the largest k singular values/vectors for a sparse matrix.
  • lu(A): Compute the LU decomposition of a sparse, square matrix.

Supported sparse types

CSR

The compressed sparse row (CSR) or Yale format represents a matrix M by three (one-dimensional) arrays, that respectively contain nonzero values, the extents of rows, and column indices. Methods to be implemented:

  • get_dense_from_csr: This method will convert CSR storage to dense storage.
  • get_csr_from_dense: This method will convert dense storage to CSR storage.
typedef struct CSR_STRUCT
{
  sp_dtype dtype;
  size_t ndims;
  size_t count; //count of non-zero elements;
  size_t* shape;
  double* elements; //elements array
  size_t* ia; //row index
  size_t* ja; //col index
}csr_matrix;

CSC

CSC is similar to CSR except that values are read first by column, a row index is stored for each value, and column pointers are stored. Methods to be implemented:

  • get_dense_from_csc: This method will convert CSC storage to dense storage.
  • get_csc_from_dense: This method will convert dense storage to CSC storage.
typedef struct CSC_STRUCT
{
  sp_dtype dtype;
  size_t ndims;
  size_t count; //count of non-zero elements;
  size_t* shape;
  double* elements; //elements array
  size_t* ia; //row index
  size_t* ja; //col index
}csc_matrix;

COO

COO stores a list of (row, column, value) tuples. Ideally, the entries are sorted first by row index and then by column index, to improve random access times. Methods to be implemented:

  • get_dense_from_coo: This method will convert COO storage to dense storage.
  • get_coo_from_dense: This method will convert dense storage to COO storage.
typedef struct COO_STRUCT
{
  sp_dtype dtype;
  size_t ndims;
  size_t count; //count of non-zero elements;
  size_t* shape;
  double* elements; //elements array
  size_t* ia; //row index
  size_t* ja; //col index
}coo_matrix;

Diagonal

Diagonal stores just the entries in the main diagonal as a one-dimensional array, so a diagonal n × n matrix requires only n entries. It is used when only the diagonal entries are non-zero. Methods to be implemented:

  • get_dense_from_dia: This method will convert Dia storage to dense storage.
  • get_dia_from_dense: This method will convert dense storage to Dia storage.
typedef struct DIA_STRUCT
{
  sp_dtype dtype;
  size_t ndims;
  size_t count; //count of non-zero elements;
  size_t* shape;
  double* elements; //elements array
  size_t* offset;
}dia_matrix;

Modules and classes initialization

Initialization code for RubySparse and it's modules and classes is as following:

VALUE RubySparse = Qnil;
VALUE SparseMatrix = Qnil;
VALUE COO = Qnil;

void  Init_sparse() {
  RubySparse   =  rb_define_module("RubySparse");
  SparseMatrix =  rb_define_class_under(RubySparse, "SparseMatrix", rb_cObject);
  COO          =  rb_define_class_under(RubySparse, "COO", rb_cObject);
  
  rb_define_alloc_func(COO, coo_alloc);
  rb_define_method(COO, "initialize", coo_init, -1);
  rb_define_method(COO, "shape", coo_get_shape, 0);
  rb_define_method(COO, "elements", coo_get_elements, 0);
  rb_define_method(COO, "coords", coo_get_coords, 0);
  rb_define_method(COO, "nzcount", coo_get_count, 0);
  rb_define_method(COO, "dim", coo_get_ndims, 0);
  
  rb_define_method(COO, "+", coo_add, 1);
}

The sparse matrix initialization of COO is as following:

VALUE coo_init(int argc, VALUE* argv, VALUE self) {
  coo_matrix* mat;
  Data_Get_Struct(self, coo_matrix, mat);

  if(argc > 0){
    mat->ndims = 2;
    mat->count = (size_t)RARRAY_LEN(argv[1]);
    mat->shape = ALLOC_N(size_t, mat->ndims);
    for(size_t index = 0; index < mat->ndims; index++) {
      mat->shape[index] = (size_t)FIX2LONG(RARRAY_AREF(argv[0], index));
    }
    mat->elements = ALLOC_N(double, mat->count);
    for(size_t index = 0; index < mat->count; index++) {
      mat->elements[index] = (double)NUM2DBL(RARRAY_AREF(argv[1], index));
    }
    mat->ia = ALLOC_N(size_t, mat->count);
    for(size_t index = 0; index < mat->count; index++) {
      mat->ia[index] = (size_t)NUM2SIZET(RARRAY_AREF(argv[2], index));
    }
    mat->ja = ALLOC_N(size_t, mat->count);
    for(size_t index = 0; index < mat->count; index++) {
      mat->ja[index] = (size_t)NUM2SIZET(RARRAY_AREF(argv[3], index));
    }
  }

  return self;
}


VALUE coo_alloc(VALUE klass) {
  coo_matrix* mat = ALLOC(coo_matrix);

  return Data_Wrap_Struct(klass, NULL, coo_free, mat);
}


void coo_free(coo_matrix* mat) {
  xfree(mat);
}

Similarly, it'll be implemented for other sparse types.

Sparse storage Dtypes

The library will be implemented only for int and float64 dtypes during the grant period. The other dtypes such as bool, complex64 and string will be implemented if time allows or will be done after the grant period.

Sparse storage Indexing

Sparse matrix indexing works differently as compared to that of dense ones. Each sparse type has a different way of indexing of elements as the position of each element is stored differently in each one. Foe example, for COO, the row and column indices are directly stored with the element, so it can directly be accessed from these but fro CSR which stores the column index but stores how many elements are in each row which doesn't give direct access to the row index of an element.

Once indexing is implemented then it can be used to iterate over each element of the sparse matrix. Other types of iterator can be implemented such as:

  • each_with_index: Iterate over each element with index of occurrence.
  • each_with_indices: Iterate over each element with row and column index.
  • each_row: Iterate over each row of sparse matrix.
  • each_column: Iterate over each column of sparse matrix.
  • each_row_with_index: Iterate over each row of sparse matrix with corresponding index.
  • each_column_with_index: Iterate over each column of sparse matrix with corresponding index.

Indexing will also be used to implement slicing of sparse matrices. Examples of slicing:

  • mat[1, 0:2]: Row with index 1 and columns with indices from 0 to 2, not including 2.
  • mat[0:5, 3:4]Rows with indices from 0 to 5, not including 5 and columns with indices from 3 to 4, not including 4.

Sparse matrix operations

Elementwise

The following Elementwise operations would be supported.

  • Basic Arithmetic operations: +, -, *, /, >>, <<.
  • Exponential and logarithmic functions: exp, log, expm1, log1p, etc.
  • Hyperbolic functions: sinh, cosh, tanh, etc.
  • Logical operations: &&, ||, |, &, <, >, <=, >=,==, !.
  • Numeric functions: floor, round, min, max, etc.
  • Trigonometric functions: sin, cos, tan, etc.

Elementwise addition for COO is as follows:

VALUE coo_add(VALUE self, VALUE another) {
  coo_matrix* left;
  coo_matrix* right;
  Data_Get_Struct(self, coo_matrix, left);
  Data_Get_Struct(another, coo_matrix, right);

  coo_matrix* result = ALLOC(coo_matrix);
  result->count = 0;
  result->ndims = left->ndims;
  result->shape = ALLOC_N(size_t, result->ndims);

  for(size_t index = 0; index < result->ndims; index++){
    result->shape[index] = left->shape[index];
  }

  result->elements = ALLOC_N(double, 2 * left->count);
  result->ia = ALLOC_N(size_t, 2 * left->count);
  result->ja = ALLOC_N(size_t, 2 * left->count);

  size_t left_index = 0, right_index = 0, result_index = 0;
  while(left_index < left->count || right_index < right->count) {
    if(left->ia[left_index] == right->ia[right_index] //left and right indices equal
    && left->ja[left_index] == right->ja[right_index] 
    && left_index < left->count && right_index < right->count) {
      result->elements[result_index] = left->elements[left_index] + right->elements[right_index];
      result->ia[result_index] = left->ia[left_index];
      result->ja[result_index] = left->ja[left_index];

      left_index++, right_index++, result_index++;
    }
    else if((left->ia[left_index] < right->ia[right_index]) //left index smaller
    || (left->ia[left_index] == right->ia[right_index] && left->ja[left_index] < right->ja[right_index])
    && left_index < left->count) {
      result->elements[result_index] = left->elements[left_index];
      result->ia[result_index] = left->ia[left_index];
      result->ja[result_index] = left->ja[left_index];

      left_index++, result_index++;
    }
    else {  //right index smaller
      result->elements[result_index] = right->elements[right_index];
      result->ia[result_index] = right->ia[right_index];
      result->ja[result_index] = right->ja[right_index];

      right_index++, result_index++;
    }

    result->count++;
  }

  return Data_Wrap_Struct(COO, NULL, coo_free, result);
}

Matrix multiplication

There are multiple sparse BLAS libraries to support:

  • matrix-vector multiplication
  • Matrix-matrix multiplication

I'll use the NIST Sparse BLAS library written in FORTRAN for these operation. The details for library can be found at https://math.nist.gov/spblas/.

Sparse linear algebra

The sparse linear algebra sub-module needs further research on which back-end to use as there isn't a LAPACK counterpart for sparse matrices. I have found a few possible alternatives, such as:

There's also a list of low-level libraries for sparse matrices here at http://www.netlib.org/utk/people/JackDongarra/la-sw.html.

For the grant period, I'll focus more on graph sub-module to get completed and will also research on possible solutions for sparse linear algebra sub-module and will implement basic linear algebra functionalities such as finding matrix inverse, norm and determinant or finding solution to system of linear equations.

Sparse storage Interfaces

NumRuby (Re-implementation of NMatrix)

Each of the sparse classes will have a class method called from_nmatrix() and an instance method called to_nmatrix(). from_nmatrix() is to convert nmatrix to given sparse type and to_nmatrix() is to convert given sparse type to nmatrix.

Following is the implementation of these methods for RubySparse::COO:

  • NMatrix to COO
VALUE coo_from_nmatrix(VALUE self, VALUE nmat) {
  VALUE r_elements =  rb_funcall(nmat, rb_intern("elements"), 0);
  VALUE r_shape =  rb_funcall(nmat, rb_intern("shape"), 0);
  
  if((size_t)RARRAY_LEN(r_shape) !=  2) {
    rb_raise(rb_eStandardError, "NMatrix must be of 2 dimensions.");
  }
  coo_matrix* mat;
  
  mat->ndims  =  2;
  mat->count  =  0;
  mat->shape  =  ALLOC_N(size_t, mat->ndims);
  for(size_t index =  0; index <  mat->ndims; index++) {
    mat->shape[index] = (size_t)NUM2SIZET(RARRAY_AREF(r_shape, index));
  }
  
  size_t dense_elements_count = (mat->shape[0]) * (mat->shape[1]);
  size_t sparse_elements_count =  0;
  for(size_t index =  0; index < dense_elements_count; ++index) {
    double value = (double)NUM2DBL(RARRAY_AREF(r_elements, index));
    if(fabs(value -  0.0) <  1e-6)
      continue;
    sparse_elements_count++;
  }
  
  mat->count  = sparse_elements_count;
  mat->elements  =  ALLOC_N(double, mat->count);
  mat->ia  =  ALLOC_N(size_t, mat->count);
  mat->ja  =  ALLOC_N(size_t, mat->count);
  
  size_t element_index =  0, index =  0;
  for(size_t i =  0; i <  mat->shape[0]; ++i) {
    for(size_t j =  0; j <  mat->shape[1]; ++j) {
      double value = (double)NUM2DBL(RARRAY_AREF(r_elements, element_index++));
      if(fabs(value -  0.0) <  1e-6)
        continue;
      mat->elements[index] = value;
      mat->ia[index] = i;
      mat->ja[index] = j;
      index++;
    }
  }
  
  return  Data_Wrap_Struct(COO, NULL, coo_free, mat);
}
  • COO to NMatrix
class  COO

  def to_nmatrix
    ia = self.coords[0]
    ja = self.coords[1]
  
    nm_elements = Array.new(self.shape[0]*self.shape[1], 0)
    for i in (0...self.nzcount)
      nm_index = (self.shape[1] * ia[i]) + ja[i]
      nm_elements[nm_index] = self.elements[i]
    end
  
    m = NMatrix.new self.shape, nm_elements
    return m
  end
end

Numo-Narray

The interfaces for Numo-Narray will also be implemented similarly as NMatrix.

Sparse storage IO support

There will be support to serialize the Sparse storage object to store it on disk and later can be read from the disk. Object serialization in Ruby is also known as Object marshalling.

Serialization support will be implemented as load and save methods in io/serialize.c file.

Tests and Documentation methodology

Tests

RSpec will be used for tests. Tests for NumRuby can be found here and for Ruby-Sparse can be found here.

Tests structure will be as follows:

$LOAD_PATH.unshift File.expand_path('../../lib', __FILE__)
require 'sparse'
require 'minitest/autorun'

class COO::CreationTest < Minitest::Test

  def setup
    @n = RubySparse::COO.new [3, 3], [1, 2, 3], [0, 1, 2], [0, 1, 2]
  end

  def test_ndims
    assert_equal 2, @n.dim
  end

  def test_shape
    assert_equal [3, 3], @n.shape
  end

  def test_elements
    assert_equal [1, 2, 3], @n.elements
  end

  def test_coords
    assert_equal [[0, 1, 2], [0, 1, 2]], @n.coords
  end

  def test_count
    assert_equal 3, @n.nzcount
  end

end


class COO::ElementWiseTest < Minitest::Test

  def setup
    @left = RubySparse::COO.new [3, 3], [1, 2, 3], [0, 1, 2], [0, 1, 2]
    @right = RubySparse::COO.new [3, 3], [3, 2, 3], [0, 1, 2], [2, 2, 2]
  end

  def test_add
    result = RubySparse::COO.new [3, 3], [1, 3, 2, 2, 6], [0, 0, 1, 1, 2], [0, 2, 1, 2, 2]
    answer = @left + @right
    assert_equal answer.dim, result.dim
    assert_equal answer.shape, result.shape
    assert_equal answer.nzcount, result.nzcount
    assert_equal answer.elements, result.elements
    assert_equal answer.coords, result.coords
  end

end

Yard will be used for documentation.

Library usage examples

These example files gives a gist of usage of Ruby-Sparse library.

Timeline

28th Oct - 6th Nov

  • Implement the RubySparse::COO class with element-wise operations.
  • Implement indexing for RubySparse::COO.
  • Implement Numo::Narray and NMatrix interfaces for RubySparse::COO.

7th Nov - 16th Nov

  • Implement the RubySparse::CSR class with element-wise operations.
  • Implement indexing for RubySparse::CSR.
  • Implement Numo::Narray and NMatrix interfaces for RubySparse::CSR.

17th Nov - 26th Nov

  • Implement the RubySparse::CSC class with element-wise operations.
  • Implement indexing for RubySparse::CSC.
  • Implement Numo::Narray and NMatrix interfaces for RubySparse::CSC.

27th Nov - 6th Dec

  • Implement the RubySparse::DIA class with element-wise operations.
  • Implement indexing for RubySparse::DIA.
  • Implement Numo::Narray and NMatrix interfaces for RubySparse::DIA.

7th Dec - 16th Dec

  • Implement pretty print for RubySparse::COO, RubySparse::CSR, RubySparse::CSC and RubySparse::DIA.
  • Implement resize() and reshape().

17th Dec - 26th Dec

  • Add serialization support for sparse matrix classes.
  • Implement load() and save() methods.
  • Implement tests for sparse matrix element-wise operations.

27th Dec - 5th Jan

  • Add inter sparse classes conversion support.
  • Implement classes such as to_coo(), to_csr(), to_csc() and to_dia().

6th Jan - 13th Jan

  • Add tests for RubySparse::COO, RubySparse::CSR, RubySparse::CSC and RubySparse::DIA.
  • Refactor and optimize code for RubySparse::COO, RubySparse::CSR, RubySparse::CSC and RubySparse::DIA.

Mid term deliverables (14th Jan)

  • Implementation of sparse storage classes, namely, COO, CSR, CSC and DIA.
  • Implement serialization support.
  • Implement interfaces to dense matrix libraries and inter sparse type conversion support.
  • Implement pretty print for all sparse types.

15th Jan - 24th Jan

  • Implement custom buffer protocol of NMatrix.
  • Implement matrix assignment using views.

25th Jan - 3rd Feb

  • Implement matrix slicing, broadcasting using views
  • Benchmarks and possible bug fixes for views.

4th Feb - 13th Feb

  • Implement RubySparse::Linalg module.
  • Add matrix-vector, matrix-matrix dot product support.
  • Add sparse matrix linear equation solution support.

14th Feb - 23rd Feb

  • Implement triu() and tril() methods for sparse types classes.
  • Implement hstack() and vstack() methods.

24th Feb - 4th Mar

  • Add tests for RubySparse::Linalg.
  • Write documentation for ruby sparse API.

5th Mar - 12th Mar

  • Release ruby-sparse gem.
  • Create benchmarks for the library.
  • Create graph plots for the benchmarks.

Final deliverables (13th Mar)

  • RubySparse::Linalg module with proper tests.
  • Documentation for the code written so far.
  • Proper benchmarks for core functionalities.
  • Graph plots of benchmarks and comparison with respect to other similar libraries.

Deliverables

  • NumRuby views implementation using custom buffer protocol.
  • Implementation of slicing, broadcasting supporting views.
  • Implementation of Sparse storage library.
  • IO (serialization) support for the Sparse storage.
  • Implementation of proper interfaces for dense storage (or dense arrays) libraries, namely, Numo-Narray and NumRuby.
  • Implement a linear algebra and a graph module for sparse matrices.
  • Proper documentation and strong tests.
  • Release of numruby and sparse-storage gems.

Future Work

  • The sparse library can be extended to support arbitrary dimensional sparse arrays.
  • Complex linear algebra and graph algorithms can be added if required.
  • Add broadcasting support to sparse matrices.

Technical Advisor

I have asked my GSoC 2019 mentor Prasun Anand to advise me on this project. Prasun works as a software engineer at Quansight and is currently working on the Pytorch library. He received Ruby Grant 2017 to work on GPU-accelerated Libraries for Ruby.

https://www.prasunanand.com/about/