Direct `csr` sparse ops support

**Is your feature request related to a problem? Please describe.**
I think there are performance gains to be had by not densifying inputs from minibatches when possible and instead doing backprop on the sparse matrix directly yielded from the loader at the level of sparsity we often see (~2%) in RNA-seq data at least. In this notebook, it's 2X for a MLP classifier. IIUC, this same trick applies to the loss function as well as the ELBO i.e., use the sparse matrix directly instead of densifying.

It's possible (likely) that at higher values, this benefit either decreases or becomes 0.

See https://colab.research.google.com/drive/14cjQkQ2lO9wT7BpcfLfYCUY40GncXesh?authuser=3 for something runnable, if not old given the antiquated colab GPU

The implementation is based on https://github.com/rusty1s/pytorch_sparse


**Describe the solution you'd like**
I think the answer is a sort of runtime setting enum with three options:

1. `auto` tries to detect sparsity and at some cutoff uses the sparse accelerator
2. `sparse_direct` will replace the first layer with a sparse-linear layer (i.e., one that does matmul on a sparse input) as well as the other applicable locations with their respective ops
3. `densify` densifies all inputs


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Direct `csr` sparse ops support #3759

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Direct csr sparse ops support #3759

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Direct `csr` sparse ops support #3759