`GradSim` Improvements

I think the dash onsite demonstrated the `GradSim` method is slow for large models. This is because currently, [pytorch](https://stackoverflow.com/questions/53798023/computing-gradients-for-every-individual-sample-in-a-batch-in-pytorch) and [tensorflow](https://stackoverflow.com/questions/45324767/tensorflow-how-to-get-gradients-per-instance-in-a-batch) don’t let you compute gradients per instance in a batch which gradient similarity requires. We can do this before time by storing the gradients but this becomes impossible for large models. Note that partial solutions include: a) using a subset of model weights, such as a final layer, to decrease memory overhead or b) reducing the dataset you're comparing against using something like [ProtoSelect](https://docs.seldon.io/projects/alibi/en/latest/methods/ProtoSelect.html). Both of these are user-level interventions. I think our focus should be figuring out how to batch the gradient computations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`GradSim` Improvements #837

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GradSim Improvements #837

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`GradSim` Improvements #837