Open
Description
Torch autograd's jacobian, used by LearnableCostFunction computes cross-batch gradients which is undesirable. I haven't seen an out of the box solution, so we might need to do some manual backward() and proper use of vmap()
to compute the jacobian on our own.