Skip to content

Torch backend: LSTM fails to use cuDNN when criteria are satisfied #22274

@DLumi

Description

@DLumi

If you create a LSTM layer that satisfies requirements stated in documentation, you cannot do a forward pass on the layer created with use_cudnn=True.
Error:

ValueError: Exception encountered when calling LSTM.call().

use_cudnn=True was specified, but cuDNN is not supported for this layer configuration with this backend. Pass use_cudnn='auto' to fallback to a non-cuDNN implementation.

Arguments received by LSTM.call():
  • sequences=torch.Tensor(shape=torch.Size([1, 256, 128]), dtype=float32)
  • initial_state=None
  • mask=None
  • training=None

A reference to the cuDNN requirements in code (pretty much the same):
https://github.com/keras-team/keras/blob/0ddf96200f781f95dea0289aa87d85eb5eb3e733/keras/src/backend/torch/rnn.py#L548C1-L564C6

Colab to reproduce:
https://colab.research.google.com/drive/1OFLXgZpiUXDXTO1eebQYhXnq2HevZpoe

P.S. I explicitly defined the requirements even though on paper you should be good to go with the default values
P.P.S this problem is observed starting with keras==3.12. On 3.11.3 this is not an issue. Or at least, the layer is created, I have no idea if it actually uses cudnn (likely not as the training is VERY slow)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions