Using KAN at every time step in an RNN leads to very slow training

We are trying to use KAN together with RNN, by applying a KAN layer at every time step of the input sequence.

However, this makes training very slow, even on GPU (e.g., A100).
Compared to using standard layers like nn.Linear + ReLU, the speed drops significantly.

Any suggestions on how to speed this up, or is this usage not recommended?

Thanks a lot for your great work!