[OpenVINO backend] Support for LSTM #22313
[OpenVINO backend] Support for LSTM #22313andersendsa wants to merge 16 commits intokeras-team:masterfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the OpenVINO backend by introducing full support for the Long Short-Term Memory (LSTM) operation. This integration allows Keras models leveraging LSTMs to be efficiently compiled and executed using OpenVINO, addressing a critical gap in functionality. The changes involve a detailed implementation of the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds support for the LSTM operation to the OpenVINO backend by implementing the backend.lstm function, leveraging the LSTMSequence operation from OpenVINO. The changes look good overall, but I have a few suggestions to improve code quality, robustness, and clarity. My feedback includes removing some dead and redundant code, improving error handling for unsupported activations, and clarifying the purpose of some changes. I've also noted a potentially out-of-scope change in the test exclusion list.
keras/src/backend/openvino/rnn.py
Outdated
| direction = "forward" | ||
| if go_backwards: | ||
| direction = "reverse" | ||
| def get_activation_name(act): | ||
| from keras.src import activations | ||
| if act == activations.tanh: return "tanh" | ||
| if act == activations.sigmoid: return "sigmoid" | ||
| if act == activations.relu: return "relu" | ||
| if act == activations.linear: return "linear" |
There was a problem hiding this comment.
The get_activation_name helper function defaults to returning "tanh" for any unrecognized activation function. This can lead to silent errors where an unsupported activation is used but the model compiles and runs with a different activation (tanh) than intended. According to the Keras API design guidelines (lines 139-143), it's better to catch user errors early and provide informative error messages.
Please consider raising a ValueError for unsupported activations. You could also expand the list of recognized activations to include those from keras.src.ops (e.g., ops.tanh) for better robustness.
def get_activation_name(act):
from keras.src import activations
from keras.src import ops
if act in (activations.tanh, ops.tanh):
return "tanh"
if act in (activations.sigmoid, ops.sigmoid):
return "sigmoid"
if act in (activations.relu, ops.relu):
return "relu"
if act in (activations.linear, ops.linear):
return "linear"
raise ValueError(f"Unsupported activation function for OpenVINO LSTM: {act}")| keras/src/trainers/trainer_test.py | ||
| keras/src/utils No newline at end of file | ||
| keras/src/utils | ||
| keras/src/layers/rnn/gru_test.py No newline at end of file |
There was a problem hiding this comment.
This change to exclude gru_test.py seems out of scope for a pull request focused on adding LSTM support. Additionally, the entire keras/src/layers/rnn directory is already excluded on line 26, which makes this new line redundant if the directory exclusion works as expected. Could you please clarify the reasoning for this change or remove it if it's not necessary?
| recurrent_kernel, | ||
| bias, | ||
| activation, | ||
| recurrent_activation, |
There was a problem hiding this comment.
The unroll parameter is included in the function signature but is not used within the implementation. According to the Keras API design guidelines (lines 18 and 53), unused parameters should be avoided to keep the API surface clean and minimize user confusion. Please consider either implementing the unrolling logic or removing this parameter.
| unroll=False, | ||
| time_major=False, |
keras/src/backend/openvino/rnn.py
Outdated
| ) | ||
|
|
||
|
|
||
| def cudnn_ok(*args, **kwargs): | ||
| return False | ||
| def cudnn_ok( | ||
| activation, | ||
| recurrent_activation, | ||
| unroll, | ||
| use_bias, | ||
| reset_after=None, | ||
| ): | ||
| if reset_after is None: | ||
| return ( | ||
| activation in (activations.tanh, ops.tanh) | ||
| and recurrent_activation in (activations.sigmoid, ops.sigmoid) | ||
| and not unroll | ||
| ) | ||
| else: | ||
| return False |
There was a problem hiding this comment.
This cudnn_ok function appears to be dead code in the context of the OpenVINO backend. It's only referenced in the LSTM and GRU layer __init__ methods within a condition that checks backend.backend() == "tensorflow". When the backend is OpenVINO, this function will not be called.
Additionally, the name cudnn_ok is misleading here. If this function is intended to be used in the future for OpenVINO, it should be renamed to something more appropriate, like is_fast_path_supported.
For now, it seems this function can be reverted to return False to avoid confusion and maintaining unused code.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #22313 +/- ##
==========================================
- Coverage 82.96% 76.54% -6.43%
==========================================
Files 596 596
Lines 66269 66355 +86
Branches 10321 10335 +14
==========================================
- Hits 54982 50790 -4192
- Misses 8667 13140 +4473
+ Partials 2620 2425 -195
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hi @hertschuh this pr passes all the tests and it is ready for review |
| keras/src/trainers/trainer_test.py | ||
| keras/src/utils No newline at end of file | ||
| keras/src/utils | ||
| keras/src/layers/rnn/gru_test.py No newline at end of file |
There was a problem hiding this comment.
Agreed with gemini here. Adding gru_test.py when the parent directory is already excluded makes no sense. Please clean up the excluded_tests.txt diff.
There was a problem hiding this comment.
Manually overriding dir and getattr to force-import the entire library seems unnecessary. We only make changes to the files involving the backend, for a backend specific implementation. Is there a reason for this?
There was a problem hiding this comment.
Revert this whole file, it's failing the code format check.
There was a problem hiding this comment.
I'm confused by the inclusion of these changes in this PR. keras/src/initializers/ is a global utility and should remain backend-agnostic. Unless there is a specific reason these initializers need to be changed for all of Keras, I'd suggest reverting the changes to this file.
There was a problem hiding this comment.
+1
Why was this change needed?
There was a problem hiding this comment.
Revert this whole file, it's failing the code format check.
hertschuh
left a comment
There was a problem hiding this comment.
Thanks for implementing this!
keras/src/backend/openvino/rnn.py
Outdated
| def cudnn_ok( | ||
| activation, | ||
| recurrent_activation, | ||
| unroll, | ||
| use_bias, | ||
| reset_after=None, | ||
| ): |
There was a problem hiding this comment.
Just to clarify, in this context, cudnn_ok means "use the native OpenVino implementation" (we should rename it).
But is the criteria really the same as for CuDNN? That seems surprising.
There was a problem hiding this comment.
+1
Why was this change needed?
keras/src/backend/openvino/rnn.py
Outdated
| from openvino import Type | ||
|
|
||
| from keras.src import activations | ||
| from keras.src import ops |
There was a problem hiding this comment.
You cannot import ops like this here, this is what's creating the circular imports.
But they're not even used.
There was a problem hiding this comment.
Revert this whole file, it's failing the code format check.
There was a problem hiding this comment.
Revert this whole file, it's failing the code format check.
keras/src/backend/openvino/rnn.py
Outdated
| def cudnn_ok(*args , **kwargs): | ||
| return False No newline at end of file |
There was a problem hiding this comment.
Revert this change file, it's failing the code format check.
There was a problem hiding this comment.
Hey @hertschuh just wanted to double-check—do you want me to revert the whole file back to the previous commit?
There was a problem hiding this comment.
Oh, no, just the last 2 lines of this file.
keras/src/backend/openvino/rnn.py
Outdated
| direction = "reverse" | ||
|
|
||
| def get_activation_name(act): | ||
| from keras.src import activations |
There was a problem hiding this comment.
activations are already imported at the top, you don't need this import.
|
Hey @andersendsa, could you also enable the tests releted to LSTM from excluded_concrete_tests.txt ? |
|
hi @hertschuh i have reverted the file and removed the test cases the pr is ready for review |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request adds support for the LSTM operation in the OpenVINO backend by implementing the lstm function using the ov_opset.lstm_sequence operation. This is a great addition. The changes also include modifications to the generic rnn function and enabling LSTM tests.
My review found a critical issue with the gate ordering for weights and biases in the new lstm implementation, which will lead to incorrect results. I've also identified a few other issues, including duplicated code, unreachable code, and dead code, which should be addressed to improve code quality and correctness.
| k_i, k_f, k_c, k_o = ov_opset.split( | ||
| kernel_T, ov_opset.constant(0, Type.i32), 4 | ||
| ).outputs() | ||
| W = ov_opset.concat([k_f, k_i, k_c, k_o], axis=0).output(0) |
There was a problem hiding this comment.
The concatenation order for weights W is incorrect. Keras LSTM kernels are ordered for [input, forget, cell, output] gates. The OpenVINO LSTMSequence operation expects the weights for gates in the order [input, output, forget, cell]. The current code concatenates them in the order [forget, input, cell, output]. This should be [input, output, forget, cell].
| W = ov_opset.concat([k_f, k_i, k_c, k_o], axis=0).output(0) | |
| W = ov_opset.concat([k_i, k_o, k_f, k_c], axis=0).output(0) |
| r_i, r_f, r_c, r_o = ov_opset.split( | ||
| recurrent_kernel_T, ov_opset.constant(0, Type.i32), 4 | ||
| ).outputs() | ||
| R = ov_opset.concat([r_f, r_i, r_c, r_o], axis=0).output(0) |
There was a problem hiding this comment.
Similar to the kernel weights W, the concatenation order for recurrent weights R is incorrect. It should be [input, output, forget, cell] to match the OpenVINO LSTMSequence op specification.
| R = ov_opset.concat([r_f, r_i, r_c, r_o], axis=0).output(0) | |
| R = ov_opset.concat([r_i, r_o, r_f, r_c], axis=0).output(0) |
| b_i, b_f, b_c, b_o = ov_opset.split( | ||
| bias, ov_opset.constant(0, Type.i32), 4 | ||
| ).outputs() | ||
| B = ov_opset.concat([b_f, b_i, b_c, b_o], axis=0).output(0) |
There was a problem hiding this comment.
Similar to the weights, the concatenation order for biases B is incorrect. It should be [input, output, forget, cell] to match the OpenVINO LSTMSequence op specification.
| B = ov_opset.concat([b_f, b_i, b_c, b_o], axis=0).output(0) | |
| B = ov_opset.concat([b_i, b_o, b_f, b_c], axis=0).output(0) |
| zero_const = ov_opset.constant([0], dtype=Type.i32).output(0) | ||
| final_output_list = [ | ||
| ov_opset.unsqueeze(x, zero_const).output(0) for x in final_output_list | ||
| ] |
| recurrent_activation, | ||
| return_sequences=False, | ||
| go_backwards=False, | ||
| unroll=False, |
| ): | ||
| if mask is not None: | ||
| raise NotImplementedError("lstm sequence with mask is not supported") | ||
| mask = get_ov_output(mask) |
| def get_activation_name(act): | ||
|
|
||
| if act == activations.tanh: | ||
| return "tanh" | ||
| if act == activations.sigmoid: | ||
| return "sigmoid" | ||
| if act == activations.relu: | ||
| return "relu" | ||
| if act == activations.linear: | ||
| return "linear" | ||
| if hasattr(act, "__name__"): | ||
| return act.__name__ | ||
| return "tanh" |
There was a problem hiding this comment.
This function can be made more concise and maintainable by using a dictionary to map activation functions to their string names. This is a common pattern for this type of logic and would improve readability.
For example:
ACTIVATION_MAP = {
activations.tanh: "tanh",
activations.sigmoid: "sigmoid",
activations.relu: "relu",
activations.linear: "linear",
}
def get_activation_name(act):
name = ACTIVATION_MAP.get(act)
if name:
return name
if hasattr(act, "__name__"):
return act.__name__
return "tanh"| if not return_sequences: | ||
| if time_major: | ||
| outputs = ov_opset.unsqueeze( | ||
| Ho, ov_opset.constant([0], Type.i32) | ||
| ).output(0) | ||
| else: | ||
| outputs = ov_opset.unsqueeze( | ||
| Ho, ov_opset.constant([1], Type.i32) | ||
| ).output(0) |
There was a problem hiding this comment.
|
@andersendsa Is there an email on which I could reach out to you?
…On Mon, 9 Mar 2026 at 02:45, andersendsa ***@***.***> wrote:
*andersendsa* left a comment (keras-team/keras#22313)
<#22313 (comment)>
hello @goyaladitya05 <https://github.com/goyaladitya05> i have been
contributing to for quite some time now you can look at my initial prs i
hadnt used jules bot as i have been contributing to this project
agressively i had started using it for help i will stop reliance towards it
and make sure to pull clean prs thank you
—
Reply to this email directly, view it on GitHub
<#22313 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AZTJQNI6N2Y7XQNNLWUWEEL4PXPHDAVCNFSM6AAAAACWCUJF2CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DAMRQGAZDMMZUG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Details :
Implement LSTM operation for the OpenVINO backend.
Closes issue : openvinotoolkit/openvino#34403