Skip to content

Fix Moonshine LiteRT export: Handle boolean masks in test runner#2555

Open
MAUK9086 wants to merge 2 commits intokeras-team:masterfrom
MAUK9086:fix-issue-2493
Open

Fix Moonshine LiteRT export: Handle boolean masks in test runner#2555
MAUK9086 wants to merge 2 commits intokeras-team:masterfrom
MAUK9086:fix-issue-2493

Conversation

@MAUK9086
Copy link

Description of the change

This PR fixes a bug in the LiteRT export testing infrastructure that prevented models with boolean inputs (like Moonshine) from being tested correctly.

The Issue:
The run_litert_export_test utility in test_case.py was incorrectly casting all boolean inputs to int32 before passing them to the TFLite interpreter. This caused a type mismatch error (ValueError: Cannot set tensor: Got value of type INT32 but expected type BOOL) for models that strictly require boolean masks.

The Fix:

  1. Updated test_case.py: Modified convert_for_tflite to preserve bool data types for both NumPy arrays and TensorFlow tensors, instead of forcing a cast to int32.
  2. Enabled Moonshine Test: Removed the @pytest.mark.skip from test_litert_export in moonshine_audio_to_text_test.py.
  3. Updated Test Inputs: Refactored the test to use keras.ops.convert_to_numpy and explicitly ensure masks are cast to bool, resolving the "mixing tensors and non-tensors" validation error.

Reference

Fixes #2493

Colab Notebook

N/A (Bug fix for existing test infrastructure)

Checklist

  • I have added all the necessary unit tests for my change.
  • I have verified that my change does not break existing code and works with all backends (TensorFlow, JAX, and PyTorch).
  • My PR is based on the latest changes of the main branch (if unsure, rebase the code).
  • I have followed the Keras Hub Model contribution guidelines in making these changes.
  • I have followed the Keras Hub API design guidelines in making these changes.
  • I have signed the Contributor License Agreement.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @MAUK9086, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug in the LiteRT export testing framework where boolean input types were being incorrectly converted to integers, leading to test failures for models that strictly rely on boolean masks. By ensuring that boolean types are preserved during conversion for TFLite and by refining the test data preparation for models like Moonshine, this change enables accurate and robust testing of such models within the LiteRT environment.

Highlights

  • Bug Fix for LiteRT Export Testing: Resolves an issue in the LiteRT export testing infrastructure that prevented models requiring boolean inputs (like Moonshine) from being tested correctly.
  • Boolean Type Preservation: The convert_for_tflite utility in test_case.py has been updated to correctly preserve boolean data types for both NumPy arrays and TensorFlow tensors, instead of incorrectly casting them to int32.
  • Moonshine Test Re-enabled: The @pytest.mark.skip decorator has been removed from the test_litert_export in moonshine_audio_to_text_test.py, allowing the Moonshine model's LiteRT export to be tested.
  • Refactored Test Inputs: Test inputs for Moonshine's LiteRT export test are now explicitly converted to NumPy and padding masks are cast to boolean to prevent type mismatch errors and 'mixing tensors and non-tensors' validation issues.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly fixes a bug in the LiteRT export test utility that was improperly casting boolean inputs to integers. The changes in test_case.py to preserve boolean dtypes are a good general fix, and the updates to the Moonshine test enable it to pass with the corrected infrastructure. I have one suggestion to improve the conciseness of the test data preparation logic.

Comment on lines +154 to +167
input_data = {}
for k, v in self.input_data.items():
input_data[k] = ops.convert_to_numpy(v)

# 2. Force masks to boolean
if "encoder_padding_mask" in input_data:
input_data["encoder_padding_mask"] = np.array(
input_data["encoder_padding_mask"], dtype=bool
)

if "decoder_padding_mask" in input_data:
input_data["decoder_padding_mask"] = np.array(
input_data["decoder_padding_mask"], dtype=bool
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This data preparation logic can be made more concise and less repetitive by using a dictionary comprehension for the initial conversion and a loop to handle the type casting for all relevant mask keys. This improves readability and maintainability.

Suggested change
input_data = {}
for k, v in self.input_data.items():
input_data[k] = ops.convert_to_numpy(v)
# 2. Force masks to boolean
if "encoder_padding_mask" in input_data:
input_data["encoder_padding_mask"] = np.array(
input_data["encoder_padding_mask"], dtype=bool
)
if "decoder_padding_mask" in input_data:
input_data["decoder_padding_mask"] = np.array(
input_data["decoder_padding_mask"], dtype=bool
)
input_data = {k: ops.convert_to_numpy(v) for k, v in self.input_data.items()}
# 2. Force masks to boolean
for mask_key in ("encoder_padding_mask", "decoder_padding_mask"):
if mask_key in input_data:
input_data[mask_key] = input_data[mask_key].astype(bool)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MoonshineAudioToText LiteRT export fails with data type mismatch for encoder_padding_mask

1 participant