Skip to content

adapting the script movielens_recommendations_transformers.py to be Backend-Agnostic#2039

Merged
hertschuh merged 6 commits intokeras-team:masterfrom
Humbulani1234:adapt_movielens_recommendations_transformers
Feb 26, 2025
Merged

adapting the script movielens_recommendations_transformers.py to be Backend-Agnostic#2039
hertschuh merged 6 commits intokeras-team:masterfrom
Humbulani1234:adapt_movielens_recommendations_transformers

Conversation

@Humbulani1234
Copy link
Contributor

This PR adapts the script movielens_recommendations_transformers.py to be Backend-Agnostic

Copy link
Contributor

@fchollet fchollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

encoded_other_features = []

# Helper function to create embeddings
def embedding_helper(input):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this to be a method on the class

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved.


# This function merely groups similar logic for include_movie_features=True,
# or given as include_movie_features=False
def movie_sequence_helper(encoded_sequence_movies):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved

# or given as include_movie_features=False
def movie_sequence_helper(encoded_sequence_movies):
# Create positional embedding.
positions = keras.ops.arange(start=0, stop=sequence_length - 1, step=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using keras.ops.* everywhere, just do from keras import ops at the start -- it will shorten many lines

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved

output_dim=movie_embedding_dims,
name="position_embedding",
)
positions = tf.range(start=0, limit=sequence_length - 1, delta=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did the old really need extensive refactoring? Best I can tell only this line needed to change (to become ops.arange()

Copy link
Contributor Author

@Humbulani1234 Humbulani1234 Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing only the tf.range to ops.arange does not work and I believe it is because StringLookup is part of the model, not of tf.data.Dataset in the original script, and only Tensorflow can handle strings.

So, my approach was to split the function encode_input_features into two parts StringLookups and Embeddings. I placed the StringLookups functionality into tf.data.Dataset processing step, and created a class to handle each input embeddings. And also the issue that this script must run whether one chooses to include user_features or not also adds some refactoring.

However, if there is a less-refactoring approach/method, I'm wiling to learn and implement it. I must confess also that the refactoring felt a bit extensive, but required by my approach.

@Humbulani1234
Copy link
Contributor Author

Humbulani1234 commented Jan 30, 2025

PR addressing code changes, and comments replies are also provided.

@Humbulani1234
Copy link
Contributor Author

Indeed, there was extensive refactoring. I've implemented the following new approach:

  • Kept the old function encode_input_features; and
  • Removed only the StringLookups functionality from the above function and deploy it into tf.data.Dataset processing.

This PR implements the changes

@divyashreepathihalli
Copy link
Collaborator

Thanks @Humbulani1234! LGTM, can you please generate the .ipynb and .md files?

@Humbulani1234
Copy link
Contributor Author

Generated .md and .ipynb files.

@hertschuh hertschuh merged commit 98429b5 into keras-team:master Feb 26, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants