You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fixes to the "English-to-Spanish Translation with a Sequence-to-Sequence Transformer" Code Example (#1997)
* bugfix: Encoder and decoder inputs were flipped.
Given 30 epochs of training, the model never ended producing sensible output. These are examples:
1) Tom didn't like Mary. → [start] ha estoy qué
2) Tom called Mary and canceled their date. → [start] sola qué yo pasatiempo visto campo
When fitting the model the following relevant warning was emitted:
```
UserWarning: The structure of `inputs` doesn't match the expected structure: ['encoder_inputs', 'decoder_inputs']. Received: the structure of inputs={'encoder_inputs': '*', 'decoder_inputs': '*'}
```
After the fix the model now outputs sentences that are close to proper Spanish:
1)That's what Tom told me. → [start] eso es lo que tom me dijo [end]
2) Does Tom like cheeseburgers? → [start] a tom le gustan las queso de queso [end]
* Fix compute_mask in PostionalEmbedding
The check essentially disables the mask calculation, as the layer is the first one to receive the input, and thus never has a previous.
With this change mask is now passed on to the encoder.
Looks like a regression error. The initial commit looks very similar to this.
* Propagate both encoder/decoder-sequence masks to the decoder
As per https://github.com/tensorflow/tensorflow/blob/6550e4bd80223cdb8be6c3afd1f81e86a4d433c3/tensorflow/python/keras/engine/base_layer.py#L965 the inputs should be a list, and not kwargs. When this is done, both the masks are received as a tuple in the mask argument.
* Apply both the padding masks in the attention layers and during loss computation
* Regenerate ipynb/md-files for NMT example
0 commit comments