questions on padding and masking for the value=1000 #553
Description
the code is as follows. It is from https://www.tensorflow.org/guide/keras/masking_and_padding
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
raw_inputs = [
[711, 632, 71],
[73, 8, 3215, 55, 927],
[83, 91, 1, 645, 1253, 927],
]
By default, this will pad using 0s; it is configurable via the
"value" parameter.
Note that you could "pre" padding (at the beginning) or
"post" padding (at the end).
We recommend using "post" padding when working with RNN layers
(in order to be able to use the
CuDNN implementation of the layers).
padded_inputs = tf.keras.preprocessing.sequence.pad_sequences(
raw_inputs, padding="post",value=1000
)
print(padded_inputs)
The output is
[[ 711 632 71 1000 1000 1000]
[ 73 8 3215 55 927 1000]
[ 83 91 1 645 1253 927]]
Then, I want to create an embedding layer.
embedding = layers.Embedding(input_dim=5000, output_dim=16, mask_zero=True)
masked_output = embedding(padded_inputs)
print(masked_output._keras_mask)
when value=0, I can have the correct output:
tf.Tensor(
[[ True True True False False False]
[ True True True True True False]
[ True True True True True True]], shape=(3, 6), dtype=bool)
The question is my value=1000. The output is
tf.Tensor(
[[ True True True True True True]
[ True True True True True True]
[ True True True True True True]], shape=(3, 6), dtype=bool)
Which is not what I want.
So may I know how to pass the value=1000 in the padded inputs to the embedding please?
Many thanks.
Kai
Activity
Aygle commentedon Jun 3, 2022
padded_inputs = tf.keras.preprocessing.sequence.pad_sequences(
raw_inputs, padding="post",value=0
)