Open
Description
Issue filed in Keras by @nicdumz - keras-team/keras#18973
Documentation for output_mode currently reads:
"multi_hot": Outputs a single int array per batch, of either vocab_size or max_tokens size, containing 1s in all elements where the token mapped to that index exists at least once in the batch item.
"count": Like "multi_hot", but the int array contains a count of the number of times the token at that index appeared in the batch item.
repro
import tensorflow as tf, tensorflow.version as tv
print(f"{tv.VERSION}, {tv.COMPILER_VERSION}, {tv.GIT_VERSION}")
v = tf.keras.layers.TextVectorization(output_mode="count")
v.adapt(["foo", "bar", "baz"])
print(v(["bar baz"]).dtype)
ouput
2.15.0, Ubuntu Clang 17.0.2 (++20231003073124+b2417f51dbbd-1~exp1~20231003073217.50), v2.15.0-2-g0b15fdfcb3f
<dtype: 'float32'>