-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
In case you come across a word that is not part of the original GloVe dataset.
We currently return an empty array.
def get_embedding( word ):
try:
return vectors[ indexes[ word ] ]
except KeyError:
return []This however should be an array of zeros with a size matching the dataset sizes:
def get_embedding(word):
try:
return vectors[indexes[word]]
except KeyError:
return np.zeros(50) In this case I'm using numpy because I'm running this in a real python env for preprocessing but could be an floatarray size 50 with all zeros instead.
This solves issues like:
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 0 and the array at index 2 has size 50
Metadata
Metadata
Assignees
Labels
No labels