Open
Description
The documentation for itoken
is silent about the data structure that is returned. It appears to be an R6 object with a few public functions and variables, but I cannot figure out what they are.
For context, I am trying to create one-hot encoded (long-vector) word embeddings for teaching/demonstration purposes. More specifically I want
- load texts, create vocabulary
- transform words to the corresponding one-hot encoded vectors
- combine nearby words into corresponding word embeddings (using one-hot vectors).
In a sense, this is equivalent to working with a DTM where each document is an individual word. As such DTM easily get's large, I am trying to find a way to iterate over individual words.
Metadata
Metadata
Assignees
Labels
No labels