Skip to content

Allow multiple tokens per feature data row #99

@de-code

Description

@de-code

This is carried over from #90 (comment)

Since the segmentation data is using the first two tokens of a line, it would make sense to have an option to be able to use that in DeLFT. Currently it would only use the first one.

Potential solution:

  • an option to specify the columns with the tokens (similar to the features)
  • concatenate the word embeddings and other token related vectors

Probably need to change a few places that expect a single token as an input.

/cc @kermitt2 @lfoppiano

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions