Skip to content

Solid State Space Models #234

@bonham79

Description

@bonham79

(Lowest of the low priorities)

SSMs have been making the rounds but people have only cared about them for 'major' tasks. (NMT models, speech, LLM). Since they're special LSTMs and we see better performance from that type of model on our type of tasks, may be fun to implement an SSM decoder and try out.

More than theoretical interest, they're supposed to be more memory efficient than transformers, so we can probably run some wicked batch sizes if they're implemented well.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions