Skip to content

Higher val loss when using E2D2 shared KV cache backbone vs. automodel for BD3LM baseline #74

@mariannearr

Description

@mariannearr
No description provided.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions