[Modeling] Release a 3.3B Open-NLLB checkpoint (~202 languages)

This the end goal for the current project scope.

Here the goal is to release a model with following properties:
* Truly open-source
* 3.3B dense
* Supports all 202 NLLB languages in both direction

Note: it will be very hard to get a satisfactory level of quality for 202 languages with a dense checkpoint. The original work from Meta used a ~54B parameter MoE (mixture of experts) model to get decent results + a ton of compute (~52k hours on A100-SXM-80GB).

We do have plans to scale beyond 3.3B parameters scale.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Modeling] Release a 3.3B Open-NLLB checkpoint (~202 languages) #17

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Modeling] Release a 3.3B Open-NLLB checkpoint (~202 languages) #17

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions