Skip to content

[Modeling] Release a 3.3B Open-NLLB checkpoint (~202 languages) #17

@gordicaleksa

Description

@gordicaleksa

This the end goal for the current project scope.

Here the goal is to release a model with following properties:

  • Truly open-source
  • 3.3B dense
  • Supports all 202 NLLB languages in both direction

Note: it will be very hard to get a satisfactory level of quality for 202 languages with a dense checkpoint. The original work from Meta used a ~54B parameter MoE (mixture of experts) model to get decent results + a ton of compute (~52k hours on A100-SXM-80GB).

We do have plans to scale beyond 3.3B parameters scale.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions