Skip to content

Publish per-language toxicity thresholds for reproducibility #5

@fedenanni

Description

@fedenanni

Hi all -
I was wondering if you could publish the fixed per-language toxicity score thresholds used by the ToxicityBinaryClassifierFilter in the Apertus pipeline.

The Apertus paper (Section 3.1.3) states: "We filter the 5% of documents per language with the highest predicted toxicity scores from the pretraining corpus." The ToxicityBinaryClassifierFilter class in this repo accepts a threshold parameter that corresponds to the 95th percentile pre-computed on the full FineWeb/FineWeb-2 corpus.

However, I couldn't find the actual threshold values published anywhere.

Thanks in advance and great work on this project!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions