Skip to content

Commit d099213

Browse files
Update 2024-12-05-tattle-mlcommons.mdx
1 parent e7aff10 commit d099213

File tree

1 file changed

+2
-5
lines changed

1 file changed

+2
-5
lines changed

src/blog/2024-12-05-tattle-mlcommons.mdx

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,15 +14,12 @@ We created 2000 prompts in Hindi on two hazard categories [^2]: hate and sex-rel
1414
These prompts were created by the expert group, which has expertise in journalism, social work, feminist advocacy, gender studies, fact-checking, political campaigning, education, psychology, and research. All of the experts were native or fluent Hindi speakers.
1515

1616
The project took place over the course of 2 months, where we conducted online sessions with the experts organised into groups.
17-
They were encouraged to discuss and write the prompts in Hindi that related to the hazards. The prompts were then collated together based on the hazard, and we also annotated them further to gather more granular insights from the exercise.
17+
They were encouraged to discuss and write the prompts in Hindi that related to the hazards. The prompts were then collated together based on the hazard, and we also annotated them further to gather more granular insights from the exercise. Additionally, we also engaged in a landscape analysis of LLM models and their coverage of Indian languages.
1818
For us, this project was an opportunity to extend the expert led participatory method of dataset creation to LLM safety.
1919

2020
MLCommons is now releasing the v1 Safety Benchmark dataset, AI Luminate. It is an important step in assessing the safety of LLMs.
2121
Our project provided interesting insights on the universality of the framework proposed in v0.5.
22-
We conclude our report, available [here](https://mlcommons.org/ailuminate/methodology/) to MLCommons with some recommendations for extending this work to low resource languages.
23-
In addition to contributing to AI Luminate, we also engaged in an extensive landscape analysis of large language models and their coverage of Indian languages.
24-
In the study, we looked at existing evaluation datasets and methodologies used to assess the performance of LLMs across various language tasks.
25-
For a set of models that support Indian languages, we also analyzed attributes such as the training data, the distribution of Indian languages within it, access, licensing, and the types of LLM.
22+
We conclude our report-available [here](https://mlcommons.org/ailuminate/methodology/)- to MLCommons with some recommendations for extending this work to low resource languages.
2623

2724
Take a look at AI Luminate [here](https://mlcommons.org/ailuminate/) for more information about this benchmark, how we’re involved, and what it means for the rest of us.
2825

0 commit comments

Comments
 (0)