You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/blog/2024-12-05-tattle-mlcommons.mdx
+2-5Lines changed: 2 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,15 +14,12 @@ We created 2000 prompts in Hindi on two hazard categories [^2]: hate and sex-rel
14
14
These prompts were created by the expert group, which has expertise in journalism, social work, feminist advocacy, gender studies, fact-checking, political campaigning, education, psychology, and research. All of the experts were native or fluent Hindi speakers.
15
15
16
16
The project took place over the course of 2 months, where we conducted online sessions with the experts organised into groups.
17
-
They were encouraged to discuss and write the prompts in Hindi that related to the hazards. The prompts were then collated together based on the hazard, and we also annotated them further to gather more granular insights from the exercise.
17
+
They were encouraged to discuss and write the prompts in Hindi that related to the hazards. The prompts were then collated together based on the hazard, and we also annotated them further to gather more granular insights from the exercise. Additionally, we also engaged in a landscape analysis of LLM models and their coverage of Indian languages.
18
18
For us, this project was an opportunity to extend the expert led participatory method of dataset creation to LLM safety.
19
19
20
20
MLCommons is now releasing the v1 Safety Benchmark dataset, AI Luminate. It is an important step in assessing the safety of LLMs.
21
21
Our project provided interesting insights on the universality of the framework proposed in v0.5.
22
-
We conclude our report, available [here](https://mlcommons.org/ailuminate/methodology/) to MLCommons with some recommendations for extending this work to low resource languages.
23
-
In addition to contributing to AI Luminate, we also engaged in an extensive landscape analysis of large language models and their coverage of Indian languages.
24
-
In the study, we looked at existing evaluation datasets and methodologies used to assess the performance of LLMs across various language tasks.
25
-
For a set of models that support Indian languages, we also analyzed attributes such as the training data, the distribution of Indian languages within it, access, licensing, and the types of LLM.
22
+
We conclude our report-available [here](https://mlcommons.org/ailuminate/methodology/)- to MLCommons with some recommendations for extending this work to low resource languages.
26
23
27
24
Take a look at AI Luminate [here](https://mlcommons.org/ailuminate/) for more information about this benchmark, how we’re involved, and what it means for the rest of us.
0 commit comments