Skip to content

muhammedjailam/fenvaru

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

3 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

FENVARU

A Dhivehi Benchmark to Assess Dhivehi Language Support in Modern Large Language Models (LLMs)

by Mohamed Jailam
MSc Information Technology (Villa College / UWE)


About the Project

FENVARU is the first known benchmark designed to evaluate how well modern large language models (LLMs) understand and support Dhivehi, the national language of the Maldives. This project was developed as part of the MScIT dissertation submitted to Villa College / University of the West of England (UWE).

The benchmark includes a comprehensive suite of evaluation tasks and analyses, covering both generative and classification-based capabilities of models across four key NLP tasks:

  • Machine Translation (MT)
  • Named Entity Recognition (NER)
  • Text Classification
  • Question Answering (QA)

Download the Paper

Summary of Findings

  • Google Gemini 2.5 Pro was the top performer across most tasks, especially in MT and NER.
  • Anthropic Claude models showed promise in QA and classification tasks.
  • Open-source models like LLaMA, DeepSeek, and Mistral showed poor zero-shot Dhivehi support.
  • Structured tasks like NER and classification yielded higher scores compared to generation-heavy tasks like QA and MT.
  • The overall results highlight the low-resource status of Dhivehi and the need for more inclusive model training.

Future Work

You are encouraged to extend this benchmark in the following ways:

  • Expand Dataset: Increase the number and diversity of samples in each task, especially for MT and QA.
  • Fine-tune Open Models: Use the provided dataset to fine-tune or adapter-train open-source models for Dhivehi.
  • Add More Tasks: Include syntactic parsing, summarization, or speech-to-text tasks in Dhivehi.
  • Evaluate Bias and Ethics: Explore how models treat Dhivehi cultural, religious, and political context.
  • Community Contributions: Collaborate on improving Dhivehi NLP resources and open datasets.

Contact

For collaboration, feedback, or research queries, please contact:

Mohamed Jailam
๐Ÿ“ง [email protected]


๐Ÿ“˜ Citation

If you use this benchmark or any part of this research, please cite:

About

Dhivehi AI Benchmark to Evaluate Dhivehi Support Across LLMs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages