FENVARU

A Dhivehi Benchmark to Assess Dhivehi Language Support in Modern Large Language Models (LLMs)

by Mohamed Jailam
MSc Information Technology (Villa College / UWE)

About the Project

FENVARU is the first known benchmark designed to evaluate how well modern large language models (LLMs) understand and support Dhivehi, the national language of the Maldives. This project was developed as part of the MScIT dissertation submitted to Villa College / University of the West of England (UWE).

The benchmark includes a comprehensive suite of evaluation tasks and analyses, covering both generative and classification-based capabilities of models across four key NLP tasks:

Machine Translation (MT)
Named Entity Recognition (NER)
Text Classification
Question Answering (QA)

Download the Paper

Summary of Findings

Google Gemini 2.5 Pro was the top performer across most tasks, especially in MT and NER.
Anthropic Claude models showed promise in QA and classification tasks.
Open-source models like LLaMA, DeepSeek, and Mistral showed poor zero-shot Dhivehi support.
Structured tasks like NER and classification yielded higher scores compared to generation-heavy tasks like QA and MT.
The overall results highlight the low-resource status of Dhivehi and the need for more inclusive model training.

Future Work

You are encouraged to extend this benchmark in the following ways:

Expand Dataset: Increase the number and diversity of samples in each task, especially for MT and QA.
Fine-tune Open Models: Use the provided dataset to fine-tune or adapter-train open-source models for Dhivehi.
Add More Tasks: Include syntactic parsing, summarization, or speech-to-text tasks in Dhivehi.
Evaluate Bias and Ethics: Explore how models treat Dhivehi cultural, religious, and political context.
Community Contributions: Collaborate on improving Dhivehi NLP resources and open datasets.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
datasets		datasets
dhivehi_wordnet		dhivehi_wordnet
evaluation scripts		evaluation scripts
results		results
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FENVARU

A Dhivehi Benchmark to Assess Dhivehi Language Support in Modern Large Language Models (LLMs)

About the Project

Summary of Findings

Future Work

Contact

📘 Citation

About

Uh oh!

Releases

Packages

Languages

muhammedjailam/fenvaru

Folders and files

Latest commit

History

Repository files navigation

FENVARU

A Dhivehi Benchmark to Assess Dhivehi Language Support in Modern Large Language Models (LLMs)

About the Project

Summary of Findings

Future Work

Contact

📘 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages