FineGrainedErrorDetection

Our project is adapted from a shared task hosted by the ninth Conference on Machine Translation (WMT). The task is listed as 2024 Quality Estimation Shared Task 2[1].

The goal of our project is to improve on a method proposed by the authors of CometKiwi designed to estimate the quality of machine translations without referencing the original source. For instance, given a German text in which the original source was not written in German, we wish to identify the segments of that text that are incorrectly translated (start and end indices) and label these segments as minor or major translation errors. In other words, we are aiming for highly granular quality estimations at a word-by-word scale, or “Error spans”.

Setup

Create an environment with python>=3.8 and run the following command

pip3 install -r requirements.txt

Next, log into huggingface using:

huggingface-cli login

Next, ensure you have access to the necessary repository by going to the Huggingface page for the model and accepting access.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
csv_data		csv_data
en-de_results		en-de_results
en-zh_results		en-zh_results
frequency-graphs		frequency-graphs
ru-en_results		ru-en_results
Niko_QLoRA_finetune.ipynb		Niko_QLoRA_finetune.ipynb
README.md		README.md
commands.txt		commands.txt
frequency.py		frequency.py
log.csv		log.csv
loss_qlora_en-de.csv		loss_qlora_en-de.csv
loss_qlora_en-de.png		loss_qlora_en-de.png
qlora_requirements.txt		qlora_requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FineGrainedErrorDetection

Setup

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FineGrainedErrorDetection

Setup

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages