Skip to content

[AAAI-26] Reasoning Transfer for an Extremely Low-Resource and Endangered Language: Bridging Languages Through Sample-Efficient Language Understanding

Notifications You must be signed in to change notification settings

ReML-AI/english-pivoted-cot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Reasoning Transfer for an Extremely Low-resource and Endangered Language: Bridging Languages through Sample-Efficient Language Understanding

Khanh-Tung Tran, Barry O'Sullivan, Hoang D. Nguyen

Accepted to AAAI-26

Overview

This repository contains code and resources for the paper Reasoning Alignment for an Extremely Low-resource and Endangered Language: Separating Reasoning and Language Understanding.

Structure

  • ./data: evaluation data, including our contributed dataset LC2024.
  • ./src/train: training code, including for the baseline Native-CoT Training and English-Pivoted CoT Training.
  • ./src/evaluation: evaluation code, based on the SkyThought repo.
  • ./appendix.pdf: technical appendix.

Training

  1. Install dependencies:

    cd ./src/train
    pip install -r requirements.txt
  2. Run the main script:

    bash scripts/lang_adapt/run_sft.sh # change the paths accordingly

Evaluation

  1. Install dependencies:

    cd ./src/evaluation
    pip install -r requirements.txt
  2. Run evaluation:

    cd ./src/evaluation/skythought/skythought_evals
    python eval.py --model ${YOUR_MODEL_HERE}$ --evals=aime,irish_aime,LC2024 --tp=1 --output_file=results.txt --temperatures 0.6 --n 64

For more information, refer to the original document from the SkyThought repo.

Citation

TBU

About

[AAAI-26] Reasoning Transfer for an Extremely Low-Resource and Endangered Language: Bridging Languages Through Sample-Efficient Language Understanding

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages