Skip to content

cltl-students/wouw-ino-van-de-CWT-BabyLM-Thesis

 
 

Repository files navigation

A thesis project exploring the application of Contrastive Weight Tying (CWT) techniques to the BabyLM Challenge for sample-efficient language model pretraining.

Overview

This repository contains the implementation and research for a thesis investigating how Contrastive Weight Tying (CWT) can be applied to improve language model training efficiency in the context of the BabyLM Challenge. The project aims to develop more parameter-efficient language models through novel weight sharing and contrastive learning approaches.

About the BabyLM Challenge

The BabyLM Challenge is a shared task focused on training sample-efficient language models on developmentally plausible corpora. The challenge aims to:

  • Train language models using human-scale data (≤100M words)
  • Develop cognitively plausible learning approaches
  • Bridge the gap between human language acquisition and machine learning
  • Democratize research into language model pretraining

Key aspects of the challenge include:

  • Strict Track: Models trained on ≤100M words
  • Strict-Small Track: Models trained on ≤10M words
  • Evaluation on diverse linguistic tasks BLiMP & GLUE

The folder headless-lm contains the explanation to install necessary dependencies and contains shell scripts to schedule scripts on a SLURM based HPC.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 89.0%
  • Shell 6.5%
  • HTML 4.1%
  • Jupyter Notebook 0.4%