Skip to content

Erdos-Projects/fall-2025-developmental-norms-and-language-acquisition

Repository files navigation

LingPredict

Do Developmental Norms Predict L2 Difficulty? Modeling Duolingo Learners with WordBank Features

Primary research question:

Do words acquired later in first-language development (L1) show higher error rates and slower learning in second language (L2) practice on Duolingo? We estimate the age at which a child learns a given word in L1 based on developmental norms as described by the WordBank dataset. We then compare this with the time it takes a Duolingo user to learn this word in L2 based on the Duolingo SLAM Dataset. See also the 2018 SLAM Task overview article Second Language Acquisition Modeling by Settles, Brust, Gustafson, Hagiwara, and Madnani.

Data:

Duolingo SLAM Dataset:

WordBank Dataset:

Instructions:

  1. Create & activate the environment:
conda env create -f environment.yml
conda activate lingpredict   # or the name defined in environment.yml
  1. Download the SLAM dataset from Dataverse. (Their license precludes us from adding these files to the repository.) Unzip dataverse_files-2018.zip and the resulting tarball files data_en_es.tar.gz, data_es_en.tar.gz, and data_fr_en.tar.gz. Move the resulting folders data_en_es, data_es_en, and data_fr_en into data/raw. The resulting file structure should appear as follows:
data/
├── processed/
│   ├── language_translation_table.csv
│   ├── wordbank_en_logistic_fits.csv
│   └── wordbank_es_logistic_fits.csv
└── raw/
    ├── data_en_es/
    │   ├── en_es.slam.20190204.dev
    │   ├── en_es.slam.20190204.dev.key
    │   ├── en_es.slam.20190204.test
    │   ├── en_es.slam.20190204.test.key
    │   ├── en_es.slam.20190204.train
    │   └── (Metadata files: CHANGELOG.md, CITING.md, LICENSE.md)
    ├── data_es_en/
    │   └── (Files structured similarly to data_en_es)
    ├── data_fr_en/
    │   └── (Files structured similarly to data_en_es)
    └── Wordbank/
        └── (Supplementary linguistic files)
  1. Run run_all.sh.

About

Team project: fall-2025-developmental-norms-and-language-acquisition

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors