Skip to content

the-context-lab/childQAfeedback

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pragmatic Modelling in Language Learning

Caregiver Question-Answer Feedback in
Child-Directed Dialogue

Maryam Bala*, Johannes Heim†, Elspeth Edelstein†, Arabella Sinclair‡

* University of Southampton
University of Aberdeen
University College London


The paper will be presented at LREC 2026, Palma de Mallorca, May 2026!

📝 Paper PDF


Abstract

In language development, children learn to form Question–Answer (QA) sequences through caregiver feedback that adapts dynamically to their evolving linguistic abilities. Using expert annotated child-caregiver interaction, we examine four feedback types that guide children's acquisition of adult-like QA behaviour: caregiver instructions through reformulating and affirming a child's output as well as caregiver demonstrations through exemplifying and modelling adult-like behaviour. Our analysis reveals that feedback incidence, frequency and complexity progress and adapt over the course of development, akin to a tailored curriculum for pragmatic development. We release our annotated dataset which offers a rich resource for studying pragmatic feedback and provides the first large-scale empirical evidence of adaptive, tailored caregiver feedback on QA behaviour.


Table of Contents


Citing

Please use the following to cite this work:

@inproceedings{bala-etal-2026-pragmatic,
  title = {Pragmatic Modelling in Language Learning: Caregiver Question-Answer Feedback in Child-Directed Dialogue},
  author = {Bala, Maryam and Heim, Johannes and Edelstein, Elspeth and Sinclair, Arabella},
  booktitle = {Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)},
  month = {May},
  year = {2026},
  pages = {11461--11478},
  address = {Palma, Mallorca, Spain},
  publisher = {European Language Resources Association (ELRA)},
  editor = {Piperidis, Stelios and Bel, Núria and van den Heuvel, Henk and Ide, Nancy and Krek, Simon and Toral, Antonio},
  doi = {10.63317/4g9ggjcrgfax},
}

Contact

🔬 Find out about other work going on at The Context Lab.


Usage

Experiment Pipeline

  • Candidate Extraction: Rule-based scripts extract candidate QA feedback pairs from the raw CHILDES transcripts using speaker sequence, punctuation patterns and lexical overlap heuristics (see rule_based/).

  • Annotation: A representative sample of candidates was manually annotated by expert linguists. The resulting gold-annotated dataset of ~3,639 samples (~1000 positive examples) is available in data/annotated_feedback/full-childes-annotation.xlsx.

  • Classification: BERT and ModernBERT binary classifiers are fine-tuned and evaluated on the annotated data using train_eval.py. Train/val/test splits per category are in data/train_data/, organised into context/ and no_context/ subfolders. LLM-as-judge baselines are evaluated in prompting.ipynb.

  • Scaling: The best-performing classifier per category is applied to the full CHILDES candidate pool using label.py, producing scaled annotations in data/automatic/

  • Feature Extraction: Linguistic features (vocabulary overlap, Levenshtein distances, POS proportions, lexical sophistication) are computed for all four feedback categories of the gold-annotated dataset using analysis_features.py.

  • Analysis: Once all features have been extracted, the developmental analysis and figures reported in the paper are produced in analysis.ipynb, examining how feedback incidence, complexity and repetition vary across child development.

Installation

git clone https://github.com/the-context-lab/childQAfeedback.git
cd childQAfeedback

Create a conda environment and install dependencies:

conda create -n childes python=3.10
conda activate childes
pip install -r requirements.txt

For NLTK resources:

python -m nltk.downloader wordnet punkt punkt_tab

Running the Pipeline

  1. Step 1: Extract candidates from CHILDES

    # example for reformulating - repeat for modelling, exemplifying and affirming
    python rule_based/extract_reformulating.py data/childes_original/childes.csv data/raw/reformulating.csv
  2. Step 2: Train and evaluate classifiers

    Open classifiers/train_eval.py and set the following in the CONFIG block:

    CATEGORY = "reformulating"
    CONTEXT  = False

    Then run:

    python classifiers/train_eval.py

    Trained model checkpoints and evaluation results are saved to classifiers/no_context/reformulating/. Repeat steps 1 and 2 for each category (modelling, exemplifying, affirming).

  3. Step 3: Scale annotation to full CHILDES

    Once all four category models are trained, run:

    python classifiers/label.py

    This automatically applies the best-performing classifier per category to the full candidate pool and saves labelled CSVs to data/automatic/.

  4. Step 4: Compute analysis features

    python analysis/analysis_features.py
  5. Step 5: LLM-as-judge evaluation

    Run llm_judge/prompting.ipynb on a GPU runtime

  6. Step 6: Analysis and plots

    Run analysis/analysis.ipynb to reproduce the figures and developmental analysis reported in the paper.

Repository Structure

This repository is structured as follows.

Data

The data/ folder contains all annotated, processed and scaled data used across the pipeline.

  • The childes_original/ folder contains the full subset of the CHILDES corpus used in this work, covering children aged 12-48 months across 46 children and 15 corpora.
  • The annotated_feedback/ folder contains the expert-annotated dataset of ~1,000 positive examples across four feedback categories, stored as an Excel file with one sheet per category. It also contains the linguistic feature CSVs for each category produced by analysis_features.py.
  • The train_data/ folder contains train/val/test splits per category, organised into context/ and no_context/ subfolders.
  • The automatic/ folder contains the full-corpus scaled annotation outputs produced by label.py. It also contains utterance_count.csv file contains the total number of utterances per dialogue across the full dataset.
  • The echoes/ folder contains the extracted adult echo sequences produced by echoes.py, used for the repetition analysis.

Rule-based

The rule_based/ folder contains rule-based scripts that extract candidate QA feedback pairs from the raw CHILDES transcripts.

  • modelling.py extracts Modelling candidate pairs (Adult - Adult).
  • exemplifying.py extracts Exemplifying candidate triples (Child - Adult - Adult).
  • reformulating.py extracts Reformulating candidate pairs (Child - Adult).
  • affirming.py extracts Affirming candidate triples (Child - Child - Adult).
  • echoes.py extracts adult echo sequences for the repetition analysis

Classifier

The classifier/ folder contains scripts for training, evaluating and applying the automatic annotation models.

  • train_eval.py fine-tunes and evaluates BERT and ModernBERT as binary classifiers for a single feedback category, using transformers.Trainer.
  • label.py applies the best-performing classifier per category to the full CHILDES candidate pool to produce scaled annotations for the developmental incidence analysis.

LLM Judge

The llm_judge/ folder contains the prompting experiments comparing LLM-as-judge performance against the fine-tuned classifiers.

  • prompting.ipynb runs zero-shot and few-shot prompting experiments with Llama 3 8b, Gemma 2b, and Falcon 7b across all four feedback categories and evaluates the results.

Analysis

The analysis/ folder contains the scripts compute the linguistic features and produce the developmental analysis plots

  • analysis_features.py computes linguistic features for all four feedback categories from the expert-annotated dataset. Features include vocabulary overlap, POS tag overlap, character-level, word-level and POS-level Levenshtein distances and POS proportions Each category is processed separately. Outputs are saved as one CSV per category to data/annotated_feedback/.

  • analysis_gold.ipynb and analysis.ipynb produce all figures and plots reported in the paper for the expert-annotated and automatically labelled datasets respectively.

License

Creative Commons.

CHILDES data is subject to the TalkBank Terms of Use.

About

Resources for our LREC 2026 paper containing Child-directed Question-Answer Feedback annotations and analysis.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors