Automated main concept generation for narrative discourse assessment in aphasia

This is the official repository for our paper: Automated main concept generation for narrative discourse assessment in aphasia. This repository contains code to reproduce the modeling experiments discussed in our paper.

An earlier version of this work was presented at the Clinical Aphasiology Conference 2025. The abstract is available in the CAC2025 directory.

Set up

Follow these instructions to set up the repository.

git clone https://github.com/gnkitaa/aphasia-narrative.git
cd aphasia-narrative

conda create -y --name aphasia python=3.9
conda activate aphasia
pip install -r requirements.txt

git clone https://github.com/openai/openai-cookbook.git

Datasets

We release a novel BATS dataset, containing narratives with human-annotated main concepts, which are empirically derived through extensive analysis of hundreds of story retellings from healthy participants (Kurland et al., 2021; Richardson and Dalton, 2016, 2020) and have been used to assess patients with aphasia (Kurland et al., 2024b). The dataset is provided under data/BATS directory.
We also evaluate our method on an existing narrative summarization dataset (Zhao et al., 2022). Please refer to NarraSum for more details.

MC generation

To generate main concepts run MCGenerator/generate_mcs_bats.ipynb for BATS dataset and MCGenerator/generate_mcs_narrasum.ipynb for narrasum dataset.

Different prompts used for MC generation are provided in MCGenerator/Prompts directory.

Semantic deduplication

To cluster main concepts that are similar in meaning, run MCGenerator/clustering_bats.ipynb for BATS dataset and MCGenerator/clustering_narrasum.ipynb for narrasum dataset.

MC evaluation

To evaluate the generated main concepts, run MCEvaluator/evaluate_bats.ipynb for BATS dataset and MCEvaluator/evaluate_narrasum.ipynb for narrasum dataset. The notebooks also plot the recall versus yield tradeoff curves discussed in the paper.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
ACL2025		ACL2025
CAC2025		CAC2025
MCEvaluator		MCEvaluator
MCGenerator		MCGenerator
data		data
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automated main concept generation for narrative discourse assessment in aphasia

Set up

Datasets

MC generation

Semantic deduplication

MC evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

slanglab/aphasia

Folders and files

Latest commit

History

Repository files navigation

Automated main concept generation for narrative discourse assessment in aphasia

Set up

Datasets

MC generation

Semantic deduplication

MC evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages