Where Do LLMs Compose Meaning? A Layerwise Analysis of Compositional Robustness

This repository contains the code for the paper "Where Do LLMs Compose Meaning? A Layerwise Analysis of Compositional Robustness" (EACL 2026).

Overview

This project investigates how Large Language Models (LLMs) compose meaning across different layers by analysing their robustness to activation grouping at various granularities.

Constituent-Aware Pooling (CAP)

We introduce Constituent-Aware Pooling (CAP), a methodology grounded in compositionality, mechanistic interpretability, and information theory that intervenes in model activations by pooling token representations into linguistic constituents at various layers. The approach consolidates token-level activations into higher-level syntactic units (e.g., words, phrases) using different pooling strategies (mean, sum, max, last-token).

Installation

Create and activate the conda environment:

conda env create -f environment.yml
conda activate [environment-name]

Usage

The main script activation_grouping_main.py supports various experimental configurations:

python activation_grouping_main.py \
    --model_name GPT2 \
    --task_type inverse_dictionary \
    --supervision_type original \
    --data_path path/to/data.json \
    --start_layer 4 \
    --grouping_protocol mean \
    --granularity token_to_words \
    --k 1 \
    --batch_size 16 \
    --seed 42 \
    --norm_preserve \
    --broadcast_cap

Key Arguments

--model_name: Model to evaluate (e.g., GPT2, gemma-2b)
--task_type: Evaluation task (inverse_dictionary, synonyms, hypernyms, input_reconstruction, exact_sequence_autoencoding)
--supervision_type: Use original or fine-tuned model
--granularity: Grouping granularity (token_to_words, token_to_phrases, random)
--grouping_protocol: Pooling method (mean, sum, max, last_token)
--start_layer: Layer from which to start applying grouping
--norm_preserve: Preserve activation norms after pooling (optional)
--broadcast_cap: Broadcast pooled values back to original positions (optional)
--k: Number of top-k predictions to consider

Project Structure

.
├── activation_grouping_main.py    # Main evaluation script
├── src/
│   ├── metrics.py                 # Metric calculation utilities
│   └── utils.py                   # Helper functions
└── results/                       # Output logs and metrics

Output

Results are saved in two formats:

Log files: Detailed execution logs in results/{model_name}/{task_type}/
JSON files: Structured metrics and predictions in results/{model_name}/{task_type}/{granularity}_logs/

Each result includes:

Clean model predictions (no intervention)
Grouped predictions (with activation pooling)
Metrics comparing both conditions
Compression ratios and layer information

Experiments

The code supports analyzing:

Component-wise effects: MLP, attention (Q/K/V/Z, scores, patterns), residual stream
Layer-wise progression: Start interventions from different layers
Granularity levels: Word-level, phrase-level, or random grouping
Pooling strategies: Mean, sum, max, or last-token aggregation

Citation

If you use this code in your research, please cite:

@inproceedings{aljaafari2025cap,
  title={Where Do LLMs Compose Meaning? A Layerwise Analysis of Compositional Robustness},
  author={Aljaafari, Nura and Carvalho, Danilo S and Freitas, Andr{\'e}},
  booktitle={Proceedings of the 2026 Conference of the European Chapter of the Association for Computational Linguistics (EACL)},
  year={2026}
}

License

This project is licensed under the licensed under the GPLv3 License - see the LICENSE file for details.

Contact

For questions or issues regarding this code, or for paper-related inquiries, please:

Open an issue in this repository
Contact: nuraaljaafari@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
src		src
LICENSE.txt		LICENSE.txt
ReadMe.md		ReadMe.md
activation_grouping_main.py		activation_grouping_main.py
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Where Do LLMs Compose Meaning? A Layerwise Analysis of Compositional Robustness

Overview

Constituent-Aware Pooling (CAP)

Installation

Usage

Usage

Key Arguments

Project Structure

Output

Experiments

Citation

License

Contact

About

Uh oh!

Releases

Packages

Languages

License

neuro-symbolic-ai/CAP

Folders and files

Latest commit

History

Repository files navigation

Where Do LLMs Compose Meaning? A Layerwise Analysis of Compositional Robustness

Overview

Constituent-Aware Pooling (CAP)

Installation

Usage

Usage

Key Arguments

Project Structure

Output

Experiments

Citation

License

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages