This repository contains the implementation of our paper "HYDRA: A Multi-Head Encoder-only Architecture for Hierarchical Text Classification".
HYDRA is a simple yet effective multi-head encoder-only architecture for hierarchical text classification that treats each level in the hierarchy as a separate classification task with its own label space. Through parameter sharing and level-specific parameterization, HYDRA enables flat models to incorporate hierarchical awareness without architectural complexity.
Our approach demonstrates that complex components like graph encoders, label semantics, and autoregressive decoders are often unnecessary for achieving state-of-the-art performance in hierarchical text classification.
datasets/: Contains scripts and instructions for the four benchmark datasets (NYT, RCV1-V2, BGC, WOS)modeling/: Contains the implementation of the HYDRA model architecture and its variantsscripts/: Contains shell scripts to reproduce our experimentshydra_experiments.py: Main script for training and evaluation
pip install -r requirements.txt- PyTorch
- Transformers (v4.51.3, for UnLlama you need to use v4.41.1)
- Datasets
- Prepare the datasets following the instructions in
datasets/README.md. - To reproduce our main results:
# Run flat baseline models
bash scripts/run_flat.sh
# Run HYDRA with local heads only
bash scripts/run_hydra.sh
# Run HYDRA with local+global heads
bash scripts/run_hydra_global.sh
# Run HYDRA with local+nested heads
bash scripts/run_hydra_nested.shOur experiments were conducted on four standard hierarchical text classification benchmarks:
- NYT (New York Times Annotated Corpus)
- RCV1-V2
- BGC (Blurb Genre Collection)
- WOS (Web of Science-46985)
All experiments were run five times with different random seeds (42, 1, 2, 3, 4) to ensure reproducibility.
Our source code is released under the MIT License. See the LICENSE file for details.
This repository was developed with the assistance of GitHub Copilot. The authors have reviewed and edited the generated content to ensure accuracy and clarity.