This repository contains the paper and official implementation for preliminary work on "Hierarchical Transformers," undertaken by @shanai13 and @robert-lieck from 2023-2024.
The experiments directory contains several experimental implementations referenced in the paper. These implementations are rough, may contain mistakes, and may differ from the official implementation in ways unrelated to the specific experiments they were testing. Therefore, they are included for reference purposes only.