Authors: Dong Liu, Yanxuan Yu
Institutions: Yale University, Columbia University
Semantic parsing of long documents remains challenging due to quadratic growth in pairwise composition and memory requirements. We introduce Hierarchical Segment-Graph Memory (HSGM), a novel framework that decomposes an input of length N into M meaningful segments, constructs Local Semantic Graphs on each segment, and extracts compact summary nodes to form a Global Graph Memory. HSGM reduces complexity from O(N²) to O(Nk + (N/k)²), achieving 2-4× speedup with custom CUDA kernels and 60%+ memory reduction. The framework supports incremental updates for streaming documents, hierarchical query processing via top-K retrieval, and multiple semantic tasks including AMR parsing, SRL, event extraction, QA, and summarization.
- Python 3.8+
- PyTorch 1.12+
- CUDA 11.0+ (for GPU acceleration)
pip install -r requirements.txt# Download spaCy English model
python -m spacy download en_core_web_sm
# Download NLTK data
python -c "import nltk; nltk.download('punkt')"from hsgm import HSGMModel
from config import HSGMConfig
# Initialize configuration
config = HSGMConfig()
# Create HSGM model
model = HSGMModel(
config=config,
hidden_dim=768,
segment_size=256,
local_threshold=0.2,
global_threshold=0.1,
top_k_retrieval=5,
device="cuda"
)
# Process documents
documents = [
"Your long document text here...",
"Another document to process..."
]
output = model(documents)
print(f"Generated {len(output.summary_nodes)} summary nodes")
print(f"Processing time: {output.processing_time:.4f}s")# Process a query
query = "What is the main topic?"
result = model.query(query)
print(f"Query result shape: {result.shape}")# Add new documents incrementally
new_documents = ["New document content..."]
update_stats = model.incremental_update(new_documents)
print(f"Cache hit rate: {update_stats['hit_rate']:.3f}")Run the comprehensive demo and test suite to see HSGM in action:
python run_demo.pypython experiments/train.py --dataset document_amr --task classification --epochs 10document_amr: Document-level AMR parsingonto_notes_srl: Semantic Role Labelinglegal_eghr: Legal event extractionnarrative_qa: Question answeringgov_report: Text summarization
python experiments/train.py \
--dataset document_amr \
--task classification \
--batch_size 8 \
--learning_rate 3e-5python experiments/experiments.py \
--dataset document_amr \
--experiments main ablation scalability streamingInput Document → Segmentation → Local Graphs → Hierarchical Memory → Query Processing
The framework consists of four core components: a Document Segmenter that splits documents into coherent segments, a Local Semantic Graph builder that constructs graphs within each segment, a Hierarchical Memory that maintains summary nodes with global structure, and a Query Processor for hierarchical retrieval with local reasoning. HSGM leverages custom CUDA kernels for GPU acceleration including vectorized pairwise similarity computation (O(Nk)), parallel adaptive thresholding with reduction, efficient sparse edge creation in COO format, multi-head cross-segment attention with shared memory optimization, and bitonic sort for fast top-K retrieval.
from hsgm import get_hsgm_ops
ops = get_hsgm_ops(device='cuda') # Auto-detects CUDA
similarities = ops.pairwise_similarity(embeddings)- HSGM: O(Nk + (N/k)²) with k ≪ N
- Full Graph: O(N²)
- Optimal k: √N achieves O(N^(3/2))
| Model | Accuracy | F1 | Time (ms) | Memory (GB) |
|---|---|---|---|---|
| HSGM | 78.5% | 85.6% | 380 | 8.2 |
| Longformer | 76.8% | 84.5% | 700 | 6.8 |
| BigBird | 77.1% | 84.8% | 650 | 6.5 |
| Full Graph | 78.2% | 85.1% | 1200 | 12.5 |
- Speedup: 16.5× faster than baseline
- Memory: 78% reduction vs Full Graph
- Accuracy: 95%+ of baseline performance
from config import HSGMConfig
config = HSGMConfig(
hidden_dim=768,
segment_size=256,
local_threshold=0.2,
global_threshold=0.1,
top_k_retrieval=5,
learning_rate=3e-5
)Supported: Document-AMR, OntoNotes-SRL, Legal-ECHR, NarrativeQA, GovReport
Transformer: Longformer, BigBird, LongT5, Reformer
Graph: Full Graph, Sliding Window, Graph Transformer
Retrieval: BM25+T5, FiD, SGPT, RAG, REPLUG
git clone https://github.com/FastLM/HSGM.git
cd HSGM
pip install -e .
pip install -r requirements-dev.txtpython -m pytest tests/If you use HSGM in your research, please cite our paper:
@inproceedings{liu2025hsgm,
title={HSGM: Hierarchical Segment-Graph Memory for Scalable Long-Text Semantics},
author={Liu, Dong and Yu, Yanxuan},
booktitle={Proceedings of the 14th Joint Conference on Lexical and Computational Semantics (* SEM 2025)},
pages={328--337},
year={2025}
}