HSGM: Hierarchical Segment-Graph Memory for Scalable Long-Text Semantics

Authors: Dong Liu, Yanxuan Yu
Institutions: Yale University, Columbia University

Abstract

Semantic parsing of long documents remains challenging due to quadratic growth in pairwise composition and memory requirements. We introduce Hierarchical Segment-Graph Memory (HSGM), a novel framework that decomposes an input of length N into M meaningful segments, constructs Local Semantic Graphs on each segment, and extracts compact summary nodes to form a Global Graph Memory. HSGM reduces complexity from O(N²) to O(Nk + (N/k)²), achieving 2-4× speedup with custom CUDA kernels and 60%+ memory reduction. The framework supports incremental updates for streaming documents, hierarchical query processing via top-K retrieval, and multiple semantic tasks including AMR parsing, SRL, event extraction, QA, and summarization.

Installation

Prerequisites

Python 3.8+
PyTorch 1.12+
CUDA 11.0+ (for GPU acceleration)

Install Dependencies

pip install -r requirements.txt

Download Additional Models

# Download spaCy English model
python -m spacy download en_core_web_sm

# Download NLTK data
python -c "import nltk; nltk.download('punkt')"

Quick Start

Basic Usage

from hsgm import HSGMModel
from config import HSGMConfig

# Initialize configuration
config = HSGMConfig()

# Create HSGM model
model = HSGMModel(
    config=config,
    hidden_dim=768,
    segment_size=256,
    local_threshold=0.2,
    global_threshold=0.1,
    top_k_retrieval=5,
    device="cuda"
)

# Process documents
documents = [
    "Your long document text here...",
    "Another document to process..."
]

output = model(documents)
print(f"Generated {len(output.summary_nodes)} summary nodes")
print(f"Processing time: {output.processing_time:.4f}s")

Query Processing

# Process a query
query = "What is the main topic?"
result = model.query(query)
print(f"Query result shape: {result.shape}")

Incremental Updates

# Add new documents incrementally
new_documents = ["New document content..."]
update_stats = model.incremental_update(new_documents)
print(f"Cache hit rate: {update_stats['hit_rate']:.3f}")

Demo

Run the comprehensive demo and test suite to see HSGM in action:

python run_demo.py

Training

Train HSGM Model

python experiments/train.py --dataset document_amr --task classification --epochs 10

Available Datasets

document_amr: Document-level AMR parsing
onto_notes_srl: Semantic Role Labeling
legal_eghr: Legal event extraction
narrative_qa: Question answering
gov_report: Text summarization

Training Options

python experiments/train.py \
    --dataset document_amr \
    --task classification \
    --batch_size 8 \
    --learning_rate 3e-5

Experiments

Run Comprehensive Experiments

python experiments/experiments.py \
    --dataset document_amr \
    --experiments main ablation scalability streaming

Architecture

Pipeline

Input Document → Segmentation → Local Graphs → Hierarchical Memory → Query Processing

Components & Acceleration

The framework consists of four core components: a Document Segmenter that splits documents into coherent segments, a Local Semantic Graph builder that constructs graphs within each segment, a Hierarchical Memory that maintains summary nodes with global structure, and a Query Processor for hierarchical retrieval with local reasoning. HSGM leverages custom CUDA kernels for GPU acceleration including vectorized pairwise similarity computation (O(Nk)), parallel adaptive thresholding with reduction, efficient sparse edge creation in COO format, multi-head cross-segment attention with shared memory optimization, and bitonic sort for fast top-K retrieval.

from hsgm import get_hsgm_ops
ops = get_hsgm_ops(device='cuda')  # Auto-detects CUDA
similarities = ops.pairwise_similarity(embeddings)

Performance

Complexity

HSGM: O(Nk + (N/k)²) with k ≪ N
Full Graph: O(N²)
Optimal k: √N achieves O(N^(3/2))

Benchmark Results

Model	Accuracy	F1	Time (ms)	Memory (GB)
HSGM	78.5%	85.6%	380	8.2
Longformer	76.8%	84.5%	700	6.8
BigBird	77.1%	84.8%	650	6.5
Full Graph	78.2%	85.1%	1200	12.5

Scalability (20k tokens)

Speedup: 16.5× faster than baseline
Memory: 78% reduction vs Full Graph
Accuracy: 95%+ of baseline performance

Configuration

from config import HSGMConfig

config = HSGMConfig(
    hidden_dim=768,
    segment_size=256,
    local_threshold=0.2,
    global_threshold=0.1,
    top_k_retrieval=5,
    learning_rate=3e-5
)

Datasets

Supported: Document-AMR, OntoNotes-SRL, Legal-ECHR, NarrativeQA, GovReport

Baselines

Transformer: Longformer, BigBird, LongT5, Reformer
Graph: Full Graph, Sliding Window, Graph Transformer
Retrieval: BM25+T5, FiD, SGPT, RAG, REPLUG

Development

git clone https://github.com/FastLM/HSGM.git
cd HSGM
pip install -e .
pip install -r requirements-dev.txt

Running Tests

python -m pytest tests/

Citation

If you use HSGM in your research, please cite our paper:

@inproceedings{liu2025hsgm,
  title={HSGM: Hierarchical Segment-Graph Memory for Scalable Long-Text Semantics},
  author={Liu, Dong and Yu, Yanxuan},
  booktitle={Proceedings of the 14th Joint Conference on Lexical and Computational Semantics (* SEM 2025)},
  pages={328--337},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
__pycache__		__pycache__
experiments		experiments
hsgm		hsgm
LICENSE		LICENSE
README.md		README.md
config.py		config.py
demo.py		demo.py
requirements.txt		requirements.txt
run_demo.py		run_demo.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

HSGM: Hierarchical Segment-Graph Memory for Scalable Long-Text Semantics

Abstract

Installation

Prerequisites

Install Dependencies

Download Additional Models

Quick Start

Basic Usage

Query Processing

Incremental Updates

Demo

Training

Train HSGM Model

Available Datasets

Training Options

Experiments

Run Comprehensive Experiments

Architecture

Pipeline

Components & Acceleration

Performance

Complexity

Benchmark Results

Scalability (20k tokens)

Configuration

Datasets

Baselines

Development

Running Tests

Citation

About

Uh oh!

Releases

Packages

Languages

License

FastLM/HSGM

Folders and files

Latest commit

History

Repository files navigation

HSGM: Hierarchical Segment-Graph Memory for Scalable Long-Text Semantics

Abstract

Installation

Prerequisites

Install Dependencies

Download Additional Models

Quick Start

Basic Usage

Query Processing

Incremental Updates

Demo

Training

Train HSGM Model

Available Datasets

Training Options

Experiments

Run Comprehensive Experiments

Architecture

Pipeline

Components & Acceleration

Performance

Complexity

Benchmark Results

Scalability (20k tokens)

Configuration

Datasets

Baselines

Development

Running Tests

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages