Problem O(N²) comparisons are infeasible for real ontologies.
Task: Introduce a BlockingPipeline before matching.
What to implement
- SBERT / BM25 top-N candidate selection
- HNSW index (FAISS / hnswlib)
- Configurable candidate size
Pipeline:
blocking → matching → reranking
The blocking goal is not to compare everything to everything. First, reduce the search space, and then do the alignment.
naive approach:
for each source concept:
compare with every target concept
OntoAligner approach
Step 1: Blocking (cheap filtering)
reduce 10,000 targets → top 100 candidates
Step 2: Matching (smarter model)
run alignment only on those 100
Step 3: Reranking (final refinement)
pick best match from those 100
Problem O(N²) comparisons are infeasible for real ontologies.
Task: Introduce a BlockingPipeline before matching.
What to implement
Pipeline:
The blocking goal is not to compare everything to everything. First, reduce the search space, and then do the alignment.
naive approach:
OntoAligner approach