Skip to content

Preselection / Blocking (new step) #93

Description

@HamedBabaei

Problem O(N²) comparisons are infeasible for real ontologies.

Task: Introduce a BlockingPipeline before matching.

What to implement

  • SBERT / BM25 top-N candidate selection
  • HNSW index (FAISS / hnswlib)
  • Configurable candidate size

Pipeline:

blocking → matching → reranking

The blocking goal is not to compare everything to everything. First, reduce the search space, and then do the alignment.

naive approach:

for each source concept:
    compare with every target concept

OntoAligner approach

Step 1: Blocking (cheap filtering)
    reduce 10,000 targets → top 100 candidates

Step 2: Matching (smarter model)
    run alignment only on those 100

Step 3: Reranking (final refinement)
    pick best match from those 100

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions