This repository hosts the materials developed for research on parallel evolution of ontologies and ontology alignments. This research has been performed as a collaboration between the Knowledge Engineering Group (KEG) and the Ontology Engineering Group (OEG).
Evol_Align is a Python-based framework for generating, managing, and reviewing ontology alignments using Large Language Models (LLMs). The project integrates multiple LLM providers (OpenAI, Gemini, Ollama) to create SSSOM (Simple Standard for Sharing Ontology Mappings) formatted alignment sets with structured outputs.
- Python 3.8+
- pip package manager
-
Clone the repository:
git clone https://github.com/DiegoCondeHerreros/Evol_Align.git cd Evol_Align -
Install required dependencies:
pip install -r requirements.txt
Key dependencies include:
openai- OpenAI API clientollama- Ollama LLM interfacegoogle-genai- Google Gemini API clientpydantic- Data validation using Python type annotationsrdflib- RDF/OWL ontology handling
-
Configure API credentials:
Create an
api_key.txtfile in the repository root with the following structure:[ { "OpenAI": { "API_KEY": "your_openai_api_key", "Models": { "gpt-4": ["temperature", "top_p", "max_tokens"], "gpt-3.5-turbo": ["temperature", "max_tokens"] } }, "Gemini": { "API_KEY": "your_gemini_api_key", "Models": { "gemini-2.0-flash": ["temperature"], "gemini-1.5-pro": ["temperature"] } }, "Ollama": { "Models": { "llama2": ["temperature", "top_k", "top_p"], "mistral": ["temperature"] } } } ]- Replace placeholder values with your actual API keys
- For Ollama (local LLM), no API_KEY is required; the tool will automatically pull models
- Supported parameters depend on each provider's API specifications
Use the LLM interface to generate ontology alignments:
from llm_interface import LLM
from structured_outputs import SSSOMAlignmentStrictCore
# Initialize LLM
llm = LLM(
model_family="OpenAI", # or "Gemini", "Ollama"
model="gpt-4",
params={"temperature": 0.7, "max_tokens": 2000},
context=None # Optional: list of ontology files for context
)
# Create messages for alignment generation
messages = [
{"role": "system", "content": "You are an ontology alignment expert."},
{"role": "user", "content": "Generate alignments between these ontologies..."}
]
# Get structured response
response = llm.prompt(messages, SSSOMAlignmentStrictCore, context=None)Review generated alignments and provide manual curation:
python alignement_review.py -a path/to/alignments.ttlProcess:
- Load existing SSSOM alignments in Turtle format
- Review each alignment one by one
- For each mapping, provide feedback:
- y (Yes): Accept the alignment
- n (No): Reject the alignment
- r (Requires Refinement): Flag for further refinement
- Provide justification comments for your decisions
- Output file saved to
output/directory with reviewer metadata
Define and validate alignment data using Pydantic models:
from structured_outputs import SSSOMAlignmentStrictCore, MappingRow, MappingPredicate, SemaPVJustification
# Create a mapping row
mapping = MappingRow(
subject_id="http://example.org/ontology1/Class1",
object_id="http://example.org/ontology2/ClassA",
predicate_id=MappingPredicate.skos_exact_match,
mapping_justification=SemaPVJustification.lexical_matching,
confidence=0.95
)
# Create a complete alignment set
alignment_set = SSSOMAlignmentStrictCore(
mapping_set_id="http://example.org/alignments/set1",
license="http://creativecommons.org/licenses/by/4.0/",
subject_source="http://example.org/ontology1",
object_source="http://example.org/ontology2",
subject_type="owl:Class",
object_type="owl:Class",
mappings=[mapping]
)The Simple Standard for Sharing Ontology Mappings (SSSOM) is used throughout this project. For more information, visit: https://w3id.org/sssom/
- OWL format (.owl files)
- Turtle/RDF format (.ttl files)
- RDF/XML format (.rdf files)
- SSSOM Turtle format (.ttl) - RDF representation with SSSOM metadata
- JSON/Pydantic format - Structured Python objects
- Requires valid OpenAI API key
- Supports structured output via response schema
- Recommended models: gpt-4, gpt-3.5-turbo
- Requires valid Google GenAI API key
- Supports file uploads for ontology context
- Recommended models: gemini-2.0-flash, gemini-1.5-pro
- No API key required
- Runs locally on your machine
- Automatically downloads models on first use
- Recommended models: llama2, mistral, neural-chat
- Prepare ontologies - Place your ontology files in
testOntologies/ - Configure LLM - Set up API keys in
api_key.txt - Generate alignments - Use
llm_interface.pyto generate SSSOM mappings - Review alignments - Use
alignement_review.pyfor manual curation - Export results - Save reviewed alignments to
output/
- SSSOM Standard: https://w3id.org/sssom/
- OWL Ontology Language: https://www.w3.org/OWL/
- RDF/Turtle Format: https://www.w3.org/TR/turtle/
- OAEI (Ontology Alignment Evaluation Initiative): http://oaei.ontologymatching.org/
Contributions are welcome! Please ensure code follows the existing structure and includes proper documentation.