Skip to content

Latest commit

 

History

History
148 lines (110 loc) · 5.17 KB

File metadata and controls

148 lines (110 loc) · 5.17 KB

Conversational Ontology Builder

Build a formal RDF/OWL ontology through natural conversation. Talk to the agent about your domain — it extracts concepts, relationships, and properties, grounds them in W3C standard vocabularies, and persists everything to a Turtle (.ttl) file that grows with each session.

How it works

The agent uses the Anthropic tool-use API. Rather than dumping the entire ontology into every prompt, it keeps a compact concept index in the system prompt and calls structured tools when it needs to read or write triples:

Tool What it does
check_concept Looks up a term by label, prefLabel, altLabel (acronyms), or local name
add_triples Parses and merges a Turtle fragment; warns if it looks like instance data
remove_triples Removes specific triples by Turtle fragment
remove_subject Removes all triples for a matched concept (by name or label)

The ontology is a TBox only — abstract schema (classes, properties, hierarchies, SKOS controlled vocabularies). Named individuals, specific places, and people belong in your data layer, not here.

Setup

python -m venv venv && source venv/bin/activate
pip install -r requirements.txt

cp .env.example .env
# edit .env — add your ANTHROPIC_API_KEY

Edit config.toml to set your domain namespace and TTL output path:

[ontology]
namespace = "http://example.org/ontology#"
ttl_path  = "ontologies/domain_ontology.ttl"

[llm]
model      = "claude-sonnet-4-5-20250929"
max_tokens = 4096

Usage

# Start (or resume) a conversation
python -m ontology_builder chat

# Wipe the ontology and start fresh
python -m ontology_builder chat --fresh

# Show stats for an existing TTL file
python -m ontology_builder stats ontologies/domain_ontology.ttl

# Validate an existing TTL file
python -m ontology_builder validate ontologies/domain_ontology.ttl

# Export to another RDF format
python -m ontology_builder export ontologies/domain_ontology.ttl --format json-ld

In-session slash commands

Command Effect
/stats Class, property, and triple counts
/show Print the current TTL to the terminal
/save <path> Copy the TTL to a different file
/reset Clear the ontology (prompts for confirmation)
/quit Exit

Example conversation

You: A security clearance is how we grant access to sensitive information.
     SC means Security Clearance. Perm Sec is short for Permanent Secretary.

Agent: ✓ SecurityClearance → prov:Activity + skos:Concept
       ✓ skos:altLabel "SC" recorded — I'll find it by that name in future
       ✓ PermanentSecretary → org:Role, skos:altLabel "Perm Sec", "PS" recorded

You: What clearances do we have?

Agent: [calls check_concept("security clearance") — finds it via altLabel "SC"]
       SecurityClearance is defined as a prov:Activity...

Standard ontologies

Every concept is grounded in at least one of these W3C vocabularies:

Prefix Vocabulary Typical use
time: OWL-Time Durations, intervals, retention periods
org: W3C Organization Departments, roles, units
prov: PROV-O Activities, agents, processes
skos: SKOS Controlled vocabularies, concept schemes
dcterms: Dublin Core Terms Documents, subjects, identifiers
owl: OWL 2 Class and property declarations
xsd: XML Schema Datatypes — date, duration, string, integer

Alias and abbreviation support

check_concept searches skos:altLabel as well as primary labels and local names. The agent is instructed to proactively record known abbreviations when creating a concept, so you can refer to concepts by their shorthand in later turns and lookups will succeed automatically.

Stored example:

:PermanentSecretary a owl:Class ;
    rdfs:label "Permanent Secretary" ;
    skos:altLabel "Perm Sec" ;
    skos:altLabel "PS" .

TBox / ABox boundary

The system detects and warns when a triple looks like instance data (an rdf:type assertion where the type is not a schema-level construct such as owl:Class or skos:Concept). The agent is instructed to self-correct on receiving such a warning — replacing the named individual with the class it exemplifies.

Add: :GovernmentDepartment a owl:Class Not: :CabinetOffice a org:Organization

Tests

python -m pytest tests/ -q

102 tests, all mocked — no API calls, no cost.

Project layout

ontology_builder/
  config.py              # loads config.toml + .env
  standard_ontologies.py # W3C vocab metadata and grounding hints
  ontology_manager.py    # RDFLib graph — add, remove, query, persist
  tools.py               # Anthropic tool schemas + dispatch
  prompts.py             # system prompt builder
  agent.py               # multi-turn conversation loop
  cli.py                 # Click + Rich terminal interface
ontologies/
  domain_ontology.ttl    # grows with each session (gitignore if sensitive)
tests/
  conftest.py            # fixtures and mock API helpers
  test_ontology_manager.py
  test_abox_detection.py
  test_tools.py
  test_prompts.py
  test_agent.py
config.toml
.env                     # gitignored — put ANTHROPIC_API_KEY here