Skip to content

MaastrichtU-IDS/AI-Assisted-Knowledge-Graph-Metadata-Curation

Repository files navigation

AI-Assisted Knowledge Graph Metadata Curation

Through a within-subjects study, we evaluate a form-based, AI-assisted approach to knowledge graph metadata creation. This approach is implemented as a tool that presents the fields from a recently proposed KG metadata specification in a structured form. For each metadata element, it generates suggestions using LLM-based retrieval-augmented generation from user-provided documentation. These suggestions follow the required structure and value formats of the specification. Users can then review, accept, or reject the suggestions before finalizing the metadata.

Here you can find the KG documentations, User study Form, user study questionnaires responses, submitted metadata by participants (tasks outputs), KG metadata spreadsheet, analysis scripts, and supplementary Excel files. The source code for the tools can be found at: https://github.com/MaastrichtU-IDS/knowledge-graph-metadata-form.

You can access tools at the following:

The AI-assisted form: https://maastrichtu-ids.github.io/knowledge-graph-metadata-form/?mode=llm

Standard form: https://maastrichtu-ids.github.io/knowledge-graph-metadata-form/?mode=regular

Turtle editor: https://maastrichtu-ids.github.io/knowledge-graph-metadata-form/?mode=turtle

To reproduce the analysis results, run the notebook Scripts/User_study_Analysis.ipynb.

Tool A – Turtle Editor

This tool provides a text editor with Turtle syntax validation.
Here, you can manually create the metadata in Turtle format.

Steps:

  1. Skim the KG documentation to understand what it describes.
  2. Open the metadata schema checklist and note which elements you must include.
  3. Gather exact values, such as IRIs for themes, sources, and linked resources when available.
  4. Map each element from the schema to a simple Turtle statement, keeping the format short and consistent.
  5. Use the editor’s validation to check that your Turtle is syntactically correct.

Example screenshot of Tool 1:

Turtle Editor


Tool B – Form Interface

In this tool, you can describe a KG using a web form.
You do not need to write any Turtle syntax — the form automatically converts your answers into structered format.

Steps:

  1. Skim the KG documentation.
  2. Fill in the field values according to the metadata specification.

Example screenshot of Tool 2:

form


Tool C – AI Assisted Form

The third tool uses AI assistance to help you fill the metadata form.
You must upload the natural-language KG document, and the tool will generate AI suggestions for each metadata field.

Steps:

  1. Upload the KG documentation.
  2. Click on each metadata field to view the AI suggestions.
  3. Review the suggestions, and if you accept the sugesstion add them. If you dont accept some of the suggesstions you can reject them.

Example screenshot of Tool 3:

AI-Assisted Form

Metadata Specification

Below is a sample of five fields from the metadata specification, showing how each element is defined, its expected data type, its purpose, and an example from Wikidata.

You can access the full KG metadata specification here on Google Sheets.

Metadata Specification (Excerpt)

Field Value Specification Purpose / Use Wikidata Metadata Example
Identifier rdfs:Literal or IRI A unique identifier for the dataset. wd:Q2013
Title rdf:LangString or xsd:string The main name of the dataset or knowledge graph. "Wikidata Knowledge Base"
Description rdf:LangString or xsd:string Provides a short explanation of the dataset content and scope. "A free and open knowledge base that can be read and edited by humans and machines."
Theme / Category IRI Specifies the topical area or classification of the dataset. <http://www.wikidata.org/entity/Q21198> (Ontology)
Distribution Information dcat:Distribution
(includes sub-elements:
  title
  description
  mediaType
  downloadURL
  accessURL)
Describes how and where the dataset is made available. title: “Wikidata dump files”;
mediaType: “application/gzip”;
downloadURL: https://dumps.wikimedia.org/wikidatawiki/entities/

Note: The schema specifies which elements are required and which are optional, and defines the expected value type for each.
For instance, a title is free text, a theme is an IRI, a date follows a standard date format, and a distribution includes structured subfields like mediaType, downloadURL, and accessURL.

Note: The “optional” and “required” indicators on this form apply only to the final version of the tool. For the purposes of this user study, please disregard them and complete as many fields as you are able to.

Structural relationships between elements of specification is shown below:

Structural relationships between components of specification .


About

This repository shares all the resources used in KG metadata curation tool study. Here you can find the KG documentations, Usability study Form, user study questionnaires responses, submitted metadata by participants, KG metadata spreadsheet, evaluation scripts, and supplementary Excel files. etc.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors