🧩 AutoRDF2GML

AutoRDF2GML is a framework designed to convert RDF data into graph representations suitable for graph-based machine learning (GML) methods, such as Graph Neural Networks (GNNs). By generating both content-based features from RDF datatype properties and topology-based features from RDF object properties, AutoRDF2GML enables effective integration of Semantic Web technologies with Graph Machine Learning.

🌟 Key Features

Content-Based Node Features: Automatically extract node features from RDF datatype properties.
Topology-Based Edge Features: Derive edge features from RDF object properties.
User-Friendly Interface: Modular design with automatic feature selection for simplicity and ease of use.
Graph ML Integration: Seamlessly integrates with leading frameworks like PyTorch Geometric and DGL.

📥 Installation via pip

AutoRDF2GML is now available via pip! To install, simply run:

pip install autordf2gml

For detailed usage instructions, check https://pypi.org/project/autordf2gml/.

Quick User Guide

For a step-by-step guide on using the framework, see our example and example-topologyfeatures directories.

Usage

To start using AutoRDF2GML, you need an (1) RDF file and (2) config file describing the configuration for the transformation. In the config file, define the RDF classes and properties as needed for your project. Once configured, execute the AutoRDF2GML script to generate a heterogeneous graph dataset suitable for your machine learning applications. For a step-by-step guide, see our example and example-topologyfeatures directories.

The output can then be used for various machine learning tasks, including node classification, link prediction, and graph classification. It can be readily integrated into common graph machine learning frameworks. For example, see how the output from AutoRDF2GML can be loaded into a PyTorch Geometric HeteroData object in this script. For instance, the structure of the loaded PyG HeteroData object is available as a directed graph here and as an undirected graph here.

Feature Configuration

Content-based Node Features

Quick example for Content-based Node Features Transformation: example

AutoRDF2GML with content-based node features is implemented in the Python script autordf2gml-cb.py. The related template and documentation of the configuration file is defined in the config-template.ini file. The default model for calculating the embeddings based on the natural language descriptions is SciBERT, but also other huggingface BERT variant models (e.g., bert-base) can be used.

Topology-based Node Features

Quick example for Topology-based Node Features Transformation: example-topologyfeatures directory.

AutoRDF2GML with topology-based node features is implemented in the Python script autordf2gml-tb.py. The related template and documentation of the configuration file is defined in the config-template.ini file. The following KG embedding models are possible for calculating the topology-based feature: TransE, DistMult, ComplEx, RotatE. The default parameters (hidden channel size 128) are defined and commented in the implementation.

🤝 Contributing

We welcome any kind of contributions!

📄 License

AutoRDF2GML is available under the MIT License, making it open and accessible for both personal and commercial use.

GML Datasets

GML Dataset LPWC
DOI: 10.5281/zenodo.10299366
License: CC BY-SA 4.0
GML Dataset SOA-SW
DOI: 10.5281/zenodo.10299429
License: Creative Commons Zero (CC0)
GML Dataset AIFB
DOI: 10.5281/zenodo.10989595
License: CC BY 4.0
GML Dataset LinkedMDB
DOI: 10.5281/zenodo.10989683
License: CC BY 4.0

📞 Contact & Reference

Michael Färber, David Lamprecht, Yuni Susanti: "AutoRDF2GML: Facilitating RDF Integration in Graph Machine Learning", Proceedings of the 23rd International Semantic Web Conference (ISWC'24), Baltimore, USA.

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
DCAT-metadata		DCAT-metadata
benchmark-evaluation		benchmark-evaluation
content-based-feature		content-based-feature
example		example
library		library
topology-based-feature		topology-based-feature
use-case_aifb-linkedmdb		use-case_aifb-linkedmdb
use-case_lpwc		use-case_lpwc
use-case_semopenalex-semanticweb		use-case_semopenalex-semanticweb
use-with-pyg		use-with-pyg
LICENSE		LICENSE
README.md		README.md
autordf2gml-overview.png		autordf2gml-overview.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧩 AutoRDF2GML

🌟 Key Features

📥 Installation via pip

Quick User Guide

Usage

Feature Configuration

Content-based Node Features

Topology-based Node Features

🤝 Contributing

📄 License

GML Datasets

📞 Contact & Reference

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧩 AutoRDF2GML

🌟 Key Features

📥 Installation via pip

Quick User Guide

Usage

Feature Configuration

Content-based Node Features

Topology-based Node Features

🤝 Contributing

📄 License

GML Datasets

📞 Contact & Reference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages