Skip to content

rdflib-endpoint: easily register custom functions and deploy SPARQL endpoints with RDFLib #1333

@vemonet

Description

@vemonet

Hi the RDFLib community, I developed a wrapper to easily expose RDFLib graph publicly with custom SPARQL functions:

  • deploy a publicly available SPARQL endpoint from a RDFLib Graph using FastAPI
  • easily define and register python functions as SPARQL custom functions (without the need for being an expert in RDFLib and SPARQL engines)

Checkout the package repository: https://github.com/vemonet/rdflib-endpoint

I found this existing project: rdflib-web
But it was not usable to me:

  • Not updated since 2017
  • No documentation (no idea how to declare and start the endpoint)

Features

Here are a few features of the rdflib-endpoint package:

  • Straigthforward example for production deployments: quickly deploy your python functions with Docker
    • Get started by copying this example folder and changing the function for your functions
    • Use the docker-compose to quickly deploy it in production (including example for using the popular docker reverse nginx-proxy to deploy publicly in one command with easy HTTPS encryption thanks to LetsEncrypt)
  • Built as class inheriting from FastAPI
  • CORS enabled by default
  • It should be mostly SPARQL 1.1 compliant:
    • It has been built by reading through the W3C SPARQL 1.1 docs and testing to query the endpoint from other SPARQL endpoints (e.g. Virtuoso and GraphDB) to make sure it is recognized as a legit SPARQL endpoint
    • The SPARQL service description dynamically returns the different functions registered by the developer
  • Bonus: the SPARQL endpoint gets an OpenAPI specifications with Swagger UI, and the dev can provide example queries that will guide the users using the SPARQL custom function

Example code

Example of code to register a function and start the SPARQL endpoint:

from rdflib_endpoint import SparqlEndpoint
import rdflib
from rdflib.plugins.sparql.evalutils import _eval

def custom_concat(query_results, ctx, part, eval_part):
  # Handle query and fill the results
  argument1 = str(_eval(part.expr.expr[0], eval_part.forget(ctx, _except=part.expr._vars)))
  argument2 = str(_eval(part.expr.expr[1], eval_part.forget(ctx, _except=part.expr._vars)))
  evaluation = []
  evaluation.append(argument1 + argument2)
  evaluation.append(argument2 + argument1)
  for i, result in enumerate(evaluation):
        query_results.append(eval_part.merge({
            part.var: rdflib.Literal(result), 
        }))
  return query_results, ctx, part, eval_part

g = rdflib.graph.ConjunctiveGraph()

app = SparqlEndpoint(
    graph=g,
    # Register the functions:
    functions={
        'https://w3id.org/um/sparql-functions/custom_concat': custom_concat
    },
    # CORS enabled by default
    cors_enabled=True,
    # Metadata used for the service description and Swagger UI:
    title="SPARQL endpoint for RDFLib graph", 
    description="A SPARQL endpoint to serve machine learning models, or any other logic implemented in Python. \n[Source code](https://github.com/vemonet/rdflib-endpoint)",
    version="0.0.1",
    public_url='https://your-endpoint-url/sparql',
    # Example queries displayed in the Swagger UI to help users (markdown)
    example_query="""Example query:\n
```
PREFIX myfunctions: <https://w3id.org/um/sparql-functions/>
SELECT ?concat ?concatLength WHERE {
    BIND("First" AS ?first)
    BIND(myfunctions:custom_concat(?first, "last") AS ?concat)
}
```"""
)

What still needs to be improved

Properly handles all possible format via a clean content negociation depending on the query verb. Currently working well with CSV/JSON/XML for Select queries.
But it works only for turtle when doing a Construct query: for some reason when serializing the exact same graph as format='xml' RDFLib returns a empty XML file, but when asking for turtle the RDF is as expected.
I did not take the time to improve this part because all I am interested in is proper support for federated queries. And those issues are more related to RDFLib serialization problem (and to be honest, the least I use XML, the happiest I am!)

Anyone interested?

I was wondering if anyone was interested in reusing this package in the RDFLib community? I am already using it for 2 small services:

I was also considering publishing the package as rdflib-endpoint on PyPI, is this something that the RDFLib Team would be interested in? (I don't want to mess with naming of packages and confuse the users)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions