-
-
Notifications
You must be signed in to change notification settings - Fork 7
Add tests for federated SPARQL queries between the curies mapping service and popular triplestores #53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
vemonet
wants to merge
32
commits into
biopragmatics:main
Choose a base branch
from
vemonet:add-federated-queries-test-with-docker
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Add tests for federated SPARQL queries between the curies mapping service and popular triplestores #53
Changes from all commits
Commits
Show all changes
32 commits
Select commit
Hold shift + click to select a range
843541f
Add tests to check if federated queries between the curies mapping se…
vemonet 899d586
Update MANIFEST.in
cthoyt 6863cc3
Merge branch 'main' into pr/53
cthoyt e77a846
Update test_sparql.py
cthoyt 67d009e
Remove redundant code
cthoyt 79dd180
Merge branch 'main' into pr/53
cthoyt 7a61e43
Update test_sparql.py
cthoyt 423e9f6
Code cleanup
cthoyt e4443c0
Update test_sparql.py
cthoyt 418fd74
Update test_sparql.py
cthoyt 41dfdb7
Merge branch 'main' into pr/53
cthoyt eaf2906
Update test_sparql.py
cthoyt 5f555bf
Update test_sparql.py
cthoyt 62a1f49
try to fix a bit the URLs that have been changed without checking
vemonet e32c6da
fix the blazegraph local URL in test
vemonet 984f10c
fix federated queries test, only test_from_virtuoso_to_mapping_servic…
vemonet 0e709cf
fix CSV parsing, which fixes all tests
vemonet 85f8652
Use the same query for test from the mapping service to external trip…
vemonet 432fc06
improve how triples are defined in init script
vemonet 477fe35
Add externally configurable tests
cthoyt 5e5e1e9
Add second generic test
cthoyt 78fb063
Better configure queries
cthoyt 3561a57
Update src/curies/mapping_service/utils.py
cthoyt 58e6021
Cleanup code
cthoyt 18cb2d7
pass flake8
cthoyt f2b9f74
add federated queries tests for fuseki
vemonet dac165c
merge
vemonet efbfa00
Remove non-generic tests
cthoyt 3657376
Update test_sparql.py
cthoyt 7e6d339
Make tests generic and not rely on docker bioregistry
cthoyt 7b66804
Switch to cases
cthoyt 4dd192c
Merge branch 'main' into add-federated-queries-test-with-docker
cthoyt File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| version: "3" | ||
| services: | ||
|
|
||
| mapping-service: | ||
| build: | ||
| context: . | ||
| dockerfile: tests/resources/Dockerfile | ||
| ports: | ||
| - 8888:8888 | ||
| volumes: | ||
| - ./src:/app/src | ||
| - ./tests:/app/tests | ||
|
|
||
| blazegraph: | ||
| image: metaphacts/blazegraph-basic:2.2.0-20160908.003514-6-jetty9.4.44-jre8-45dbfff | ||
| ports: | ||
| - 8889:8080 | ||
|
|
||
| virtuoso: | ||
| image: openlink/virtuoso-opensource-7:latest | ||
| ports: | ||
| - 8890:8890 | ||
| environment: | ||
| - DBA_PASSWORD=dba | ||
| - SPARQL_UPDATE=true | ||
| - VIRT_Database_ErrorLogLevel=7 # 7 is maximum logs | ||
| - VIRT_HTTPServer_HTTPLogFile=/http.log | ||
| # https://docs.openlinksw.com/virtuoso/loggingandrecording/ | ||
|
|
||
| fuseki: | ||
| image: stain/jena-fuseki:4.0.0 | ||
| ports: | ||
| - 8891:3030 | ||
| environment: | ||
| - ADMIN_PASSWORD=dba # Admin user: admin | ||
| # - FUSEKI_DATASET_1=mapping # Not working with 4.0.0 | ||
| # - JVM_ARGS=-Xmx2g | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,139 @@ | ||
| """Test cases.""" | ||
|
|
||
| import itertools as itt | ||
| import unittest | ||
| from textwrap import dedent | ||
| from typing import Collection, NamedTuple, Set, Tuple | ||
|
|
||
| from curies.mapping_service.utils import ( | ||
| get_sparql_record_so_tuples, | ||
| get_sparql_records, | ||
| sparql_service_available, | ||
| ) | ||
|
|
||
| # NOTE: federated queries need to use docker internal URL | ||
| LOCAL_MAPPING_SERVICE = "http://localhost:8888/sparql" | ||
| LOCAL_BLAZEGRAPH = "http://localhost:8889/blazegraph/namespace/kb/sparql" | ||
| LOCAL_VIRTUOSO = "http://localhost:8890/sparql" | ||
| LOCAL_FUSEKI = "http://localhost:8891/mapping" | ||
|
|
||
| DOCKER_MAPPING_SERVICE = "http://mapping-service:8888/sparql" | ||
| DOCKER_BLAZEGRAPH = "http://blazegraph:8080/blazegraph/namespace/kb/sparql" | ||
| DOCKER_VIRTUOSO = "http://virtuoso:8890/sparql" | ||
| DOCKER_FUSEKI = "http://fuseki:3030/mapping" | ||
|
|
||
| #: Some triplestores are a bit picky on the mime types to use, e.g. blazegraph | ||
| #: SELECT query fails when asking for application/xml, so we need to use a subset | ||
| #: of content types for the federated tests | ||
| TEST_CONTENT_TYPES = { | ||
| "application/json", | ||
| "application/sparql-results+xml", | ||
| "text/csv", | ||
| } | ||
|
|
||
|
|
||
| class TripleStoreConfiguation(NamedTuple): | ||
| """A tuple with information for each triplestore.""" | ||
|
|
||
| local_endpoint: str | ||
| docker_endpoint: str | ||
| mimetypes: Collection[str] | ||
| direct_query_fmts: Collection[str] | ||
| service_query_fmts: Collection[str] | ||
|
|
||
|
|
||
| def get_pairs(endpoint: str, sparql: str, accept: str) -> Set[Tuple[str, str]]: | ||
| """Get a response from a given SPARQL query.""" | ||
| records = get_sparql_records(endpoint=endpoint, sparql=sparql, accept=accept) | ||
| return get_sparql_record_so_tuples(records) | ||
|
|
||
|
|
||
| SPARQL_TO_MAPPING_SERVICE_VALUES = """\ | ||
| PREFIX owl: <http://www.w3.org/2002/07/owl#> | ||
| SELECT DISTINCT ?s ?o WHERE {{ | ||
| SERVICE <{0}> {{ | ||
| VALUES ?s {{ <http://purl.obolibrary.org/obo/CHEBI_24867> <http://purl.obolibrary.org/obo/CHEBI_24868> }} . | ||
| ?s owl:sameAs ?o . | ||
| }} | ||
| }} | ||
| """.rstrip() | ||
|
|
||
| SPARQL_TO_MAPPING_SERVICE_SIMPLE = """\ | ||
| PREFIX owl: <http://www.w3.org/2002/07/owl#> | ||
| SELECT DISTINCT ?s ?o WHERE {{ | ||
| SERVICE <{0}> {{ | ||
| <http://purl.obolibrary.org/obo/CHEBI_24867> owl:sameAs ?o . | ||
| ?s owl:sameAs ?o . | ||
| }} | ||
| }} | ||
| """.rstrip() | ||
|
|
||
| SPARQL_FROM_MAPPING_SERVICE_SIMPLE = """\ | ||
| PREFIX owl: <http://www.w3.org/2002/07/owl#> | ||
| SELECT ?s ?o WHERE {{ | ||
| <http://purl.obolibrary.org/obo/CHEBI_24867> owl:sameAs ?s . | ||
| SERVICE <{0}> {{ | ||
| ?s a ?o . | ||
| }} | ||
| }} | ||
| """.rstrip() | ||
|
|
||
| configurations = { | ||
| "blazegraph": TripleStoreConfiguation( | ||
| local_endpoint=LOCAL_BLAZEGRAPH, | ||
| docker_endpoint=DOCKER_BLAZEGRAPH, | ||
| mimetypes=TEST_CONTENT_TYPES, | ||
| direct_query_fmts=[SPARQL_TO_MAPPING_SERVICE_SIMPLE, SPARQL_TO_MAPPING_SERVICE_VALUES], | ||
| service_query_fmts=[SPARQL_FROM_MAPPING_SERVICE_SIMPLE], | ||
| ), | ||
| "virtuoso": TripleStoreConfiguation( | ||
| local_endpoint=LOCAL_VIRTUOSO, | ||
| docker_endpoint=DOCKER_VIRTUOSO, | ||
| mimetypes=TEST_CONTENT_TYPES, # todo generalize? | ||
| # TODO: Virtuoso fails to resolves VALUES in federated query | ||
| direct_query_fmts=[SPARQL_TO_MAPPING_SERVICE_SIMPLE], | ||
| service_query_fmts=[SPARQL_FROM_MAPPING_SERVICE_SIMPLE], | ||
| ), | ||
| "fuseki": TripleStoreConfiguation( | ||
| local_endpoint=LOCAL_FUSEKI, | ||
| docker_endpoint=DOCKER_FUSEKI, | ||
| mimetypes=TEST_CONTENT_TYPES, | ||
| direct_query_fmts=[SPARQL_TO_MAPPING_SERVICE_SIMPLE, SPARQL_TO_MAPPING_SERVICE_VALUES], | ||
| service_query_fmts=[SPARQL_FROM_MAPPING_SERVICE_SIMPLE], | ||
| ), | ||
| } | ||
|
|
||
|
|
||
| class FederationMixin(unittest.TestCase): | ||
| """Tests federated SPARQL queries.""" | ||
|
|
||
| #: The URL for the mapping service | ||
| mapping_service: str | ||
|
|
||
| def assert_endpoint(self, endpoint: str, query: str, *, accept: str): | ||
| """Assert the endpoint returns favorable results.""" | ||
| records = get_pairs(endpoint, query, accept=accept) | ||
| self.assertIn( | ||
| ("http://purl.obolibrary.org/obo/CHEBI_24867", "https://bioregistry.io/chebi:24867"), | ||
| records, | ||
| ) | ||
|
|
||
| def test_from_triplestore(self): | ||
| """Test federated queries from various triples stores to the CURIEs service.""" | ||
| for name, config in configurations.items(): | ||
| self.assertTrue(sparql_service_available(config.local_endpoint)) | ||
| for mimetype, sparql_fmt in itt.product(config.mimetypes, config.direct_query_fmts): | ||
| sparql = dedent(sparql_fmt.format(self.mapping_service).rstrip()) | ||
| with self.subTest(name=name, mimetype=mimetype, sparql=sparql): | ||
| self.assert_endpoint(config.local_endpoint, sparql, accept=mimetype) | ||
|
|
||
| def test_to_triplestore(self): | ||
| """Test a federated query from the CURIEs service to various triple stores.""" | ||
| for name, config in configurations.items(): | ||
| self.assertTrue(sparql_service_available(config.local_endpoint)) | ||
| for mimetype, sparql_fmt in itt.product(config.mimetypes, config.service_query_fmts): | ||
| sparql = dedent(sparql_fmt.format(config.docker_endpoint).rstrip()) | ||
| with self.subTest(name=name, mimetype=mimetype, sparql=sparql): | ||
| records = get_pairs(self.mapping_service, sparql, accept=mimetype) | ||
| self.assertGreater(len(records), 0) | ||
| # TODO add assert_endpoint here? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,11 @@ | ||
| FROM python:3.10 | ||
|
|
||
| # Dockerfile used to spawn a mapping service SPARQL endpoint for testing | ||
|
|
||
| WORKDIR /app | ||
|
|
||
| ADD . . | ||
|
|
||
| RUN pip install -e ".[fastapi,rdflib,bioregistry]" | ||
|
|
||
| CMD [ "bioregistry", "web", "--port", "8888", "--host", "0.0.0.0" ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| ## Script to initialize the triplestores started with docker | ||
| # Run it from the root of the repo: ./tests/resources/init_triplestores.sh | ||
|
|
||
| TRIPLES=" | ||
| <https://identifiers.org/CHEBI:24867> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/biolink/vocab/ChemicalEntity> . | ||
| <https://identifiers.org/CHEBI:24868> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/biolink/vocab/ChemicalEntity> . | ||
| " | ||
|
|
||
| echo " 🪄 Load triples to Virtuoso and enable federated queries" | ||
| docker compose exec virtuoso isql -U dba -P dba exec='GRANT "SPARQL_SELECT_FED" TO "SPARQL";' | ||
| docker compose exec virtuoso isql -U dba -P dba exec="SPARQL INSERT IN <https://identifiers.org/CHEBI> { $TRIPLES };" | ||
|
|
||
| echo " ⚡️ Load triples to Blazegraph" | ||
| docker compose exec blazegraph curl -X POST http://localhost:8080/blazegraph/namespace/kb/sparql -d "update=insert data { $TRIPLES }" | ||
|
|
||
| echo " ☕️ Load triples to Fuseki" | ||
| docker compose exec fuseki curl -X POST -u admin:dba -H 'Content-Type: application/x-www-form-urlencoded; charset=UTF-8' 'http://localhost:3030/$/datasets' -d "dbName=mapping&dbType=tdb2" | ||
| docker compose exec fuseki curl -X POST http://localhost:3030/mapping -d "update=insert data { $TRIPLES }" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,61 @@ | ||
| """Tests federated SPARQL queries between the curies mapping service and popular triplestores.""" | ||
|
|
||
| import time | ||
| from multiprocessing import Process | ||
| from typing import ClassVar | ||
|
|
||
| import uvicorn | ||
|
|
||
| from curies import Converter | ||
| from curies.mapping_service import get_fastapi_mapping_app | ||
| from curies.mapping_service.utils import sparql_service_available | ||
| from tests import cases | ||
| from tests.test_mapping_service import PREFIX_MAP | ||
|
|
||
|
|
||
| class TestDockerFederation(cases.FederationMixin): | ||
| """Tests federated SPARQL queries between the curies mapping service and blazegraph/virtuoso triplestores. | ||
|
|
||
| Run and init the required triplestores locally: | ||
| 1. docker compose up | ||
| 2. ./tests/resources/init_triplestores.sh | ||
| """ | ||
|
|
||
| def setUp(self) -> None: | ||
| """Set up the test case.""" | ||
| self.mapping_service = cases.LOCAL_MAPPING_SERVICE | ||
|
|
||
| if not sparql_service_available(self.mapping_service): | ||
| self.skipTest(f"Mapping service is not available: {self.mapping_service}") | ||
|
|
||
|
|
||
| def _get_app(): | ||
| converter = Converter.from_priority_prefix_map(PREFIX_MAP) | ||
| app = get_fastapi_mapping_app(converter) | ||
| return app | ||
|
|
||
|
|
||
| class TestLocalFederation(cases.FederationMixin): | ||
| """Tests federated SPARQL queries.""" | ||
|
|
||
| host: ClassVar[str] = "localhost" | ||
| port: ClassVar[int] = 8000 | ||
| mapping_service_process: Process | ||
|
|
||
| def setUp(self): | ||
| """Set up the test case.""" | ||
| # Start the curies mapping service SPARQL endpoint | ||
| self.mapping_service_process = Process( | ||
| target=uvicorn.run, | ||
| # uvicorn.run accepts a zero-argument callable that returns an app | ||
| args=(_get_app,), | ||
| kwargs={"host": self.host, "port": self.port, "log_level": "info"}, | ||
| daemon=True, | ||
| ) | ||
| self.mapping_service_process.start() | ||
| time.sleep(5) | ||
|
|
||
| self.mapping_service = f"http://{self.host}:{self.port}/sparql" | ||
|
|
||
| if not sparql_service_available(self.mapping_service): | ||
| self.skipTest(f"Mapping service is not available: {self.mapping_service}") |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason we can't just use the default ports for each service as we expose outside of docker?