-
Notifications
You must be signed in to change notification settings - Fork 22
Description
Summary
As of elixir-europe/biovalidator#87, Biovalidator (in my fork) now accepts Draft 2020. I was looking forward to integrating Beacon v2 model JSON Schemas, when I encountered an issue regarding a broken reference:
https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/models/json/beacon-v2-default-model/common/beaconCommonComponents.json
I'm writing all these details here with the aim to get some insight from you guys as the Beacon maintainers, because I'm pulling my hair off not being able to find the broken reference.
Details
So, in a nutshell, Biovalidator fetches references ($ref) through the net, and then compiles them to use them as the schema. It's done in an iterative way until all are compiled, and then it validates the data with them.
When I use any reference to Beacon v2 JSON Schemas in the schema I feed to Biovalidator, and it starts compiling the internal schema references (mostly relative) in your model, it encounters a broken reference to beaconCommonComponents.json that points not to framework/json/common/beaconCommonComponents.json path, where I did find it in main, but to common/beaconCommonComponents.json, which doesn't exist.
My guess is that somewhere in the (../common/..) JSON Schemas there's a relative reference to beaconCommonComponents.json that must be resolving against its own (../common/..) JSON Path, ending up with the broken reference beaconCommonComponents.json. Either that, or Biovalidator is hallucinating, which would surprise me a lot, but could be a possibility.
Now, onto what I tried and didn't work:
- I tried to increase (a lot) the verbosity of Biovalidator logs, to figure out what JSON file had this broken reference. I unfortunately couldn't pinpoint it still.
- I tried myself forking the Beacon v2 model, and redacting chunks to identify which is the file with the broken, but it seems to be anything that references the
commonJSON schemas, which then in turn reference a bunch of other stuff. Regardless of whether it's used or not for a specific validation, if there's a$ref, it'll be compiled. And all objects reference the common in some way. - I checked all
beaconCommonComponents.jsonreferences, relative or absolute, within the repository, and at glance I still couldn't pinpoint which one was resolving againstcommoninstead offramework.
Do you recall any possible outdated reference to ../common/beaconCommonComponents.json? Perhaps in an old commit or somewhere else that might be picked up? When you validate Beacon metadata through the JSON Schemas, do you compile all the $ref prior validation? If so, does your schema compiler not complain of any broken reference?
How to reproduce
- Install Biovalidator from my fork:
git clone [email protected]:M-casado/biovalidator.git
git checkout dev
npm install
node src/biovalidator- Open a different terminal.
- Create a test JSON file to validate:
cat > beacon_test.json
{
"schema": {
"$id": "Beacon test",
"$ref": "https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/models/json/beacon-v2-default-model/cohorts/defaultSchema.json"
},
"data": {}
}- Trigger validation:
curl --data @beacon_test.json -H "Content-Type: application/json" -X POST "http://localhost:3020/validate" | jqLogs
Validation log:
$ curl --data @beacon_test.json -H "Content-Type: application/json" -X POST "http://localhost:3020/validate" | jq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 427 100 229 100 198 66299 57324 --:--:-- --:--:-- --:--:-- 138k
{
"error": "Failed to compile schema: {\"error\":\"Failed to resolve $ref: https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/models/json/beacon-v2-default-model/common/beaconCommonComponents.json, status: undefined\"}"
}
Biovalidator local server logs:
$ node src/biovalidator
2025-03-17T11:59:19.133Z [info] Custom keywords successfully added. Number of custom keywords: 5
2025-03-17T11:59:19.182Z [info] ---------------------------------------------
2025-03-17T11:59:19.183Z [info] ------------ ELIXIR biovalidator ------------
2025-03-17T11:59:19.183Z [info] ---------------------------------------------
2025-03-17T11:59:19.184Z [info] Started server on port 3020 with base URL /
2025-03-17T11:59:19.184Z [info] Server available at http://localhost:3020/
2025-03-17T11:59:19.184Z [info] PID file is available at .../biovalidator/server.pid
2025-03-17T11:59:19.184Z [info] Writing logs to: .../biovalidator/logs/
2025-03-17T11:59:22.477Z [info] New validation request: Malformed data. Please provide both 'schema' and 'data' in request body.
2025-03-17T11:59:30.879Z [info] New validation request: Malformed data. Please provide both 'schema' and 'data' in request body.
2025-03-17T11:59:52.660Z [info] Saving compiled schema in cache, $id: Beacon test
2025-03-17T11:59:52.870Z [info] Returning referenced schema from network : https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/models/json/beacon-v2-default-model/cohorts/defaultSchema.json
2025-03-17T11:59:52.871Z [debug] Skipping meta-schema fetch: https://json-schema.org/draft/2020-12/schema
2025-03-17T11:59:53.148Z [info] Returning referenced schema from network : https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/framework/json/common/ontologyTerm.json
2025-03-17T11:59:53.465Z [info] Returning referenced schema from network : https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/framework/json/common/beaconCommonComponents.json
unknown format "datetime" ignored in schema at path "#/properties/eventDate"
unknown format "datetime" ignored in schema at path "#/properties/eventDate"
unknown format "datetime" ignored in schema at path "#/properties/eventTimeline/properties/end"
unknown format "datetime" ignored in schema at path "#/properties/eventTimeline/properties/end"
unknown format "datetime" ignored in schema at path "#/properties/eventTimeline/properties/start"
unknown format "datetime" ignored in schema at path "#/properties/eventTimeline/properties/start"
2025-03-17T11:59:53.799Z [info] Returning referenced schema from network : https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/models/json/beacon-v2-default-model/common/ageRange.json
2025-03-17T11:59:54.115Z [info] Returning referenced schema from network : https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/models/json/beacon-v2-default-model/common/age.json
2025-03-17T11:59:54.433Z [info] Returning referenced schema from network : https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/models/json/beacon-v2-default-model/common/disease.json
2025-03-17T11:59:54.720Z [info] Returning referenced schema from network : https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/models/json/beacon-v2-default-model/common/timeElement.json
2025-03-17T11:59:55.051Z [info] Returning referenced schema from network : https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/models/json/beacon-v2-default-model/common/gestationalAge.json
2025-03-17T11:59:55.337Z [info] Returning referenced schema from network : https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/models/json/beacon-v2-default-model/common/commonDefinitions.json
2025-03-17T11:59:55.651Z [info] Returning referenced schema from network : https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/models/json/beacon-v2-default-model/common/timeInterval.json
2025-03-17T11:59:55.962Z [error] Failed to retrieve referenced schema: https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/models/json/beacon-v2-default-model/common/beaconCommonComponents.json, {"message":"Request failed with status code 404","name":"AxiosError","stack":"AxiosError: Request failed with status code 404\n at settle (.../biovalidator/node_modules/axios/dist/node/axios.cjs:2031:12)\n at IncomingMessage.handleStreamEnd (.../biovalidator/node_modules/axios/dist/node/axios.cjs:3148:11)\n at IncomingMessage.emit (node:events:529:35)\n at endReadableNT (node:internal/streams/readable:1400:12)\n at process.processTicksAndRejections (node:internal/process/task_queues:82:21)\n at Axios.request (.../biovalidator/node_modules/axios/dist/node/axios.cjs:4258:41)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)","config":{"transitional":{"silentJSONParsing":true,"forcedJSONParsing":true,"clarifyTimeoutError":false},"adapter":["xhr","http","fetch"],"transformRequest":[null],"transformResponse":[null],"timeout":0,"xsrfCookieName":"XSRF-TOKEN","xsrfHeaderName":"X-XSRF-TOKEN","maxContentLength":-1,"maxBodyLength":-1,"env":{},"headers":{"Accept":"application/json, text/plain, */*","User-Agent":"axios/1.8.3","Accept-Encoding":"gzip, compress, deflate, br"},"method":"get","url":"https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/models/json/beacon-v2-default-model/common/beaconCommonComponents.json","responseType":"json","allowAbsoluteUrls":true},"code":"ERR_BAD_REQUEST","status":404}
2025-03-17T11:59:55.964Z [error] Failed to compile schema: {"error":"Failed to resolve $ref: https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/models/json/beacon-v2-default-model/common/beaconCommonComponents.json, status: undefined"}
2025-03-17T11:59:55.965Z [error] An error occurred while running the validation: {"error":"Failed to compile schema: {\"error\":\"Failed to resolve $ref: https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/models/json/beacon-v2-default-model/common/beaconCommonComponents.json, status: undefined\"}"}
2025-03-17T11:59:55.968Z [error] New validation request: Server failed to process data: {"error":"Failed to compile schema: {\"error\":\"Failed to resolve $ref: https://raw.githubusercontent.com/ga4gh-beacon/beacon-v2/main/models/json/beacon-v2-default-model/common/beaconCommonComponents.json, status: undefined\"}"}