-
Notifications
You must be signed in to change notification settings - Fork 0
Reification and JSON LD
Sometimes we need to add context around a relationship. An extra level in indirection often helps.
Take for example the statement:
<blade_runner> <stars> <harrison_ford> .
This is an unambiguous statement, however, the same can't be said for the statement:
<hamlet> <stars> <laurence_olivier> .
After all there may only be only one Laurence Olivier, but there have been many productions of Hamlet, both stage and film. One solution to this would be to add more facts about the relationship . Perhaps we could describing the whole statement as a new subject with its own set of predicates. Something like:
<larry_in_hamlet> <subject> <hamlet> .
<larry_in_hamlet> <predicate> <stars> .
<larry_in_hamlet> <object> <laurence_olivier> .
<larry_in_hamlet> <occurs> "1948" .
<larry_in_hamlet> <media> "movie" .
However that doesn't sit particularly well with the entity focused nature of JSON (and therefore JSON-LD.) Another solution would be to adjust our model slightly and introduce the notion of Production, so we might now say:
<hamlet> <production> <hamlet_movie_1948> .
<hamlet_movie_1948> <stars> <laurence_olivier> .
This approach is a much better fit for JSON. And it is arguably a better, more accurate, model. At least it is closer to the way we might talk about the matter. The programmer working directly with the JSON now just has to deal with:
{
"@id": "http://entertainment/hamlet",
"production": {
"@id": "http://entertainment/hamlet_movie_1948",
"stars": {
"@id": "http://entertainment/laurence_olivier"
}
}
}
Working with this in JavaScript the programmer would most likely ignore the "@id" and just use the dot syntax of the language to access the data.
In reality we would probably add a JSON-LD context to this that specified both "production" and "stars" as containers. Once we have done that they will always be rendered in the JSON as arrays even if there is a single property. If we were to do this, the JavaScript programmer would be able to use the language's array access syntax, the [] operator. In fact declaring the containers would mean the programmer could consistently use the array syntax so that would be a good thing to do in this case.
So the approach we took here was really just a matter of adding an extra level of indirection. This approach is actually used in the NuGet metadata repository. The NuGet metadata repository is built from the ground up with JSON-LD; the core structure on the service is an append-only database, basically a transaction log. New metadata documents are simply added to this ordered structure; but so are edits to existing metadata. The client, however, is only interested in the latest revision of the metadata that contains the latest edits, after all, it naturally wants to display the current description/title/tags etc.
The solution was to introduce another artifact that actually represents this current latest revision of the metadata and then have that point at the particular revision in the append-only structure. The client is forced to step one extra level into the structure, but because that is always consistent this turns out not to be a concern. The append-only structure is referred to as the Catalog with the individual revisions referred to as Catalog Entries and the entity the client access is referred to as a Registration, the result looks like this:
First the catalog entries:
{
"@id": "http://nuget.org/catalog/ef.2015.10.31",
"description": "The first attempt at describing this.",
"lastUpdate": "2015/10/31"
}
and much later in the structure:
{
"@id": "http://nuget.org/catalog/ef.2015.12.25",
"description": "A much better attempt at describing this.",
"lastUpdate": "2015/12/25"
}
And then the client always binds to:
{
"@id": "http://nuget.org/registration/ef",
"catalogEntry": {
"@id": "http://nuget.org/catalog/ef.2015.12.25",
"description": "A much better attempt at describing this."
}
}
Whether a notion of time was added to the "description" property or whether we just have a direct model of revisions is a matter of interpretation. The point is only that the extra nested entity sits more comfortably in the JSON.