This is an easy to full workflow for creating key-value based summaries on input RDF ontologies using Gilgamesh Summarizer. For more information on the tool visit: gilgamesh-summarizer
For all key attributes in JSON, exactly one file path must be provided.
| Attributes | Info | Value Type | Required |
|---|---|---|---|
data_file |
Any valid RDF format format |
list |
✔ |
ontology_file |
Any valid RDF format format |
list |
✔ |
{
"inputs": {
"data_file": [
"XXXXXXXX-bucket/temp1.csv"
],
"ontology_file": [
"XXXXXXXX-bucket/intermediate.json"
]
}
}
- data_file: The input knowledge graph data
- ontology_fle: The input knowledge graph's RDF schema
Concering input, additional info must be provided.
| Attributes | Info | Value Type | Required |
|---|---|---|---|
prune_topk_nodes |
Remove nodes with most edges from clusters. Increasing it reduces execution time but reduces accuracy | dataset_object | |
max_cluster_size |
Set maximum cluster size. Increasing it reduces execution time but reduces accuracy | dataset_object |
An example of parameters for ontology summarization:
{
"inputs": {
"data_file": [
"XXXXXXXX-bucket/temp1.csv"
],
"ontology_file": [
"XXXXXXXX-bucket/intermediate.json"
]
},
"outputs": {
"mappings_file": "/path/to/write/the/file.json",
},
"parameters": {
"prune_topk_nodes": 100,
"max_cluster_size": 500,
}
}
Output will be provide the file where the mappings generated from the ontology summarization process are stored, alongside some basic metrics like the number of key value pairs collected and the number of affected properties in the ontology. E.g
{
'message': 'Tool Executed Succesfully',
'outputs': {
"mappings" : "/path/to/write/the/file.json"
},
'metrics': {
'numKVPairs': 3,
'affectedPredicates' : 10
},
'status': "success",
}