Skip to content

Extract data from the API

ProgramX-NPledger edited this page Nov 3, 2024 · 17 revisions

Exporting content autonomously requires use of the Taggloo API and the dataExporter or administrator roles. If you do not have either of these roles, you will receive a 401 response.

Exporting

It is possible to extract data from a number of endpoints, according to the data type required.

  • Extracting data is performed in pages, therefore one must call for the first page with a limit of items returned, which will return how many items there are, along with a link for the next page using the same request size, if available.
  • All responses have a links collection, which contains sufficient information to be able to move around the Taggloo graph. The rel link can be:
    • self refers to the request itself, if one was to perform the request again.
    • nextpage refers to the next page, if available and using the same criteria and request parameters as the current page.
    • previouspage refers to the previous page, if available and using the same criteria and request parameters as the current page.
  • Each request contains its results in the results array.
  • Results which are paged return three properties:
    • fromIndex identifies the index of the first returned item in the wider collection (index of first item in this page from the wider collection)
    • totalItemsCount for the total number of available items, which is useful when calculating how many pages will be required. (Iteratively increment fromIndex until fromIndex > totalItemsCount in pages with size pageSize.)
    • pageSize the number of items per page (note: not the number of returned items this page, so this value may be greater than the number of returned items in results)

Languages

The Languages configured in the Taggloo database may be exported using the languages endpoint.

Get multiple Languages

Retrieve Languages in Taggloo using the following endpoint:

GET https://taggloo.im/api/v4/languages[?[ietfLanguageCode=en-GB&][offsetIndex=0&][pageSize=10]]

Where:

  • ietfLanguageCode (optional) if specified, limits output to the specified language. eg. IETF Language Tag en-GB
  • offsetIndex (optional) if specified, starts output at a particular index. eg. 0 to start from the beginning. If not specified, 0 is presumed.
  • pageSize (optional) returns up to the number of items requested, if available. This is particularly useful when combined with offsetIndex to provide a paging capability. eg. 10 items. If not specified, the maximum items per page is returned as defined by Defaults.MaxItems.

This returns a 200 OK result and a model such as:

{
    "results": [
        {
            "ietfLanguageCode": "en-GB",
            "name": "British English",
            "links": [
                {
                    "rel": "self",
                    "href": "https://taggloo.im/api/v4/languages/en-GB",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                }
            ]
        },
        {
            "ietfLanguageCode": "gv-GV",
            "name": "Manx Gaelic",
            "links": [
                {
                    "rel": "self",
                    "href": "https://taggloo.im/api/v4/languages/gv-GV",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                }
            ]
        }
    ],
    "fromIndex": 0,
    "totalItemsCount": 2,
    "pageSize": 10,
    "links": [
        {
            "rel": "self",
            "href": "https://taggloo.im/api/v4/languages?offsetIndex=0&pageSize=10",
            "action": "get",
            "types": [
                "application/json"
            ]
        }
    ]
}
  • Use no parameters to return the first page of results with no filter.
  • There should only ever be two Languages in a Taggloo graph, therefore paging isn't useful - though it is possible.

Dictionaries

The Dictionaries configured in the Taggloo database may be exported using the dictionaries endpoint.

Get multiple Dictionaries

Retrieve Dictionaries in Taggloo using the following endpoint:

GET https://taggloo.im/api/v4/dictionaries[?[id=1&][ietfLanguageCode=en-GB&][offsetIndex=0&][pageSize=10]]

Where:

  • id (optional) if specified, limits output to the single requested Dictionary. eg. 1 for Dictionary ID 1.
  • ietfLanguageCode (optional) if specified, limits output to the specified language. eg. IETF Language Tag en-GB
  • offsetIndex (optional) if specified, starts output at a particular index. eg. 0 to start from the beginning. If not specified, 0 is presumed.
  • pageSize (optional) returns up to the number of items requested, if available. This is particularly useful when combined with offsetIndex to provide a paging capability. eg. 10 items. If not specified, the maximum items per page is returned as defined by Defaults.MaxItems.

This returns a 200 OK result and a model such as:

{
    "results": [
        {
            "id": 1,
            "name": "British English Dictionary",
            "description": "Dictionary for British English words",
            "sourceUrl": "https://taggloo.im",
            "ietfLanguageTag": "en-GB",
            "createdByUserName": "administrator",
            "createdAt": "2024-03-20T21:56:48.0725642",
            "createdOn": "46.31.203.19",
            "contentTypeKey": "Word",
            "contentTypeFriendlyName": "Word",
            "controller": "words",
            "links": [
                {
                    "rel": "self",
                    "href": "https://taggloo.im/api/v4/dictionaries/1",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                },
                {
                    "rel": "language",
                    "href": "https://taggloo.im/api/v4/languages/en-GB",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                },
                {
                    "rel": "firstwords",
                    "href": "https://taggloo.im/api/v4/words?dictionaryId=1",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                }
            ]
        }
    ],
    "fromIndex": 0,
    "totalItemsCount": 1,
    "pageSize": 10,
    "links": [
        {
            "rel": "self",
            "href": "https://taggloo.im/api/v4/dictionaries?ietfLanguageTag=en-GB&offsetIndex=0&pageSize=10",
            "action": "get",
            "types": [
                "application/json"
            ]
        }
    ]
}
  • Use no parameters to return the first page of results with no filter.
  • Within each Dictionary result the links collection has items with rel:
    • firstwords to get the first page of Words in the Dictionary.
    • language to get the Language of the Dictionary
  • Within each Dictionary result, it is possible to use the contentTypeKey to affect behaviour of a consuming application based on the type of content within the Dictionary. The controllers property gives a partial endpoint.

Words

The Words configured in the Taggloo database may be exported using the words endpoint.

Get multiple Words

Retrieve Words in Taggloo using the following endpoint:

GET https://taggloo.im/api/v4/words[?[word=hello&][dictionaryId=1&][externalId=A123&][offsetIndex=0&][pageSize=10]]

Where:

  • word (optional) if specified, limits output to the requested word, eg. hello
  • dictionaryId (optional) if specified, limits output to the requested Dictionary. eg. 1
  • externalId (optional) if specified, limits output to words with the specified external identifier. An external identifier may be applied to each Word to provide a link with an external data source. eg, A123
  • offsetIndex (optional) if specified, starts output at a particular index. eg. 0 to start from the beginning. If not specified, 0 is presumed.
  • pageSize (optional) returns up to the number of items requested, if available. This is particularly useful when combined with offsetIndex to provide a paging capability. eg. 10 items. If not specified, the maximum items per page is returned as defined by Defaults.MaxItems.

This returns a 200 OK result and a model such as:

{
    "results": [
        // other results
        {
            "id": 9,
            "word": "a black sheep",
            "links": [
                {
                    "rel": "self",
                    "href": "https://beta.taggloo.im/api/v4/words/9",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                },
                {
                    "rel": "dictionary",
                    "href": "https://beta.taggloo.im/api/v4/dictionaries/1",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                },
                {
                    "rel": "translation",
                    "href": "http://beta.taggloo.im/api/v4/words/403379",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                },
                {
                    "rel": "wordinphrases",
                    "href": "http://localhost:5067/api/v4/wordinphrases?wordid=434571",
                    "action": "get",
                   "types": [
                        "application/json"
                   ]
                }
            ],
            "createdByUserName": "Taggloo",
            "createdAt": "2012-06-29T10:01:06.177",
            "createdOn": "46.31.203.19",
            "dictionaryId": 1,
            "externalId": "Taggloo2-Word-284812",
            "ietfLanguageTag": "en-GB"
        },
        {
            "id": 10,
            "word": "a boat trip",
            "links": [
                {
                    "rel": "self",
                    "href": "https://beta.taggloo.im/api/v4/words/10",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                },
                {
                    "rel": "dictionary",
                    "href": "https://beta.taggloo.im/api/v4/dictionaries/1",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                }
            ],
            "createdByUserName": "Taggloo",
            "createdAt": "2012-06-29T10:01:06.177",
            "createdOn": "46.31.203.19",
            "dictionaryId": 1,
            "externalId": "Taggloo2-Word-284813",
            "ietfLanguageTag": "en-GB"
        }
    ],
    "fromIndex": 0,
    "totalItemsCount": 104131,
    "pageSize": 10,
    "links": [
        {
            "rel": "self",
            "href": "https://beta.taggloo.im/api/v4/words?word=&offsetIndex=0&pageSize=10",
            "action": "get",
            "types": [
                "application/json"
            ]
        }
    ]
}
  • Use no parameters to return the first page of results with no filter.
  • Within each Word result the links collection has items with rel:
    • dictionary to get the Dictionary of the Word.
    • translation to get a translation of the Word. Following the Word to its Dictionary will reveal the Word language.
    • wordinphrases to get Phrases that contain the Word.

Phrases

The Phrases configured in the Taggloo database may be exported using the phrases endpoint.

Get multiple Phrases

Retrieve Phrases in Taggloo using the following endpoint:

GET https://taggloo.im/api/v4/phrases[?[phrase=hello%20world&][dictionaryId=1&][externalId=A123&][offsetIndex=0&][pageSize=10]]

Where:

  • phrase (optional) if specified, limits output to the requested phrase, eg. hello world
  • dictionaryId (optional) if specified, limits output to the requested Dictionary. eg. 1
  • externalId (optional) if specified, limits output to phrases with the specified external identifier. An external identifier may be applied to each Phrase to provide a link with an external data source. eg, A123
  • offsetIndex (optional) if specified, starts output at a particular index. eg. 0 to start from the beginning. If not specified, 0 is presumed.
  • pageSize (optional) returns up to the number of items requested, if available. This is particularly useful when combined with offsetIndex to provide a paging capability. eg. 10 items. If not specified, the maximum items per page is returned as defined by Defaults.MaxItems.

This returns a 200 OK result and a model such as:

{
    "results": [
        // other results
        {
            "id": 9,
            "phrase": "a black sheep",
            "links": [
                {
                    "rel": "self",
                    "href": "https://beta.taggloo.im/api/v4/phrases/9",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                },
                {
                    "rel": "dictionary",
                    "href": "https://beta.taggloo.im/api/v4/dictionaries/1",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                }
            ],
            "createdByUserName": "Taggloo",
            "createdAt": "2012-06-29T10:01:06.177",
            "createdOn": "46.31.203.19",
            "dictionaryId": 1,
            "externalId": "Taggloo2-Word-284812",
            "ietfLanguageTag": "en-GB"
        },
        {
            "id": 10,
            "phrase": "a boat trip",
            "links": [
                {
                    "rel": "self",
                    "href": "https://beta.taggloo.im/api/v4/phrases/10",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                },
                {
                    "rel": "dictionary",
                    "href": "https://beta.taggloo.im/api/v4/dictionaries/1",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                },
                {
                    "rel": "translation",
                    "href": "https://beta.taggloo.im/api/v4/phrases/38099",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                },
                {
                     "rel": "wordinphrases",
                     "href": "http://localhost:5067/api/v4/wordinphrases?phraseId=434571",
                     "action": "get",
                     "types": [
                         "application/json"
                     ]
                }
            ],
            "createdByUserName": "Taggloo",
            "createdAt": "2012-06-29T10:01:06.177",
            "createdOn": "46.31.203.19",
            "dictionaryId": 1,
            "externalId": "Taggloo2-Word-284813",
            "ietfLanguageTag": "en-GB"
        }
    ],
    "fromIndex": 0,
    "totalItemsCount": 104131,
    "pageSize": 10,
    "links": [
        {
            "rel": "self",
            "href": "https://beta.taggloo.im/api/v4/phrases?phrase=&offsetIndex=0&pageSize=10",
            "action": "get",
            "types": [
                "application/json"
            ]
        }
    ]
}
  • Use no parameters to return the first page of results with no filter.
  • Within each Phrase result the links collection has items with rel:
    • dictionary to get the Dictionary of the Phrase.
    • translation to get a translation of the Phrase. Following the Phrase to its Dictionary will reveal the Phrase language.
    • wordinphrases to get Words contained within the Phrase.

Words in Phrases

Taggloo contains an index of relationships between Words and Phrases. The wordsinphrases endpoint can be used to query this index and allow reconstitution of Phrases.

Get multiple Words in Phrases

Retrieve Words In Phrases in Taggloo using the following endpoint:

GET https://taggloo.im/api/v4/wordsinphrases[?[wordId=h1&][phraseId=1&][offsetIndex=0&][pageSize=10]]

Where:

  • wordId (optional) if specified, limits output to the requested Word. eg. 1
  • phraseId (optional) if specified, limits output to the requested Phrase. eg. 1
  • offsetIndex (optional) if specified, starts output at a particular index. eg. 0 to start from the beginning. If not specified, 0 is presumed.
  • pageSize (optional) returns up to the number of items requested, if available. This is particularly useful when combined with offsetIndex to provide a paging capability. eg. 10 items. If not specified, the maximum items per page is returned as defined by Defaults.MaxItems.

This returns a 200 OK result and a model such as:

{
    "results": [
        {
            "id": 25171,
            "inPhraseId": 1,
            "wordId": 434571,
            "ordinal": 1,
            "thePhrase": "A bad reaper never got a good sickle",
            "theWord": "A",
            "createdByUserName": "DELLXPS-15\\progr",
            "createdOn": "DELLXPS-15",
            "createdAt": "2024-04-24T21:20:49.2117986",
            "links": [
                {
                    "rel": "self",
                    "href": "http://localhost:5067/api/v4/phrases/25171",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                },
                {
                    "rel": "word",
                    "href": "http://localhost:5067/api/v4/words/434571",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                },
                {
                    "rel": "phrase",
                    "href": "http://localhost:5067/api/v4/phrases/1",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                }
            ]
        },
        {
            "id": 25172,
            "inPhraseId": 1,
            "wordId": 434572,
            "ordinal": 2,
            "thePhrase": "A bad reaper never got a good sickle",
            "theWord": "bad",
            "createdByUserName": "DELLXPS-15\\progr",
            "createdOn": "DELLXPS-15",
            "createdAt": "2024-04-24T21:20:49.3930895",
            "links": [
                {
                    "rel": "self",
                    "href": "http://localhost:5067/api/v4/phrases/25172",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                },
                {
                    "rel": "word",
                    "href": "http://localhost:5067/api/v4/words/434572",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                },
                {
                    "rel": "phrase",
                    "href": "http://localhost:5067/api/v4/phrases/1",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                }
            ]
        },
        // other words in phrase
    ],
    "fromIndex": 0,
    "totalItemsCount": 8,
    "pageSize": 10,
    "links": [
        {
            "rel": "self",
            "href": "http://localhost:5067/api/v4/wordInPhrases?phraseId=1&)offsetIndex=0&",
            "action": "get",
            "types": [
                "application/json"
            ]
        }
    ],
    "deltaMs": 34.1276
}

Results are returned sorted by Phrase ID and then Ordinal.

  • Use no parameters to return the first page of results with no filter.
  • Within each Phrase result the links collection has items with rel:
    • word to get the Word.
    • phrase to get the Phrase.

Community Content Items

Community Content Items represent content that has been discovered, retrieved and indexed from the internet into the Taggloo corpus.

Get multiple Community Content Items

Retrieve Community Content Items in Taggloo using the following endpoint:

GET https://taggloo.im/api/v4/communictycontentitems[?[containingText=hello%20world&][dictionaryId=1&][hash=A123&][hashAlgorithm=MD5&][ietfLanguageTag=en-GB&][offsetIndex=0&][pageSize=10]]

Where:

  • containingText (optional) if specified, limits output to the containing text, eg. hellow world
  • dictionaryId (optional) if specified, limits output to the requested Dictionary. eg. 1
  • hash (optional) if specified, limits output to the requested hash. This is useful for identifying if content has already been imported.
  • hashAlgorithm (optional) most often used with hash parameter, limits output of hashed content to content hashed using specified algorithm.
  • offsetIndex (optional) if specified, starts output at a particular index. eg. 0 to start from the beginning. If not specified, 0 is presumed.
  • pageSize (optional) returns up to the number of items requested, if available. This is particularly useful when combined with offsetIndex to provide a paging capability. eg. 10 items. If not specified, the maximum items per page is returned as defined by Defaults.MaxItems.

This returns a 200 OK result and a model such as:

{
    "results": [
        // other results
        {
            "id": 9,
            "hash": "123456789abcdef",
            "hashAlgorithm":" MD5",
            "title": "Euro 2012 - Yn  Pholynn neu-nhee Yn Phobblaght Check nane #manx #gaelg",
            "authorName": "greinneyder (Greinneyder )",
            "authorUrl": "http://twitter.com/greinneyder",
            "imageUrl": "http://a0.twimg.com/profile_images/1576790492/mish_normal.jpg",
            "synopsisText": "Euro 2012 - Yn  Pholynn neu-nhee Yn Phobblaght Check nane <a href="http://search.twitter.com/search?q=%23manx" title="#manx" class=" ">#manx</a> <a href="http://search.twitter.com/search?q=%23gaelg" title="#gaelg" class=" ">#gaelg</a>",
            "ietfLanguageTag": "gv-GV",
            "originalSynopsisHtml": "Euro 2012 - Yn  Pholynn neu-nhee Yn Phobblaght Check nane <a href="http://search.twitter.com/search?q=%23manx" title="#manx" class=" ">#manx</a> <em><a href="http://search.twitter.com/search?q=%23gaelg" title="#gaelg" class=" ">#gaelg</a></em>"
            "dictionaryId": 1,
            "isTruncated": false,
            "publishedAt": "2012-06-16 21:57:36.0000000",
            "communityContentCollectionId": 1,
            "retrievedAt": "2012-06-16 21:57:36.0000000",
            "links": [
                {
                    "rel": "self",
                    "href": "https://beta.taggloo.im/api/v4/phrases/9",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                },
                {
                    "rel": "dictionary",
                    "href": "https://beta.taggloo.im/api/v4/dictionaries/1",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                },
                {
                    "rel": "collection",
                    "href": "https://beta.taggloo.im/api/v4/communitycontentcollections/1",
                    "action": "get",
                    "types": [
                        "application/json"
                    ]
                }
            ],
            "createdByUserName": "Taggloo",
            "createdAt": "2012-06-29T10:01:06.177",
            "createdOn": "46.31.203.19",
        }
    ],
    "fromIndex": 0,
    "totalItemsCount": 104131,
    "pageSize": 10,
    "links": [
        {
            "rel": "self",
            "href": "https://beta.taggloo.im/api/v4/communitycontentitems?contianingText==&offsetIndex=0&pageSize=10",
            "action": "get",
            "types": [
                "application/json"
            ]
        }
    ]
}
  • Use no parameters to return the first page of results with no filter.
  • Within each Community Content Item result the links collection has items with rel:
    • dictionary to get the Dictionary of the Phrase.
    • collection to get the collection the item belongs to.

Clone this wiki locally