-
Notifications
You must be signed in to change notification settings - Fork 5
Description
From Markdown:
the Wayback Machine API was used (<cite id=\"8bwrd\"><a href=\"#zotero%7C16470964%2F5LJVV378\">(<i>Wayback Machine APIs | Internet Archive</i>, n.d.)</a></cite>). The KB Web Collection was explored based on the Solrwayback instance available at the KB Datalab. The datasets were produced by applying similar practices for data filtering, cleaning, and processing. The resulting export file contains metadata for each capture, including the original URL, timestamp, status code, MIME type, content digest, and some other fields. Datasets were shaped by including only URLs filtered by status 200. The necessary condition was to receive successfully processed data during crawling, which is indicated in the metadata with status 200. The other materials were not taken
From citation metadata:
"16470964/5LJVV378": {
"URL": "https://archive.org/help/wayback_api.php",
"accessed": {
"date-parts": [
[
2026,
1,
19
]
]
},
"id": "16470964/5LJVV378",
"system_id": "zotero|16470964/5LJVV378",
"title": "Wayback Machine APIs | Internet Archive",
"type": "webpage"
},
@eliselavy
References were added. https://github.com/jdh-observer/5dfgdpiURy8S/tree/add-references
Some were not rendered on the notebook preview.![]()
![]()
![]()
![]()
![]()
![]()
![]()
Originally posted by @Nouhabens in #240






