Data Migration

The code for data migration lives in the migration/ directory.

At the moment, no data is born in our MongoDB database. All data is imported from a myriad of Google Sheets using the Sheets REST API. The migration process for annotated texts consists of the following steps.

Retrieve the contents of the Annotated Texts Index to eventually process each row into an AnnotatedDoc.
For each row of the index, retrieve the contents of the "Annotation Sheet" column cell. The annotation sheet must consist of one or more pages named exactly like so Page 1, Page 2, etc. and two other pages called Metadata and References.
Ingest the "Metadata" page into a DocumentMetadata structure. Correlate each "Source Document Image" cell with the Page X sheet of the same index.
Fill in the segments field of the AnnotatedDoc with the Page X sheet contents concatenated together, by converting each set of annotation rows into an AnnotatedForm.
Push the assembled AnnotatedDoc to the database by running the updateDocument mutation on the GraphQL server.

Data Migration

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Table of Contents

Home

Community-Based Design

Annotation and Analysis

Technical Design

Development Processes

Clone this wiki locally