-
Notifications
You must be signed in to change notification settings - Fork 5
Data Migration
jae-mess edited this page Jul 22, 2025
·
5 revisions
The code for data migration lives in the migration/ directory.
At the moment, no data is born in our MongoDB database. All data is imported from a myriad of Google Sheets using the Sheets REST API. The migration process for annotated texts consists of the following steps.
- Retrieve the contents of the Annotated Texts Index to eventually process each row into an
AnnotatedDoc. - For each row of the index, retrieve the contents of the "Annotation Sheet" column cell. The annotation sheet must consist of one or more pages named exactly like so
Page 1,Page 2, etc. and two other pages calledMetadataandReferences. - Ingest the "Metadata" page into a
DocumentMetadatastructure. Correlate each "Source Document Image" cell with thePage Xsheet of the same index. - Fill in the
segmentsfield of theAnnotatedDocwith thePage Xsheet contents concatenated together, by converting each set of annotation rows into anAnnotatedForm. - Push the assembled
AnnotatedDocto the database by running theupdateDocumentmutation on the GraphQL server.
- CARE Principles
- Collective Decision-Making Process
- Data Resilience
- Culturally-Sensitive Information
- UX Design
- Metadata
- User Contributed Audio
- Audio Data Process
- Manuscript Annotation and Analysis
- Language Specific Limitations
- Annotation and Analysis (Before 2024)
- Code Standards
- AWS Diagnostics and Triage Guide
- Cloud Architecture
- Development Environments
- Data Representation
- Data Migration
- User Groups and Roles
- Wordpress Content
- Web Design & Accessibility