Skip to content

Language Model

jae-mess edited this page Jun 19, 2025 · 3 revisions

Overview

This page provides a broad description of DAILP's language annotation and analysis processes after 2024.

Community-Based Design

All archival material is selected by community members for translation. Once documents are identified, our translation team works to provide free translations and linguistic information on multiple levels of detail.

Levels of Analysis

Edited collections are the system of documents identified for a particular theme, scope, and audience by community members. Read more about our language annotation process at this level here.

Documents are individual, primary sources of archival material in a wide variety of mediums that are organized into edited collections. Read more about document-level language analysis here.

Each individual word within a document is translated and annotated with multiple layers of information. Read more about our word-level analysis here.

Language Audio

Prior to 2024, language audio was recorded in Zoom. After 2024, the process of adding audio is supported by the DAILP translation interface. This allows users to directly upload their own pronunciations of words and phrases, and to provide more commentary about dialect and pragmatic information.

Clone this wiki locally