Skip to content

Milestones

List view

  • No due date
    4/5 issues closed
  • The eICR Augmentation specifications, developed by APHL, detail out the various changes and tags that need to be incorporated into an ‘augmented’ eICR for the TTC project, as well as for other projects that intend to make any changes to an existing eICR. The specifications demonstrate 4 main changes: A new Document Id, extension, version, effectiveTime, assigningAuthority, and setId tags with all new values to demonstrate this is a ‘new’ iteration of this eICR, but keep all existing other Document Id tags to retain history. This includes keeping the ‘parentDocument’ tag pointing to the original eICR A new Author tag/section in the header indicating which application (‘text-to-code’ in our case) has performed the augmentation of the eICR, but keep all existing Author tags/sections to maintain history. A new Author tag/section within the specific ‘section’.’entry’ area where the change to the eICR will occur. (ie. Lab Resulting section - entry.Observation). A Translation tag, under the ‘code’ element that contained the error, contains the newly mapped LOINC code from TTC.

    No due date
    13/15 issues closed
  • See Eng Sync [notes](https://docs.google.com/document/d/1OMgW3Xw7I7azv2IiUjGVtJKPl9rIlce8ApKS-hHLhNU/edit?tab=t.0#heading=h.z4u29epgcfum) where we sketched out the requirements for the API.

    No due date
    5/5 issues closed
  • We need to create a simple frontend experience where users can input strings and get back standardized LOINC codes for lab test names. See example mockup: https://drive.google.com/file/d/17czDxU1DsDpVCaYM8PcbgXerihYCGqU7/view?usp=drive_link

    No due date
    5/5 issues closed
  • We need to create functions that allow the TTC lambda function to receive information about a TTC event, read and write various pieces of information (manifest & eICRs to/from s3 bucket), query vector DB, and push a success event to the APHL event bridge.

    No due date
    23/33 issues closed
  • This phase builds a second classifier using K-Nearest Neighbors with cosine similarity over document embeddings. It generalizes scoring and timing utilities for broader model support and wraps both Naive Bayes and kNN models into a unified interface for easier training, evaluation, and deployment.

    No due date
  • This milestone shifts to using semantic document embeddings as an alternative to BoW. It involves creating synthetic lab name data, using LOINC as a source for semantic context, and constructing vocabulary embeddings from LOINC metadata. Embedding persistence functions are also developed

    No due date
    2/2 issues closed
  • This phase introduces the first classification approach using a multiclass Naive Bayes model trained on BoW features. Tasks include model training, evaluation, and validation via k-fold cross-validation. Functions for performance metrics and model persistence are also implemented.

    No due date
    0/6 issues closed
  • This milestone establishes the core preprocessing pipeline to clean and normalize input text, preparing it for modeling. It includes standard NLP tasks such as tokenization, lemmatization, and whitespace/punctuation normalization. It also introduces the initial feature engineering approach using Bag-of-Words (BoW) and supports persistence of vocabulary representations

    No due date
    4/7 issues closed
  • All things related to handling the eICR around TTC.

    No due date
    16/16 issues closed
  • This phase focuses on setting the foundation for modeling lab result data by identifying the appropriate NLP/ML tools and gaining familiarity with the structure and variability of real-world eCR data. Tasks include comparing available libraries, reviewing sample eCRs, and generating synthetic lab data for development and testing.

    No due date
    5/5 issues closed