List view
- No due date•4/5 issues closed
The eICR Augmentation specifications, developed by APHL, detail out the various changes and tags that need to be incorporated into an ‘augmented’ eICR for the TTC project, as well as for other projects that intend to make any changes to an existing eICR. The specifications demonstrate 4 main changes: A new Document Id, extension, version, effectiveTime, assigningAuthority, and setId tags with all new values to demonstrate this is a ‘new’ iteration of this eICR, but keep all existing other Document Id tags to retain history. This includes keeping the ‘parentDocument’ tag pointing to the original eICR A new Author tag/section in the header indicating which application (‘text-to-code’ in our case) has performed the augmentation of the eICR, but keep all existing Author tags/sections to maintain history. A new Author tag/section within the specific ‘section’.’entry’ area where the change to the eICR will occur. (ie. Lab Resulting section - entry.Observation). A Translation tag, under the ‘code’ element that contained the error, contains the newly mapped LOINC code from TTC.
No due date•13/15 issues closedSee Eng Sync [notes](https://docs.google.com/document/d/1OMgW3Xw7I7azv2IiUjGVtJKPl9rIlce8ApKS-hHLhNU/edit?tab=t.0#heading=h.z4u29epgcfum) where we sketched out the requirements for the API.
No due date•5/5 issues closedWe need to create a simple frontend experience where users can input strings and get back standardized LOINC codes for lab test names. See example mockup: https://drive.google.com/file/d/17czDxU1DsDpVCaYM8PcbgXerihYCGqU7/view?usp=drive_link
No due date•5/5 issues closedWe need to create functions that allow the TTC lambda function to receive information about a TTC event, read and write various pieces of information (manifest & eICRs to/from s3 bucket), query vector DB, and push a success event to the APHL event bridge.
No due date•23/33 issues closedThis phase builds a second classifier using K-Nearest Neighbors with cosine similarity over document embeddings. It generalizes scoring and timing utilities for broader model support and wraps both Naive Bayes and kNN models into a unified interface for easier training, evaluation, and deployment.
No due dateThis milestone shifts to using semantic document embeddings as an alternative to BoW. It involves creating synthetic lab name data, using LOINC as a source for semantic context, and constructing vocabulary embeddings from LOINC metadata. Embedding persistence functions are also developed
No due date•2/2 issues closedThis phase introduces the first classification approach using a multiclass Naive Bayes model trained on BoW features. Tasks include model training, evaluation, and validation via k-fold cross-validation. Functions for performance metrics and model persistence are also implemented.
No due date•0/6 issues closedThis milestone establishes the core preprocessing pipeline to clean and normalize input text, preparing it for modeling. It includes standard NLP tasks such as tokenization, lemmatization, and whitespace/punctuation normalization. It also introduces the initial feature engineering approach using Bag-of-Words (BoW) and supports persistence of vocabulary representations
No due date•4/7 issues closedAll things related to handling the eICR around TTC.
No due date•16/16 issues closedThis phase focuses on setting the foundation for modeling lab result data by identifying the appropriate NLP/ML tools and gaining familiarity with the structure and variability of real-world eCR data. Tasks include comparing available libraries, reviewing sample eCRs, and generating synthetic lab data for development and testing.
No due date•5/5 issues closed