Skip to content

Repo documentation cleanup#389

Merged
bamader merged 9 commits intomainfrom
repo-documentation-cleanup
Mar 26, 2026
Merged

Repo documentation cleanup#389
bamader merged 9 commits intomainfrom
repo-documentation-cleanup

Conversation

@bamader
Copy link
Copy Markdown
Collaborator

@bamader bamader commented Mar 25, 2026

Description

This PR addresses a number of outstanding documentation and repo cleanup issues related to how we generate synthetic data. It includes the following changes:

  • Splits out the "LOINC Enhancement" functionality we created during our first attempt at making training data into its own module; this separates it from being entangled in the rest of the first attempt code contained in augmentation.py
  • "Archives" all of the code associated with our first attempt at training data, clearly demarcating it as no longer in use but preserving the code for functionality audit
  • Adjusting the structure of our unit tests to reflect this split and archive, splitting out the LOINC enhancement tests to their own file and updating all relevant imports to point to the Archived first attempt
  • Deletes the "Model Tuning" package directory, moving the only file worth keeping (tsdae.py) into data-curation, where it's more appropriate anyway
  • Updating the README file for data-curation to reflect the new reorganization, as well as provide more commentary on the training data scripts overall

Related Issues

Closes #314

Additional Notes

Because of file path and import adjustments, as well as splicing some functions out into their own files, this PR has a lot of Files Changed. However, none of these functions, tests, or code are actually new--it's all just things that are moved around as part of repo cleanup.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 25, 2026

Codecov Report

❌ Patch coverage is 99.09091% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 93.58%. Comparing base (b119829) to head (e7e21fc).
⚠️ Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
...ta-curation/src/data_curation/loinc_enhancement.py 98.87% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #389      +/-   ##
==========================================
+ Coverage   93.15%   93.58%   +0.43%     
==========================================
  Files          39       42       +3     
  Lines        2088     2137      +49     
==========================================
+ Hits         1945     2000      +55     
+ Misses        143      137       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@bamader bamader marked this pull request as ready for review March 26, 2026 20:45
Copy link
Copy Markdown
Member

@nickclyde nickclyde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup! This is a good restructure.

Copy link
Copy Markdown
Collaborator

@BradySkylight BradySkylight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Makes me happy to see this all cleaned up.

@bamader bamader merged commit 5ade64f into main Mar 26, 2026
5 checks passed
@bamader bamader deleted the repo-documentation-cleanup branch March 26, 2026 21:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Documentation within the TTC repo

4 participants