-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: set up new github workflow for only ontologies that have changed #264
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #264 +/- ##
==========================================
- Coverage 89.27% 88.56% -0.71%
==========================================
Files 19 19
Lines 1343 1356 +13
Branches 121 122 +1
==========================================
+ Hits 1199 1201 +2
- Misses 124 135 +11
Partials 20 20
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
c9d0c5b
to
7caa371
Compare
3772954
to
70d1dac
Compare
a38c197
to
4ac3a1b
Compare
on: | ||
push: | ||
paths: | ||
- "**/ontology-assets/ontology_info.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we know what happens if we update both ontology_info and all_ontology_generator.py? do we need to add some sort of "if ontology_info is updated but not all_ontology_generator" conditional logic here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if both the files are updated, then both workflows will be executed. it's not ideal, but i spent 1-2 hours trying to prevent that from happening and i couldn't figure out how to do it. i'm also not super well versed in how to write GHA workflows with conditional logic, so it certainly could be possible with more investigative effort.
what i was thinking for now is that if you do enter this edge case, the developer writing the PR can just cancel the workflow that's not needed, or just allow both to run. allowing both jobs to run shouldn't cause any issues in the expected generated ontology assets, it would just waste some time and some GHA resources. if that ends up being too annoying, we can patch this edge case as a follow up.
since trying to handle this edge case took up about half the total time i spent on this, i figured it's still better to get this change out first so we see an immediate benefit to lower runtime for ~most PR's generating ontology assets--especially since this was already annoying me this week while i was working on adding in the CL terms into EFO for perturbations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good, thanks for looking into it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, thank you! left a note about one github workflow edge case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change looks good, optionally mind adding a TODO or jira ticket or something to document that we're aware of the double update edge case?
🤖 I have created a release *beep* *boop* --- <details><summary>python-api: 1.6.0-alpha</summary> ## [1.6.0-alpha](python-api-v1.5.0-alpha...python-api-v1.6.0-alpha) (2025-03-05) ### Features * pinned ontologies for schema 5.3.0 ([#275](#275)) ([cd6a15e](cd6a15e)) * set up new github workflow for only ontologies that have changed ([#264](#264)) ([45d0fce](45d0fce)) ### Misc * remove 5.3.1-alpha ([#273](#273)) ([a930332](a930332)) </details> <details><summary>ontology-assets: 1.5.0-alpha</summary> ## [1.5.0-alpha](ontology-assets-v1.4.0-alpha...ontology-assets-v1.5.0-alpha) (2025-03-05) ### Features * pinned ontologies for schema 5.3.0 ([#275](#275)) ([cd6a15e](cd6a15e)) * set up new github workflow for only ontologies that have changed ([#264](#264)) ([45d0fce](45d0fce)) ### Misc * remove 5.3.1-alpha ([#273](#273)) ([a930332](a930332)) </details> --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Evan Molinelli <[email protected]>
Reason for Change
NCBITaxon
, it takes about two hours for thegenerate_all_ontology.yml
job to wrap up, which is.. quite long if we just want to do simple things like bump versions of smaller ontology graphsnote: i did try to put all of this into one workflow, but i couldn't figure out how to check if only
ontology_info.json
was updated and the other files were not updated. there potentially is a way for this to be done so that if we updateontology_info.json
andall_ontology_generator.py
, then we don't have to run both workflows. that being said, i do think that there are some benefits to having it be a separate workflow entirely. it's immediately clear to the person writing the PR which version of the script we're executing since we would start the "Updates to Diff'd Ontology Files" job instead of the "Updates to Ontology Files" jobChanges
generate_diff_ontology
so that we only download / process ontologies that have changed since the last run by executing the script with a--diff
flag if the only file we've changed isontology_info.json
latest_ontology_info.json
ontology_info_schema.json
,all_ontology_schema.json
, orall_ontology_generator.py
, then we re-generate all ontology files without the--diff
flag by executing the oldgenerate_all_ontology
workflowTesting steps
latest_ontology_info.json
correctlylatest_ontology_info.json
to verify that we're only picking up diffed changesontology_info.json
, and only thegenerate_diff_ontology
workflow is executed. theontology-assets/latest_ontology_info.json
file was also automatically generated and added by the workflow