-
Notifications
You must be signed in to change notification settings - Fork 9
[GEN-1704] Filter out germline variants from sv files #583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
thomasyu888
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥 LGTM. will defer to @danlu1 for final review.
danlu1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
|
@rxu17 just a follow up question, since you found "SV_STATUS is in the testing pipeline's consortium release sv file while SV_Status is found in the production pipeline 's consortium release sv file.". Do we have plan to update testing pipeline to make it align with prod? |
I think this is another ticket to figure out why it's happening and then resolve from there (because I'm not sure why it's happening). I think that's a low priority ticket given we have no known current issues. Overarching goal to resolve this would be having a data validation framework set up since we're not enforcing any specific data standards and checks in the release files. |



Purpose: This PR will be a hotfix to filter out germline variants from the sv file at the consortium release step (and subsequently the public release step since it just copies over the consortium release sv file) before we can release the germline variant validation rule.
Changes: Changes are isolated to the
store_sv_filesfunction indatabase_to_staging.py. Had to make it case-insensitive because I've found thatSV_STATUSis in the testing pipeline's consortium release sv file whileSV_Statusis found in the production pipeline 's consortium release sv file.Testing: Followed the standard validation of new features guide. The germline variants are filtered out.