-
Notifications
You must be signed in to change notification settings - Fork 36
Open
Description
I'm translating the info_CLNHGVS
field in clinvar 38 build into VRS IDs.
Many hgvs IDs including [int]
are failing (repeat sequence notation in hgvs). Is there a transformation I should perform on the source hgvs data prior to passing to the allele translator?
Error translating NC_000001.11:g.930090TTCCTCTCCTCCTGCCCCACC[2]: NC_000001.11:g.930090TTCCTCTCCTCCTGCCCCACC[2]: char 42: expected the character '='
Error translating NC_000001.11:g.930139CCT[1]: NC_000001.11:g.930139CCT[1]: char 24: expected the character '='
Error translating NC_000001.11:g.930212AAG[1]: NC_000001.11:g.930212AAG[1]: char 24: expected the character '='
Using ga4gh-ver==2.1.3
:
from ga4gh.vrs.dataproxy import create_dataproxy
from ga4gh.vrs.extras.translator import AlleleTranslator
import os
os.environ["UTA_DB_URL"] = "postgresql://anonymous:[email protected]:5432/uta/uta_20241220"
seqrepo_rest_service_url = "seqrepo+https://services.genomicmedlab.org/seqrepo"
dataproxy = create_dataproxy(uri=seqrepo_rest_service_url)
translator = AlleleTranslator(dataproxy)
hgvs = [
"NC_000001.11:g.930090TTCCTCTCCTCCTGCCCCACC[2]",
"NC_000001.11:g.930139CCT[1]",
"NC_000001.11:g.930212AAG[1]",
]
for h in hgvs:
translated = translator.translate_from(h, "hgvs")
Metadata
Metadata
Assignees
Labels
No labels