Skip to content

Commit 704c334

Browse files
Add FAM193B and TBC1D7 (#352)
## Description Adding OPDM_FAM193B and OPDM_TBC1D7 Fixes: #338 and #339 ## Major Changes - OPDM_FAM193B - OPDM_TBC1D7 ## Checklist - [x] All changes are well summarized - [x] Check all tests pass - [x] Check that the website preview looks good - [x] Update the STRchive version in `CITATION.cff`, format X.Y.Z. If any major changes, increment Y. If only minor changes, increment Z. If the breaking change (rare), increment X. - [x] Ask someone to review this PR --------- Co-authored-by: Harriet Dashnow <h.dashnow@gmail.com> Co-authored-by: hdashnow <3794821+hdashnow@users.noreply.github.com> Co-authored-by: Macayla-weiner <205837004+Macayla-weiner@users.noreply.github.com>
1 parent 128ae29 commit 704c334

32 files changed

Lines changed: 577 additions & 40 deletions

CITATION.cff

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
title: STRchive
2-
version: 2.19.0
3-
date-released: "2026-05-05"
2+
version: 2.20.0
3+
date-released: "2026-05-08"
44
url: https://github.com/dashnowlab/STRchive
55
authors:
66
- family-names: Dashnow

data/STRchive-citations.json

Lines changed: 222 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -162518,6 +162518,228 @@
162518162518
"language": "en",
162519162519
"note": "This CSL Item was generated by Manubot v0.6.1 from its persistent identifier (standard_id).\nstandard_id: pubmed:42003611"
162520162520
},
162521+
{
162522+
"id": "pmid:39868092",
162523+
"manubot_success": true,
162524+
"link": "https://www.ncbi.nlm.nih.gov/pubmed/39868092",
162525+
"title": "Detailed tandem repeat allele profiling in 1,027 long-read genomes reveals genome-wide patterns of pathogenicity.",
162526+
"type": "article-journal",
162527+
"doi": "10.1101/2025.01.06.631535",
162528+
"authors": [
162529+
["Matt C", "Danzi"],
162530+
["Isaac R L", "Xu"],
162531+
["Sarah", "Fazal"],
162532+
["Egor", "Dolzhenko"],
162533+
["David", "Pellerin"],
162534+
["Ben", "Weisburd"],
162535+
["Chloe", "Reuter"],
162536+
["Jacinda", "Sampson"],
162537+
["Chiara", "Folland"],
162538+
["Matthew", "Wheeler"],
162539+
["Anne", "O'Donnell-Luria"],
162540+
["Stefan", "Wuchty"],
162541+
["Gianina", "Ravenscroft"],
162542+
["Michael A", "Eberle"],
162543+
["Stephan", "Zuchner"]
162544+
],
162545+
"publisher": "bioRxiv : the preprint server for biology",
162546+
"issn": "2692-8205",
162547+
"date": "2025-01-20",
162548+
"abstract": "Tandem repeats are a highly polymorphic class of genomic variation that play causal roles in rare diseases but are notoriously difficult to sequence using short-read techniques",
162549+
"language": "en",
162550+
"note": "This CSL Item was generated by Manubot v0.6.1 from its persistent identifier (standard_id).\nstandard_id: pubmed:39868092"
162551+
},
162552+
{
162553+
"id": "pmid:38585781",
162554+
"manubot_success": true,
162555+
"link": "https://www.ncbi.nlm.nih.gov/pubmed/38585781",
162556+
"title": "Integration of transcriptomics and long-read genomics prioritizes structural variants in rare disease.",
162557+
"type": "article-journal",
162558+
"doi": "10.1101/2024.03.22.24304565",
162559+
"authors": [
162560+
["Tanner D", "Jensen"],
162561+
["Bohan", "Ni"],
162562+
["Chloe M", "Reuter"],
162563+
["John E", "Gorzynski"],
162564+
["Sarah", "Fazal"],
162565+
["Devon", "Bonner"],
162566+
["Rachel A", "Ungar"],
162567+
["Pag\u00e9 C", "Goddard"],
162568+
["Archana", "Raja"],
162569+
["Euan A", "Ashley"],
162570+
["Jonathan A", "Bernstein"],
162571+
["Stephan", "Zuchner"],
162572+
["Michael D", "Greicius"],
162573+
["Stephen B", "Montgomery"],
162574+
["Michael C", "Schatz"],
162575+
["Matthew T", "Wheeler"],
162576+
["Alexis", "Battle"]
162577+
],
162578+
"publisher": "medRxiv : the preprint server for health sciences",
162579+
"issn": "",
162580+
"date": "2024-03-26",
162581+
"abstract": "Rare structural variants (SVs) - insertions, deletions, and complex rearrangements - can cause Mendelian disease, yet they remain difficult to accurately detect and interpret. We sequenced and analyzed Oxford Nanopore long-read genomes of 68 individuals from the Undiagnosed Disease Network (UDN) with no previously identified diagnostic mutations from short-read sequencing. Using our optimized SV detection pipelines and 571 control long-read genomes, we detected 716 long-read rare (MAF < 0.01) SV alleles per genome on average, achieving a 2.4x increase from short-reads. To characterize the functional effects of rare SVs, we assessed their relationship with gene expression from blood or fibroblasts from the same individuals, and found that rare SVs overlapping enhancers were enriched (LOR = 0.46) near expression outliers. We also evaluated tandem repeat expansions (TREs) and found 14 rare TREs per genome; notably these TREs were also enriched near overexpression outliers. To prioritize candidate functional SVs, we developed Watershed-SV, a probabilistic model that integrates expression data with SV-specific genomic annotations, which significantly outperforms baseline models that don't incorporate expression data. Watershed-SV identified a median of eight high-confidence functional SVs per UDN genome. Notably, this included compound heterozygous deletions in",
162582+
"language": "en",
162583+
"note": "This CSL Item was generated by Manubot v0.6.1 from its persistent identifier (standard_id).\nstandard_id: pubmed:38585781"
162584+
},
162585+
{
162586+
"id": "pmid:38297326",
162587+
"manubot_success": true,
162588+
"link": "https://www.ncbi.nlm.nih.gov/pubmed/38297326",
162589+
"title": "RExPRT: a machine learning tool to predict pathogenicity of tandem repeat loci.",
162590+
"type": "article-journal",
162591+
"doi": "10.1186/s13059-024-03171-4",
162592+
"authors": [
162593+
["Sarah", "Fazal"],
162594+
["Matt C", "Danzi"],
162595+
["Isaac", "Xu"],
162596+
["Shilpa Nadimpalli", "Kobren"],
162597+
["Shamil", "Sunyaev"],
162598+
["Chloe", "Reuter"],
162599+
["Shruti", "Marwaha"],
162600+
["Matthew", "Wheeler"],
162601+
["Egor", "Dolzhenko"],
162602+
["Francesca", "Lucas"],
162603+
["Stefan", "Wuchty"],
162604+
["Mustafa", "Tekin"],
162605+
["Stephan", "Z\u00fcchner"],
162606+
["Vanessa", "Aguiar-Pulido"]
162607+
],
162608+
"publisher": "Genome biology",
162609+
"issn": "1474-760X",
162610+
"date": "2024-01-31",
162611+
"abstract": "Expansions of tandem repeats (TRs) cause approximately 60 monogenic diseases. We expect that the discovery of additional pathogenic repeat expansions will narrow the diagnostic gap in many diseases. A growing number of TR expansions are being identified, and interpreting them is a challenge. We present RExPRT (Repeat EXpansion Pathogenicity pRediction Tool), a machine learning tool for distinguishing pathogenic from benign TR expansions. Our results demonstrate that an ensemble approach classifies TRs with an average precision of 93% and recall of 83%. RExPRT's high precision will be valuable in large-scale discovery studies, which require prioritization of candidate loci for follow-up studies.",
162612+
"language": "en",
162613+
"note": "This CSL Item was generated by Manubot v0.6.1 from its persistent identifier (standard_id).\nstandard_id: pubmed:38297326"
162614+
},
162615+
{
162616+
"id": "pmid:40357124",
162617+
"manubot_success": true,
162618+
"link": "https://www.ncbi.nlm.nih.gov/pubmed/40357124",
162619+
"title": "Long-read sequencing for diagnosis of genetic myopathies.",
162620+
"type": "article-journal",
162621+
"doi": "10.1136/bmjno-2024-000990",
162622+
"authors": [
162623+
["Dennis", "Yeow"],
162624+
["Laura Ivete", "Rudaks"],
162625+
["Ryan", "Davis"],
162626+
["Karl", "Ng"],
162627+
["Roula", "Ghaoui"],
162628+
["Pak Leng", "Cheong"],
162629+
["Gianina", "Ravenscroft"],
162630+
["Marina", "Kennerson"],
162631+
["Ira", "Deveson"],
162632+
["Kishore Raj", "Kumar"]
162633+
],
162634+
"publisher": "BMJ neurology open",
162635+
"issn": "2632-6140",
162636+
"date": "2025-05-11",
162637+
"abstract": "Genetic myopathies are caused by pathogenic variants in >300 genes across the nuclear and mitochondrial genomes. Although short-read next-generation sequencing (NGS) has revolutionised the diagnosis of genetic disorders, large and/or complex genetic variants, which are over-represented in the genetic myopathies, are not well characterised using this approach. Long-read sequencing (LRS) is a newer genetic testing technology that overcomes many of the limitations of NGS. In particular, LRS provides improved detection of challenging variant types, including short tandem repeat (STR) expansions, copy number variants and structural variants, as well as improved variant phasing and concurrent assessment of epigenetic changes, including DNA methylation. The ability to concurrently detect multiple STR expansions is particularly relevant given the growing number of recently described genetic myopathies associated with STR expansions. LRS will also aid in the identification of new myopathy genes and molecular mechanisms. However, use of LRS technology is currently limited by high cost, low accessibility, the need for specialised DNA extraction procedures, limited availability of LRS bioinformatic tools and pipelines, and the relative lack of healthy control LRS variant databases. Once these barriers are addressed, the implementation of LRS into clinical diagnostic pipelines will undoubtedly streamline the diagnostic algorithm and increase the diagnostic rate for genetic myopathies. In this review, we discuss the utility and critical impact of LRS in this field.",
162638+
"language": "en",
162639+
"note": "This CSL Item was generated by Manubot v0.6.1 from its persistent identifier (standard_id).\nstandard_id: pubmed:40357124"
162640+
},
162641+
{
162642+
"id": "pmid:42094143",
162643+
"manubot_success": true,
162644+
"link": "https://www.ncbi.nlm.nih.gov/pubmed/42094143",
162645+
"title": "Genome-wide detection and clinical prioritization of tandem repeat outliers using long-read sequencing.",
162646+
"type": "article-journal",
162647+
"doi": "10.64898/2026.04.30.26352103",
162648+
"authors": [
162649+
["Sophia B", "Gibson"],
162650+
["Nikhita", "Damaraju"],
162651+
["J Gus", "Gustafson"],
162652+
["Elsa V", "Balton"],
162653+
["Sirisak", "Chanprasert"],
162654+
["Ian A", "Glass"],
162655+
["Martha", "Horike-Pyne"],
162656+
["Runjun D", "Kumar"],
162657+
["Kathleen A", "Leppig"],
162658+
["Chris", "Lundberg"],
162659+
["Jane", "Ranchalis"],
162660+
["Elisabeth A", "Rosenthal"],
162661+
["Andrew K", "Solomon"],
162662+
["Andrew B", "Stergachis"],
162663+
["Mark", "Wener"],
162664+
["Gail P", "Jarvik"],
162665+
["Elizabeth E", "Blue"],
162666+
["Katrina M", "Dipple"],
162667+
["Harriet", "Dashnow"],
162668+
["Lea M", "Starita"],
162669+
["Danny E", "Miller"]
162670+
],
162671+
"publisher": "medRxiv : the preprint server for health sciences",
162672+
"issn": "",
162673+
"date": "2026-05-01",
162674+
"abstract": "Tandem repeat expansions (TREs) cause over 60 known neurological, neuromuscular, and developmental disorders. Detecting these expansions genome-wide is challenging due to their size, sequence complexity (including interruptions), and population variation. While long-read sequencing is an emerging technology that can fully resolve many TREs, no methods have been described for genome-wide identification and prioritization of candidate pathogenic TREs with this technology.",
162675+
"language": "en",
162676+
"note": "This CSL Item was generated by Manubot v0.6.1 from its persistent identifier (standard_id).\nstandard_id: pubmed:42094143"
162677+
},
162678+
{
162679+
"id": "pmid:42095061",
162680+
"manubot_success": true,
162681+
"link": "https://www.ncbi.nlm.nih.gov/pubmed/42095061",
162682+
"title": "Systematic proteomics reveals plasma NEFL as a robust predictor and pathological associate in",
162683+
"type": "article-journal",
162684+
"doi": "10.3389/fnagi.2026.1792887",
162685+
"authors": [
162686+
["Zhen", "Hu"],
162687+
["Jing-Jin", "Wan"],
162688+
["Qin-Qin", "Yan"],
162689+
["Yu", "Fan"],
162690+
["Jun", "Liu"]
162691+
],
162692+
"publisher": "Frontiers in aging neuroscience",
162693+
"issn": "1663-4365",
162694+
"date": "2026-04-21",
162695+
"abstract": "The",
162696+
"language": "en",
162697+
"note": "This CSL Item was generated by Manubot v0.6.1 from its persistent identifier (standard_id).\nstandard_id: pubmed:42095061"
162698+
},
162699+
{
162700+
"id": "pmid:42090775",
162701+
"manubot_success": true,
162702+
"link": "https://www.ncbi.nlm.nih.gov/pubmed/42090775",
162703+
"title": "Challenges in the diagnosis of spinocerebellar ATAXIA 27B.",
162704+
"type": "article-journal",
162705+
"doi": "10.1016/j.jns.2026.125969",
162706+
"authors": [
162707+
["N\u00faria Caballol", "Pons"],
162708+
["Alejandro Peral", "Quir\u00f3s"],
162709+
["Anna", "Planas-Ballv\u00e9"],
162710+
["Paula Lombardo", "Del Toro"],
162711+
["Imma Hernan", "Sendra"],
162712+
["Asunci\u00f3n \u00c1vila", "Rivera"]
162713+
],
162714+
"publisher": "Journal of the neurological sciences",
162715+
"issn": "1878-5883",
162716+
"date": "2026-04-30",
162717+
"abstract": "To characterize the clinical, radiologic, and genetic spectrum of patients with (GAA) repeat expansions in FGF14 gene and to analyze diagnostic challenges associated with spinocerebellar ataxia type 27B (SCA27B).",
162718+
"language": "en",
162719+
"note": "This CSL Item was generated by Manubot v0.6.1 from its persistent identifier (standard_id).\nstandard_id: pubmed:42090775"
162720+
},
162721+
{
162722+
"id": "pmid:42091595",
162723+
"manubot_success": true,
162724+
"link": "https://www.ncbi.nlm.nih.gov/pubmed/42091595",
162725+
"title": "Double strand breaks drive toxicity in a Huntington's disease mouse model with or without somatic expansion.",
162726+
"type": "article-journal",
162727+
"doi": "10.1038/s41467-026-72382-z",
162728+
"authors": [
162729+
["Aris A", "Polyzos"],
162730+
["Ana", "Cheong"],
162731+
["Jung Hyun", "Yoo"],
162732+
["Lana", "Blagec"],
162733+
["Zachary D", "Nagel"],
162734+
["Cynthia T", "McMurray"]
162735+
],
162736+
"publisher": "Nature communications",
162737+
"issn": "2041-1723",
162738+
"date": "2026-05-06",
162739+
"abstract": "Genome-wide association studies (GWAS) have provided strong evidence that modifiers of CAG tract length have a crucial influence on Huntington disease onset, but somatic expansion alone may not be sufficient to drive neuronal death. Here, we report that DSBs drive neuropathology in male HdhQ(150/150) mice, regardless of somatic expansion of the inherited disease allele. DSBs and somatic expansion occur simultaneously in the HD brain, but the two types of DNA damage drive disease by distinct mechanisms. The site-specific increases in CAG tract length are driven by active mismatch repair (MMR), while DSBs occur genome-wide and are driven by mutant huntingtin-mediated suppression of nonhomologous joining of DNA broken ends. DSBs and transcriptional dysfunction occur in animals that cannot somatically expand their inherited allele. Conversely, suppression of DSBs is sufficient to reverse neuropathology even when somatic expansion is active. We propose that CAG expansion and DSBs promote downstream neuronal pathology as separable drivers. The disease-length CAG tract leads to early inhibition of DSBR and accumulating DSBs over time ultimately kill neurons.",
162740+
"language": "en",
162741+
"note": "This CSL Item was generated by Manubot v0.6.1 from its persistent identifier (standard_id).\nstandard_id: pubmed:42091595"
162742+
},
162521162743
{
162522162744
"id": "omim:309548",
162523162745
"manubot_success": false,

0 commit comments

Comments
 (0)