-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Problem: The COVID (SARS-CoV-2) and RSV automation pipelines use different column names and location formatting in their timeline.tsv files, causing downstream integration issues and unnecessary maintenance burden.
Note: Influenza does note have a timeline.tsv as it is not running the calculation of Variant Abundance with Lollipop so this output it not generate.
Column Name Differences
| Field | COVID Pipeline | RSV Pipeline |
|---|---|---|
| Sample identifier | sample |
submissionId |
| Primer protocol | proto |
primerProtocol |
| Reference genome | (missing) | reference |
Example Files
COVID: /cluster/project/pangolin/processes/sars_cov_2/lollipop/variants/timeline.tsv
sample batch reads proto location_code date location
A1_05_2025_11_05 20251128_2511665243 250 v532_pooled 5 2025-11-05 Lugano (TI)
A2_15_2025_11_06 20251128_2511665243 250 v532_pooled 15 2025-11-06 Basel (BS)RSV: /cluster/project/pangolin/processes/rsv/RSVA/working/timeline.tsv
submissionId batch reads reference primerProtocol location_code date location
A1_05_2025_11_05 20251128_2511665243 250 v532_pooled Eawag-2024-v532_pooled 05 2025-11-05 Lugano
A2_15_2025_11_06 20251128_2511665243 250 v532_pooled Eawag-2024-v532_pooled 15 2025-11-06 BaselLocation Formatting Differences
COVID: Includes canton codes with UTF-8 preservation
Lugano (TI)Basel (BS)Zürich (ZH)(preserves umlaut)
RSV: City names only, strips special characters
LuganoBaselZurich(umlaut removed:ü→u)
Impact
- Data integration: Tools consuming both pipelines must handle two different schemas
- Location matching: The
Zürich→Zurichtransformation breaks joins on location names - Maintenance burden: Schema changes require updates across multiple codebases
- Error-prone: Easy to accidentally use wrong column names across pipelines
Recommendation
Align schemas between pipelines. Choose one format as canonical and update the other to match.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels