Releases: cbg-ethz/sr2silo
v1.8.0 - Robust Reference Handling
What's New
Highlights
- New CLI flags for explicit reference paths (
--nuc-ref,--aa-ref) - Environment variable support (
NUC_REF,AA_REF) - XDG-compliant caching for LAPIS-fetched references (
~/.cache/sr2silo/references/) - Major documentation overhaul with new logo and improved structure
Features
- Add
--nuc-refand--aa-refCLI flags for explicit reference file paths - Support
NUC_REFandAA_REFenvironment variables for reference configuration - Use
~/.cache/sr2silo/references/for LAPIS-fetched references (XDG-compliant) - Move
timeline_columns.ymlinside package usingimportlib.resources - Add
NUC_REF/AA_REFsupport to Snakemake workflow - Workflow conda environment now includes
sr2silo>=1.8.0
Reference Resolution Priority
- CLI flags (
--nuc-ref,--aa-ref) - Environment variables (
NUC_REF,AA_REF) - LAPIS auto-fetch with caching
Bug Fixes
- Fix full URL path for multi-virus reference directories
- Fix non-user specific path handling
- Handle Snakemake exit codes for smarter job resubmission
Documentation
- Overhaul MkDocs documentation with new logo and styling
- Streamline README
- Add new API documentation pages (process, schema, vpipe)
Breaking Changes
- Genomic references moved from
resources/references/toexamples/references/
Full Changelog: v1.7.0...v1.8.0
v1.7.0 - Multi-Virus Support & Reference Filtering
Highlights
This release brings multi-virus support to sr2silo, starting with RSV-A alongside the existing SARS-CoV-2 pipeline. It also introduces critical reference filtering for accurate variant calling in mixed-reference BAM files.
New Features
Multi-Virus Framework
- RSV-A Support - Full organism-specific configuration for RSV-A processing
- Multi-virus deployment framework - Scalable architecture for adding new pathogens with automated Loculus submission
Reference Filtering for BAM Files
--reference-accessionflag forprocess-from-vpipecommand - Filter reads by reference sequence when processing BAM files containing alignments to multiple similar references (e.g., RSV-A/RSV-B)- Logging of filtering statistics showing kept/filtered read counts and percentages
- Raises
ZeroFilteredReadsErrorwhen filtering results in zero reads
Auto-Release for Loculus Submissions
--auto-release/-rflag forsubmit-to-loculus- Automatically approve and release sequences after submission--release-delayoption (default 180s) to wait for backend processing- Environment variable support via
AUTO_RELEASE=true
Improvements
- Updated project status badge from "concept" to "active" - we're production-ready!
- Cleanup of legacy sbatch scripts and LOG_DIR paths
Dependency Updates
- ruff 0.14.8 → 0.14.11
- pyright 1.1.407 → 1.1.408
- mkdocs-material 9.7.0 → 9.7.1
- actions/cache v4 → v5
- Various pip dependency bumps
Full Changelog
Features:
- feat: add --reference-accession filter for BAM files (#424)
- feat: add --auto-release option to submit-to-loculus command (#419)
- feat: multi-virus deployment framework with RSV-A and automated loculus submission
- feat: add RSV-A support with organism-specific configuration
Maintenance:
- chore: bump version to 1.7.0
- chore: cleanup legacy sbatch and fix LOG_DIR paths
- Update README status badge from concept to active
Full Changelog: v1.6.1...v1.7.0
v1.6.1
Performance
- Diamond DB Optimization: Significantly improved batch processing performance by creating the Diamond database once and reusing it, rather than recreating it for every batch of reads.
Maintenance
- Updated project dependencies and CI/CD workflows.
Full Changelog: v1.6.0...v1.6.1
Release 1.6.0
Most importantly:
- fixed bug in amino acid sequences with deletions
- Rename "Submission_URL" to "Backend_URL" to align with Loculus documentation
- refactor: migrate metadata fields to camelCase for Loculus compatibility
What's Changed
- pip(deps): bump pyyaml from 6.0.2 to 6.0.3 by @dependabot[bot] in #352
- pip(deps-dev): bump mkdocs-material from 9.6.20 to 9.6.21 by @dependabot[bot] in #353
- pip(deps-dev): bump pyright from 1.1.405 to 1.1.406 by @dependabot[bot] in #354
- pip(deps): bump pydantic from 2.11.9 to 2.11.10 by @dependabot[bot] in #355
- pip(deps): bump pydantic from 2.11.10 to 2.12.0 by @dependabot[bot] in #357
- fix(sam_to_seq_and_indels) – bug amino acid sequences with deletions by @gordonkoehn in #360
- Rename "Submission_URL" to "Backend_URL" to align with Loculus documentation by @Copilot in #359
- feat: trigger sr2silo by @gordonkoehn in #356
- Revert "feat: trigger sr2silo" by @gordonkoehn in #361
- feat: trigger sr2silo originally #356 by @gordonkoehn in #362
- pip(deps): bump pydantic from 2.12.0 to 2.12.3 by @dependabot[bot] in #363
- pip(deps-dev): bump mkdocs-material from 9.6.21 to 9.6.22 by @dependabot[bot] in #364
- pip(deps-dev): bump pyright from 1.1.406 to 1.1.407 by @dependabot[bot] in #370
- chore: Replace Black with ruff-format and update to ruff v0.14.2 by @Copilot in #373
- refactor: migrate metadata fields to camelCase for Loculus compatibility by @gordonkoehn in #371
- Bump version to 1.6.0 for next major release by @Copilot in #376
- fix: metadata submission by @gordonkoehn in #381
- Release 1.6.0 by @gordonkoehn in #374
Full Changelog: v1.5.0...v1.6.0
Release v1.5.0
This release introduces several significant updates to the Loculus submission workflow and related configuration, focusing on stricter environment variable handling, improved duplicate submission detection, enhanced file upload logic, and more robust metadata handling. The changes also include dependency and workflow updates, and some cleanup of environment and requirements files. We also resolve a critical but +1 shift in positions of amino acid alignments.
Key changes include:
Critical Bug Fix: Amino Acid Alignment Shift
- turned out this was a double bug
- for one, positions were taken equal to offsets, but offset = position - 1
- This led to a shift in amino acids and nucleotides by +1
- was compensated by 0-indexing error of -bam in pysam
- corrected bam_to_fastq_handle_indels()
- done in (#350)
Loculus Submission Workflow Improvements
-
The
submitmethod inLoculusClientnow requires both a processed file and a nucleotide alignment file, uploading both via pre-signed URLs and mapping them with correct key names (siloReadsandnucleotideAlignment). It also adds aresubmit_duplicateflag to control duplicate submission behavior and checks for previously released samples using a newreleased_samplesfunction. [1] [2] [3] [4] [5] -
Added the
released_samplesandget_original_metadatautility functions to fetch and process released sample IDs from the Loculus API, handling revoked entries and supporting both JSON and NDJSON API responses.
Metadata and Field Handling
- The
create_metadata_filemethod now maps specific metadata fields from snake_case to camelCase and writes only the specified fields, ensuring consistent output. Theparse_metadatamethod is updated to return metadata in snake_case, aligning with internal conventions. [1] [2] [3]
Configuration and Dependency Updates
-
The
.env.examplefile is removed, and all code for loading.envfiles viapython-dotenvis deleted, reflecting a move to rely solely on system environment variables. Corresponding dependencies (python-dotenv) are removed frompyproject.tomlandconda-recipe/meta.yaml. [1] [2] [3] [4] -
The
get_organismfunction now enforces that theORGANISMenvironment variable must be set, exiting with an error if it is missing.
CI and Workflow Modernization
- GitHub Actions workflows are updated to use
actions/setup-python@v6instead of@v5for both documentation and testing jobs. [1] [2]
Minor Cleanups
- Imports and function calls in
main.pyare updated to reflect the refactored submission logic. [1] [2]
Full Changelog: v1.5.0...v1.5.0
v1.4.0
🎉 Key Features
Enhanced Metadata Support for Loculus (#322)
- Added comprehensive metadata extraction from SILO input files
- Includes sample ID, batch ID, location, and sampling date parsing
- Supports both compressed (.zst) and uncompressed input files
- Significantly improves data quality for Loculus submissions
🔧 Improvements
Simplified Version Management (#323)
- Removed Git-based versioning for consistent behavior across environments
- Cleaner codebase with reduced dependencies
Workflow Reliability (#316)
- Added retry functionality to improve workflow stability
- Updated conda package distribution
📦 Dependencies
- Updated development dependencies (mkdocs-material, pyright, mkdocstrings-python)
- Updated GitHub Actions to latest versions
v1.3.0
This release upgrades sr2silo to support SILO input format version 0.8.0, implementing a new JSON schema structure that flattens metadata fields to the root level and restructures genomic segments with explicit sequence, insertions, and offset fields.
Key changes:
in v1.2.0 pre
- Migrated from nested JSON structure to flat schema with root-level metadata
- Replaced padded alignments with offset-based positioning for better efficiency
- Updated schema validation to distinguish between nucleotide and amino acid segments
in v1.3.0
The most significant change is making the LAPIS URL optional with intelligent fallback to default SARS-CoV-2 references (NC_045512.2) when LAPIS is unavailable or not specified.
- Refactored reference genome handling to use fallback logic when LAPIS URL is not provided or fails
- Made
read_ida required field in the SILO schema with proper validation and positioning - Updated documentation and configuration files to reflect optional LAPIS URL usage
- Improved file handling in the paired-end read merger to overwrite existing output files rather than raising an error, making workflows more robust.
Upgrading to SILO 0.8.0 format
This release upgrades sr2silo to support SILO input format version 0.8.0, implementing a new JSON schema structure that flattens metadata fields to the root level and restructures genomic segments with explicit sequence, insertions, and offset fields.
Key changes:
- Migrated from nested JSON structure to flat schema with root-level metadata
- Replaced padded alignments with offset-based positioning for better efficiency
- Updated schema validation to distinguish between nucleotide and amino acid segments
- Leading to a major bump in version sr2silo v1.2.0
Full Changelog: v1.0.1...v1.2.0
v1.1.1
What's Changed
This minor patch, does:
- upgrade Pyright
- ensures the example workflow has all transient dependencies/streamlines dependency files/bump
smallgenomeutilities - add automatic environment handeling via conda in slurm submission files
- Ensure unique filenames for submission metadata file in
submit-to-loculusrule, this makes parallel execution of sr2silo more stable. - Enhance the pyproject project description
Log
- v1.1.0 - reiterate interfaces & simplify metadata by @gordonkoehn in #283
- Bumping Version and minor cleanups (#289) by @gordonkoehn in #291
- pip(deps-dev): bump pyright from 1.1.402 to 1.1.403 by @dependabot[bot] in #292
- adding read pair merger to smk workflow end + bump up by @gordonkoehn in #293
- Automatic sbatch conda loader by @gordonkoehn in #295
Full Changelog: v1.0.1...v1.1.1
v1.1.0
What's Changed
This PR upgrades sr2silo to v1.1.0 by overhauling the metadata interface, centralizing sample selection via timeline.yaml, and integrating dynamic reference fetching and authentication improvements.
- Replace hardcoded SAMPLE_BATCH_IDS with dynamic filtering from timeline.yaml
- Fetch nucleotide/amino acid references from LAPIS at runtime and simplify metadata handling
- Update submission to use LoculusClient with CLI-based authentication and add SBATCH templating scripts
- v1.1.0 - reiterate interfaces & simplify metadata by @gordonkoehn in #283
Full Changelog: v1.0.1...v1.1.0