All notable changes to Cladetime are documented here. Cladetime uses Semantic Versioning.
- Automatic fallback to variant-nowcast-hub archives when Nextstrain S3 historical metadata is unavailable
- New
_get_metadata_from_hub()function incladetime/util/reference.pyto retrieve metadata from variant-nowcast-hub GitHub archives - Support for historical metadata access dating back to 2024-10-09 via variant-nowcast-hub archives
- Comprehensive test coverage for fallback mechanism with 5 new test cases in
tests/unit/util/test_reference.py - New
CladeTimeDataUnavailableErrorexception for dates outside Nextstrain S3 data retention window - Comprehensive negative tests verifying errors are raised for unavailable dates
- Data availability constraints section in README documenting historical data limitations
- Reference to GitHub issue #185 in error messages and documentation for tracking infrastructure changes
- BREAKING: Updated minimum
sequence_as_ofdate from 2023-05-01 to 2025-09-29 due to Nextstrain's 90-day S3 retention policy - BREAKING: CladeTime now raises
CladeTimeDataUnavailableErrorfor dates outside data availability windows instead of silently defaulting to current date - BREAKING: Minimum
tree_as_ofdate remains 2024-10-09 (via variant-nowcast-hub archives), but now enforced with error instead of warning _get_ncov_metadata()now accepts optionalas_of_dateparameter to enable fallback support_get_ncov_metadata()logic simplified to eliminate code duplication and improve clarity (thanks @nickreich for the review feedback)Treeclass now catchesValueErrorfrom_get_s3_object_url()and triggers fallback when metadata is missingCladeTimeclass now handles missing S3 metadata gracefully with automatic fallback- BREAKING: Test infrastructure updated with new
mock_s3_sequence_data()andpatch_s3_for_tests()fixtures to handle Nextstrain's October 2025 S3 cleanup - All integration and unit tests now use
patch_s3_for_testsfixture to mock S3 calls - Updated configuration constant
nextstrain_min_seq_dateto reflect new data availability constraints - Updated retention policy language from "approximately 7 weeks" to "90 days" for clarity and accuracy
- Simplified test infrastructure by removing complex S3 mocking in favor of testing actual behavior
- Updated integration tests to use dates within data availability window (>= 2025-09-29)
- Increased dataset staleness threshold from 60 to 90 days in integration tests
- Removed
test_cladetime_assign_clades_historicaltest that relied on unavailable historical data (2024-10-30) - Removed complex mocking from
test_cladetime_urlsandtest_cladetime_ncov_metadataunit tests - See GitHub issue #185 for discussion of restoring historical test coverage
- CladeTime no longer fails when accessing historical dates after Nextstrain's October 2025 cleanup of S3 metadata files
- Tests now pass consistently regardless of Nextstrain S3 historical data availability
- Proper error handling and logging when both S3 and fallback sources are unavailable
- Test assertions now match updated error message language (90 days)
- Removed unused imports from test files
These changes reflect Nextstrain's October 2025 implementation of a 90-day S3 retention policy for versioned objects. Historical data beyond this window has been permanently deleted. This limitation may change as Nextstrain's infrastructure evolves. Users requiring access to historical data should consider archiving datasets locally or using alternative data sources.
The breaking changes in this release are necessary due to external infrastructure changes beyond CladeTime's control. The date validation ensures users receive clear error messages when requesting data that is no longer available, rather than silent failures or incorrect defaults. Further adjustments may be needed to ensure full compatibility with variant-nowcast-hub workflows.
- Cladetime now has a CHANGELOG
- Acknowledgements section in the README
- Performance improvement: use biobear as .fasta file reader for ZSTD-compressed sequence data
sequence_as_ofandtree_as_oftimestamps now default to 23:59:59 UTC instead of 00:00:00 UTC
- Publish Cladetime to PyPI
- Make the Clade class public
- Contributing guidelines
- Cladetime package documentation
- Support for demo mode that uses Nextstrain's 100k sample instead of an entire SARS-CoV-2 sequence dataset
- New
CladeTime.assign_cladesmethod that assigns clades to SARS-CoV-2 sequences using a point-in-time reference tree - New
nextclade_dataset_nameattribute inCladeTime.ncov_metadata - Warning message when Docker is not detected during Cladetime initialization
- Package renamed to
cladetime
- Output clade assignments as .tsv instead of .csv
- Fix UTC timezone bug when setting
CladeTime.sequence_as_ofandCladeTime.tree_as_of
- Cladetime CLI removed in favor of programmatic usage
- The
get_clade_list.pyfunctionality has moved to thevariant-nowcast-hub