v2.2.14 is a maintenance-only release based on v2.2.13.
- Added the local
scripts/release_gate.pyrelease gate. - Added architecture audit documentation covering release tooling, CLI/config, workflow paths, acquisition, selection/evidence, taxonomy sources, reports, diagnostics, and tests.
- Added CLI compatibility and workflow dispatch characterization tests.
- Reorganized the release process and checklist into clearer three-phase preparation, verification, and publication steps.
- Clarified the maintenance-release gate and local release readiness checks.
- Continued the gradual
cli.pyparser, configuration, and dispatch refactoring without changing operator-facing workflow behavior.
- Fixed Python 3.10 compatibility for the local release gate.
- No changes to download strategy, selection policy, or evidence thresholds.
- No provider or ATCC automatic download support is introduced.
- No new real-genus coverage is claimed.
- No live downloads are required for this maintenance release.
v2.2.13 is a maintenance-only release based on v2.2.12.
- Switched the default omitted-
--outdirbehavior to workspace-based run directories. - Reorganized the maintained documentation system around current entry points, contracts, workspace policy, and archived historical notes.
- Added documentation and workspace hygiene checks, including CI coverage for documentation hygiene.
- Improved release consistency checking for version anchors and release readiness metadata.
- No changes to download strategy, selection policy, or evidence thresholds.
- No provider or ATCC automatic download support is introduced.
- No new real-genus coverage is claimed.
v2.2.12 is a maintenance-only release based on v2.2.11.
- Added release consistency checker.
- Reduced duplicated current-version wording in docs.
- Documented
handoff_index.mdas a package navigation/operator handoff artifact. - Hardened maintenance release checklist/process.
- No changes to download strategy, selection safety, or evidence thresholds.
v2.2.11 is a maintenance/refactor-only release based on v2.2.10.
- No behavior changes.
- No selection policy changes.
- No evidence threshold changes.
- No download strategy changes.
- No real download validation.
v2.2.10 is a small UX/reporting polish release based on v2.2.9 real-world validation.
next-stepavoids repeated Entrez fallback suggestions after fallback completion.- Plan-only
next-stepnow prioritizes selection review and guarded downloads. - Taxonomy enrichment summaries clarify offline scaffold and cache-only runs.
package-resultsnow writeshandoff_index.md.
- No changes to download strategy, selection safety, or evidence thresholds.
v2.2.9 improves handoff robustness and safe rerun behavior.
- Prevent accidental cross-genus reuse of an existing outdir; use
--allow-genus-changeonly when intentionally rebuilding an outdir for a different genus. - Improve zero accepted checklist guidance by pointing users to
excluded_lpsn_taxa.tsvinstead of guarded downloads. - Improve NCBI BioSample transient backend/network failure guidance with retry/cache-based next steps.
- Expand
package-results --failed-handoffto include available early acquisition, cache, and diagnostic artifacts. - Clarify plan-only run reviews when downloads were not executed.
- Normal delivery packaging still requires
manifest.tsv. --failed-handoffonly expands optional artifacts in failed-handoff mode.
- Added
package-results --failed-handoffto create review artifacts for failed runs that stop beforemanifest.tsv.
- Improved
next-stepguidance for duplicate selected assembly accession failures.
- Normal delivery packaging still requires
manifest.tsv. - Failed handoff packages are review artifacts, not completed genome delivery packages.
- Cleaned up the v2.2.x handoff path around
completion/manual_supplement_hints.tsv, making the manual supplement task queue the explicit place for curator follow-up after expanded discovery, rejected candidate review, or missing external candidate checks. - Aligned
report/summary.md,report/run_review.md,status, andnext-stepvocabulary aroundreason,recommended_action, andhandoff_pathso ordinary users can follow the same review path from reports and CLI navigation. - Recorded the Clostridium limited smoke as a small cache-based/synthetic verification of guarded planning, handoff visibility, status/report output, and packaging, not as genus completion.
- Tightened release documentation and install reproducibility checks so package
metadata,
typetreeflow.__version__, CLI--version, README, docs, and release notes agree on2.2.7. - Non-goals: no full Clostridium completion, no expanded discovery auto-selection, no provider/ATCC auto-download, and no evidence model rewrite.
- Strengthened the representative species identity guard so top-ranked exploratory fallbacks cannot cover a different checklist species.
- Rejected explicit organism/checklist species mismatches before auto-selection, keeping mismatched candidates out of balanced and representative selected outputs.
- Made duplicate selected accessions fail at the selection stage and surface a clear next-step explanation instead of allowing ambiguous downstream plans.
- Updated
report/summary.mdandreport/run_review.mdexplanations forrejected_species_mismatchandspecies_identity_mismatchrows. - Verified the Clostridium cache-based plan-only regression with no duplicate
selected accession and no erroneous
GCF_055383455.1coverage forClostridium nitritogenes.
- Unified the package metadata, Python package version, CLI
--version,doctordiagnostics, README, and release notes on version2.2.5. - Improved Entrez fallback provenance reporting so fallback-derived sequence records are easier to distinguish and audit in reports.
- Clarified 16S provenance by separating same-genome barrnap 16S evidence from Entrez fallback 16S evidence.
- Improved resume behavior, report handoff clarity, and release credibility checks for the v2.2.x workflow.
- Added v2.2.x integration release notes and an end-to-end acceptance checklist for the v2.2.2 shared acquisition/gap reporting, v2.2.3 expanded discovery audit, and v2.2.4 NCBI Taxonomy enrichment work.
- Clarified that expanded discovery and taxonomy-derived rows are audit-only:
they do not automatically modify
manifest.tsv,selection/user_selection.tsv, completion metrics, or evidence levels, and they do not promise automatic 100% coverage. - Completed release verification with 932 passing tests, CLI smoke coverage for help/doctor commands, and targeted smoke checks for report/run_review, resume, Entrez fallback provenance, and expanded discovery history.
- Fixed resume dry-run handling so
--dry-runtakes priority over real execution enable flags, and protected combined 16S assembly from primary FASTA header collisions while preserving Entrez fallback provenance fields.
- Raised TSV CSV field-size handling for release verification, status, delivery, and CLI summary readers so long NCBI Datasets CLI stderr fields do not break post-download reporting.
- Hardened manifest path normalization and resolution for repo-relative, outdir-relative, and manifest-relative genome and 16S paths.
- Fixed barrnap execution to resolve genome FASTA inputs relative to the manifest while keeping rrna plan outputs portable and relative.
- Added regression coverage from the Spirosoma post-release validation path.
- Added high-level
verify-genusorchestration for LPSN-first genus planning, guarded download execution with--auto-accept-selection --enable-downloads, and optional barrnap 16S extraction. - Added
verify-release-genusto run balanced and representative release verification directories and updateverification_matrix.tsvplus a Markdown release summary. - Added
package-resultsdelivery bundles with manifest, selection, evidence, download, report, genome, and 16S handoff files while excluding credentials, NCBI ZIP cache files, and local environment files. - Added
doctor,status, andnext-stepcommands for dependency checks, run-state inspection, and actionable workflow guidance. - Added
run_state.jsonas a stable machine-readable workflow status file. - Structured selection evidence in
manifest.tsv, preserving strict-confirmed, likely type-material, and representative-only layers across resume and report generation. - Standardized manifest genome and 16S paths as relative POSIX paths while retaining compatibility with older Windows-style manifest paths.
- Hardened dry-run, BioSample Entrez, and high-level download argument validation so conflicting combinations fail before workflow execution.
- Improved missing external-tool messages, including explicit guidance that the
required
datasetsexecutable is the NCBI Datasets CLI, not the Python package nameddatasets. - Updated README, cookbook, release checklist, and output-layout documentation so high-level workflow commands are the recommended user entry point.
- Added
manual_review_report.mdas a stable human-readable companion to the manual deposit-evidence and species-gap review outputs. - Added stable
ranking_reasonsandblocking_reasonsexplanation fields touser_selection.tsvso candidate ranking and policy blocks are auditable. - Hardened BioSample field-level deposit ID extraction and stable BioSample evidence markers for candidate rows, including deposit ID fields and type-material evidence markers.
- Added
selection/download_preflight_summary.tsvas a single-row preflight summary for selected records before download planning/execution. - Added representative-only download risk summaries so exploratory representative rows are visible before guarded downloads and report-only summaries.
- Updated the documented strict, likely type-material, and representative-only routes so strict completion remains deposit-equivalence based while balanced and representative routes remain review/exploration surfaces.
- Strengthened provider planning and external FASTA boundary documentation: provider planning remains review-only, and external FASTA registration remains curator-reviewed local evidence.
- Non-goals: this release does not add ATCC, DSMZ, JCM, NCTC, or other
provider auto-downloaders; does not loosen strict completion; and does not
count
likely_type_materialorrepresentative_onlyrows toward strict type-strain completion.
- Added the
representativeselection policy for exploratory top-ranked fallback selection while marking unconfirmed rows as representative-only. - Added selection, report, and manifest risk layering with
evidence_levelandtype_confirmation_statusvalues for strict-confirmed, likely type-material, and representative-only records. - Tightened
balancedselection to strong-evidence automatic selection only. - Hardened completion audit counting so likely type-material and representative-only rows cannot inflate strict completion metrics.
- Added BioSample enrichment UX guidance for strict and balanced acquisition when guarded Entrez lookup is appropriate.
- Stable provider automation framework release for guarded, review-only provider planning on top of the stable LPSN-first type-strain acquisition and audit workflow.
- Stabilized the provider capability registry with explicit unavailable, planning-only, metadata-only, and download-enabled capability states.
- Kept provider network behavior disabled by default.
- Stabilized ATCC planning-only behavior behind the downloader gate, so ATCC remains planning-only unless a future review explicitly enables downloader behavior.
- Added the credential-like redaction helper as a stable safety boundary for provider planning outputs.
- Documented the private provider cache policy: provider data stays out of
cache/ncbi/, and this release does not write provider cache artifacts. - Stabilized review-only provider planning as a supported planning and audit surface that does not create acquisition evidence or alter completion metrics.
- This stable release does not include an ATCC or provider downloader, provider
login, scraping, browser automation, credential storage, terms acceptance,
provider manifest/name-map writes, provider download-plan writes, provider
external_genomes.tsvwrites, or completion metric changes from provider planning rows.
- Added the provider automation framework skeleton with provider adapter interface, registry, policy boundaries, and offline provider planning notes.
- Added the provider capability registry with fail-closed statuses for unavailable, planning-only, metadata-only, and download-enabled modes.
- Kept provider network behavior disabled by default, with provider planning continuing to emit review-only rows and no network, download, credential, manifest, name-map, external registration, or NCBI download-plan actions.
- Added ATCC planning-only behavior gated by an unavailable/not-passed downloader review, including deterministic gate-failure and user-assisted handoff notes.
- Added a credential-like redaction helper for provider planning outputs.
- Added private provider cache policy boundaries that keep provider data out of
cache/ncbi/and do not write provider cache artifacts in this skeleton. - Documented that provider planning remains review-only and does not change completion metrics until curator-reviewed external registration occurs.
- This release candidate does not include an ATCC or provider downloader,
provider login, scraping, browser automation, credential storage, terms
acceptance, provider manifest/name-map writes, provider download-plan writes,
provider
external_genomes.tsvwrites, or completion metric changes from provider planning rows.
- Hardened provider handoff planning so proposal outputs remain explicit review-only rows with local FASTA path, SHA-256, terms, and manual-review prompts before any external registration handoff.
- Added provider proposal review-only guarantees: provider planning proposals
always stay
external_genome_manual_review_requiredand must be copied intoexternal_genomes.tsvbefore--register-external-genomescan validate and install them. - Expanded report-only provider proposal summaries with risk/count reporting for proposed rows, unexpected registered statuses, manual-review rows, missing local FASTA paths, and missing SHA-256 checksums.
- Polished the real external
F. mortiferumpilot workflow and templates to clarify reviewed proposal handoff, installed local FASTA evidence, checksum verification, and downstream mixed-provenance readiness. - Added a local artifact normalization design for a future offline, curator-supplied FASTA preparation layer before external registration.
- Kept tests and documentation consistent around provider proposal boundaries, local artifact non-scope, report-only behavior, and completion metric separation.
- This release does not include an ATCC or provider downloader, provider login, scraping, credential handling, provider download, provider manifest/name-map writes, provider download-plan writes, or completion metric changes from provider proposals or normalization outputs.
- Stable v1.0.0 release of the LPSN-first type-strain acquisition and audit workflow, covering the guarded CLI, stable I/O contracts, completion boundaries, provider-planning review boundary, documentation, and packaging readiness.
- This stable release does not include an ATCC or provider downloader, provider login, scraping, browser automation, credential handling, provider downloads, provider-backed manifest/name-map writes, provider-backed NCBI download-plan writes, FASTA installation from provider planning rows, or completion metric changes from provider planning rows.
- Release candidate for the v1.0.0 LPSN-first type-strain acquisition and audit workflow, scoped as release hardening for the existing guarded CLI, stable I/O contracts, completion boundaries, provider-planning review boundary, documentation, and packaging readiness.
- Prepared v1.0.0rc1 release-candidate readiness documentation by tightening release checklist gates for v1.0 scope, provider/ATCC automation non-scope, stable contract review, and clean validation artifact cleanup.
- Note: v1.0.0rc1 does not include an ATCC or provider downloader, provider login, scraping, browser automation, credential handling, provider downloads, provider-backed manifest/name-map writes, provider-backed NCBI download-plan writes, FASTA installation from provider planning rows, or completion metric changes from provider planning rows.
- Added a v0.9.0 provider adapter spike with a dry-run-only
--plan-provider-registrationCLI,provider_request.tsvparsing,provider/provider_registration_plan.tsv, andprovider/proposed_external_genomes.tsvreview outputs. - Added provider planning documentation, schema/status/output contracts, a
synthetic
examples/provider_request_minimal.tsvfixture, report-only provider planning counts, overwrite protection, and manual handoff compatibility with the existingexternal_genomes.tsvworkflow. - Note: v0.9.0 provider planning does not automate ATCC/provider login, scraping, browser automation, credential handling, downloads, FASTA installation, manifest/name-map writes, NCBI download-plan writes, or completion metric changes.
- Clarified v0.8.0 planning, external workflow, and provider-automation boundaries so ATCC/provider automation remains out of scope.
- Added real
F. mortiferumexternal pilot template documentation for local curator-provided FASTA evidence packages without committing restricted data. - Tightened completion audit and report wording so external-inclusive 17/17 readiness is not confused with NCBI Assembly strict 17/17 completion.
- Note: v0.8.0 does not automate ATCC Genome Portal or any other provider portal, and does not include provider download automation.
- Added an explicit
--write-completion-auditworkflow for local species-checklist completion auditing. - Added
source_audit/completion_audit.tsvandsource_audit/completion_summary.tsvoutputs for mixed NCBI and external registered genome evidence. - Added report consumption of an existing completion summary, with optional completion audit detail when present.
- Added a minimal mixed-provenance acceptance fixture covering NCBI, external, missing, and conflicting completion states.
- Documented the implemented
--write-completion-auditworkflow and report consumption boundary for completion audit outputs. - Added a
Fusobacterium mortiferumexternal registered genome pilot procedure for curator-run external-inclusive 17/17 evaluation without claiming 17/17 NCBI Assembly strict completion. - Note: v0.7.0 does not automate ATCC Genome Portal or any other provider portal.
- Fixed example fixture line-ending attributes so the external genome FASTA
checksum remains reproducible on Windows clones with
core.autocrlf=true.
- Added manual external genome registration for curator-provided local FASTA
files.
--register-external-genomesvalidatesexternal_genomes.tsv, writes reviewable registration results and install plans, and in non-dry-run mode copies valid FASTA files intogenomes/references/. - Added external registered genome manifest and name-map integration. Installed
external records use
external_registered_genomeprovenance, keepassembly_accessionempty, preserve external IDs in notes, and can be appended to an existing manifest with--merge-manifest. - Added explicit separation from NCBI downloads: external records do not create
NCBI Datasets download work, and download plans can mark them as
external_genome_download_not_applicableinstead of missing an accession. - Added report-only and downstream planning compatibility for external registered genome records, including provenance counts, an external registered genomes report section, barrnap/ANI/16S phylogeny dry-run planning, and a minimal synthetic external genome smoke workflow.
- Note: v0.6.0 manual external registration does not automate ATCC Genome Portal or any other provider portal, and performs no external login, scraping, purchasing, or provider download.
- Added LPSN-first strict type-strain acquisition workflow for correct-valid species.
- Added official LPSN API/cache support and checklist-compatible acquisition inputs.
- Added validly published and correct-name filtering for acquisition candidates.
- Added culture collection audit and synonym-aware discovery support.
- Added NCBI Assembly and BioSample enrichment for candidate evidence.
- Added strict, balanced, and review-only selection policies.
- Added source audit policy for genome and same-genome 16S provenance.
- Added manual curator evidence workflow and review templates.
- Added Fusobacterium strict NCBI type-strain delivery case study support.
- Documented Fusobacterium strict NCBI Assembly completion as 16/17 correct-valid LPSN species, with F. mortiferum remaining outside the current high-confidence NCBI Assembly workflow.
- Note: External ATCC Genome Portal ingestion is not implemented in v0.5.0.
- Note: External registered type genomes are not represented as NCBI
assembly_accessionvalues in v0.5.0 and need separate provenance/status fields in a future design. - Note: v0.5.0 does not claim Fusobacterium 17/17 NCBI Assembly strict completion.
- Note: External type-genome ingestion should be designed separately for a later release.
- Added the documented minimal offline checklist-guided workflow covering LPSN Child taxa TSV conversion, local discovery-cache candidate generation, selection preparation, selection dry-run planning, guarded selection-driven downloads, and resume dry-run downstream planning.
- Added Phase 19-24 acquisition workflow polish: LPSN Child taxa to checklist conversion, taxonomy/source audit reporting, selection-driven planning and downloads, local-cache and guarded NCBI assembly candidate discovery, and an offline end-to-end smoke workflow.
- Added LPSN-first acquisition design documentation.
- Added LPSN/checklist-compatible schema fields.
- Added offline LPSN adapter/cache interfaces.
- Added assembly candidate tables and ranking helpers.
- Added culture collection ID parsing.
- Added sequence source audit data model.
- Added offline user selection TSV support via
--prepare-selection,--selection-tsv, and--strains-per-species. - Added candidate/selection examples and docs.
- Note: real LPSN API, NCBI candidate discovery, and selection-driven downloads are not yet implemented.
- Added Apache-2.0 license, NOTICE, citation metadata, contribution guidance, security policy, and GitHub Actions CI.
- Added user-provided species checklist audit via
--species-checklist. - Added
taxonomy/checklist_comparison.tsv. - Added Taxonomic Audit report section.
- Added species checklist docs and example TSV.
- Clarified GTDB/LPSN taxonomic scope.
- Initial MVP release.
- GTDB metadata parsing and type-material selection.
- Guarded NCBI Datasets downloads with ZIP extraction and genome registration.
- barrnap 16S extraction.
- Entrez fallback guarded by email/API key.
- FastANI query-vs-reference execution, parsing, summary, and PNG.
- Reference-only and query-inclusive all_16S assembly.
- MAFFT/trimAl/IQ-TREE phylogeny workflow.
- report/summary.md and report-only refresh.
- Real smoke validations: Aalborgiella and Actinocorallia.