This document classifies TypeTreeFlow surfaces that became the stable contract baseline during the v2.0.0 readiness work. The stable identity is an LPSN-first type-strain acquisition and audit workflow plus a guarded provider automation framework skeleton. It is not an ATCC/provider downloader contract.
Stable in v2.0 means downstream users and tests may rely on the documented behavior, paths, schemas, status meanings, and safety boundaries.
Stable review-only means the surface is a supported planning or audit contract, but it must not be treated as acquisition automation or completion evidence unless a stable downstream workflow explicitly consumes it.
Internal means code structure or helper behavior that should not be promised as a public contract.
Out of scope means explicitly outside this stable contract baseline.
Stable in v2.0:
--dry-runtakes precedence over real execution flags.- Real execution requires explicit stage flags such as
--enable-downloads,--enable-barrnap,--enable-entrez,--enable-fastani,--enable-phylo,--enable-ncbi-discovery,--enable-biosample-entrez, and--enable-lpsn-api. - Remote NCBI/Entrez modes require
--email;--api-keyis optional pass-through metadata for Entrez-backed operations. --resumereuses existing manifest state;--forcepermits rebuilding or overwriting protected outputs; the two modes are mutually exclusive.--report-onlyreads existing files and refreshesreport/summary.mdwithout running workflow stages or generating completion audits.- LPSN/checklist, candidate discovery, selection, source audit, NCBI download, manual external registration, completion audit, and report-only commands are documented workflow surfaces.
Stable review-only:
--plan-provider-registrationreads curator-authored provider requests and writes review outputs only. It is dry-run-only even when--dry-runis not supplied.--enable-synonym-discoveryexpands candidate discovery for review and does not make synonym conclusions.
Internal:
- Parser implementation details, dispatch functions, helper module names, and fake-runner wiring.
Out of scope:
- ATCC/provider downloader flags.
- Provider login, scraping, browser automation, credential handling, terms acceptance, purchase flow, or provider artifact download flags.
Stable in v2.0:
species_checklist.tsvandexcluded_lpsn_taxa.tsvas checklist and excluded-row review surfaces.taxonomy/checklist_comparison.tsvfor checklist-vs-manifest audit.candidates/assembly_candidates.tsv,candidates/assembly_candidate_diagnostics.tsv, andcandidates/discovery_records.tsvfor NCBI Assembly candidate evidence.cache/ncbi/biosample_records.tsvfor optional BioSample enrichment input or cache output.selection/strain_candidates.tsvandselection/user_selection.tsvfor offline strain review and selected NCBI assembly accessions.source_audit/culture_collection_audit.tsv,source_audit/sequence_source_audit.tsv,source_audit/completion_audit.tsv, andsource_audit/completion_summary.tsv.external_genomes.tsv,external_genome_registration_results.tsv,external_genome_install_plan.tsv, andexternal_genome_install_results.tsvfor manual local external FASTA registration.manifest.tsvandname_map.tsvas the durable recorded run state.cache/ncbi/download_plan.tsvandcache/ncbi/download_results.tsvfor NCBI Datasets planning and execution results.rrna,ani, andphyloplan/result TSVs documented indocs/schemas.md.
Stable review-only:
provider_request.tsv,provider/provider_registration_plan.tsv, andprovider/proposed_external_genomes.tsv. These are not manifest records, not NCBI download work, and not completion evidence.
Internal:
- Intermediate data classes and in-memory row objects.
- Any local output not documented in
docs/output_layout.mdordocs/schemas.md.
Out of scope:
- Provider credential tables.
- Provider artifact cache schemas.
- Direct provider-specific manifest fields.
Stable in v2.0:
- Manifest, download, genome extraction, rRNA, ANI, phylogeny, taxonomy,
selection, source audit, external registration, external install, completion
audit, and report statuses documented in
docs/statuses.md. external_genome_registeredas the successful external registration status and external registered manifest status after successful or skipped-existing install results are converted by the CLI.- Completion statuses
complete_ncbi,complete_external_registered,missing_genome, andconflict.
Stable review-only:
- Provider planning statuses:
provider_plan_ready_for_review,provider_plan_manual_review_required,provider_plan_missing_required_field,provider_plan_terms_review_required,provider_plan_credentials_not_supported, andprovider_plan_download_not_supported. - These statuses describe planning only. They do not represent provider access, download, FASTA validation, installed genomes, manifest writes, or completion.
Internal:
- Exception text, debug logs, temporary messages, and test-only statuses not documented as emitted workflow values.
Out of scope:
- Statuses that would imply provider login, provider download, credential use, or automated provider artifact installation.
Stable in v2.0:
- Canonical run layout under
--outdiras documented indocs/output_layout.md. manifest.tsvandname_map.tsvas synchronized durable state.cache/ncbi/as NCBI-specific cache and download-plan space only.genomes/references/<normalized_id>.fnaas installed local reference genome paths.source_audit/as the stable audit table directory.selection/,candidates/,taxonomy/,rrna/,ani/,phylo/, andreport/as documented stage directories.- External registered genome rows may be installed under
genomes/references/, but they keep emptyassembly_accessionvalues and do not create NCBI download work.
Stable review-only:
provider/provider_registration_plan.tsvprovider/proposed_external_genomes.tsv
These provider files may be summarized in reports when already present, but they do not trigger planning, downloads, installs, completion audit writes, or manifest changes during report-only mode.
Internal:
- Tool-specific temporary directories, noncanonical local run products, and ad hoc scratch files.
Out of scope:
- Provider artifact cache directories.
- Credential stores or browser-profile state inside run outputs.
Stable in v2.0:
report/summary.mdis generated from recorded manifest state and existing output files.- Reports may summarize status distribution, genome and 16S readiness, output file existence, problem records, external registered genomes, completion summaries, provider planning counts, and ANI summaries.
- Reports do not run downloads, Entrez, barrnap, FastANI, MAFFT, trimAl, IQ-TREE, provider planning, external registration, or completion audit generation.
- ANI
95%threshold reporting is advisory only. - Reports do not make taxonomic species conclusions.
Stable review-only:
- Provider planning report sections are summaries of existing review files. They do not imply provider acquisition, installed FASTA files, manifest changes, or completion evidence.
Internal:
- Markdown formatting details not asserted by tests or documented as user contract.
Out of scope:
- Report sections claiming provider downloads or automated provider acquisition.
- Automated species assignment conclusions.
Stable in v2.0:
- Dry-run-first workflow behavior.
- Explicit opt-in for guarded real actions.
- Local-cache modes that avoid network calls.
- No secrets in repository files, run TSVs, manifests, reports, logs, or test fixtures.
- External registered genome IDs must not be used as NCBI
assembly_accessionvalues. cache/ncbi/download_plan.tsvis for NCBI Datasets work only.- External registered genome completion remains separate from NCBI Assembly strict completion.
Stable review-only:
- Provider planning can record that credentials or downloads are unsupported, but it must not request, store, or use credentials.
Internal:
- How the code checks executables, builds commands, or injects subprocess runners.
Out of scope:
- Automated provider access, login, scraping, browser automation, terms click-through, purchase flows, or artifact downloads.
NCBI Assembly strict completion is stable and NCBI-only. It counts accepted strict type-strain rows backed by valid NCBI Assembly accessions. External registered genomes must not change this metric.
External-inclusive completion is stable as a separately labeled local readiness metric. It may count accepted external registered genomes only after local FASTA validation, manifest registration, and completion audit.
Provider planning rows are review-only. They do not count toward completion,
do not write or merge manifests, do not write name_map.tsv, do not write
external_genomes.tsv, do not install FASTA files, and do not create
cache/ncbi/download_plan.tsv.
Provider-native IDs remain external identifiers. They must not be written to
assembly_accession, used as NCBI accessions, or passed into NCBI Datasets
download planning.