refactor: rename plotting subpackage to visualization by edkerk · Pull Request #21 · SysBioChalmers/raven-toolbox

edkerk · 2026-06-08T22:12:36Z

What

Renames the (stub) plotting subpackage to visualization, matching RAVEN's folder layout where pathway/ + plotting/ were unified into visualization/ (SysBioChalmers/RAVEN#614). This is the deferred follow-up noted in #20.

src/raven_python/plotting/ → src/raven_python/visualization/
pip extra [plotting] → [visualization] (matplotlib)
CI (.github/workflows/ci.yml) and ReadTheDocs (.readthedocs.yaml) install lines updated so the renamed extra still resolves
docs updated: README.md, CHANGELOG.md, docs/installation.md, docs/README.md, docs/reference/api/index.md, docs/reference/todo.md

Notes

The subpackage is an unimplemented stub (empty __init__.py), so nothing imports it — no behaviour change, no broken imports.
Generic prose uses of the word "plotting" (seaborn / heatmap descriptions) are intentionally left unchanged.
After this, RAVEN and raven-python share the visualization/ name. The one remaining structural gap is reconstruction/metacyc/ (feature work — porting MetaCyc — not a rename).

Align the (stub) plotting subpackage with RAVEN's folder layout, where pathway + plotting were unified into visualization/ (SysBioChalmers/RAVEN#614). - src/raven_python/plotting/ -> src/raven_python/visualization/ - pyproject optional-dependency extra [plotting] -> [visualization] (matplotlib); CI (.github/workflows/ci.yml) and ReadTheDocs (.readthedocs.yaml) install lines updated to match so the renamed extra still resolves. - docs updated: README, CHANGELOG, installation, docs/README, api/index, todo. The subpackage is an unimplemented stub (empty __init__), so nothing imports it and there is no behaviour change. Generic uses of the word "plotting" (seaborn / heatmap prose) are left as-is.

* feat(data): shared download manifest for artefacts + binaries (#16) * feat(data): shared download manifest for artefacts and binaries Introduce a single, language-agnostic manifest (data/manifest.schema.json) that lists every downloadable data artefact and external-binary bundle with a SHA256, consumed by both raven-python and (via the same JSON) MATLAB RAVEN. The manifest is a superset of the two runtime registries: * manifest["data"] -> raven_python.data._DATA_REGISTRY * manifest["binaries"] -> raven_python.binaries._REGISTRY Added: * data/manifest.schema.json (JSON Schema) + data/manifest.example.json (worked example) + data/manifest.json (empty, the live source of truth until assets are published). * raven_python.manifest — load_manifest / to_*_registry / load_into_registries. * Lazy autoload: data.ensure_* and binaries.ensure_binary populate themselves from $RAVEN_PYTHON_MANIFEST on first use when their registry is still empty (guarded; no effect when a registry is passed explicitly or the env var is unset). * scripts/make_registry_snippet.py: a `manifest` subcommand that computes url+sha256+bytes and writes/updates manifest.json. * tests/test_manifest.py (round-trip, converters, lazy autoload via file:// URLs, repo manifests valid). * docs/maintenance/data_manifest.md — format, Python + MATLAB consumers, GitHub-Releases vs Zenodo hosting (incl. a release→Zenodo GitHub Action), and per-asset recommendations. * docs(data): host assets on existing-repo releases; KEGG redistribution permitted Reflect the chosen distribution model: GitHub release assets live outside the git tree, so a separate data repository is optional — attach assets to dedicated tags (e.g. kegg-kegg116, diamond-2.1.9) on an existing RAVEN repo and reuse the same URLs across raven-python and MATLAB RAVEN. Use Zenodo only for DOIs or files >2 GB. KEGG artefacts are redistributed with permission, so the prior 'confirm rights' caveat is removed. Example/schema URLs repointed from a hypothetical raven-data repo to raven-python. * Add yeast-GEM-derived shared modules (diff, annotation, conditions, biomass, curation) (#17) * Add diff_models, annotation, and conditions modules for yeast-GEM port Lands the upstream-shareable pieces that yeast-GEM has been implementing locally during its Python port (see yeast-GEM/code/python/PORTING_PLAN.md and UPSTREAM_CANDIDATES.md). These are organism-agnostic; yeast-GEM will consume them via a Python dependency on raven-python. New modules ----------- raven_python.comparison.diff diff_models(a, b, ...) -> DiffReport — strict two-model semantic- equality diff. Complements the existing compare_models (N-model presence-matrix overview). Used as a CI gate to verify that two toolchains (e.g. MATLAB RAVEN vs raven_python, pre/post refactor of one toolchain) produce equivalent models. Includes a python -m raven_python.comparison.diff CLI. raven_python.annotation.sbo add_sbo_terms — SBO term assignment with "fill" semantic. Default parameter set reproduces yeast-GEM's behaviour; biomass metabolite names, biomass/NGAM reaction names, and pseudoreaction substrings are overridable. Transport detection is pluggable (default: same- met-name in two compartments). Includes an `only_last_reaction_ for_pseudo` legacy bug-compat flag for yeast-GEM's lock-step migration; off by default for any new caller. raven_python.annotation.delta_g load_delta_g_csv / save_delta_g_csv — generic side-car CSV persistence for scalar notes (ΔG by default, but the note key, column names, and id/value mapping are all configurable). raven_python.conditions.apply apply_condition(model, yaml_or_dict) — generic "apply this YAML condition" loader. Schema: prelude (reset_exchanges), cofactor_pseudoreaction (remove_mets + charge_balance_met), biomass_stoichiometry_delta, per-rxn bounds, expected_uptake_count. Project-specific extensions (e.g. yeast-GEM's amino_acid_ratio) are handled by the caller before/after this function — kept upstream-narrow on purpose. Also exposes set_reaction_bounds helper that bypasses cobra's lb<=ub validator for the (legitimate) cases where a condition lands on an infeasible bound state. Tests ----- 46 new tests across the three modules; full pre-existing raven-python suite still passes (519 passed; 1 unrelated pre-existing openpyxl ImportError in tests/test_io_git.py; 2 skipped). ruff clean. Not in this commit ------------------ The biomass / GAM / chemostat / fit_gam modules are still tracked as upstream candidates in yeast-GEM/code/python/UPSTREAM_CANDIDATES.md and remain local in yeast-GEM until phase 4 of the port (which would ideally land them directly here). * Add raven_python.biomass — sum / scale / rescale / set_gam Generic biomass-equation manipulation, extracted from yeast-GEM's sumBioMass / scaleBioMass / rescalePseudoReaction / changeGAM as yeast-GEM moves to depend on raven-python (yeast-GEM phase 4 of the porting plan). Module layout ------------- raven_python.biomass.config BiomassConfig — biomass_rxn id + proton_met id + ordered tuple of BiomassComponent (per-component pseudoreaction name + mass- computation strategy). raven_python.biomass.scale sum_biomass(model, config) → {component: g/gDW, ..., "total": X} rescale_pseudoreaction(model, config, name, factor) — multiply the pseudoreaction's substrate coefs by factor and rebalance H+ to keep ionic charge at zero. scale_biomass(model, config, name, new_value, balance_out=None) — rescale a component to a target fraction, optionally balancing a second component so the total stays at 1 g/gDW. raven_python.biomass.gam set_gam(model, value, *, biomass_rxn, cofactor_met_names, ngam_rxn=None, ngam_value=None) — scales every metabolite in the biomass pseudoreaction whose `name` is in the supplied cofactor set, preserving its sign; optionally fixes the NGAM rxn bounds. Mass strategies (per BiomassComponent.mass_strategy): "mw" plain MW from chemical formula (carbohydrate / ion / cofactor) "mw_minus_2h" MW − 2.016 g/mol per substrate (protein — charged tRNAs release two protons) "mw_minus_water" MW − 18.015 g/mol per substrate (RNA / DNA — polymerisation releases one water) "grams" stoichiometry already in g/gDW (lipid backbone) Tests: 19 new tests over a synthetic toy model that exercises every mass strategy, the H+ charge rebalance, scale_biomass with and without balance_out, set_gam on cofactor mets (and the NGAM bound path). * Add raven_python.manipulation.find_duplicate_reactions (detection variant) Detection-only counterpart to remove_duplicate_reactions. Returns duplicate groups instead of mutating the model. Ignores bounds / GPR / objective — only stoichiometry is compared, mirroring the typical curation use case ("find reactions that could be merged"). A new ``ignore_direction=True`` default (yeast-GEM convention) treats A→B and B→A as duplicates. Set False to require identical orientation. Used by yeast-GEM's modelTests port (Tier 3 / phase 5) to flag duplicate reactions during curation review. * Add raven_python.curation — batch_curate / batch_curate_from_tsv Generic batch curation engine extracted from yeast-GEM's MATLAB curateMetsRxnsGenes (yeast-GEM phase 6). Adds or updates metabolites, reactions and genes from pandas DataFrames; a batch_curate_from_tsv convenience wrapper reads the equivalent TSVs. Schema (matches yeast-GEM's data/modelCuration/template/ layout): mets_df metNames, comps, formula, charge, inchi, metNotes + any number of MIRIAM-namespace columns genes_df genes, geneShortNames + MIRIAM columns rxns_df rxnNames, grRules, lb, ub, rev, subSystems, eccodes, rxnNotes, rxnReferences, rxnConfidenceScores + MIRIAM columns rxns_coeffs_df rxnNames, metNames, comps, coefficient (one row per (reaction, metabolite) pair) Match keys: metabolites — (name, compartment) tuple genes — gene id reactions — stoichiometric signature Existing entities get their annotations overwritten (warning emitted); new entities are added with fresh ids generated from the supplied ``met_id_prefix`` / ``rxn_id_prefix`` (defaults M_ / R_ per the BiGG convention; yeast-GEM passes s_ / r_). Width of the existing zero-padded suffix is preserved so s_0001 → s_0002, not s_2. "Everything after the core columns is MIRIAM" — the header of any extra column becomes the annotation namespace key. Matches MATLAB behaviour exactly so yeast-GEM's existing TSVs work unchanged, and projects with different MIRIAM column sets need no code change. CurationResult dataclass records what was added vs updated so callers can verify in tests / CI. Tests: 13 new (add/update mets, add/update genes, add/update rxns by stoichiometry, miriam auto-detect, id-width preservation, combined mets+rxns in one call, missing-metabolite error, batch_curate_from_tsv round trip). * io.yaml: byte-compatible round-trip with cobrapy + RAVEN MATLAB Three things this fixes: 1. write_yaml_model dropped the !!omap tags entirely. _to_plain was flattening cobra's OrderedDict to plain dict, which causes ruamel to emit ordinary block mappings. RAVEN MATLAB's reader is a line-based parser keyed on !!omap and therefore could not load any file we wrote. _to_plain now returns OrderedDict so ruamel re-emits the !!omap tag. 2. eccodes was lost on round-trip — it wasn't in _RXN_FIELDS, so read_yaml_model didn't capture it into .notes and write_yaml_model couldn't lift it back. Added. 3. RAVEN MATLAB writes reaction notes as 'rxnNotes'; cobrapy and this writer use 'notes'. Added a read-time alias so existing yeast-GEM YAML files (which still say 'rxnNotes') load cleanly. Writes go out as 'notes' (cobrapy-canonical). Top-level layout now matches RAVEN MATLAB: metaData first, then metabolites / reactions / genes / compartments, then optional gecko_light + ec-rxns + ec-enzymes. id/name/version live inside metaData (RAVEN convention) — cobrapy reading these files still works, but cobra_model.id ends up None because cobrapy doesn't know about metaData. raven_python.read_yaml_model lifts both metaData.id/name/version onto model.id / model.name / model.notes['version'] so the rest of the codebase doesn't care which layout the file used. Empty-name genes are no longer emitted as — that's a cobrapy quirk that drifts yeast-GEM YAML files away from RAVEN MATLAB's output. Verified end-to-end: * cobra.io.load_yaml_model reads every file the new writer produces (yeast-GEM and a synthetic fixture). * RAVEN MATLAB readYAMLmodel reads every file the new writer produces. * Round-tripping yeast-GEM through raven_python preserves 2748/2748 metabolites, 4102/4102 reactions, 1143/1143 genes, 2411 eccodes, 3984 reaction deltaG, 2696 metabolite deltaG, 1788 SMILES, 1443 rxn-notes — no semantic drift. Tests ----- * tests/test_io_yaml_parity.py is new: covers every RAVEN extension, the rxnNotes legacy alias, the SMILES YAML-special character case, metaData-first layout, and cobra readability. * tests/test_io_yaml.py::test_output_is_cobra_readable adjusts for the metaData layout (cobra recovers mets/rxns/annotation but not model.id, by design). * conditions: switch from PyYAML to ruamel.yaml PyYAML is not a project dependency; raven-python uses ruamel.yaml (already pulled in via cobra) everywhere else. The conditions module and its tests still imported PyYAML, which broke pytest collection on clean CI runners with 'No module named yaml'. Both apply.py and the test now use a YAML(typ='safe') instance from ruamel.yaml — same plain-dict semantics as PyYAML's safe_load / safe_dump, no new dependency. * io.yaml: document the format + accept legacy geckoLight-in-metaData Adds docs/reference/yaml_format.md as the canonical schema reference for the cross-toolchain YAML format (cobrapy / raven-python / RAVEN MATLAB). Covers the full document shape, per-entry field order, RAVEN extensions, the GECKO ec-* sections, the metaData provenance block, number / string / quoting rules, and the cross-reader interoperability matrix. Linked from docs/reference/index.md and the I/O guide. Reader fix: pre-shim RAVEN MATLAB writes emitted GECKO models with geckoLight: "true" inside the metaData block (not as a top-level gecko_light). The reader now lifts that legacy key out of metaData so model.ec.gecko_light is populated whichever placement the file used. Round-trip writes always use the new top-level form. Regression tests: test_pre_shim_format_loads — synthetic fixture covering every legacy quirk we know about (--- doc marker, plain metaData, geckoLight inside metaData, top-level metabolite smiles, rxnNotes reaction key, integer bounds, double-quoted strings). Each quirk has its own assertion + comment. test_pre_shim_yeast_gem_loads_if_available — sanity-loads the real yeast-GEM.yml (2748 mets, 4102 rxns, 1143 genes) and asserts the documented preserved-counts table from the format reference. Skipped on CI runners where the working copy isn't mounted. * Cobra-aligned hardening pass from full code review (#18) * Add diff_models, annotation, and conditions modules for yeast-GEM port Lands the upstream-shareable pieces that yeast-GEM has been implementing locally during its Python port (see yeast-GEM/code/python/PORTING_PLAN.md and UPSTREAM_CANDIDATES.md). These are organism-agnostic; yeast-GEM will consume them via a Python dependency on raven-python. New modules ----------- raven_python.comparison.diff diff_models(a, b, ...) -> DiffReport — strict two-model semantic- equality diff. Complements the existing compare_models (N-model presence-matrix overview). Used as a CI gate to verify that two toolchains (e.g. MATLAB RAVEN vs raven_python, pre/post refactor of one toolchain) produce equivalent models. Includes a python -m raven_python.comparison.diff CLI. raven_python.annotation.sbo add_sbo_terms — SBO term assignment with "fill" semantic. Default parameter set reproduces yeast-GEM's behaviour; biomass metabolite names, biomass/NGAM reaction names, and pseudoreaction substrings are overridable. Transport detection is pluggable (default: same- met-name in two compartments). Includes an `only_last_reaction_ for_pseudo` legacy bug-compat flag for yeast-GEM's lock-step migration; off by default for any new caller. raven_python.annotation.delta_g load_delta_g_csv / save_delta_g_csv — generic side-car CSV persistence for scalar notes (ΔG by default, but the note key, column names, and id/value mapping are all configurable). raven_python.conditions.apply apply_condition(model, yaml_or_dict) — generic "apply this YAML condition" loader. Schema: prelude (reset_exchanges), cofactor_pseudoreaction (remove_mets + charge_balance_met), biomass_stoichiometry_delta, per-rxn bounds, expected_uptake_count. Project-specific extensions (e.g. yeast-GEM's amino_acid_ratio) are handled by the caller before/after this function — kept upstream-narrow on purpose. Also exposes set_reaction_bounds helper that bypasses cobra's lb<=ub validator for the (legitimate) cases where a condition lands on an infeasible bound state. Tests ----- 46 new tests across the three modules; full pre-existing raven-python suite still passes (519 passed; 1 unrelated pre-existing openpyxl ImportError in tests/test_io_git.py; 2 skipped). ruff clean. Not in this commit ------------------ The biomass / GAM / chemostat / fit_gam modules are still tracked as upstream candidates in yeast-GEM/code/python/UPSTREAM_CANDIDATES.md and remain local in yeast-GEM until phase 4 of the port (which would ideally land them directly here). * Add raven_python.biomass — sum / scale / rescale / set_gam Generic biomass-equation manipulation, extracted from yeast-GEM's sumBioMass / scaleBioMass / rescalePseudoReaction / changeGAM as yeast-GEM moves to depend on raven-python (yeast-GEM phase 4 of the porting plan). Module layout ------------- raven_python.biomass.config BiomassConfig — biomass_rxn id + proton_met id + ordered tuple of BiomassComponent (per-component pseudoreaction name + mass- computation strategy). raven_python.biomass.scale sum_biomass(model, config) → {component: g/gDW, ..., "total": X} rescale_pseudoreaction(model, config, name, factor) — multiply the pseudoreaction's substrate coefs by factor and rebalance H+ to keep ionic charge at zero. scale_biomass(model, config, name, new_value, balance_out=None) — rescale a component to a target fraction, optionally balancing a second component so the total stays at 1 g/gDW. raven_python.biomass.gam set_gam(model, value, *, biomass_rxn, cofactor_met_names, ngam_rxn=None, ngam_value=None) — scales every metabolite in the biomass pseudoreaction whose `name` is in the supplied cofactor set, preserving its sign; optionally fixes the NGAM rxn bounds. Mass strategies (per BiomassComponent.mass_strategy): "mw" plain MW from chemical formula (carbohydrate / ion / cofactor) "mw_minus_2h" MW − 2.016 g/mol per substrate (protein — charged tRNAs release two protons) "mw_minus_water" MW − 18.015 g/mol per substrate (RNA / DNA — polymerisation releases one water) "grams" stoichiometry already in g/gDW (lipid backbone) Tests: 19 new tests over a synthetic toy model that exercises every mass strategy, the H+ charge rebalance, scale_biomass with and without balance_out, set_gam on cofactor mets (and the NGAM bound path). * Add raven_python.manipulation.find_duplicate_reactions (detection variant) Detection-only counterpart to remove_duplicate_reactions. Returns duplicate groups instead of mutating the model. Ignores bounds / GPR / objective — only stoichiometry is compared, mirroring the typical curation use case ("find reactions that could be merged"). A new ``ignore_direction=True`` default (yeast-GEM convention) treats A→B and B→A as duplicates. Set False to require identical orientation. Used by yeast-GEM's modelTests port (Tier 3 / phase 5) to flag duplicate reactions during curation review. * Add raven_python.curation — batch_curate / batch_curate_from_tsv Generic batch curation engine extracted from yeast-GEM's MATLAB curateMetsRxnsGenes (yeast-GEM phase 6). Adds or updates metabolites, reactions and genes from pandas DataFrames; a batch_curate_from_tsv convenience wrapper reads the equivalent TSVs. Schema (matches yeast-GEM's data/modelCuration/template/ layout): mets_df metNames, comps, formula, charge, inchi, metNotes + any number of MIRIAM-namespace columns genes_df genes, geneShortNames + MIRIAM columns rxns_df rxnNames, grRules, lb, ub, rev, subSystems, eccodes, rxnNotes, rxnReferences, rxnConfidenceScores + MIRIAM columns rxns_coeffs_df rxnNames, metNames, comps, coefficient (one row per (reaction, metabolite) pair) Match keys: metabolites — (name, compartment) tuple genes — gene id reactions — stoichiometric signature Existing entities get their annotations overwritten (warning emitted); new entities are added with fresh ids generated from the supplied ``met_id_prefix`` / ``rxn_id_prefix`` (defaults M_ / R_ per the BiGG convention; yeast-GEM passes s_ / r_). Width of the existing zero-padded suffix is preserved so s_0001 → s_0002, not s_2. "Everything after the core columns is MIRIAM" — the header of any extra column becomes the annotation namespace key. Matches MATLAB behaviour exactly so yeast-GEM's existing TSVs work unchanged, and projects with different MIRIAM column sets need no code change. CurationResult dataclass records what was added vs updated so callers can verify in tests / CI. Tests: 13 new (add/update mets, add/update genes, add/update rxns by stoichiometry, miriam auto-detect, id-width preservation, combined mets+rxns in one call, missing-metabolite error, batch_curate_from_tsv round trip). * io.yaml: byte-compatible round-trip with cobrapy + RAVEN MATLAB Three things this fixes: 1. write_yaml_model dropped the !!omap tags entirely. _to_plain was flattening cobra's OrderedDict to plain dict, which causes ruamel to emit ordinary block mappings. RAVEN MATLAB's reader is a line-based parser keyed on !!omap and therefore could not load any file we wrote. _to_plain now returns OrderedDict so ruamel re-emits the !!omap tag. 2. eccodes was lost on round-trip — it wasn't in _RXN_FIELDS, so read_yaml_model didn't capture it into .notes and write_yaml_model couldn't lift it back. Added. 3. RAVEN MATLAB writes reaction notes as 'rxnNotes'; cobrapy and this writer use 'notes'. Added a read-time alias so existing yeast-GEM YAML files (which still say 'rxnNotes') load cleanly. Writes go out as 'notes' (cobrapy-canonical). Top-level layout now matches RAVEN MATLAB: metaData first, then metabolites / reactions / genes / compartments, then optional gecko_light + ec-rxns + ec-enzymes. id/name/version live inside metaData (RAVEN convention) — cobrapy reading these files still works, but cobra_model.id ends up None because cobrapy doesn't know about metaData. raven_python.read_yaml_model lifts both metaData.id/name/version onto model.id / model.name / model.notes['version'] so the rest of the codebase doesn't care which layout the file used. Empty-name genes are no longer emitted as — that's a cobrapy quirk that drifts yeast-GEM YAML files away from RAVEN MATLAB's output. Verified end-to-end: * cobra.io.load_yaml_model reads every file the new writer produces (yeast-GEM and a synthetic fixture). * RAVEN MATLAB readYAMLmodel reads every file the new writer produces. * Round-tripping yeast-GEM through raven_python preserves 2748/2748 metabolites, 4102/4102 reactions, 1143/1143 genes, 2411 eccodes, 3984 reaction deltaG, 2696 metabolite deltaG, 1788 SMILES, 1443 rxn-notes — no semantic drift. Tests ----- * tests/test_io_yaml_parity.py is new: covers every RAVEN extension, the rxnNotes legacy alias, the SMILES YAML-special character case, metaData-first layout, and cobra readability. * tests/test_io_yaml.py::test_output_is_cobra_readable adjusts for the metaData layout (cobra recovers mets/rxns/annotation but not model.id, by design). * conditions: switch from PyYAML to ruamel.yaml PyYAML is not a project dependency; raven-python uses ruamel.yaml (already pulled in via cobra) everywhere else. The conditions module and its tests still imported PyYAML, which broke pytest collection on clean CI runners with 'No module named yaml'. Both apply.py and the test now use a YAML(typ='safe') instance from ruamel.yaml — same plain-dict semantics as PyYAML's safe_load / safe_dump, no new dependency. * io.yaml: document the format + accept legacy geckoLight-in-metaData Adds docs/reference/yaml_format.md as the canonical schema reference for the cross-toolchain YAML format (cobrapy / raven-python / RAVEN MATLAB). Covers the full document shape, per-entry field order, RAVEN extensions, the GECKO ec-* sections, the metaData provenance block, number / string / quoting rules, and the cross-reader interoperability matrix. Linked from docs/reference/index.md and the I/O guide. Reader fix: pre-shim RAVEN MATLAB writes emitted GECKO models with geckoLight: "true" inside the metaData block (not as a top-level gecko_light). The reader now lifts that legacy key out of metaData so model.ec.gecko_light is populated whichever placement the file used. Round-trip writes always use the new top-level form. Regression tests: test_pre_shim_format_loads — synthetic fixture covering every legacy quirk we know about (--- doc marker, plain metaData, geckoLight inside metaData, top-level metabolite smiles, rxnNotes reaction key, integer bounds, double-quoted strings). Each quirk has its own assertion + comment. test_pre_shim_yeast_gem_loads_if_available — sanity-loads the real yeast-GEM.yml (2748 mets, 4102 rxns, 1143 genes) and asserts the documented preserved-counts table from the format reference. Skipped on CI runners where the working copy isn't mounted. * Cobra-aligned hardening pass from full code review No behaviour change on well-formed inputs. Highlights: - Packaging: derive __version__ from package metadata (was a stale hard-coded "0.0.1" that the docs site reported); pin ruff==0.15.15 in the dev extra and CI; fix two lint errors unpinned ruff started flagging. - Errors: solver/feasibility failures in run_init, run_ftinit, fill_tasks and random_sampling now raise cobra.exceptions.OptimizationError instead of bare RuntimeError (consistent with the rest of the package). - Consistency: single utils.parse.subsystem_to_str coerces reaction subsystem to cobra's canonical str across io.excel / comparison.compare / curation.batch / manipulation.add (fixes a crash on non-string items and the silent drop of multi-subsystem reactions); shared GPR score aggregators in utils.gpr used by init.score and init.genes; KEGG-download progress uses a module logger instead of print. - Robustness: zip path-traversal guard in binaries.py; penalty>0 check in connect_blocked_reactions; NaN-sample guard in random_sampling; all-zero ec coupling warning; optional verify= SHA256 re-check on data cache hits; non-finite z-score guard in reporter. Regression tests added for each. * io.yaml: reaction EC codes as cobra annotation ec-code (#19) * Add diff_models, annotation, and conditions modules for yeast-GEM port Lands the upstream-shareable pieces that yeast-GEM has been implementing locally during its Python port (see yeast-GEM/code/python/PORTING_PLAN.md and UPSTREAM_CANDIDATES.md). These are organism-agnostic; yeast-GEM will consume them via a Python dependency on raven-python. New modules ----------- raven_python.comparison.diff diff_models(a, b, ...) -> DiffReport — strict two-model semantic- equality diff. Complements the existing compare_models (N-model presence-matrix overview). Used as a CI gate to verify that two toolchains (e.g. MATLAB RAVEN vs raven_python, pre/post refactor of one toolchain) produce equivalent models. Includes a python -m raven_python.comparison.diff CLI. raven_python.annotation.sbo add_sbo_terms — SBO term assignment with "fill" semantic. Default parameter set reproduces yeast-GEM's behaviour; biomass metabolite names, biomass/NGAM reaction names, and pseudoreaction substrings are overridable. Transport detection is pluggable (default: same- met-name in two compartments). Includes an `only_last_reaction_ for_pseudo` legacy bug-compat flag for yeast-GEM's lock-step migration; off by default for any new caller. raven_python.annotation.delta_g load_delta_g_csv / save_delta_g_csv — generic side-car CSV persistence for scalar notes (ΔG by default, but the note key, column names, and id/value mapping are all configurable). raven_python.conditions.apply apply_condition(model, yaml_or_dict) — generic "apply this YAML condition" loader. Schema: prelude (reset_exchanges), cofactor_pseudoreaction (remove_mets + charge_balance_met), biomass_stoichiometry_delta, per-rxn bounds, expected_uptake_count. Project-specific extensions (e.g. yeast-GEM's amino_acid_ratio) are handled by the caller before/after this function — kept upstream-narrow on purpose. Also exposes set_reaction_bounds helper that bypasses cobra's lb<=ub validator for the (legitimate) cases where a condition lands on an infeasible bound state. Tests ----- 46 new tests across the three modules; full pre-existing raven-python suite still passes (519 passed; 1 unrelated pre-existing openpyxl ImportError in tests/test_io_git.py; 2 skipped). ruff clean. Not in this commit ------------------ The biomass / GAM / chemostat / fit_gam modules are still tracked as upstream candidates in yeast-GEM/code/python/UPSTREAM_CANDIDATES.md and remain local in yeast-GEM until phase 4 of the port (which would ideally land them directly here). * Add raven_python.biomass — sum / scale / rescale / set_gam Generic biomass-equation manipulation, extracted from yeast-GEM's sumBioMass / scaleBioMass / rescalePseudoReaction / changeGAM as yeast-GEM moves to depend on raven-python (yeast-GEM phase 4 of the porting plan). Module layout ------------- raven_python.biomass.config BiomassConfig — biomass_rxn id + proton_met id + ordered tuple of BiomassComponent (per-component pseudoreaction name + mass- computation strategy). raven_python.biomass.scale sum_biomass(model, config) → {component: g/gDW, ..., "total": X} rescale_pseudoreaction(model, config, name, factor) — multiply the pseudoreaction's substrate coefs by factor and rebalance H+ to keep ionic charge at zero. scale_biomass(model, config, name, new_value, balance_out=None) — rescale a component to a target fraction, optionally balancing a second component so the total stays at 1 g/gDW. raven_python.biomass.gam set_gam(model, value, *, biomass_rxn, cofactor_met_names, ngam_rxn=None, ngam_value=None) — scales every metabolite in the biomass pseudoreaction whose `name` is in the supplied cofactor set, preserving its sign; optionally fixes the NGAM rxn bounds. Mass strategies (per BiomassComponent.mass_strategy): "mw" plain MW from chemical formula (carbohydrate / ion / cofactor) "mw_minus_2h" MW − 2.016 g/mol per substrate (protein — charged tRNAs release two protons) "mw_minus_water" MW − 18.015 g/mol per substrate (RNA / DNA — polymerisation releases one water) "grams" stoichiometry already in g/gDW (lipid backbone) Tests: 19 new tests over a synthetic toy model that exercises every mass strategy, the H+ charge rebalance, scale_biomass with and without balance_out, set_gam on cofactor mets (and the NGAM bound path). * Add raven_python.manipulation.find_duplicate_reactions (detection variant) Detection-only counterpart to remove_duplicate_reactions. Returns duplicate groups instead of mutating the model. Ignores bounds / GPR / objective — only stoichiometry is compared, mirroring the typical curation use case ("find reactions that could be merged"). A new ``ignore_direction=True`` default (yeast-GEM convention) treats A→B and B→A as duplicates. Set False to require identical orientation. Used by yeast-GEM's modelTests port (Tier 3 / phase 5) to flag duplicate reactions during curation review. * Add raven_python.curation — batch_curate / batch_curate_from_tsv Generic batch curation engine extracted from yeast-GEM's MATLAB curateMetsRxnsGenes (yeast-GEM phase 6). Adds or updates metabolites, reactions and genes from pandas DataFrames; a batch_curate_from_tsv convenience wrapper reads the equivalent TSVs. Schema (matches yeast-GEM's data/modelCuration/template/ layout): mets_df metNames, comps, formula, charge, inchi, metNotes + any number of MIRIAM-namespace columns genes_df genes, geneShortNames + MIRIAM columns rxns_df rxnNames, grRules, lb, ub, rev, subSystems, eccodes, rxnNotes, rxnReferences, rxnConfidenceScores + MIRIAM columns rxns_coeffs_df rxnNames, metNames, comps, coefficient (one row per (reaction, metabolite) pair) Match keys: metabolites — (name, compartment) tuple genes — gene id reactions — stoichiometric signature Existing entities get their annotations overwritten (warning emitted); new entities are added with fresh ids generated from the supplied ``met_id_prefix`` / ``rxn_id_prefix`` (defaults M_ / R_ per the BiGG convention; yeast-GEM passes s_ / r_). Width of the existing zero-padded suffix is preserved so s_0001 → s_0002, not s_2. "Everything after the core columns is MIRIAM" — the header of any extra column becomes the annotation namespace key. Matches MATLAB behaviour exactly so yeast-GEM's existing TSVs work unchanged, and projects with different MIRIAM column sets need no code change. CurationResult dataclass records what was added vs updated so callers can verify in tests / CI. Tests: 13 new (add/update mets, add/update genes, add/update rxns by stoichiometry, miriam auto-detect, id-width preservation, combined mets+rxns in one call, missing-metabolite error, batch_curate_from_tsv round trip). * io.yaml: byte-compatible round-trip with cobrapy + RAVEN MATLAB Three things this fixes: 1. write_yaml_model dropped the !!omap tags entirely. _to_plain was flattening cobra's OrderedDict to plain dict, which causes ruamel to emit ordinary block mappings. RAVEN MATLAB's reader is a line-based parser keyed on !!omap and therefore could not load any file we wrote. _to_plain now returns OrderedDict so ruamel re-emits the !!omap tag. 2. eccodes was lost on round-trip — it wasn't in _RXN_FIELDS, so read_yaml_model didn't capture it into .notes and write_yaml_model couldn't lift it back. Added. 3. RAVEN MATLAB writes reaction notes as 'rxnNotes'; cobrapy and this writer use 'notes'. Added a read-time alias so existing yeast-GEM YAML files (which still say 'rxnNotes') load cleanly. Writes go out as 'notes' (cobrapy-canonical). Top-level layout now matches RAVEN MATLAB: metaData first, then metabolites / reactions / genes / compartments, then optional gecko_light + ec-rxns + ec-enzymes. id/name/version live inside metaData (RAVEN convention) — cobrapy reading these files still works, but cobra_model.id ends up None because cobrapy doesn't know about metaData. raven_python.read_yaml_model lifts both metaData.id/name/version onto model.id / model.name / model.notes['version'] so the rest of the codebase doesn't care which layout the file used. Empty-name genes are no longer emitted as — that's a cobrapy quirk that drifts yeast-GEM YAML files away from RAVEN MATLAB's output. Verified end-to-end: * cobra.io.load_yaml_model reads every file the new writer produces (yeast-GEM and a synthetic fixture). * RAVEN MATLAB readYAMLmodel reads every file the new writer produces. * Round-tripping yeast-GEM through raven_python preserves 2748/2748 metabolites, 4102/4102 reactions, 1143/1143 genes, 2411 eccodes, 3984 reaction deltaG, 2696 metabolite deltaG, 1788 SMILES, 1443 rxn-notes — no semantic drift. Tests ----- * tests/test_io_yaml_parity.py is new: covers every RAVEN extension, the rxnNotes legacy alias, the SMILES YAML-special character case, metaData-first layout, and cobra readability. * tests/test_io_yaml.py::test_output_is_cobra_readable adjusts for the metaData layout (cobra recovers mets/rxns/annotation but not model.id, by design). * conditions: switch from PyYAML to ruamel.yaml PyYAML is not a project dependency; raven-python uses ruamel.yaml (already pulled in via cobra) everywhere else. The conditions module and its tests still imported PyYAML, which broke pytest collection on clean CI runners with 'No module named yaml'. Both apply.py and the test now use a YAML(typ='safe') instance from ruamel.yaml — same plain-dict semantics as PyYAML's safe_load / safe_dump, no new dependency. * io.yaml: document the format + accept legacy geckoLight-in-metaData Adds docs/reference/yaml_format.md as the canonical schema reference for the cross-toolchain YAML format (cobrapy / raven-python / RAVEN MATLAB). Covers the full document shape, per-entry field order, RAVEN extensions, the GECKO ec-* sections, the metaData provenance block, number / string / quoting rules, and the cross-reader interoperability matrix. Linked from docs/reference/index.md and the I/O guide. Reader fix: pre-shim RAVEN MATLAB writes emitted GECKO models with geckoLight: "true" inside the metaData block (not as a top-level gecko_light). The reader now lifts that legacy key out of metaData so model.ec.gecko_light is populated whichever placement the file used. Round-trip writes always use the new top-level form. Regression tests: test_pre_shim_format_loads — synthetic fixture covering every legacy quirk we know about (--- doc marker, plain metaData, geckoLight inside metaData, top-level metabolite smiles, rxnNotes reaction key, integer bounds, double-quoted strings). Each quirk has its own assertion + comment. test_pre_shim_yeast_gem_loads_if_available — sanity-loads the real yeast-GEM.yml (2748 mets, 4102 rxns, 1143 genes) and asserts the documented preserved-counts table from the format reference. Skipped on CI runners where the working copy isn't mounted. * io.yaml: represent reaction EC codes as cobra annotation['ec-code'] EC numbers are a standard MIRIAM cross-reference, so the cobra-native representation is annotation['ec-code'] (a list) -- exactly where cobrapy and geckopy read them. raven-python was instead routing RAVEN's legacy top-level `eccodes` key into model.notes['eccodes'], so reaction EC codes written by RAVEN-MATLAB never reached the annotation['ec-code'] location geckopy reads from. - Drop `eccodes` from _RXN_FIELDS (it is not a RAVEN-only notes field). - Add _lift_eccodes_to_annotation: a legacy top-level `eccodes` (a ;-joined string or a list) is lifted into annotation['ec-code'] on read, mirroring the existing _lift_smiles_to_annotation; a native annotation['ec-code'] wins. - On write, EC codes serialise via cobra's annotation block; no top-level `eccodes` is emitted. - Update test_io_yaml_parity expectations to the cobra-aligned location (verified against the real yeast-GEM.yml: 2411 reactions). * docs: update RAVEN cross-references for the post-reorg folder layout (#20) RAVEN moved its functions out of the core/ catch-all into purpose-based top-level folders (SysBioChalmers/RAVEN#614). Repoint every RAVEN file path in the cross-reference docs (IMPROVEMENTS.md, docs/reference/matlab_raven_backports.md): - FSEOF / randomSampling / reporterMetabolites -> analysis/ - parseTaskList / checkTasks -> tasks/ - fillGaps -> gapfilling/ - addRxns / changeRxns / standardizeGrRules -> manipulation/ - getIndexes / checkModelStruct / getElementalBalance -> queries/ - getModelFromHomology -> reconstruction/homology/ - getKEGGModelForOrganism -> reconstruction/kegg/ - runINIT / ftINIT -> INIT/ Also corrects references that were stale even before the reorg (getKEGGModelForOrganism was in external/kegg/) and points the proposed GPR-lint back-port findPotentialErrors at manipulation/, alongside standardizeGrRules. Doc-only: raven-python's module layout already matches RAVEN's new structure (it was the template the reorg mirrored), so no code changes are needed. * refactor: rename plotting subpackage to visualization (#21) Align the (stub) plotting subpackage with RAVEN's folder layout, where pathway + plotting were unified into visualization/ (SysBioChalmers/RAVEN#614). - src/raven_python/plotting/ -> src/raven_python/visualization/ - pyproject optional-dependency extra [plotting] -> [visualization] (matplotlib); CI (.github/workflows/ci.yml) and ReadTheDocs (.readthedocs.yaml) install lines updated to match so the renamed extra still resolves. - docs updated: README, CHANGELOG, installation, docs/README, api/index, todo. The subpackage is an unimplemented stub (empty __init__), so nothing imports it and there is no behaviour change. Generic uses of the word "plotting" (seaborn / heatmap prose) are left as-is. * Ship type information and enforce it; make gpr_to_dnf public (#22) Three related "make the package's contracts real" changes: - Add the PEP 561 py.typed marker so the package's extensive type hints are visible to downstream type checkers (geckopy included). The hatchling wheel ships raven_python/py.typed. - Add mypy to the dev extra, a lenient [tool.mypy] config (ignore_missing_imports for the un-stubbed cobra/optlang/scipy/ruamel), and a mypy CI job. Fix the 36 type errors this surfaced -- all type-only (Path vs str annotations, None-guards that match existing behaviour, optlang Variable typing, isinstance/cast narrowing). No runtime behaviour changes; the full test suite stays green. - Promote manipulation.expand._gpr_to_dnf to a public gpr_to_dnf (re-exported from raven_python.manipulation). geckopy's call sites switch to it in lockstep (separate PR), so no deprecated alias is kept. * Harden curation, EC-data and archive-handling modules (#23) Tier-2 audit of the post-review modules surfaced four targeted fixes: - curation/batch.py: new reactions coerce a list-valued subSystems via subsystem_to_str (";"-joined) instead of str(list), matching the update path. - io/ec_data.py: _eccodes_to_yaml strips stray separators in the single-EC case so a trailing ";" never leaks into the written YAML. - binaries.py: _safe_extract_zip rejects symlink members, defence-in-depth alongside the existing path-traversal guard. - binaries.py / data.py: archive and dataset downloads pass a socket timeout to urlopen so a stalled server cannot hang the process. Adds regression tests for each fix. * Surgical performance pass on hot paths (#24) Targeted, behaviour-preserving optimisations from the review: - manipulation/add.py + change.py: resolve equation tokens through a shared (name, compartment) -> metabolite index (_build_met_index) instead of re-scanning model.metabolites per token. Bulk reaction add/change by name drops from O(R*k*M) to O(R*k); the index is updated as new mets are created so cross-token and cross-reaction dedup is preserved. - reconstruction/homology/homology.py: replace DataFrame.apply(axis=1) in the ortholog filter with a comprehension over the columns (membership is already O(1); avoids per-row Series construction). - analysis/sampling.py: build the random objective with optlang add() instead of sum(), which re-canonicalises the expression on every term (O(n^2)). Adds a cross-reaction metabolite-dedup regression test for the add path. * Robustness and polish fixes (#25) Tier 4 of the review: small, targeted hardening, no behaviour change on valid input. - gapfilling/fill.py: clamp the connectivity gap-fill big-M to the largest finite bound magnitude, so a template reaction with an infinite bound no longer puts an infinite coefficient into the MILP (which broke the solver). - reconstruction/kegg/download.py: a malformed or unreadable .netrc now raises a ValueError explaining how to fix it, instead of a raw NetrcParseError/OSError. - io/excel.py: always write the metabolite formula to the METS COMPOSITION column; it was dropped whenever an InChI was present. - visualization: the empty stub package raises a clear NotImplementedError (with a roadmap pointer) on attribute access, via a PEP 562 module __getattr__. A regression test per fix. * Add code-built-model YAML round-trip test (covers the objective) (#26) The existing YAML round-trip and parity tests originate their model from a parsed doc; none builds a model directly from cobra objects, and none asserts the objective coefficient survives (the parity fixture pins it to 0). Add one round-trip test that builds a model in code with a non-zero objective and asserts metabolites, reactions, bounds, stoichiometry, GPR, subsystem, formula, annotation and the objective all survive write -> read. * Share the linear-chain INIT model fixture via tests/conftest.py (#27) test_init.py, test_init_build.py and test_init_solvers.py each built the same linear-chain INIT model (EX_A -> A -> B -> C -> D) independently, differing only in the model id and whether gene rules were attached. Move that construction into a new tests/conftest.py as linear_chain_model / linear_chain_model_with_genes fixtures; the three files now reuse it (test bodies unchanged). The bespoke _toy_ftinit_model stays local. No behaviour change. * Publish kegg116 KEGG artefacts (v0.1.0) (#28) * Publish kegg116 KEGG artefacts as gzip, version-prefixed assets (v0.1.0) First downloadable KEGG artefact set, wired into the runtime resolvers: - All artefacts are gzip and version-prefixed (kegg116_<name>.gz) so MATLAB and Windows read them with the built-in gunzip, no external tool. organism_gene_ko moves from xz to gzip for the same reason. - HMM libraries ship as one gzip concatenated flatfile per domain; ensure_kegg_hmm_library decompresses and hmmpresses on first use, ~10x smaller than the pressed index and portable across HMMER versions. - Add a version-prefix-tolerant artefact resolver (_resolve_artefact) used by the organism/sequence entry points; parse_kegg_dump and build_kegg_artefacts.py gain an opt-in --version. - Populate data/manifest.json and _DATA_REGISTRY with the kegg116 release assets (real SHA256 + bytes); refresh the maintainer docs and manifest example. - Bump version to 0.1.0 and update CHANGELOG. * Add KEGG taxonomy artefact and phyl_dist (RAVEN getPhylDist port) Publish kegg116_taxonomy.gz and regenerate RAVEN's keggPhylDist from it, so GECKO's organism-distance kcat selection needs no MATLAB .mat file: - reconstruction.kegg.phyl_dist + PhylDist faithfully reproduce RAVEN getPhylDist's (asymmetric, occasionally negative) distance metric; parse_taxonomy_records exposes ids/names/lineages and reads .gz transparently. - data.ensure_kegg_taxonomy fetches the artefact; build_kegg_artefacts.py emits it. - Register kegg116_taxonomy.gz in data/manifest.json and _DATA_REGISTRY (8 files). - Tests for phyl_dist (hand-checked against RAVEN) and the taxonomy fetch; update migration/IMPROVEMENTS/maintainer docs and CHANGELOG. * Publish kegg116 KEGG artefacts as gzip, version-prefixed assets (v0.1.0) (#29) First downloadable KEGG artefact set, wired into the runtime resolvers: - All artefacts are gzip and version-prefixed (kegg116_<name>.gz) so MATLAB and Windows read them with the built-in gunzip, no external tool. organism_gene_ko moves from xz to gzip for the same reason. - HMM libraries ship as one gzip concatenated flatfile per domain; ensure_kegg_hmm_library decompresses and hmmpresses on first use, ~10x smaller than the pressed index and portable across HMMER versions. - Add a version-prefix-tolerant artefact resolver (_resolve_artefact) used by the organism/sequence entry points; parse_kegg_dump and build_kegg_artefacts.py gain an opt-in --version. - Populate data/manifest.json and _DATA_REGISTRY with the kegg116 release assets (real SHA256 + bytes); refresh the maintainer docs and manifest example. - Bump version to 0.1.0 and update CHANGELOG. Add KEGG taxonomy artefact and phyl_dist (RAVEN getPhylDist port) Publish kegg116_taxonomy.gz and regenerate RAVEN's keggPhylDist from it, so GECKO's organism-distance kcat selection needs no MATLAB .mat file: - reconstruction.kegg.phyl_dist + PhylDist faithfully reproduce RAVEN getPhylDist's (asymmetric, occasionally negative) distance metric; parse_taxonomy_records exposes ids/names/lineages and reads .gz transparently. - data.ensure_kegg_taxonomy fetches the artefact; build_kegg_artefacts.py emits it. - Register kegg116_taxonomy.gz in data/manifest.json and _DATA_REGISTRY (8 files). - Tests for phyl_dist (hand-checked against RAVEN) and the taxonomy fetch; update migration/IMPROVEMENTS/maintainer docs and CHANGELOG. Bundle core KEGG artefacts into kegg116_core.tar.gz Combine the five core model files (reference model + KO/reaction/organism-gene/ rxn-flag tables) into one kegg116_core.tar.gz; HMM libraries and taxonomy stay separate. The release drops from 8 assets to 4. - ensure_kegg_data now fetches the single bundle, SHA-verifies it, and extracts the version-prefixed members into the cache once (safe extraction, matching download.py). - build_kegg_artefacts.py groups the core files into the bundle after the HMM step. - Regenerate data/manifest.json and _DATA_REGISTRY (4 entries); update manifest.example, tests (bundle fixture), and docs. * Remove the visualization stub and [visualization] extra (#30) Mirror MATLAB RAVEN removing its pathway-map / omics-overlay plotting functions (drawMap, colorPathway, drawPathway, markPathwayWith*, setOmicDataToRxns, ...) as obsolete/low-value (SysBioChalmers/RAVEN #618). raven-python only had a not-implemented `visualization` stub reserving that domain; drop it and its scaffolding. cobrapy + Escher cover pathway/omics visualization externally. - Delete src/raven_python/visualization/ and tests/test_visualization.py. - Drop the [visualization] (matplotlib) extra; remove it from CI, ReadTheDocs, and the installation / README / api-index / todo docs. - CHANGELOG: record the removal. The other functions RAVEN removed (MetaCyc, xml_toolbox, Excel-import wrappers, solveQP) were never ported to raven-python, so no further changes are needed. * Auto-resolve the taxonomy artefact in domain-mode from_artefacts (#31) get_kegg_model_for_organism_from_artefacts("prokaryotes"/"eukaryotes") builds a whole-domain model, which needs the KEGG taxonomy file. Taxonomy is a separate artefact (not part of the core set ensure_kegg_data fetches), so the call raised "Domain mode needs the KEGG taxonomy file; pass taxonomy=." unless the caller supplied a path by hand. It now auto-resolves taxonomy for domain mode: from the artefact directory if present, else via ensure_kegg_taxonomy(version). An explicit taxonomy= still wins; species mode is unchanged. Adds a regression test. * Use hmmsearch (not hmmscan) for the de-novo KEGG query (#32) get_kegg_model_from_sequences now runs one hmmsearch over the concatenated KO library instead of an hmmscan against a pressed database: - run_hmmsearch / parse_hmmsearch_tblout replace run_hmmscan / parse_hmmscan_tblout. hmmsearch is HMMER's faster, better-parallelising direction (profiles as the query) and needs no hmmpress. -Z is fixed to the profile count so per-hit E-values (and thus assign_kos output) are identical to the previous hmmscan path — verified on real HMMs (same hits, same E-values, same assignments). - ensure_kegg_hmm_library just gunzips the library (no hmmpress, no .h3* sidecars). - build_hmm_library concatenates the per-KO HMMs without pressing; the published .hmm.gz artefact is unchanged. - Docs / IMPROVEMENTS (K7) / CHANGELOG updated. * Replace the on-disk KEGG test fixture with a synthetic in-code dump (#33) tests/data/kegg_dump contained real KEGG records (e.g. reaction R00010 and KO K01194 with their EC/RHEA/ChEBI cross-references) which the project is not licensed to redistribute. Remove the directory and instead generate an equivalent, fully fictional KEGG-format dump at test time via a new session-scoped fixture in tests/conftest.py. The synthetic dump mimics the flat-file format so it still exercises the parser (reaction flags, overview-map skipping, InChI/formula handling, mapformula irreversibility, KO/gene grouping, taxonomy lineages) but all identifiers, names, sequences and cross-references are invented. The four dependent test modules (parse, query, hmm, organism) consume the fixture and assert against the fictional ids. No real KEGG content is committed and coverage is unchanged. * Rename project and import package: raven-python -> raven-toolbox (#34) * Rename project and import package: raven-python -> raven-toolbox Rename the distribution (raven-python -> raven-toolbox) and the import package (raven_python -> raven_toolbox) across all source, tests, scripts, docs, and packaging metadata. Project URLs now point to SysBioChalmers/raven-toolbox. * Complete the rename: remaining raven-python/raven_python -> raven-toolbox/raven_toolbox The package/distribution rename left occurrences behind after the rebase: - import statements () in the reconstruction.kegg modules and data.py, which would have failed at import time; - monkeypatch string targets and the cache-path assertions in the tests; - the wheel/package and mypy paths in pyproject.toml (still pointing at the now-removed src/raven_python), plus the distribution name and project URLs; - docs, data manifests and GitHub URLs. Replace them so the import package is consistently raven_toolbox and all distribution/repo references point to raven-toolbox. Also drop the empty src/raven_python directory left behind by the rebase. * Wrap homology imports to satisfy ruff isort after the rename raven_python -> raven_toolbox widened the homology hits import past the 100-char line length, so ruff isort (I001) wanted it split across lines. Format it as a multiline import block; ruff check . is clean again. * CI: bump actions to Node 24 versions (checkout v5, setup-python v6) (#35) actions/checkout@v4 and actions/setup-python@v5 run on the deprecated Node.js 20 runtime. Bump to actions/checkout@v5 and actions/setup-python@v6, both of which run on Node.js 24, to clear the GitHub Actions deprecation warning. * Prepare 0.2.0 release Bump version 0.1.0 -> 0.2.0 and complete the CHANGELOG 0.2.0 section (raven-toolbox rename, hmmsearch de-novo KEGG query, domain-mode taxonomy auto-resolve, synthetic KEGG test fixture, visualization stub removal, Node 24 CI).

edkerk merged commit 693bc08 into develop Jun 9, 2026
5 checks passed

edkerk deleted the refactor/rename-plotting-to-visualization branch June 9, 2026 05:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: rename plotting subpackage to visualization#21

refactor: rename plotting subpackage to visualization#21
edkerk merged 1 commit into
developfrom
refactor/rename-plotting-to-visualization

edkerk commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

edkerk commented Jun 8, 2026

What

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant