Skip to content

Modular dependency management: optional extras for expanded modules #59

@enriquea

Description

@enriquea

Context

As hvantk expands with new modules (HGC, ancestry, annotation pipelines, etc.), some dependencies bring heavy transitive requirements that can fail on hosts without system-level dev packages. For example, gnomad pulls in ga4gh.vrs[extras]hgvspsycopg2, which requires gcc, libpq-devel, and python-devel to compile from source — a pain for collaborators on different OS environments.

Current state

  • gnomad has been moved to an optional extra under hgc (the only actual import is already guarded with try/except ImportError in hvantk/hgc/converters.py)
  • Base poetry install now works without system-level C/PostgreSQL dev libraries

Going forward

As the tool grows, modules should follow this pattern:

  1. Core dependencies (hail, click, pandas, etc.) stay required — they are needed by all users
  2. Module-specific heavy dependencies should be declared as optional extras in pyproject.toml
  3. Code imports of optional packages must be guarded with try/except ImportError and clear error messages guiding the user to install the relevant extra
  4. Install instructions in README should document available extras, e.g.:
    poetry install                    # core functionality
    poetry install --extras hgc       # HGC pipeline (includes gnomad, plotting)
    poetry install --extras viz       # visualization support

TODO

  • Review current extras (viz, interactive, hgc, psroc) and ensure they cover all optional modules
  • Audit all imports for optional packages and add guards where missing
  • Update README install section with extras documentation
  • Consider whether scikit-learn should also be optional (only needed by specific modules)
  • Test clean install on a minimal environment (no system dev packages) to verify base install works

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions