Skip to content

Commit 81c417b

Browse files
dpark01claude
andcommitted
Fix CI failures: ARM64 builds, pytest, docs, and parallel tests
- Split x86-only classify packages (bmtagger, kallisto, kb-python) into classify-x86.txt for ARM64 compatibility - Add pytest, pytest-cov, pytest-xdist to core.txt for test jobs - Update install-conda-deps.sh with --x86-only: inline syntax for single-pass dependency resolution (prevents version regressions) - Fix docs/conf.py mock modules (add Bio.SeqIO.FastaIO, fix zstandard) - Enable parallel pytest with -n auto in all CI test jobs - Add conda+pip installation instructions to README.md - Update AGENTS.md with x86-only package patterns and doc build notes Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 0261125 commit 81c417b

File tree

12 files changed

+169
-58
lines changed

12 files changed

+169
-58
lines changed

.github/workflows/docker.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -484,7 +484,7 @@ jobs:
484484
-v --tb=short \
485485
--cov=viral_ngs \
486486
--cov-report=xml:/workspace/coverage-core.xml \
487-
-x
487+
-n auto -x
488488
489489
- name: Upload coverage to Codecov
490490
uses: codecov/codecov-action@v4
@@ -561,7 +561,7 @@ jobs:
561561
-v --tb=short \
562562
--cov=viral_ngs \
563563
--cov-report=xml:/workspace/coverage-assemble.xml \
564-
-x
564+
-n auto -x
565565
566566
- name: Upload coverage to Codecov
567567
uses: codecov/codecov-action@v4
@@ -638,7 +638,7 @@ jobs:
638638
-v --tb=short \
639639
--cov=viral_ngs \
640640
--cov-report=xml:/workspace/coverage-classify.xml \
641-
-x
641+
-n auto -x
642642
643643
- name: Upload coverage to Codecov
644644
uses: codecov/codecov-action@v4
@@ -715,7 +715,7 @@ jobs:
715715
-v --tb=short \
716716
--cov=viral_ngs \
717717
--cov-report=xml:/workspace/coverage-phylo.xml \
718-
-x
718+
-n auto -x
719719
720720
- name: Upload coverage to Codecov
721721
uses: codecov/codecov-action@v4

AGENTS.md

Lines changed: 26 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -130,9 +130,10 @@ viral-ngs/
130130
│ └── requirements/
131131
│ ├── baseimage.txt
132132
│ ├── core.txt
133-
│ ├── core-x86.txt # x86-only packages
133+
│ ├── core-x86.txt # x86-only core packages
134134
│ ├── assemble.txt
135135
│ ├── classify.txt
136+
│ ├── classify-x86.txt # x86-only classify packages
136137
│ ├── phylo.txt
137138
│ └── phylo-x86.txt # x86-only phylo packages
138139
@@ -255,18 +256,30 @@ The `pyproject.toml` has empty dependencies - conda handles everything.
255256
- `docker/requirements/classify.txt` - classification-specific
256257
- `docker/requirements/phylo.txt` - phylo-specific
257258

258-
3. For x86-only packages (no ARM64 build), add to `core-x86.txt`
259+
3. For x86-only packages (no ARM64 build), add to the appropriate `-x86.txt` file:
260+
- `core-x86.txt` - novoalign, mvicuna
261+
- `classify-x86.txt` - bmtagger, kallisto, kb-python
262+
- `phylo-x86.txt` - table2asn
259263

260264
### Dependency Resolution
261265

262-
When building derivative images, ALL dependencies must be installed in a single resolver call:
266+
When building derivative images, ALL dependencies (including x86-only) must be installed in a **single resolver call** using the `--x86-only:` prefix:
263267

264268
```bash
265-
/tmp/install-conda-deps.sh /tmp/requirements/core.txt /tmp/requirements/assemble.txt
269+
# Single resolver call - x86-only files skipped on ARM64
270+
/tmp/install-conda-deps.sh \
271+
/tmp/requirements/baseimage.txt \
272+
/tmp/requirements/core.txt \
273+
/tmp/requirements/classify.txt \
274+
--x86-only:/tmp/requirements/classify-x86.txt
266275
```
267276

268277
This prevents version regressions. **Never install incrementally.**
269278

279+
The `install-conda-deps.sh` script:
280+
- On x86: Includes all files in one micromamba call
281+
- On ARM64: Skips files tagged with `--x86-only:` but includes others
282+
270283
---
271284

272285
## Docker Images
@@ -343,9 +356,17 @@ Each test job uploads coverage to Codecov with flavor-specific flags.
343356
### Multi-Architecture Support
344357

345358
- Images built for `linux/amd64` and `linux/arm64`
346-
- x86-only packages (novoalign, mvicuna, table2asn) handled gracefully on ARM
359+
- x86-only packages (novoalign, mvicuna, bmtagger, kallisto, kb-python, table2asn) handled via `--x86-only:` prefix in `install-conda-deps.sh`
360+
- Python tool wrappers still importable on ARM64; only runtime execution fails for missing binaries
347361
- Cache stored on Quay.io registry (GHA 10GB limit too small)
348362

363+
### Documentation Build
364+
365+
The `docs.yml` workflow builds Sphinx documentation. Key points:
366+
- Uses `mock` to stub heavy dependencies (`Bio`, `pysam`, `scipy`, etc.) in `docs/conf.py`
367+
- When adding new imports to source code, add corresponding mocks to `MOCK_MODULES` in `docs/conf.py`
368+
- Runs `sphinx-build -W` (warnings as errors)
369+
349370
### Feature Branch Images
350371

351372
Feature branch images get `quay.expires-after=10w` label for automatic cleanup.

README.md

Lines changed: 52 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,57 @@ docker pull ghcr.io/broadinstitute/viral-ngs:latest
6060

6161
The recommended way to use viral-ngs is via Docker images, which include all bioinformatics tool dependencies pre-configured.
6262

63+
### Conda + pip (Development)
64+
65+
For local development with bioinformatics tools:
66+
67+
1. **Clone the repository:**
68+
```bash
69+
git clone https://github.com/broadinstitute/viral-ngs.git
70+
cd viral-ngs
71+
```
72+
73+
2. **Create and activate a conda environment:**
74+
```bash
75+
conda create -n viral-ngs python=3.12
76+
conda activate viral-ngs
77+
```
78+
79+
Or using micromamba (faster):
80+
```bash
81+
micromamba create -n viral-ngs python=3.12
82+
micromamba activate viral-ngs
83+
```
84+
85+
3. **Install bioinformatics tools via conda:**
86+
```bash
87+
# Core tools only
88+
conda install -c conda-forge -c bioconda \
89+
--file docker/requirements/baseimage.txt \
90+
--file docker/requirements/core.txt
91+
92+
# Or for all tools (mega)
93+
conda install -c conda-forge -c bioconda \
94+
--file docker/requirements/baseimage.txt \
95+
--file docker/requirements/core.txt \
96+
--file docker/requirements/assemble.txt \
97+
--file docker/requirements/classify.txt \
98+
--file docker/requirements/phylo.txt
99+
```
100+
101+
4. **Install the viral-ngs Python package:**
102+
```bash
103+
pip install -e .
104+
```
105+
106+
Note: Installing into an activated conda environment is safe - pip installs into the conda environment, not system Python.
107+
108+
5. **Verify installation:**
109+
```bash
110+
read_utils --version
111+
python -c "from viral_ngs.core import samtools; print(samtools.SamtoolsTool().version())"
112+
```
113+
63114
### pip (Python package only)
64115

65116
For use as a Python library without bioinformatics tools:
@@ -68,7 +119,7 @@ For use as a Python library without bioinformatics tools:
68119
pip install viral-ngs
69120
```
70121

71-
Note: pip installation does not include external tools (samtools, bwa, etc.). Use Docker for the complete toolset.
122+
Note: pip installation does not include external tools (samtools, bwa, etc.). Use Docker or Conda for the complete toolset.
72123

73124
## Documentation
74125

docker/Dockerfile.classify

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,12 +20,13 @@ LABEL org.opencontainers.image.source="https://github.com/broadinstitute/viral-n
2020
ARG MAMBA_DOCKERFILE_ACTIVATE=1
2121

2222
# Copy requirements and dependency installation script
23-
COPY docker/requirements/baseimage.txt docker/requirements/core.txt docker/requirements/classify.txt /tmp/requirements/
23+
COPY docker/requirements/baseimage.txt docker/requirements/core.txt docker/requirements/classify.txt docker/requirements/classify-x86.txt /tmp/requirements/
2424
COPY docker/install-conda-deps.sh /tmp/
2525

2626
# Install conda dependencies (classify tools)
27-
# Install all requirements together in single resolver call for proper dependency resolution
28-
RUN /tmp/install-conda-deps.sh /tmp/requirements/baseimage.txt /tmp/requirements/core.txt /tmp/requirements/classify.txt
27+
# All files resolved together in single micromamba call; x86-only files skipped on ARM
28+
RUN /tmp/install-conda-deps.sh /tmp/requirements/baseimage.txt /tmp/requirements/core.txt /tmp/requirements/classify.txt \
29+
--x86-only:/tmp/requirements/classify-x86.txt
2930

3031
# Copy source code (includes classify module)
3132
COPY src/ /opt/viral-ngs/source/src/

docker/Dockerfile.core

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,10 +22,9 @@ COPY docker/requirements/baseimage.txt docker/requirements/core.txt docker/requi
2222
COPY docker/install-conda-deps.sh /tmp/
2323

2424
# Install conda dependencies (bioinformatics tools)
25-
# Install baseimage.txt + core.txt together in single resolver call for proper dependency resolution
26-
# Then install x86-only packages (skipped on ARM)
27-
RUN /tmp/install-conda-deps.sh /tmp/requirements/baseimage.txt /tmp/requirements/core.txt && \
28-
/tmp/install-conda-deps.sh --x86-only /tmp/requirements/core-x86.txt
25+
# All files resolved together in single micromamba call; x86-only files skipped on ARM
26+
RUN /tmp/install-conda-deps.sh /tmp/requirements/baseimage.txt /tmp/requirements/core.txt \
27+
--x86-only:/tmp/requirements/core-x86.txt
2928

3029
# Copy source code
3130
COPY src/ /opt/viral-ngs/source/src/

docker/Dockerfile.mega

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,13 +20,14 @@ LABEL org.opencontainers.image.source="https://github.com/broadinstitute/viral-n
2020
ARG MAMBA_DOCKERFILE_ACTIVATE=1
2121

2222
# Copy requirements and dependency installation script
23-
COPY docker/requirements/baseimage.txt docker/requirements/core.txt docker/requirements/assemble.txt docker/requirements/classify.txt docker/requirements/phylo.txt docker/requirements/phylo-x86.txt /tmp/requirements/
23+
COPY docker/requirements/baseimage.txt docker/requirements/core.txt docker/requirements/assemble.txt docker/requirements/classify.txt docker/requirements/classify-x86.txt docker/requirements/phylo.txt docker/requirements/phylo-x86.txt /tmp/requirements/
2424
COPY docker/install-conda-deps.sh /tmp/
2525

2626
# Install ALL conda dependencies in single resolver call for proper dependency resolution
27-
# Then install x86-only packages (skipped on ARM)
28-
RUN /tmp/install-conda-deps.sh /tmp/requirements/baseimage.txt /tmp/requirements/core.txt /tmp/requirements/assemble.txt /tmp/requirements/classify.txt /tmp/requirements/phylo.txt && \
29-
/tmp/install-conda-deps.sh --x86-only /tmp/requirements/phylo-x86.txt
27+
# All files resolved together; x86-only files skipped on ARM
28+
RUN /tmp/install-conda-deps.sh /tmp/requirements/baseimage.txt /tmp/requirements/core.txt /tmp/requirements/assemble.txt /tmp/requirements/classify.txt /tmp/requirements/phylo.txt \
29+
--x86-only:/tmp/requirements/classify-x86.txt \
30+
--x86-only:/tmp/requirements/phylo-x86.txt
3031

3132
# Copy source code (includes all modules)
3233
COPY src/ /opt/viral-ngs/source/src/

docker/Dockerfile.phylo

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -24,10 +24,9 @@ COPY docker/requirements/baseimage.txt docker/requirements/core.txt docker/requi
2424
COPY docker/install-conda-deps.sh /tmp/
2525

2626
# Install conda dependencies (phylo tools)
27-
# Install all requirements together in single resolver call for proper dependency resolution
28-
# Then install x86-only packages (skipped on ARM)
29-
RUN /tmp/install-conda-deps.sh /tmp/requirements/baseimage.txt /tmp/requirements/core.txt /tmp/requirements/phylo.txt && \
30-
/tmp/install-conda-deps.sh --x86-only /tmp/requirements/phylo-x86.txt
27+
# All files resolved together in single micromamba call; x86-only files skipped on ARM
28+
RUN /tmp/install-conda-deps.sh /tmp/requirements/baseimage.txt /tmp/requirements/core.txt /tmp/requirements/phylo.txt \
29+
--x86-only:/tmp/requirements/phylo-x86.txt
3130

3231
# Copy source code (includes phylo module)
3332
COPY src/ /opt/viral-ngs/source/src/

docker/install-conda-deps.sh

Lines changed: 47 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -2,20 +2,31 @@
22
#
33
# Install conda dependencies from one or more requirements files.
44
#
5-
# Usage: install-conda-deps.sh [options] requirements1.txt [requirements2.txt ...]
5+
# Usage: install-conda-deps.sh [options] file1.txt [--x86-only:file2.txt] [file3.txt ...]
66
#
7-
# Options:
8-
# --x86-only Only install if running on x86_64 architecture (skip on ARM)
7+
# Arguments:
8+
# file.txt Regular requirements file (installed on all architectures)
9+
# --x86-only:file.txt x86-only requirements file (skipped on ARM64)
910
#
1011
# This script installs all dependencies in a SINGLE resolver call to prevent
11-
# version regressions when derivative images add packages. Always pass ALL
12-
# requirements files together (e.g., core.txt + classify.txt for classify image).
12+
# version regressions when derivative images add packages. x86-only files are
13+
# included in the same resolver call on x86, ensuring consistent dependency
14+
# resolution across all packages.
15+
#
16+
# Examples:
17+
# # Install core packages only
18+
# install-conda-deps.sh baseimage.txt core.txt
19+
#
20+
# # Install with x86-only packages (single resolver call on x86, skips x86-only on ARM)
21+
# install-conda-deps.sh baseimage.txt core.txt --x86-only:core-x86.txt
22+
#
23+
# # Multiple x86-only files
24+
# install-conda-deps.sh baseimage.txt core.txt classify.txt --x86-only:classify-x86.txt phylo.txt --x86-only:phylo-x86.txt
1325

1426
set -e -o pipefail
1527

1628
# Architecture detection
1729
ARCH=$(uname -m)
18-
X86_ONLY=false
1930

2031
is_x86() {
2132
[[ "$ARCH" == "x86_64" || "$ARCH" == "amd64" ]]
@@ -49,41 +60,52 @@ stop_keepalive() {
4960

5061
trap stop_keepalive EXIT SIGINT SIGQUIT SIGTERM
5162

52-
# Parse options and build requirements arguments
63+
# Parse arguments and build requirements list
5364
REQUIREMENTS=""
65+
SKIPPED_X86_FILES=""
66+
5467
for arg in "$@"; do
55-
case "$arg" in
56-
--x86-only)
57-
X86_ONLY=true
58-
;;
59-
*)
60-
if [ -f "$arg" ]; then
61-
echo "Adding requirements from: $arg"
62-
REQUIREMENTS="$REQUIREMENTS --file $arg"
68+
if [[ "$arg" == --x86-only:* ]]; then
69+
# Extract filename from --x86-only:filename.txt
70+
file="${arg#--x86-only:}"
71+
if is_x86; then
72+
if [ -f "$file" ]; then
73+
echo "Adding x86-only requirements from: $file"
74+
REQUIREMENTS="$REQUIREMENTS --file $file"
6375
else
64-
echo "Warning: requirements file not found: $arg" >&2
76+
echo "Warning: x86-only requirements file not found: $file" >&2
6577
fi
66-
;;
67-
esac
78+
else
79+
echo "Skipping x86-only file on $ARCH: $file"
80+
SKIPPED_X86_FILES="$SKIPPED_X86_FILES $file"
81+
fi
82+
else
83+
# Regular requirements file
84+
if [ -f "$arg" ]; then
85+
echo "Adding requirements from: $arg"
86+
REQUIREMENTS="$REQUIREMENTS --file $arg"
87+
else
88+
echo "Warning: requirements file not found: $arg" >&2
89+
fi
90+
fi
6891
done
6992

70-
# Skip x86-only packages on non-x86 architectures
71-
if $X86_ONLY && ! is_x86; then
72-
echo "Skipping x86-only packages on $ARCH architecture"
73-
exit 0
74-
fi
75-
7693
if [ -z "$REQUIREMENTS" ]; then
94+
if [ -n "$SKIPPED_X86_FILES" ]; then
95+
echo "No packages to install (all files were x86-only and skipped on $ARCH)"
96+
exit 0
97+
fi
7798
echo "Error: No valid requirements files provided" >&2
7899
exit 1
79100
fi
80101

81102
echo ""
82103
echo "Installing conda dependencies..."
104+
echo "Architecture: $ARCH"
83105
echo "micromamba version: $(micromamba --version)"
84106
echo ""
85107

86-
# Install all dependencies together
108+
# Install all dependencies together in single resolver call
87109
start_keepalive
88110
micromamba install -y -n base $REQUIREMENTS
89111
stop_keepalive
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# Classification tools - x86-only (no ARM64 builds available)
2+
# These packages are skipped on ARM64 architecture
3+
# Install with: install-conda-deps.sh --x86-only:classify-x86.txt
4+
5+
# BMTagger: Host sequence filtering tool
6+
bmtagger>=3.101
7+
8+
# Kallisto/kb-python: Pseudoalignment for RNA-seq
9+
# kb-python depends on ngs-tools -> pyseq-align which has no ARM64 build
10+
kallisto>=0.51.1
11+
kb-python>=0.29.5

docker/requirements/classify.txt

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
11
# Metagenomic classification tools
2+
# Note: x86-only packages (bmtagger, kallisto, kb-python) are in classify-x86.txt
23
blast>=2.15.0
3-
bmtagger>=3.101
4-
kallisto>=0.51.1
5-
kb-python>=0.29.5
64
kma>=1.4.0
75
kmc>=3.2.1
86
kraken2>=2.1.3

0 commit comments

Comments
 (0)