Skip to content

Commit d3bba97

Browse files
dpark01claude
andcommitted
Restructure as proper Python package with Docker/CI support
- Convert standalone scripts to installable Python package with src/ layout - Create single CLI entry point `qprimer` with 7 subcommands: - generate, pick-representatives, prepare-input, evaluate, filter, build-output, select-multiplex - Reduce environment.yml from 165 to ~15 packages (top-level only) - Move ML models to package data (src/qprimer_designer/data/) - Add external tool wrappers (ViennaRNA, bowtie2, MAFFT) using PATH - Fix hardcoded paths in training/ to use environment variables - Move Snakefiles to workflows/ and update to use new CLI syntax - Delete 37MB ViennaRNA tarball (use bioconda instead) - Add Dockerfile with mambaorg/micromamba base - Add GitHub Actions workflows: - test.yml: pytest runner - docker.yml: multi-arch Docker builds - snakemake.yml: workflow dry-run validation - Add placeholder tests for sequences and params modules - Add CLAUDE.md development guide and update README.md Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 9685b51 commit d3bba97

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+2728
-362
lines changed

.dockerignore

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# Git
2+
.git
3+
.gitignore
4+
5+
# Python
6+
__pycache__
7+
*.py[cod]
8+
*$py.class
9+
*.egg-info
10+
.eggs
11+
dist
12+
build
13+
*.egg
14+
15+
# IDE
16+
.vscode
17+
.idea
18+
*.swp
19+
*.swo
20+
21+
# Testing
22+
.pytest_cache
23+
.coverage
24+
htmlcov
25+
26+
# Development
27+
*.log
28+
.DS_Store
29+
30+
# Training data (not needed for runtime)
31+
training/
32+
33+
# Large files
34+
*.tar.bz2
35+
*.tar.gz
36+
*.zip
37+
38+
# Analysis artifacts
39+
CODEBASE_ANALYSIS.md

.github/workflows/docker.yml

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
name: Docker Build
2+
3+
on:
4+
push:
5+
branches: ['**']
6+
tags: ['**']
7+
pull_request:
8+
branches: [main]
9+
10+
env:
11+
REGISTRY: ghcr.io
12+
IMAGE_NAME: ${{ github.repository }}
13+
14+
jobs:
15+
build:
16+
runs-on: ubuntu-latest
17+
permissions:
18+
contents: read
19+
packages: write
20+
21+
steps:
22+
- uses: actions/checkout@v4
23+
24+
- uses: docker/setup-qemu-action@v3
25+
- uses: docker/setup-buildx-action@v3
26+
27+
- name: Log in to Container Registry
28+
if: github.event_name != 'pull_request'
29+
uses: docker/login-action@v3
30+
with:
31+
registry: ${{ env.REGISTRY }}
32+
username: ${{ github.actor }}
33+
password: ${{ secrets.GITHUB_TOKEN }}
34+
35+
- name: Extract metadata
36+
id: meta
37+
uses: docker/metadata-action@v5
38+
with:
39+
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
40+
tags: |
41+
type=ref,event=branch
42+
type=ref,event=tag
43+
type=ref,event=pr
44+
type=raw,value=latest,enable={{is_default_branch}}
45+
46+
- name: Build and push
47+
uses: docker/build-push-action@v6
48+
with:
49+
context: .
50+
platforms: linux/amd64,linux/arm64
51+
push: ${{ github.event_name != 'pull_request' }}
52+
tags: ${{ steps.meta.outputs.tags }}
53+
labels: ${{ steps.meta.outputs.labels }}
54+
cache-from: type=gha
55+
cache-to: type=gha,mode=max

.github/workflows/snakemake.yml

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
name: Snakemake Validation
2+
3+
on:
4+
push:
5+
branches: [main]
6+
pull_request:
7+
branches: [main]
8+
9+
jobs:
10+
validate:
11+
runs-on: ubuntu-latest
12+
steps:
13+
- uses: actions/checkout@v4
14+
15+
- uses: mamba-org/setup-micromamba@v2
16+
with:
17+
environment-file: environment.yml
18+
cache-environment: true
19+
20+
- name: Install package
21+
run: pip install -e .
22+
shell: micromamba-shell {0}
23+
24+
- name: Validate Snakefile syntax (dry-run)
25+
run: |
26+
cd workflows
27+
snakemake -s Snakefile.example --dry-run --quiet
28+
shell: micromamba-shell {0}

.github/workflows/test.yml

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
name: Tests
2+
3+
on:
4+
push:
5+
branches: [main]
6+
pull_request:
7+
branches: [main]
8+
9+
jobs:
10+
test:
11+
runs-on: ubuntu-latest
12+
steps:
13+
- uses: actions/checkout@v4
14+
15+
- uses: mamba-org/setup-micromamba@v2
16+
with:
17+
environment-file: environment.yml
18+
cache-environment: true
19+
20+
- name: Install package
21+
run: pip install -e ".[dev]"
22+
shell: micromamba-shell {0}
23+
24+
- name: Run tests
25+
run: pytest tests/ -v
26+
shell: micromamba-shell {0}

.gitignore

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
# Byte-compiled / optimized / DLL files
2+
__pycache__/
3+
*.py[cod]
4+
*$py.class
5+
6+
# C extensions
7+
*.so
8+
9+
# Distribution / packaging
10+
.Python
11+
build/
12+
develop-eggs/
13+
dist/
14+
downloads/
15+
eggs/
16+
.eggs/
17+
lib/
18+
lib64/
19+
parts/
20+
sdist/
21+
var/
22+
wheels/
23+
*.egg-info/
24+
.installed.cfg
25+
*.egg
26+
27+
# PyInstaller
28+
*.manifest
29+
*.spec
30+
31+
# Installer logs
32+
pip-log.txt
33+
pip-delete-this-directory.txt
34+
35+
# Unit test / coverage reports
36+
htmlcov/
37+
.tox/
38+
.nox/
39+
.coverage
40+
.coverage.*
41+
.cache
42+
nosetests.xml
43+
coverage.xml
44+
*.cover
45+
*.py,cover
46+
.hypothesis/
47+
.pytest_cache/
48+
49+
# Translations
50+
*.mo
51+
*.pot
52+
53+
# Environments
54+
.env
55+
.venv
56+
env/
57+
venv/
58+
ENV/
59+
env.bak/
60+
venv.bak/
61+
62+
# IDEs
63+
.idea/
64+
.vscode/
65+
*.swp
66+
*.swo
67+
*~
68+
69+
# macOS
70+
.DS_Store
71+
72+
# Jupyter Notebook
73+
.ipynb_checkpoints
74+
75+
# Pipeline outputs
76+
alignments/
77+
bt2_index/
78+
final/
79+
inputs/
80+
outputs/
81+
primer_seqs/
82+
target_seqs/choice/
83+
target_seqs/msa/
84+
target_seqs/representative/
85+
target_seqs/selected/
86+
87+
# Large files
88+
*.tar.bz2
89+
*.tar.gz
90+
*.zip
91+
92+
# Logs
93+
*.log

CLAUDE.md

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
# qprimer_designer Development Guide
2+
3+
## Project Structure
4+
5+
This is a Python package for ML-guided qPCR primer design. Key directories:
6+
7+
- `src/qprimer_designer/` - Main package source
8+
- `cli.py` - Single CLI entry point with subcommands
9+
- `commands/` - Individual subcommand implementations
10+
- `models/` - PyTorch ML model architectures
11+
- `external/` - Wrappers for external tools (ViennaRNA, bowtie2, MAFFT)
12+
- `utils/` - Shared utilities (sequence ops, encoding, params)
13+
- `data/` - Pre-trained ML models (bundled as package data)
14+
- `workflows/` - Snakemake workflow templates
15+
- `training/` - Historical model training code (not production)
16+
- `tests/` - pytest test suite
17+
18+
## Development Setup
19+
20+
```bash
21+
# Create conda environment with external tools
22+
conda env create -f environment.yml
23+
conda activate qprimer-designer
24+
25+
# Install package in editable mode with dev dependencies
26+
pip install -e ".[dev]"
27+
28+
# Run tests
29+
pytest tests/ -v
30+
```
31+
32+
## CLI Usage
33+
34+
Single entry point with subcommands:
35+
```bash
36+
qprimer generate --help
37+
qprimer evaluate --help
38+
qprimer pick-representatives --help
39+
qprimer prepare-input --help
40+
qprimer filter --help
41+
qprimer build-output --help
42+
qprimer select-multiplex --help
43+
```
44+
45+
## Key Patterns
46+
47+
### External Tool Wrappers
48+
External bioinformatics tools (RNAduplex, bowtie2, mafft) are assumed to be
49+
in PATH via conda installation. Use `shutil.which()` to verify availability.
50+
51+
### ML Model Loading
52+
Models are bundled as package data. Load using `importlib.resources`:
53+
```python
54+
from importlib.resources import files
55+
model_path = files('qprimer_designer.data').joinpath('combined_classifier.pth')
56+
```
57+
58+
### Adding New Subcommands
59+
1. Create module in `src/qprimer_designer/commands/`
60+
2. Implement `register(subparsers)` function to add argparse subparser
61+
3. Import and register in `cli.py`
62+
63+
## Testing
64+
65+
- Run all tests: `pytest tests/ -v`
66+
- Run with coverage: `pytest tests/ -v --cov=qprimer_designer`
67+
- Tests should not require external tools (mock them)
68+
69+
## Docker
70+
71+
Build locally:
72+
```bash
73+
docker build -t qprimer-designer:local .
74+
docker run --rm qprimer-designer:local qprimer --help
75+
```
76+
77+
## Snakemake Workflows
78+
79+
Workflows are in `workflows/`. They use the `qprimer` CLI internally:
80+
```bash
81+
cd workflows
82+
snakemake -s Snakefile.example --cores all
83+
```
84+
85+
Dry-run validation:
86+
```bash
87+
snakemake -s Snakefile.example --dry-run
88+
```
89+
90+
## Environment Variables
91+
92+
- `QPRIMER_FONT_PATH`: Custom font directory for training plots (optional)
93+
- `QPRIMER_TOOLPATH`: Custom tool installation path for training scripts (optional)
94+
- `RNASTRUCTURE_DATAPATH`: Path to RNAstructure data tables (optional)

Dockerfile

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
FROM mambaorg/micromamba:2.0.0-ubuntu24.04
2+
3+
COPY --chown=$MAMBA_USER:$MAMBA_USER environment.yml /tmp/environment.yml
4+
RUN micromamba install -y -n base -f /tmp/environment.yml && \
5+
micromamba clean --all --yes
6+
7+
ARG MAMBA_DOCKERFILE_ACTIVATE=1
8+
ENV PATH="/opt/conda/bin:$PATH"
9+
10+
COPY --chown=$MAMBA_USER:$MAMBA_USER . /app
11+
WORKDIR /app
12+
13+
RUN pip install --no-cache-dir .
14+
15+
# Verify installation
16+
RUN qprimer --help && \
17+
qprimer generate --help && \
18+
RNAduplex --version && \
19+
bowtie2 --version | head -1
20+
21+
WORKDIR /data

0 commit comments

Comments
 (0)