Skip to content

Commit 393f53d

Browse files
committed
Devdoc updates
1 parent 8567c0e commit 393f53d

5 files changed

Lines changed: 43 additions & 86 deletions

File tree

docs/dev/Issues/Logs/2025-10-15 to 2025-10-17.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,10 @@ Next step should be to come up with an overarching test strategy.
3030
+ Individual processing steps
3131
+ Stub run with very large numbers of samples (i.e. 100k-1M)
3232
+ Find some way to validate the outputs
33+
34+
All nf-test tests currently pass.
3335
# Projects
36+
+ Formal test plan
3437
+ Migrate test assets to git lfs
3538
+ Replace end-to-end test with M129
3639
# Changes
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
```csvtable
2+
source: Tests/TEST_MATRIX.csv
3+
```
4+
5+
6+
7+

docs/dev/Tests/TEST_MATRIX.csv

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Feature/component,Behavior Tested,Test,File,Level,Coverage

docs/dev/Tests/Test strategy.md

Lines changed: 32 additions & 86 deletions
Original file line numberDiff line numberDiff line change
@@ -1,95 +1,41 @@
1-
[**nf-test documentation**](https://www.nf-test.com/docs/getting-started/)
2-
3-
## Tests of tests (metatests)
4-
* Scripts to download the large test datasets
5-
## Scalability
6-
* Time and memory cost of evaluating the pipeline vs. order of magnitude of samples.
7-
8-
9-
# Tags
10-
11-
## priority
12-
0to3s (<3s) -> 3to10s (3-10s) -> 10to60s (10-60s) -> 1to3m (1-3m)
13-
14-
0to3s: 114s, 40 tests
15-
16-
# Factoring Hich into testable units
17-
18-
Components of Nextflow workflow
19-
* Processes
20-
* Workflow
21-
* Utils
22-
* Containers
23-
* Command line parameters
24-
* Profiles
25-
* Scalability
26-
* Workflow control
27-
28-
Components of Python tool
29-
30-
Runtime
31-
* Run fast tests first, slower tests later
32-
33-
# Infrastructure
34-
35-
## Small test data objects
36-
## Large test data objects
37-
### Downloaders
38-
39-
### nf-test plugins
40-
* None currently, but can be created by putting groovy files in tests/lib, causing them to be added to the classpath.
41-
42-
<iframe width=500, src="https://raw.githubusercontent.com/bskubi/hich/refs/heads/main/nf-test.config"></iframe>
43-
### Test workflow automation
44-
bash script to run tests with the right priority order, sharding, logging
1+
# Hich Attributes
2+
**Reliable:** Ensures the user gets the behavior they expect.
3+
**Versatile:** Adapts to diverse use cases.
4+
**Powerful:** Handles large, complex analysis tasks.
5+
**Clear:** Easy to figure out how to get it working.
6+
# Hich Components
7+
+ **Interface** (config, params, profiles, sampleFile)
8+
+ **Orchestration** (Nextflow, containers)
9+
+ **Preprocessing** (alignment, filtering and contact matrix generation)
10+
+ **Analysis** (feature calling and QC)
11+
# Hich Capabilities
12+
**Interface**
13+
- Reliable: Input validation with informative errors.
14+
- Versatile: All tools used during processing fully configurable.
15+
- Powerful: Run the exact analysis you want with analysis plans and sample selection strategies
16+
- Clear: Zero-install. Dictate output with declarative sample attributes.
17+
18+
**Orchestration**
19+
- Reliable: Built on industry-standard workflow management system.
20+
- Versatile: Compatible with wide variety of computing environments.
21+
- Powerful: Take full advantage of cluster computing.
22+
- Powerful: Scale to hundreds of thousands of samples, hundreds of terabyte datasets.
23+
- Clear: Standardized execution control via Nextflow profiles.
24+
25+
**Preprocessing**
26+
+ Reliable: Based on standard bioinformatics tools or well-tested new tools.
27+
+ Versatile: Accepts wide variety of assay types (multi-enzyme digest, single-cell, methyl-Hi-C, capture) and intermediate data formats.
28+
+ Powerful: Handle replicates, cell types with merge and split.
29+
+ Clear:
30+
31+
# Test productivity
4532
#### Why not use [GitHub Actions](https://docs.github.com/en/actions/get-started/understand-github-actions) to run Hich tests?
4633
* I looked into this on 2025-10-14. **GitHub actions is only free if using standard, free runners. The ones they offer only have 4 CPUs, 16GB RAM, and 14GB storage, which is too little for some of the tests we'll need to run.** [source](https://docs.github.com/en/actions/how-tos/write-workflows/choose-where-workflows-run/choose-the-runner-for-a-job#standard-github-hosted-runners-for-public-repositories)
4734
* I'm not clear on whether it would be possible or helpful to use self-hosted runners on ARC, but this seems like a can of worms. I'd rather develop 1+ SLURM scripts that test Hich on ARC.
48-
49-
# Key decisions
50-
* Test suite composition
51-
* Automated testing tools
52-
* Workflow to run automated tests
53-
* Test resource management
54-
* Test documentation
55-
* Release governance
56-
57-
# Test performance
58-
427s for 57 tests in `nf-test test --tag fastest`
59-
+ Most tests take 2s, but longest test takes 50s
60-
+
61-
# Strategy
62-
63-
* Identify integration points (places where components interact)
64-
* Define scope and boundaries
65-
* Risk assessment to set priorities
66-
* Tooling and environment strategy
67-
* Test framework (nf-test)
68-
* Test environment
69-
* Approach for test data management
70-
71-
- [x] Explore github actions to set up and run tests
72-
- [x] Ankify nf-test as we've made significant progress and it's clear we'll use it.
73-
- [ ] Set up dedicated test environment
74-
- [ ] Determine how to do test data management more systematically
75-
- [ ] Inventory existing tests
76-
- [ ] Identify tests that currently fail
77-
- [ ] Identify highest-priority new tests
78-
- [ ] Create build and test script suite for individual containers in the repo
79-
- [ ] Get entire existing test suite functional
80-
- [ ] Get simple Github actions set up to run automated tests of pipeline
81-
8235
# nf-test issues
8336
* nf-test shards by skipping tests, which could interact oddly with obsolete snapshot detection, but I don't know if this is the case.
8437
* 5xing the shards for the fastest runs only decreased runtime by a factor of 2, possibly due to outlier slow tests.
8538
* Is there a way to rerun only tests that failed on the last run?
8639
# Basics
8740

88-
[**nf-test**](https://www.nf-test.com/) is the automated test suite for Hich.
89-
### tests/assets
90-
91-
* Small test files are stored in folders based on file type (bam, cool, hic, index, etc) and are backed up on the github repo
92-
* What is the limit on test file size?
93-
* Large test files must be downloaded to `tests/assets/downloads` (in #gitignore) after cloning the repo using `obtain_large_resources.sh` or `obtain_large_resources_slurm.sh` (a simple SLURM runner for `obtain_large_resources.sh)
94-
### tests/contents
95-
* Contains tests with templates autogenerated by `nf-test create`
41+
[**nf-test**](https://www.nf-test.com/) is the automated test suite for Hich.

docs/dev/Tests/Untitled.md

Whitespace-only changes.

0 commit comments

Comments
 (0)