Skip to content

Commit eb11e2e

Browse files
authored
Merge pull request #398 from nf-core/dev
Release 4.3.1 update tutorials
2 parents 132ab3d + dc023ff commit eb11e2e

9 files changed

Lines changed: 143 additions & 258 deletions

File tree

.github/workflows/awsfulltest.yml

Lines changed: 11 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -4,61 +4,40 @@ name: nf-core AWS full size tests
44
# It runs the -profile 'test_full' on AWS batch
55

66
on:
7-
pull_request:
8-
branches:
9-
- main
10-
- master
117
workflow_dispatch:
128
pull_request_review:
139
types: [submitted]
10+
release:
11+
types: [published]
1412

1513
jobs:
1614
run-platform:
1715
name: Run AWS full tests
18-
# run only if the PR is approved by at least 2 reviewers and against the master branch or manually triggered
19-
if: github.repository == 'nf-core/airrflow' && github.event.review.state == 'approved' && github.event.pull_request.base.ref == 'master' || github.event_name == 'workflow_dispatch'
16+
# run only if the PR is approved by at least 2 reviewers and against the master/main branch or manually triggered
17+
if: github.repository == 'nf-core/airrflow' && github.event.review.state == 'approved' && (github.event.pull_request.base.ref == 'master' || github.event.pull_request.base.ref == 'main') || github.event_name == 'workflow_dispatch' || github.event_name == 'release'
2018
runs-on: ubuntu-latest
2119
steps:
22-
- name: Get PR reviews
23-
uses: octokit/request-action@v2.x
24-
if: github.event_name != 'workflow_dispatch'
25-
id: check_approvals
26-
continue-on-error: true
27-
with:
28-
route: GET /repos/${{ github.repository }}/pulls/${{ github.event.pull_request.number }}/reviews?per_page=100
29-
env:
30-
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
31-
32-
- name: Check for approvals
33-
if: ${{ failure() && github.event_name != 'workflow_dispatch' }}
34-
run: |
35-
echo "No review approvals found. At least 2 approvals are required to run this action automatically."
36-
exit 1
37-
38-
- name: Check for enough approvals (>=2)
39-
id: test_variables
40-
if: github.event_name != 'workflow_dispatch'
20+
- name: Set revision variable
21+
id: revision
4122
run: |
42-
JSON_RESPONSE='${{ steps.check_approvals.outputs.data }}'
43-
CURRENT_APPROVALS_COUNT=$(echo $JSON_RESPONSE | jq -c '[.[] | select(.state | contains("APPROVED")) ] | length')
44-
test $CURRENT_APPROVALS_COUNT -ge 2 || exit 1 # At least 2 approvals are required
23+
echo "revision=${{ (github.event_name == 'workflow_dispatch' || github.event_name == 'release') && github.sha || 'dev' }}" >> "$GITHUB_OUTPUT"
4524
4625
- name: Launch workflow via Seqera Platform
4726
uses: seqeralabs/action-tower-launch@v2
4827
with:
4928
workspace_id: ${{ secrets.TOWER_WORKSPACE_ID }}
5029
access_token: ${{ secrets.TOWER_ACCESS_TOKEN }}
5130
compute_env: ${{ secrets.TOWER_COMPUTE_ENV }}
52-
revision: ${{ github.sha }}
53-
workdir: s3://${{ secrets.AWS_S3_BUCKET }}/work/airrflow/work-${{ github.sha }}
31+
revision: ${{ steps.revision.outputs.revision }}
32+
workdir: s3://${{ secrets.AWS_S3_BUCKET }}/work/airrflow/work-${{ steps.revision.outputs.revision }}
5433
parameters: |
5534
{
5635
"hook_url": "${{ secrets.MEGATESTS_ALERTS_SLACK_HOOK_URL }}",
57-
"outdir": "s3://${{ secrets.AWS_S3_BUCKET }}/airrflow/results-${{ github.sha }}"
36+
"outdir": "s3://${{ secrets.AWS_S3_BUCKET }}/airrflow/results-${{ steps.revision.outputs.revision }}"
5837
}
5938
profiles: test_full
6039

61-
- uses: actions/upload-artifact@v4
40+
- uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
6241
with:
6342
name: Seqera Platform debug log file
6443
path: |

CHANGELOG.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,16 @@
33
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
44
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).
55

6+
## [4.3.1] - Revelio hotfix
7+
8+
### `Added`
9+
10+
- [#399](https://github.com/nf-core/airrflow/pull/399) Bump versions.
11+
12+
### `Fixed`
13+
14+
- [#392](https://github.com/nf-core/airrflow/pull/392) Updated tutorials.
15+
616
## [4.3.0] - Revelio
717

818
### `Added`

docs/usage.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
77
## Introduction
88

9-
The nf-core/airrflow pipeline allows processing B-cell receptor (BCR) and and T-cell receptor (TCR) sequencing data from bulk and single-cell sequencing protocols. It allows the processing of targeted bulk and single-cell adaptive immune receptor sequencing data (AIRR-seq), as well as the extraction of TCR and BCR sequences from untargeted bulk and single-cell RNA-seq data. The pipeline enables and end-to-end analysis, departing from raw reads or readily assembled sequences, and performs sequence assembly, V(D)J assignment, clonal group inference, lineage reconstruction and repertoire analysis using the [Immcantation](https://immcantation.readthedocs.io/en/stable/) framework, as well as other immune repertoire analysis tools.
9+
nf-core/airrflow allows processing B-cell receptor (BCR) and and T-cell receptor (TCR) sequencing data from bulk and single-cell sequencing protocols. It allows the processing of targeted bulk and single-cell adaptive immune receptor sequencing data (AIRR-seq), as well as the extraction of TCR and BCR sequences from untargeted bulk and single-cell RNA-seq data. The pipeline enables and end-to-end analysis, departing from raw reads or readily assembled sequences, and performs sequence assembly, V(D)J assignment, clonal group inference, lineage reconstruction and repertoire analysis using the [Immcantation](https://immcantation.readthedocs.io/en/stable/) framework, as well as other immune repertoire analysis tools.
1010

1111
In addition to this page, you can find additional information on how to use the pipeline on the following pages:
1212

@@ -198,10 +198,10 @@ An example samplesheet is:
198198
199199
It is possible to provide several fastq files per sample (e.g. sequenced over different chips or lanes). In this case the different fastq files per sample will be provided to the same cellranger process. These rows should then have an identical `sample_id` field.
200200

201-
### Fastq input samplesheet (untargeted bulk or single-cell RNAseq)
201+
### Fastq input samplesheet (untargeted bulk or single-cell RNA-seq)
202202

203203
When running the untargeted protocol, BCR or TCR sequences will be extracted from the untargeted bulk or single-cell RNA sequencing with tools such as [TRUST4](https://github.com/liulab-dfci/TRUST4).
204-
The required input file is the same as for the [Fastq bulk AIRR samplesheet](#fastq-input-samplesheet-bulk-airr-sequencing) or [Fastq single-cell AIRR samplesheet](#fastq-input-samplesheet-single-cell-sequencing) depending on the input data type (bulk RNAseq or single-cell RNAseq).
204+
The required input file is the same as for the [Fastq bulk AIRR samplesheet](#fastq-input-samplesheet-bulk-airr-sequencing) or [Fastq single-cell AIRR samplesheet](#fastq-input-samplesheet-single-cell-sequencing) depending on the input data type (bulk RNA-seq or single-cell RNA-seq).
205205

206206
### Assembled input samplesheet (bulk or single-cell sequencing)
207207

@@ -535,7 +535,7 @@ nextflow run nf-core/airrfow \
535535
```
536536

537537
- If UMI's are present, the read containing them must be specified using the `--umi_read` parameter.
538-
- The `--read_format` parameter can be used to specify the Cell Barcode and UMI position within the reads (see TRUST4 [docs](https://github.com/liulab-dfci/TRUST4?tab=readme-ov-file#10x-genomics-data-and-barcode-based-single-cell-data)). For scRNAseq with 10X Genomics the R1 read usually contains both the cell barcode (barcode) and UMI. So we specify "R1" for both `--umi_read` and `--cell_barcode_read`, and the positions of both the cell barcode and UMI with the `--read_format` parameter as in the example ("bc:0:15,um:16:27"). Then specify the R1 read in the filename_R1 column of the samplesheet, and the read containing the actual sequence (usually R2) in the filename_R2 column of the samplesheet.
538+
- The `--read_format` parameter can be used to specify the Cell Barcode and UMI position within the reads (see TRUST4 [docs](https://github.com/liulab-dfci/TRUST4?tab=readme-ov-file#10x-genomics-data-and-barcode-based-single-cell-data)). For scRNA-seq with 10X Genomics the R1 read usually contains both the cell barcode (barcode) and UMI. So we specify "R1" for both `--umi_read` and `--cell_barcode_read`, and the positions of both the cell barcode and UMI with the `--read_format` parameter as in the example ("bc:0:15,um:16:27"). Then specify the R1 read in the filename_R1 column of the samplesheet, and the read containing the actual sequence (usually R2) in the filename_R2 column of the samplesheet.
539539

540540
## Important considerations for clonal analysis
541541

@@ -622,7 +622,7 @@ Specify the path to a specific config file (this is a core Nextflow command). Se
622622

623623
### Resource requests
624624

625-
Whilst the default requirements set within the pipeline will hopefully work for most people and with most input data, you may find that you want to customise the compute resources that the pipeline requests. Each step in the pipeline has a default set of requirements for number of CPUs, memory and time. For most of the pipeline steps, if the job exits with any of the error codes specified [here](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/conf/base.config#L18) it will automatically be resubmitted with higher resources request (2 x original, then 3 x original). If it still fails after the third attempt then the pipeline execution is stopped.
625+
Whilst the default requirements set within the pipeline will hopefully work for most people and with most input data, you may find that you want to customise the compute resources that the pipeline requests. Each step in the pipeline has a default set of requirements for number of CPUs, memory and time. For most of the pipeline steps, if the job exits with any of the error codes specified [here](https://github.com/nf-core/airrflow/blob/132ab3d129c0df3f2de0ede7a7afaf549277c512/conf/base.config#L17) it will automatically be resubmitted with higher resources request (2 x original, then 3 x original). If it still fails after the third attempt then the pipeline execution is stopped.
626626

627627
To change the resource requests, please see the [max resources](https://nf-co.re/docs/usage/configuration#max-resources) and [tuning workflow resources](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources) section of the nf-core website.
628628

docs/usage/FAQ.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# nf-core/airrflow: Frequently Asked Questions
2+
3+
## How to update process resource requests and resource limits?
4+
5+
By default, the pipeline defines reasonable resource requests for each process (number of CPUs, RAM memory, time limits) based on typical compute environments. However, you can adjust these settings to better match the size of your datasets or the capabilities of your compute infrastructure. You can customize the limits and requests in `resource.config` file and provide it to the pipeline using the -c parameter during execution. The `resourceLimits` option applies upper resource request limits to all the processes in the pipeline. Ensure that these limits do not exceed the available resources on your compute system.
6+
7+
```json title="resource.config"
8+
process {
9+
resourceLimits = [cpus: 8, memory: 72.GB, time: 24.h]
10+
}
11+
```
12+
13+
To update the resource requests for a specific pipeline process, you can also provide specific process requests in this config file. For example, to update the resource requests for the `CHANGEO_ASSIGNGENES` process:
14+
15+
```json title="resource.config"
16+
process {
17+
resourceLimits = [cpus: 8, memory: 72.GB, time: 24.h]
18+
19+
withName:CHANGEO_ASSIGNGENES {
20+
cpus = 2
21+
memory = 10.GB
22+
time = 5h
23+
}
24+
}
25+
```
26+
27+
In nf-core pipelines, each process has a label indicating the resources that are being requested (`process_low`, `process_medium`, `process_high`, ...). The CPUs, RAM and time set up for each of these labels can be found in the [base.config](https://github.com/nf-core/airrflow/blob/master/conf/base.config) file. You can update the resource requests for all processes with a specific label by providing the updated configuration. For example here we update the resource requests of processes with the `process_high` label:
28+
29+
```json title="resource.config"
30+
process {
31+
resourceLimits = [cpus: 24, memory: 100.GB, time: 24.h]
32+
33+
withLabel:process_high {
34+
cpus = 24
35+
memory = 100.GB
36+
time = 10h
37+
}
38+
}
39+
```
40+
41+
Note that the resource requests will never exceed what is specified in the `resourceLimits` line, so if you do want to increase the resource requests for specific processes, you should also increase the `resourceLimits` requests and run the pipeline in a compute infrastructure with sufficient resources. In this example we also have updated the `resourceLimits` to reflect that.
42+
43+
> [!TIP]
44+
> For more information about nf-core pipeline resource configurations, check out the [nf-core pipeline configuration docs](https://nf-co.re/docs/usage/getting_started/configuration).
45+
46+
## How to customize the analysis and figures?
47+
48+
nf-core/airrflow is a standardized pipeline that performs the different computational analysis steps and provides standard figures for a first data exploration. You can use nf-core/airrflow results as input for customized analyses using R and the Immcantation tools. There are three options to customize your analysis:
49+
50+
- Option 1: some of the intermediate analysis steps are stored on `RData` objects that can be loaded in R to customize your figures. For instance, clonal abundance calculations can be time-consuming, so the results are stored in the results folder (`clonal_abundance/define_clones/all_reps_clone_report/ggplots/abundanceSample.RData`). With `load()` function in R, both the abundance plot and the clonal abundance object can be loaded.
51+
- Option 2: perform your own downstream analysis with the Immcantation framework. You can load the nf-core/airrflow results in AIRR format in R and use the Immcantation tools to plot the data as you need for publications. Check the [Immcantation tutorials](https://immcantation.readthedocs.io/en/stable/getting_started/getting-started.html) for this purpose, e.g. the Immcantation's single-cell V(D)J analysis [here](https://immcantation.readthedocs.io/en/stable/getting_started/10x_tutorial.html) shows an example for single-cell data analysis.
52+
- Option 3: for more advanced users and in case you need to repeat the exact same analysis for multiple projects, you can customize the [Airrflow report](https://github.com/nf-core/airrflow/blob/master/assets/repertoire_comparison.Rmd) Rmarkdown file that comes with the pipeline and provide the updated version to the pipeline with the `--report_rmd` option. The pipeline will then use this file instead to create the report.

0 commit comments

Comments
 (0)