Skip to content

Commit 6aaefe4

Browse files
authored
Merge pull request #6 from OpenOmics/dev
Version 3 Update
2 parents 30b44ae + 767d53b commit 6aaefe4

21 files changed

+1822
-149
lines changed

.github/workflows/main.yaml

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,3 +32,23 @@ jobs:
3232
run: |
3333
docker run -v $PWD:/opt2 snakemake/snakemake:v5.24.2 snakemake --lint -s /opt2/output/workflow/Snakefile -d /opt2/output || \
3434
echo 'There may have been a few warnings or errors. Please read through the log to determine if its harmless.'
35+
Dry_Run_and_Lint_cellranger:
36+
runs-on: ubuntu-latest
37+
steps:
38+
- uses: actions/checkout@v2
39+
- uses: docker://snakemake/snakemake:v5.24.2
40+
- name: Dry Run with test data
41+
run: |
42+
docker run -v $PWD:/opt2 snakemake/snakemake:v5.24.2 \
43+
/opt2/cell-seek run --input \
44+
/opt2/.tests/WT/ \
45+
--output /opt2/output --genome hg38 --pipeline gex --cellranger 8.0.0 --mode local --dry-run
46+
- name: View the pipeline config file
47+
run: |
48+
echo "Generated config file for pipeline...." && cat $PWD/output/config.json
49+
- name: Lint Workflow
50+
continue-on-error: true
51+
run: |
52+
docker run -v $PWD:/opt2 snakemake/snakemake:v5.24.2 snakemake --lint -s /opt2/output/workflow/Snakefile -d /opt2/output || \
53+
echo 'There may have been a few warnings or errors. Please read through the log to determine if its harmless.'
54+

.tests/WT/outs/web_summary.html

Whitespace-only changes.

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
2.0.1
1+
3.0.0

cell-seek

Lines changed: 86 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -314,7 +314,7 @@ def parsed_arguments(name, description):
314314
[--aggregate {{mapped, none}}][--libraries LIBRARIES] \\
315315
[--features FEATURES] [--cmo-reference CMOREFERENCE] \\
316316
[--cmo-sample CMOSAMPLE] [--exclude-introns] [--filter FILTER] \\
317-
[--create-bam] [--rename RENAMEFILE] \\
317+
[--create-bam] [--rename RENAMEFILE] [--forcecells FORCECELLS]\\
318318
--input INPUT [INPUT ...] \\
319319
--output OUTPUT \\
320320
--pipeline {{gex, ...}} \\
@@ -324,17 +324,25 @@ def parsed_arguments(name, description):
324324
325325
{3}{4}Description:{5}
326326
To run the cell-seek pipeline with your data raw data, please
327-
provide a space seperated list of FastQ (globbing is supported) and an output
327+
provide a space separated list of FastQ (globbing is supported) and an output
328328
directory to store results.
329329
330330
{3}{4}Required arguments:{5}
331331
--input INPUT [INPUT ...]
332-
Input FastQ file(s) to process. The pipeline does NOT
333-
support single-end data. FastQ files for one or more
334-
samples can be provided. Multiple input FastQ files
335-
should be seperated by a space. Globbing for multiple
336-
file is also supported.
332+
Input FastQ file(s) or Cell Ranger output folders to
333+
process. The pipeline does NOT support single-end data.
334+
FastQ files for one or more samples can be provided.
335+
Multiple input FastQ files per sample can be provided.
336+
Multiple input FastQ files should be separated by a
337+
space.
338+
Cell Ranger output folders can be provided. It is
339+
expected that the outs folder is contained within the
340+
Cell Ranger output folders.
341+
Globbing for multiple files/folders is also supported.
342+
FastQ Input:
337343
Example: --input .tests/*.R?.fastq.gz
344+
Cell Ranger Input:
345+
Example: --input .tests/*/
338346
--output OUTPUT
339347
Path to an output directory. This location is where
340348
the pipeline will create all of its output files, also
@@ -359,11 +367,11 @@ def parsed_arguments(name, description):
359367
options: hg38, mm10, hg2024, mm2024.
360368
Example: --genome hg38
361369
{3}{4}Analysis options:{5}
362-
--cellranger {{7.1.0, 7.2.0, 8.0.0}}
370+
--cellranger {{7.1.0, 7.2.0, 8.0.0, 9.0.0}}
363371
The version of CellRanger to run. This option specifies
364372
which version of CellRanger to use when running GEX,
365373
CITE, or MULTI. Please select one of the following
366-
options: 7.1.0, 7.2.0, 8.0.0
374+
options: 7.1.0, 7.2.0, 8.0.0, 9.0.0
367375
Example: --cellranger 7.1.0
368376
--aggregate {{mapped,none}}
369377
Cell Ranger aggregate. This option defines the
@@ -372,9 +380,11 @@ def parsed_arguments(name, description):
372380
from higher depth samples until each library type has an
373381
equal number of reads per cell that are confidently mapped.
374382
None means to not normalize at all. If this flag is not
375-
used then aggregate will not be run. To run Cell Ranger
376-
aggregate, please select one of the following options:
377-
mapped, none.
383+
used then aggregate will not be run. Aggregate analysis
384+
is generally not needed, but it can be used to generate a
385+
Loupe Browser file for interactive exploration of the data.
386+
To run Cell Ranger aggregate, please select one of the
387+
following options: mapped, none.
378388
Example: --aggregate mapped
379389
--libraries LIBRARIES
380390
Libraries file. A CSV file containing information about
@@ -556,16 +566,67 @@ def parsed_arguments(name, description):
556566
Here is an example rename.csv file:
557567
FASTQ,Name
558568
original_name1,new_name1
559-
original_name2,new_name1
560-
original_name3,new_name2
561-
original_name4,new_name3
569+
original_name2,new_name2
570+
original_name3,new_name3
571+
original_name3-2,new_name3
572+
original_name4,original_name4
573+
where:
574+
• FASTQ: The name that is used in the FASTQ file
575+
• Name: Unique sample ID that is the sample name used for
576+
Cell Ranger count.
562577
In this example, new_name3 has FASTQ files with two different
563578
names. With this input, both sets of FASTQ files will be used
564579
when processing the sample as new_name3. original_name4 will not
565580
be renamed. Any FASTQ file that does not have the name
566581
original_name1, original_name2, original_name3, or original_name4
567582
will not be run.
568583
Example: --rename rename.csv
584+
--forcecells FORCECELLS
585+
Force cells file. A CSV file containing the name of the sample
586+
(the Cell Ranger outputted name) and the number of cells to
587+
force the sample to. This flag is applicable when using the GEX,
588+
CITE, MULTI, and ATAC pipelines. It will generally be used if
589+
the first analysis run appears to do a poor job at estimating
590+
the number of cells, and a re-run is needed to adjust the number
591+
of cells in the sample.
592+
593+
This file can created in two different formats. The first one
594+
can be used for the GEX, CITE, MULTI, and ATAC pipelines. It
595+
will contain the name of the sample and the number of cells
596+
to be forced to.
597+
Here is an example forcecells.csv file:
598+
Sample,Cells
599+
Sample1,3000
600+
Sample2,5000
601+
where:
602+
• Sample: The sample name used as the Cell Ranger output
603+
• Cells: The number of cells the sample should be forced to
604+
In this example, Sample1 and Sample2 will be run while being forced
605+
to have 3000 and 5000 cells respectively. Any other samples that
606+
are processed will be run without using the force cells flag and
607+
will use the default cell calling algorithm.
608+
609+
The second format is only compatible with the MULTI pipeline and
610+
would be used when hashtag multiplexing is used and the number of
611+
cells needs to be forced for a specific hashtagged sample.
612+
Here is an example forcecells.csv file:
613+
Name,Sample,Cells
614+
Library1,HTO_1,3000
615+
Library1,HTO_2,5000
616+
where:
617+
• Library: The name of the library that is provided as to Cell
618+
Ranger when running multi analysis. This should match the
619+
name that is given in the libraries.csv file.
620+
• Sample: The sample ID used for the associated hashtag. This
621+
will have to match the value used in the CMO sample file or
622+
the CMO reference file that is provided as input. If only a
623+
CMO reference file is provided, the pipeline default assigns
624+
each hashtag with the IDs of HTO_1, HTO_2, etc.
625+
• Cells: The number of cells the sample should be forced to
626+
In this example, the hashtags HTO_1 and HTO_2 in Library 1 will
627+
be run while being forced to 3000 and 5000 cells respectively.
628+
Any other libraries or samples that are processed will be run
629+
without using the force cells flag.
569630
570631
{3}{4}Orchestration options:{5}
571632
--mode {{slurm,local}}
@@ -836,7 +897,16 @@ def parsed_arguments(name, description):
836897
type = str.lower,
837898
required = False,
838899
default = "",
839-
choices = ['7.1.0', '7.2.0', '8.0.0'],
900+
choices = ['7.1.0', '7.2.0', '8.0.0', '9.0.0'],
901+
help = argparse.SUPPRESS
902+
)
903+
904+
# Number of cells to force samples to when running Cell Ranger analysis
905+
subparser_run.add_argument(
906+
'--forcecells',
907+
# Check if the file exists and if it is readable
908+
type = lambda file: permissions(parser, file, os.R_OK),
909+
required = False,
840910
help = argparse.SUPPRESS
841911
)
842912

config/cluster.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,5 +25,11 @@
2525
"threads": "16",
2626
"mem": "96g",
2727
"time": "1-00:00:00"
28+
},
29+
"seuratIntegrate": {
30+
"threads": "8",
31+
"mem": "350g",
32+
"partition": "largemem",
33+
"time": "1-00:00:00"
2834
}
2935
}

config/modules.json

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
11
{
22
"tools": {
3-
"cellranger": {"7.1.0": "cellranger/7.1.0", "7.2.0": "cellranger/7.2.0", "8.0.0": "cellranger/8.0.0"},
3+
"cellranger": {"7.1.0": "cellranger/7.1.0", "7.2.0": "cellranger/7.2.0", "8.0.0": "cellranger/8.0.0", "9.0.0": "cellranger/9.0.0"},
44
"cellranger-atac": "cellranger-atac/2.1.0",
55
"cellranger-arc": "cellranger-arc/2.0.1",
66
"python2": "python/2.7",
7-
"python3": "python/3.8"
7+
"python3": "python/3.8",
8+
"rversion": "R/4.4.0"
89
},
910
"r_libs": {
1011
"ext": "/data/OpenOmics/references/cyte-seek/R/4.1/library"

0 commit comments

Comments
 (0)