@@ -314,7 +314,7 @@ def parsed_arguments(name, description):
314314 [--aggregate {{mapped, none}}][--libraries LIBRARIES] \\
315315 [--features FEATURES] [--cmo-reference CMOREFERENCE] \\
316316 [--cmo-sample CMOSAMPLE] [--exclude-introns] [--filter FILTER] \\
317- [--create-bam] [--rename RENAMEFILE] \\
317+ [--create-bam] [--rename RENAMEFILE] [--forcecells FORCECELLS] \\
318318 --input INPUT [INPUT ...] \\
319319 --output OUTPUT \\
320320 --pipeline {{gex, ...}} \\
@@ -324,17 +324,25 @@ def parsed_arguments(name, description):
324324
325325 {3}{4}Description:{5}
326326 To run the cell-seek pipeline with your data raw data, please
327- provide a space seperated list of FastQ (globbing is supported) and an output
327+ provide a space separated list of FastQ (globbing is supported) and an output
328328 directory to store results.
329329
330330 {3}{4}Required arguments:{5}
331331 --input INPUT [INPUT ...]
332- Input FastQ file(s) to process. The pipeline does NOT
333- support single-end data. FastQ files for one or more
334- samples can be provided. Multiple input FastQ files
335- should be seperated by a space. Globbing for multiple
336- file is also supported.
332+ Input FastQ file(s) or Cell Ranger output folders to
333+ process. The pipeline does NOT support single-end data.
334+ FastQ files for one or more samples can be provided.
335+ Multiple input FastQ files per sample can be provided.
336+ Multiple input FastQ files should be separated by a
337+ space.
338+ Cell Ranger output folders can be provided. It is
339+ expected that the outs folder is contained within the
340+ Cell Ranger output folders.
341+ Globbing for multiple files/folders is also supported.
342+ FastQ Input:
337343 Example: --input .tests/*.R?.fastq.gz
344+ Cell Ranger Input:
345+ Example: --input .tests/*/
338346 --output OUTPUT
339347 Path to an output directory. This location is where
340348 the pipeline will create all of its output files, also
@@ -359,11 +367,11 @@ def parsed_arguments(name, description):
359367 options: hg38, mm10, hg2024, mm2024.
360368 Example: --genome hg38
361369 {3}{4}Analysis options:{5}
362- --cellranger {{7.1.0, 7.2.0, 8.0.0}}
370+ --cellranger {{7.1.0, 7.2.0, 8.0.0, 9.0.0 }}
363371 The version of CellRanger to run. This option specifies
364372 which version of CellRanger to use when running GEX,
365373 CITE, or MULTI. Please select one of the following
366- options: 7.1.0, 7.2.0, 8.0.0
374+ options: 7.1.0, 7.2.0, 8.0.0, 9.0.0
367375 Example: --cellranger 7.1.0
368376 --aggregate {{mapped,none}}
369377 Cell Ranger aggregate. This option defines the
@@ -372,9 +380,11 @@ def parsed_arguments(name, description):
372380 from higher depth samples until each library type has an
373381 equal number of reads per cell that are confidently mapped.
374382 None means to not normalize at all. If this flag is not
375- used then aggregate will not be run. To run Cell Ranger
376- aggregate, please select one of the following options:
377- mapped, none.
383+ used then aggregate will not be run. Aggregate analysis
384+ is generally not needed, but it can be used to generate a
385+ Loupe Browser file for interactive exploration of the data.
386+ To run Cell Ranger aggregate, please select one of the
387+ following options: mapped, none.
378388 Example: --aggregate mapped
379389 --libraries LIBRARIES
380390 Libraries file. A CSV file containing information about
@@ -556,16 +566,67 @@ def parsed_arguments(name, description):
556566 Here is an example rename.csv file:
557567 FASTQ,Name
558568 original_name1,new_name1
559- original_name2,new_name1
560- original_name3,new_name2
561- original_name4,new_name3
569+ original_name2,new_name2
570+ original_name3,new_name3
571+ original_name3-2,new_name3
572+ original_name4,original_name4
573+ where:
574+ • FASTQ: The name that is used in the FASTQ file
575+ • Name: Unique sample ID that is the sample name used for
576+ Cell Ranger count.
562577 In this example, new_name3 has FASTQ files with two different
563578 names. With this input, both sets of FASTQ files will be used
564579 when processing the sample as new_name3. original_name4 will not
565580 be renamed. Any FASTQ file that does not have the name
566581 original_name1, original_name2, original_name3, or original_name4
567582 will not be run.
568583 Example: --rename rename.csv
584+ --forcecells FORCECELLS
585+ Force cells file. A CSV file containing the name of the sample
586+ (the Cell Ranger outputted name) and the number of cells to
587+ force the sample to. This flag is applicable when using the GEX,
588+ CITE, MULTI, and ATAC pipelines. It will generally be used if
589+ the first analysis run appears to do a poor job at estimating
590+ the number of cells, and a re-run is needed to adjust the number
591+ of cells in the sample.
592+
593+ This file can created in two different formats. The first one
594+ can be used for the GEX, CITE, MULTI, and ATAC pipelines. It
595+ will contain the name of the sample and the number of cells
596+ to be forced to.
597+ Here is an example forcecells.csv file:
598+ Sample,Cells
599+ Sample1,3000
600+ Sample2,5000
601+ where:
602+ • Sample: The sample name used as the Cell Ranger output
603+ • Cells: The number of cells the sample should be forced to
604+ In this example, Sample1 and Sample2 will be run while being forced
605+ to have 3000 and 5000 cells respectively. Any other samples that
606+ are processed will be run without using the force cells flag and
607+ will use the default cell calling algorithm.
608+
609+ The second format is only compatible with the MULTI pipeline and
610+ would be used when hashtag multiplexing is used and the number of
611+ cells needs to be forced for a specific hashtagged sample.
612+ Here is an example forcecells.csv file:
613+ Name,Sample,Cells
614+ Library1,HTO_1,3000
615+ Library1,HTO_2,5000
616+ where:
617+ • Library: The name of the library that is provided as to Cell
618+ Ranger when running multi analysis. This should match the
619+ name that is given in the libraries.csv file.
620+ • Sample: The sample ID used for the associated hashtag. This
621+ will have to match the value used in the CMO sample file or
622+ the CMO reference file that is provided as input. If only a
623+ CMO reference file is provided, the pipeline default assigns
624+ each hashtag with the IDs of HTO_1, HTO_2, etc.
625+ • Cells: The number of cells the sample should be forced to
626+ In this example, the hashtags HTO_1 and HTO_2 in Library 1 will
627+ be run while being forced to 3000 and 5000 cells respectively.
628+ Any other libraries or samples that are processed will be run
629+ without using the force cells flag.
569630
570631 {3}{4}Orchestration options:{5}
571632 --mode {{slurm,local}}
@@ -836,7 +897,16 @@ def parsed_arguments(name, description):
836897 type = str .lower ,
837898 required = False ,
838899 default = "" ,
839- choices = ['7.1.0' , '7.2.0' , '8.0.0' ],
900+ choices = ['7.1.0' , '7.2.0' , '8.0.0' , '9.0.0' ],
901+ help = argparse .SUPPRESS
902+ )
903+
904+ # Number of cells to force samples to when running Cell Ranger analysis
905+ subparser_run .add_argument (
906+ '--forcecells' ,
907+ # Check if the file exists and if it is readable
908+ type = lambda file : permissions (parser , file , os .R_OK ),
909+ required = False ,
840910 help = argparse .SUPPRESS
841911 )
842912
0 commit comments