-
Notifications
You must be signed in to change notification settings - Fork 4
manipulateGTCs_wiki
All possible command line arguments for method manipulateGTCs:
| Argument | Defaults | Description |
|---|---|---|
--bpm |
required by user | Full path to bead pool manifest file (.bpm); must be same one used to generate gtc |
--gtcDir |
required by user | Full path to location of directory/folder containing gtc files to process (files must end in .gtc) -- will not recursively go into sub-directories |
--updates |
required by user | Full path to file containing snps and/or metadata to update. An example file can be found here or a detailed explanation can be found here. |
--overrides |
optional, default=None |
a tab-delimited text file to temporary update the snp listed in the bpm file (not GTC!), one snp per line, of snp name and allele change. Ex: rs12248560.1 [T/A], will update allele rs12248560.1 to have alleles T and A instead of what is listed on the bpm. An example file can be found here or a detailed explanation can be found here. |
--outDir |
optional, default=current working directory | Full path to directory or folder to output results. If it path does not exist, program will attempt to create it |
--logName |
optional, default=gtcFuncs.log |
Name of log file to output, will be created in directory --outDir |
--modDir |
optional, default=current working directory | Full path to module files .py from github; default is current working directory with modules folder appended |
The minimum command required to manipulate gtc files is the following:
python3 gtcFuncs.py manipulateGTCs --bpm /path/to/manifest.bpm --gtcDir /path/to/gtcLocations/ --updates myUpdates.txt
This is a tab-delimited file, please do not use spaces. This means each tab indicates the next column.
For sample update and metadata line:
- first column = must start with the character ">" for each new sample directly followed by the name of the base gtc
- second column = name of new gtc to be created
- third column = comma-separate list of keyword in gtc to update. Keywords must be followed by "=" (no spaces before or after). Possible keywords are the following: sampleName, sentrixBarcode, plateName, well, and sex. Only use keywords that need to be updated. Any keywords not listed will inherit the value of the base gtc file.
For snp updates
- one snp per line
- first column = the name of the snp list in bpm file
- second column = the alleles to change it to. Must always be an allele pair.
- The snps are updated for the sample until the next sample is reached as determined by the next ">" character of metadata
For the 1st column of data in the updates file:
For the 2nd column of data in the updates file:
For the 3rd column of data in the updates file:
Another option that is available exclusively to the method manipulateGTCs is the --overrides argument. If a snp call needs to be updated in the bpm it can be made using this argument using an overrides file. This does not create an updated bpm file, it is merely used so that a gtc can pass validation due to misrepresentation of a wrong allele(s).
The bpm file is updated within the scope of the running program, however, it will not write out a new bpm, only the original bpm will persist. This is a tab-delimited file and the braces "[]" surrounding the allele pair is required along with the forward slash "/" between the alleles. An example of an override file can be seen here.
Here is an example of how that would look like on the command line:
python3 gtcFuncs.py manipulateGTCs --bpm /path/to/manifest.bpm --gtcDir /path/to/gtcLocations/ --updates examples/example_gtcManipulationFile_input.txt --overrides examples/override.txt --outDir /path/to/output/Directory/



