@@ -213,7 +213,7 @@ specific commands to see if they apply.
213
213
*--regions-overlap* '0'|'1'|'2'::
214
214
This option controls how overlapping records are determined:
215
215
set to *0* if the VCF record has to have POS inside a region
216
- (this corresponds to the default behavior of *-t/-T*);
216
+ (this corresponds to the default behavior of *-t/-T*);
217
217
set to *1* if also overlapping records with POS outside a region
218
218
should be included (this is the default behavior of *-r/-R*); or set
219
219
to *2* to include only true overlapping variation (compare
@@ -278,7 +278,7 @@ The program ignores the first column and the last indicates sex (1=male, 2=femal
278
278
279
279
*-T, --targets-file* \[^]'FILE'::
280
280
Same *-t, --targets*, but reads regions from a file. Note that *-T*
281
- cannot be used in combination with *-t*.
281
+ cannot be used in combination with *-t*.
282
282
+
283
283
With the *call -C* 'alleles' command, third column of the targets file must
284
284
be comma-separated list of alleles, starting with the reference allele.
@@ -478,7 +478,7 @@ Add or remove annotations.
478
478
*--single-overlaps*::
479
479
use this option to keep memory requirements low with very large annotation
480
480
files. Note, however, that this comes at a cost, only single overlapping intervals
481
- are considered in this mode. This was the default mode until the commit
481
+ are considered in this mode. This was the default mode until the commit
482
482
af6f0c9 (Feb 24 2019).
483
483
484
484
*--threads* 'INT'::
@@ -633,7 +633,7 @@ demand. The original calling model can be invoked with the *-c* option.
633
633
text file with sample names in the first column and group names in the second column. If '-' is
634
634
given instead, no HWE assumption is made at all and single-sample calling is performed. (Note that
635
635
in low coverage data this inflates the rate of false positives.) The *-G* option requires the presence of
636
- per-sample FORMAT/QS or FORMAT/AD tag generated with *bcftools mpileup -a QS* (or *-a AD*).
636
+ per-sample FORMAT/QS or FORMAT/AD tag generated with *bcftools mpileup -a QS* (or *-a AD*).
637
637
638
638
*-g, --gvcf* 'INT'::
639
639
output also gVCF blocks of homozygous REF calls. The parameter 'INT' is the
@@ -892,7 +892,7 @@ depth information, such as INFO/AD or FORMAT/AD. For that, consider using the
892
892
893
893
*-H, --haplotype* '1'|'2'|'R'|'A'|'I'|'LR'|'LA'|'SR'|'SA'|'1pIu'|'2pIu'::
894
894
choose which allele from the FORMAT/GT field to use (the codes are case-insensitive):
895
-
895
+
896
896
'1';;
897
897
the first allele, regardless of phasing
898
898
@@ -1018,8 +1018,8 @@ depth information, such as INFO/AD or FORMAT/AD. For that, consider using the
1018
1018
==== GEN/SAMPLE conversion:
1019
1019
*-G, --gensample2vcf* 'prefix' or 'gen-file','sample-file'::
1020
1020
convert IMPUTE2 output to VCF. One of the ID columns ("SNP ID" or "rsID" in
1021
- https://www.cog-genomics.org/plink/2.0/formats#gen) must be of the form
1022
- "CHROM:POS_REF_ALT" to detect possible strand swaps.
1021
+ https://www.cog-genomics.org/plink/2.0/formats#gen) must be of the form
1022
+ "CHROM:POS_REF_ALT" to detect possible strand swaps.
1023
1023
{nbsp} +
1024
1024
When the *--vcf-ids* option is given, the other column (autodetected) is used
1025
1025
to fill the ID column of the VCF.
@@ -1279,7 +1279,7 @@ output VCF and are ignored for the prediction analysis.
1279
1279
#
1280
1280
# Attributes required for
1281
1281
# gene lines:
1282
- # - ID=gene:<gene_id>
1282
+ # - ID=gene:<gene_id>
1283
1283
# - biotype=<biotype>
1284
1284
# - Name=<gene_name> [optional]
1285
1285
#
@@ -1553,7 +1553,7 @@ Without the *-g* option, multi-sample cross-check of samples in 'query.vcf.gz' i
1553
1553
that average score is used to determine the top matches, not absolute values.
1554
1554
1555
1555
*--no-HWE-prob*::
1556
- Disable calculation of HWE probability to reduce memory requirements with
1556
+ Disable calculation of HWE probability to reduce memory requirements with
1557
1557
comparisons between very large number of sample pairs.
1558
1558
1559
1559
*-p, --pairs* 'LIST'::
@@ -1622,11 +1622,11 @@ Without the *-g* option, multi-sample cross-check of samples in 'query.vcf.gz' i
1622
1622
// present, a constant value '99' is used for the unseen genotypes. With
1623
1623
// *-G*, the value '1' can be used instead; the discordance value then
1624
1624
// gives exactly the number of differing genotypes.
1625
- //
1625
+ //
1626
1626
// ERR, error rate;;
1627
1627
// Pairwise error rate calculated as number of differences divided
1628
1628
// by the total number of comparisons.
1629
- //
1629
+ //
1630
1630
// CLUSTER, TH, DOT;;
1631
1631
// In presence of multiple samples, related samples and outliers can be
1632
1632
// identified by clustering samples by error rate. A simple hierarchical
@@ -1861,7 +1861,7 @@ For "vertical" merge take a look at *<<concat,bcftools concat>>* or *<<norm,bcft
1861
1861
alternate alleles relevant (local) for the current sample. The number 'INT' gives the
1862
1862
maximum number of alternate alleles that can be included in the PL tag. The default value
1863
1863
is 0 which disables the feature and outputs values for all alternate alleles.
1864
-
1864
+
1865
1865
*-m, --merge* 'snps'|'indels'|'both'|'all'|'none'|'id'::
1866
1866
The option controls what types of multiallelic records can be created:
1867
1867
----
@@ -2150,8 +2150,8 @@ INFO/DPR .. Deprecated in favor of INFO/AD; Number of high-quality bases for
2150
2150
2151
2151
1.12 -Q13 -h100 -m1
2152
2152
illumina [ default values ]
2153
- ont -B -Q5 --max-BQ 30 --indel-bias 1.01 -I
2154
- pacbio-ccs -D -Q5 --max-BQ 50 --indel-bias 1.01 -F0.1 -o25 -e1 -M99999
2153
+ ont -B -Q5 --max-BQ 30 --no-indelQ-tweaks -I
2154
+ pacbio-ccs -D -Q5 --max-BQ 50 --no-indelQ-tweaks -F0.1 -o25 -e1 -M99999
2155
2155
2156
2156
*--ar, --ambig-reads* 'drop'|'incAD'|'incAD0'::
2157
2157
What to do with ambiguous indel reads that do not span an entire
@@ -2195,6 +2195,13 @@ INFO/DPR .. Deprecated in favor of INFO/AD; Number of high-quality bases for
2195
2195
Note that although the window size approximately corresponds to the maximum
2196
2196
indel size considered, it is not an exact threshold [110]
2197
2197
2198
+ *--no-indelQ-tweaks*::
2199
+ Increase sensitivity of indel calling, especially from long reads.
2200
+ The indel calling algorithm was designed for short reads and uses heuristics
2201
+ to estimate the maximum tolerable deviation of the query sequence
2202
+ from the reference. However, for long reads this sometimes leads to incorrect
2203
+ rejection of valid indels.
2204
+
2198
2205
*-I, --skip-indels*::
2199
2206
Do not perform INDEL calling
2200
2207
@@ -2256,7 +2263,7 @@ the *<<fasta_ref,--fasta-ref>>* option is supplied.
2256
2263
See also *--atom-overlaps* and *--old-rec-tag*.
2257
2264
2258
2265
*--atom-overlaps* '.'|'*'::
2259
- Alleles missing because of an overlapping variant can be set either
2266
+ Alleles missing because of an overlapping variant can be set either
2260
2267
to missing (.) or to the star alele (*), as recommended by
2261
2268
the VCF specification. IMPORTANT: Note that asterisk is expaneded
2262
2269
by shell and must be put in quotes or escaped by a backslash:
@@ -2286,7 +2293,7 @@ the *<<fasta_ref,--fasta-ref>>* option is supplied.
2286
2293
can swap alleles and will update genotypes (GT) and AC counts,
2287
2294
but will not attempt to fix PL or other fields. Also note, and this
2288
2295
cannot be stressed enough, that 's' will NOT fix strand issues in
2289
- your VCF, do NOT use it for that purpose!!! (Instead see
2296
+ your VCF, do NOT use it for that purpose!!! (Instead see
2290
2297
<http://samtools.github.io/bcftools/howtos/plugin.af-dist.html> and
2291
2298
<http://samtools.github.io/bcftools/howtos/plugin.fixref.html>.)
2292
2299
@@ -2330,7 +2337,7 @@ the *<<fasta_ref,--fasta-ref>>* option is supplied.
2330
2337
2331
2338
*--old-rec-tag* 'STR'::
2332
2339
Add INFO/STR annotation with the original record. The format of the
2333
- annotation is CHROM|POS|REF|ALT|USED_ALT_IDX.
2340
+ annotation is CHROM|POS|REF|ALT|USED_ALT_IDX.
2334
2341
2335
2342
*-o, --output* 'FILE'::
2336
2343
see *<<common_options,Common Options>>*
@@ -2949,11 +2956,11 @@ Transition probabilities:
2949
2956
2950
2957
*-M, --rec-rate* 'FLOAT'::
2951
2958
constant recombination rate per bp. In combination with *--genetic-map*,
2952
- the *--rec-rate* parameter is interpreted differently, as 'FLOAT'-fold increase of
2959
+ the *--rec-rate* parameter is interpreted differently, as 'FLOAT'-fold increase of
2953
2960
transition probabilities, which allows the model to become more sensitive
2954
2961
yet still account for recombination hotspots. Note that also the range
2955
2962
of the values is therefore different in both cases: normally the
2956
- parameter will be in the range (1e-3,1e-9) but with *--genetic-map*
2963
+ parameter will be in the range (1e-3,1e-9) but with *--genetic-map*
2957
2964
it will be in the range (10,1000).
2958
2965
2959
2966
*-o, --output* 'FILE'::
@@ -3192,7 +3199,7 @@ Convert between VCF and BCF. Former *bcftools subset*.
3192
3199
Note that filter options below dealing with counting the number of alleles
3193
3200
will, for speed, first check for the values of AC and AN in the INFO column to
3194
3201
avoid parsing all the genotype (FORMAT/GT) fields in the VCF. This means
3195
- that a filter like '--min-af 0.1' will be calculated from INFO/AC and INFO/AN
3202
+ that a filter like '--min-af 0.1' will be calculated from INFO/AC and INFO/AN
3196
3203
when available or FORMAT/GT otherwise. However, it will not attempt to use any other existing
3197
3204
field, like INFO/AF for example. For that, use '--exclude AF<0.1' instead.
3198
3205
@@ -3411,7 +3418,7 @@ to require that all alleles are of the given type. Compare
3411
3418
* array subscripts (0-based), "*" for any element, "-" to indicate a range. Note that
3412
3419
for querying FORMAT vectors, the colon ":" can be used to select a sample and an
3413
3420
element of the vector, as shown in the examples below
3414
-
3421
+
3415
3422
INFO/AF[0] > 0.3 .. first AF value bigger than 0.3
3416
3423
FORMAT/AD[0:0] > 30 .. first AD value of the first sample bigger than 30
3417
3424
FORMAT/AD[0:1] .. first sample, second AD value
@@ -3524,7 +3531,7 @@ used on the result. For example, when querying "TAG=1,2,3,4", it will be evaluat
3524
3531
3525
3532
TYPE="snp" && QUAL>=10 && (DP4[2]+DP4[3] > 2)
3526
3533
3527
- COUNT(GT="hom")=0 .. no homozygous genotypes at the site
3534
+ COUNT(GT="hom")=0 .. no homozygous genotypes at the site
3528
3535
3529
3536
AVG(GQ)>50 .. average (arithmetic mean) of genotype qualities bigger than 50
3530
3537
0 commit comments