Skip to content

Commit 3ecf62e

Browse files
committed
add cram support section to FAQ
1 parent 7e5929f commit 3ecf62e

File tree

1 file changed

+17
-4
lines changed

1 file changed

+17
-4
lines changed

docs/FAQ.md

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
These may or may not have actually been asked, but it's a collection of hints that the
88
programmer understands but a casual user might not, as well as rationale.
99

10-
- Certain UCSC utilities no longer support `stdin`.
10+
- My `wigToBigWig` no longer support `stdin`.
1111

1212
Several of the UCSC command line utilities for big files (bigWig and bigBed in
1313
particular) used to support a barely documented feature of using `stdin` or
@@ -27,16 +27,29 @@ programmer understands but a casual user might not, as well as rationale.
2727
`userApps.v439` release (October 2022), so you will need an earlier version. See
2828
the section on UCSC library in the [Advanced Install](AdvancedInstallation.md)
2929
document for hints on compiling the utilities (you don't need to install the
30-
library).
30+
Perl library in this case).
3131

32-
- Do you support `CSV` files?
32+
- Does BioToolBox support Cram files?
33+
34+
Reading Cram files is supported through the
35+
[Bio::DB::HTS](https://metacpan.org/pod/Bio::DB::HTS) Perl adapter, which in turn
36+
is dependent on the linked [HTSlib](https://github.com/samtools/htslib) library.
37+
However, the only Cram files that can be used must either have a valid reference
38+
`UR` tag in the `@SQ` header, i.e. the original local reference fasta file is
39+
still available, or have an embedded reference sequence in the Cram file itself,
40+
i.e. generated with output option `embed_ref=1`). Using an external reference
41+
fasta file is not supported, a limitation unfortunately imposed by Bio::DB::HTS,
42+
not by Bio::ToolBox. Lacking these, you are best to simply back-convert the Cram
43+
file to Bam format using `samtools` prior to usage.
44+
45+
- Does BioToolBox support CSV files?
3346

3447
CSV files appear perfectly benign, but are in fact a can of worms: mandatory or
3548
optional quoting, empty or undefined values, spaces, character escaping, text
3649
encoding, and so on. This mostly affects reading files. Most (all?) bioinformatic
3750
text formats are tab-delimited, so CSV support is intentionally absent.
3851

39-
- Programs don't recognize a UCSC gene table (refFlat, knownGene, genePred, etc)
52+
- How do I get my UCSC gene table (refFlat, knownGene, genePred, etc) recognized?
4053

4154
UCSC doesn't have official file extensions, and their downloads page just
4255
have `.txt.gz` extensions. Furthermore, they don't have proper column headers. Downloads

0 commit comments

Comments
 (0)