Read CRAM files with pure JS, works in node or the browser. Supports CRAM 2.x
and 3.x, .crai indexes, and bzip2/lzma codecs.
npm install @gmod/cramimport { IndexedCramFile, CraiIndex } from '@gmod/cram'
import { IndexedFasta } from '@gmod/indexedfasta'
const fasta = new IndexedFasta({
path: '/path/to/reference.fa',
faiPath: '/path/to/reference.fa.fai',
})
const idToName = []
const nameToId = {}
const indexedFile = new IndexedCramFile({
cramPath: '/path/to/file.cram',
// alternatives: cramUrl, cramFilehandle (see generic-filehandle2)
index: new CraiIndex({
path: '/path/to/file.cram.crai',
// alternatives: url, filehandle
}),
seqFetch: async (seqId, start, end) => {
// seqId is numeric; coordinates are 1-based but IndexedFasta is 0-based
return fasta.getSequence(idToName[seqId], start - 1, end)
},
checkSequenceMD5: false,
})
// Build numeric refId <-> name mappings from the SAM header
const samHeader = await indexedFile.cram.getSamHeader()
samHeader
.filter(l => l.tag === 'SQ')
.forEach((sqLine, refId) => {
sqLine.data.forEach(item => {
if (item.tag === 'SN') {
nameToId[item.value] = refId
idToName[refId] = item.value
}
})
})
// Fetch records for a range (1-based, closed coordinates)
const records = await indexedFile.getRecordsForRange(
nameToId['chr1'],
10000,
20000,
)
for (const record of records) {
console.log(record.readName, record.alignmentStart, record.mappingQuality)
// Extract variants from read features
for (const feature of record.readFeatures ?? []) {
if (feature.code === 'X') {
// SNP: single base substitution
console.log(`SNP at ${feature.refPos}: ${feature.ref}->${feature.sub}`)
} else if (feature.code === 'I') {
// Insertion: full inserted sequence
console.log(`Insertion at ${feature.refPos}: ${feature.data}`)
} else if (feature.code === 'i') {
// Insertion: padding only (no sequence stored)
console.log(`Insertion at ${feature.refPos} (no sequence)`)
} else if (feature.code === 'D') {
// Deletion: bases deleted from reference
console.log(`Deletion at ${feature.refPos}: ${feature.data} bases`)
}
}
}See the example directory for browser usage with <script> tag and
the bundled cram-bundle.js.
For more complex operations like generating CIGAR strings from read features, see the JBrowse readFeaturesToNumericCIGAR implementation.
new IndexedCramFile({
cramPath, // local path
cramUrl, // remote URL
cramFilehandle, // generic-filehandle2 compatible handle
index, // CraiIndex instance (or any object with getEntriesForRange)
seqFetch, // async (seqId, start, end) => string
checkSequenceMD5, // default true; set false to avoid large reference fetches
cacheSize, // max cached records, default 20000
})getRecordsForRange(seqId, start, end, opts?)→Promise<CramRecord[]>— 1-based closed coords.opts:{ viewAsPairs, pairAcrossChr, maxInsertSize }hasDataForReferenceSequence(seqId)→Promise<boolean>
Takes { path, url, filehandle } — one of the three is required.
Properties:
readName— read namesequenceId— numeric reference IDalignmentStart— 1-based start positionqualityScores—Int8Arrayof per-base quality scoresreadFeatures— array of read features (see below)tags— auxiliary tags object
Flag methods (all return boolean):
isPaired()isProperlyPaired()isSegmentUnmapped()isMateUnmapped()isReverseComplemented()isMateReverseComplemented()isRead1()isRead2()isSecondary()isFailedQc()isDuplicate()isSupplementary()
Methods:
getReadBases()→string— returns the read sequence string. RequiresseqFetchto be configured and is populated automatically bygetRecordsForRange.
Each entry in record.readFeatures:
code— feature type (one ofbqBXIDiQNSPH, see CRAM spec §8)pos— read position (1-based)refPos— reference position (1-based)ref/sub— reference and substituted base (codeXonly)
CramUnimplementedError— unimplemented spec featureCramMalformedError— malformed file dataCramBufferOverrunError— read past end of data
Written with NHGRI funding as part of JBrowse. If you use this in a publication, please cite the most recent JBrowse paper at jbrowse.org.
MIT © Robert Buels
Trusted publishing via GitHub Actions.
npm version patch # or minor/majorSee CODEC_SUPPORT.md