Releases: vgteam/vg
vg 1.69.0 - Bologna
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.69.0
Buildable Source Tarball: vg-v1.69.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.
Release Note! Compared to the previous v1.68.0 release, vg giraffe is faster on long reads, but may be less accurate for variant calling from HiFi reads, when using available trained DeepVariant models.
This release includes:
vg injectnow produces useful error messages when reads go out of range on pathsvg autoindexnow gives you hints about what files would help it, when it can't make the indexes it wants to make.vg chainssubcommand for extracting top-level chains from a distance index or a snarls file for GBZ-base.vg injectwill no longer spontaneously map SAM/BAM reads that have their mapping fields filled in but are flagged as unmapped.vg injectwill now throw away scores for unmapped readsvg statsandvg injectcan now understand reads that are asserted to be "mapped", but where the position/path is not provided, a thing the SAM spec does not appear to prohibit.- Zip code trees for
vg giraffe's chaining mode now have non-heuristic* distances in non-DAG snarls [*intra-chain reversals are still not handled at all] As a practical matter, we get significant speedups on HiFi and R10 reads (especially for the slowest reads) and a tiny increase in read identity scores (though some increase and some decrease) - vg mapping tools can now produce supplementary alignments for SAM/BAM output
vg giraffenow implements a recombination aware chaining algorithm- GBWTGraph can again be built for more than 64 paths
vg find -Gnow includes regions of paths touched by the extracted graphvg haplotypes --include-referencenow also includes reference paths that do not visit any snarls.- Breaking changes to the haplotype information (
.hapl) files used byvg haplotypes. Old files can no longer be used. - Improve automatic manpage generation
- Fixed haplotypes supported by minimizers (for recombination-aware
vg giraffe) - Add tiebreak on identity for alignments with identical score (
vg giraffe) - Heuristically detect & fix when snarl ranks are sorted backwards in zip code tree
Updated Submodules
- gbwtgraph
- sdsl-lite
vg 1.68.0 - Rimbocchi
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.68.0
Buildable Source Tarball: vg-v1.68.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.
This release includes:
vg indexnow accepts a-woption to up-weight nodes to push the top-level chain through them when finding snarls- Added a warning that path selection options are not compatible with
vg paths -g vg haplotypesexits with an error if the snarl decomposition contains a cyclical top-level chain.scripts/check_options.pynow catches if something other than,is between shortform and longform options- Add option
vg autoindex --no-guessingto allow force-regenerating indices - Lookup of regions within paths that are themselves subpaths (like
Stella_v1p1#0#Chr4__Stella_v1p1[11578420-11580540]:0-100) should now work again. - Add errors when using incompatible options in
vg depth - SAM-style tags are no longer lost on unmapped reads during surject
- vg's vcflib build will now use the default
python3instead of the latest installed Python (which might not have its headers) - Add
nodesas avg filter --tsv-outfield option; prints a comma-separated list of nodes traversed by the read's path vg giraffenow has a--softclip-penaltyflag to reduce alignment scores per-base for softclipsvg filternow has a-W/--overwite-scoreflag to save the scores from--rescore.vg filternow checks to make sure you aren't using--rescoreor related options when they would do nothing.- Internal changes in
vg giraffeto allow multiple presets to potentially share settings. - Bug fixes for chain transition distance measurement with the zip code tree in
vg giraffe - vg now supports Protobuf 30+ and its string view return types.
vg modnow has an--invert-keep-pathsoption to save the complement of path names passed to--keep-pathsvg giraffe -b hifipreset now uses a--max-min-chain-scoreof 100vgnow has alibbdsgthat can runis_regular_snarl()on a distance-less distance index.
Updated Submodules
- gbwtgraph
- libbdsg
- libvgio
vg 1.67.0 - Vetria
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.67.0
Buildable Source Tarball: vg-v1.67.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.
This release includes:
- GAF path end positions are calculated correctly in some edge cases.
--keep-pathcan now be used multiple times in vg modvg giraffe --track-correctnessshould no longer crash when read truth positions are on paths that exist in the graph, but are too short to reach where the read is.- Bring
vg clusterup-to-date: now accepts GBZ files, can do short-read or long-read giraffe, and allows--prefixfor better compatibility withvg autoindex - Add some options to
vg clusterto help with chaining issue diagnosis: print out cyclic snarl sizes, seeds with high hit amounts - Fix GFA haplotype sniffing for GFAs with P-lines
- Use graph metadata and not path name to determine reference/haplotype status for paths in
vg callandvg deconstruct. - Loading transcript files will now produce a human-readable error message when there are duplicate transcripts with the same ID on different paths.
- The GBWT built while sorting GAF with
vg gamsortis now forward-only by default. vg simnow can output in FASTQ format via--fastq-out- Make
vg mod -ttake an argument and stop-Efrom requiring one - In
vg chunk, fix the long names for-P,-c,-r, and-R, and make the latter two accept arguments. - Register command line options correctly & put them under test (
scripts/check_options.py). This involved a lot of minor bugfixes and helptext modifications, collected in a Google Doc. - Manually wrap option helptext lines after 80 characters
vg simnow works with sample name even when no GBWT is provided.- CI now enforces the minimum required GCC version.
- vg now requires a minimum GCC version of 7, the oldest major version available in the Ubuntu releases we test on for CI.
vg giraffeusage example now shows using a.zipcodesfile and a.withzip.minfile.- vg can now be built with the mimalloc allocator (v3 beta)
Updated Submodules
- BBHash
- libbdsg
- libvgio
- sdsl-lite
- sparsepp
- vcflib
New Submodules
- mimalloc
Removed Submodules
- fastahack (now used via vcflib)
vg 1.66.0 - Navetta
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.66.0
Buildable Source Tarball: vg-v1.66.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.
This release includes:
- vg can build on CMake 4
- Mention the new wiki page about the
--tsv-outoption within its helptext - Add transcript paths to spliced transcriptome graph before saving in autoindex
- Added
-u/--cut-alignmentoption tovg chunkto allow cutting long alignments down to the portion(s) actually in the chunk. - Renamed GAM-related
vg chunkoptions to be about generic alignments (with fallback aliases of the old names). - Change README and vg giraffe's helptext to indicate that it supports long reads as well as short reads
vg giraffeno longer un-maps the read being used for rescue when no significant alignment is found for the read being rescued- vg can now surject to CRAM. To use an external reference, you will need to pre-populate the reference cache (see
REF_PATHandREF_CACHEdocumentation at https://www.htslib.org/workflow/cram.html). vg surjectnow slides by twice the anchor length looking for suspicious anchorsvg surjectnow uses a slightly increased gap open penalty in DP to encourage idiomatic gaps.vg filter --progressnow shows you the progress of reading through the input file.- Standardize capitalization in helptext
vg surjectshould no longer suffer from score overflows in tail alignment from the new higher gap open penalty.- Fix broken link in README due to wiki page rename
giraffe-facts.pyscript works again (with backported changes from the long read Giraffe experiments repo) and is now under CI testvg surjectno longer increases the gap open penalty by default, but allows it to be adjusted with--extra-gap-cost/-E.vg giraffewill always assign MAPQ 0 to unmapped reads in pairs- assertion error fixed in
vg paths -n - Add option to
vg convertto promote generic-sense paths to haplotype-sense paths - Old vg CI jobs will now be canceled automatically when a new commit is pushed to a branch
- Reads fetched from indexed GAF files are now deduplicated because one read might intersect multiple ID ranges
- Reads fetched from indexed GAF files are now parsed and checked for actual intersection with the query range
- snarl clipping options in
vg clipexpanded and refined vg giraffenow admits to taking FASTA input with-f- Add a CIGAR option to
--tsv-out - Add new option
--track-lastto track either last correct alignment (current/default) or last existing alignment inscripts/giraffe-facts.py vg constructprogress bars should fit longer PanSN contig names- Zip code tree iterator no longer tries to handle a stack depth that can't occur in
S_SKIP_CHAIN - Option to output the GBWT of the paths when sorting GAF with
vg gamsort. vg clip -dand-Dcan now be used together (depth threshold applied to edge clipping)
Updated Submodules
- DYNAMIC
- atomic_queue
- gbwtgraph
- libbdsg
- sdsl-lite
vg 1.65.0 - Carfon
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.65.0
Buildable Source Tarball: vg-v1.65.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.
This release includes:
- Fix
call -vto use consistent ref traversals (resulting in more accurate coverage info) - Fixed a few minor typos in comments
Updated Submodules
N/A
vg 1.64.1 - Vibbiana
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.64.1
Buildable Source Tarball: vg-v1.64.1.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.
This release includes:
- Giraffe will no longer cut nodes wrong when anchoring an extracted subgraph's end on the reverse strand of the same node as its start
- Giraffe will make fewer tips, by enforcing path length strictly for connecting subgraph extraction. Dagification will still produce tips that will still be trimmed.
vg giraffewill no longer confuse the two orientations of single bounding nodes when aligning between positions, a source of invalid alignments.- New
vg align --between POS,POSfeature for manually exercisign the align-between-positions code used byvg giraffe. - vg now only builds required parts of
vcflib vgmanmd.pynow properly handles--helpfailures
Updated Submodules
The gbwtgraph, gcsa2, libhandlegraph, and libvgio submodules have been updated.
vg 1.64.0 - Vibbiana
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.64.0
Buildable Source Tarball: vg-v1.64.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.
This release includes:
vg giraffelogs now include more information about candidate chains.vg find -rwill no longer include empty nodes for nonexistent node IDs in the range being extracted.- TLEN for read 1 starting inside read 2 will no longer be shorter than the union of the reads in
vg surject. - TLEN computation handles more CIGAR operations in case we decide to use them.
vg chunknow outputs and accepts base path ranges.vg giraffehas newhifiandr10preset parameters that improve speed, are mostly neutral on calling accuracy, but add more wrong MAPQ 60 HiFi reads.- Zip code ancestor orientation bugfix for
vg giraffe: all minimizer and zipcode files will need to be re-generated to take advantage of it. - GFA
L-lines now written beforeP/Wlines when converting to GFA (using vg's own implementation). The main upshot is that paths in the output ofvg convert -fWcan now be viewed in BandageNG. - Giraffe now rebuilds both the minimizers and zipcodes if either needs to be rebuilt
- Loading an old minimizer file now fails
- Haplotype sampling can sample fragmented haplotypes in large snarls.
- vg man page now includes options hidden behind
--help. - Wiki pages for long-read Giraffe are now under CI testing
- vg giraffe should no longer produce final mappings with nonzero offsets or other wrong answers when a read tail doubles back on itself.
- GBWTGraph GFA parsing algorithm can handle missing SeqStart/SeqEnd fields in W-lines.
- Reference samples for haplotype sampling can be selected with
--set-referenceinvg haplotypesandvg giraffe. vg giraffecan now handle self-loop cycles that happen to fall exactly where it wants to put an all-insertion edit.vg giraffeallows multiple tips for anchor nodes if they are the same length and back-translation can work.vg giraffeshould no longer find and be very confident in obviously insignificant rescue alignments for paired reads.
Updated Submodules
The gbwt and gbwtgraph submodules have been updated.
vg 1.63.1 - Boccaleone
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.63.1
Buildable Source Tarball: vg-v1.63.1.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.
This release includes:
- Removed unit tests for primer filtering that require the source tree
vg 1.63.0 - Boccaleone
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.63.0
Buildable Source Tarball: vg-v1.63.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.
This release includes:
- Add a man page (
make doc/man/vg.1, https://github.com/vgteam/vg/wiki/vg-manpage) - Better description of how to use input options for types of JSONs in
vg view - Since
vg rnacan't parse.gzfiles, have it give a useful error for those inputs - Long Read Giraffe is now in vg. The
vg giraffesubcommand now supports long reads.vg giraffenow has--parameter-preset hifiand--parameter-preset r10for using a new chaining-based algorithm to map long reads.--parameter-preset chaining-sruses the new algorithm for single-ended short reads; the old--parameter-preset defaultand--parameter-preset fastremain available with the old non-chaining algorithm for short reads or paired-end inputs.giraffe-facts.pyscript now knows how to read GAM files internally and no longer needs JSON preprocessing.- The
vg giraffeminimizer file format has changed. - There is also a new
.zipcodesindex file used invg giraffemapping. - Improvements have been made to the distance index format used in
vg giraffe.
- Haplotype information files used in haplotype sampling are a bit smaller. Existing files can still be used.
- Allow selecting the
identityfield invg filter --tsv-out vg giraffe,vg mpmap, andvg mapwill now fail early with an error when encountering a read with a quality string of the wrong length (as from a truncated FASTQ)- vg now tries to limit itself to a good number of threads for the number of CPUs in any enclosing Slurm job, via
SLURM_JOB_CPUS_PER_NODEand CPU affinity masks. vg chunkcan now properly take a chunk of a path that already has a subrangevg injectnow has--add-identityto calculate 'identity' statistic (e.g. for linear mapper output BAMs)- Add
vg primersto get stats about variants in PCR primers from primer3 - Stop
identity()from penalizing soft clips (insertions at start/end of path) as part of the total length- Note that this changes calculation used for the
identityfield in GAM files!
- Note that this changes calculation used for the
vg autoindexwill no longer duplicate input gbz as.giraffe.gbzwhen indexing for Giraffe.- GAF sorting with
vg gamsortis much faster than before.
Updated Submodules
The dozeu, gbwt, gbwtgraph, libbdsg, libhandlegraph, libvgio, and sublinear-Li-Stephens submodules have been updated.
vg 1.62.0 - Ranzano
Don't forget to mark the static binary executable:
chmod +x vg
Docker Image: quay.io/vgteam/vg:v1.62.0
Buildable Source Tarball: vg-v1.62.0.tar.gz
Includes source for vg and all submodules. Use this instead of Github's "Source Code" downloads; those will not build as they do not include code for bundled dependencies that the vg build process needs.
This release includes:
- GBWT construction from a GAM/GAF file now uses parallel construction jobs.
vg chunkandvg findnow generate subpaths with subrange metadata when cutting up paths.vg gbwtwill accept subranges on fragment 0 and discard the fragment number.vg map,vg mpmap, andvg giraffecan now annotate output with SAM-style flags from FASTQ comments with--comments-as-tagsvg surjectnow detects when multipath alignments obviously don't belong to the graph they are being surjected to.- Updated
libbdsgto check if a distance index actually has distances and to improve memory use of distance indexing when not including distances vg indexnow includes include flag--no-nested-distanceto build a distance index with distances only on the top-level chain- Add
--snarl-sampletovg stats -R. This adds BED-style reference coordinates to the front of each row in the snarl output table, using the input sample to select reference paths. If no selected path is found.'s are written. If multiple paths / intervals find (in case of cyclces), the first one found is printed. vg deconstruct -nbug that bypassed some nested sites fixed.- When reading a
.gff3file withvg rna, validate exon ordering by base-pair position instead of number attribute. This allows reverse-strand exons to be numbered either by base-pair order or transcription order. - Have
vg rnagracefully ignore features with bad chromosome names if they're not included in--feature-typeand thus won't be parsed anyways
Updated Submodules
The gbwt, gbwtgraph, gcsa2, libbdsg, libvgio, sdsl-lite, and sublinear-Li-Stephens submodules have been updated.
