Skip to content

Troubleshooting

Adam Novak edited this page Oct 27, 2025 · 8 revisions

Some of the solutions to common vg problems, and techniques for solving uncommon problems, are listed here.

Build Problems

My vg build fails because of something about protobuf.

This can be caused by not having checked out the submodules when you checked out vg. Go to the directory where you checked out vg and do:

git submodule update --init --recursive

Then try building again.

If that doesn't help, try going into the Protobuf directory and building it separately from vg:

cd deps/protobuf
make

If that doesn't work, the problem is likely with Protobuf and not vg itself, although to fix it vg may need to be updated to a newer version of Protobuf.

My vg build fails because of something about a regex and/or an LRUCache, with a bunch of template errors.

or

My vg build fails and "4.8" is present in the error log.

This can be caused by using an old compiler and/or C++ standard library. The minimum version of GCC that can build vg is 4.9. Make sure you have GCC 4.9 installed, and that running g++ --version reports version 4.9 or greater. Also make sure that GCC is not somehow picking up the headers from a wrong, older C++ standard library. Adjust the environment variables PATH, C_INCLUDE_PATH, CPLUS_INCLUDE_PATH, LIBRARY_PATH, LD_LIBRARY_PATH, and LD_RUN_PATH as necessary to select the correct version of the compiler and standard library, if you have had to install either of them yourself.

Runtime Problems

My vg run is slow/stuck!

Some classes of bugs manifest as a vg command never finishing. Debugging these problems can be greatly expedited by obtaining a backtrace from the stuck process, to show where it is stuck.

To get a debug trace from a running vg, first you need to know the PID (process ID) of the vg you want to debug. You can get that with pgrep vg if there is only one vg running. Otherwise you can do ps xau | grep $USER | grep vg and sort out which running vg you want, and then take the number in the second column.

Then once you have the PID, you start up gdb with gdb. Then, if your PID is "1234", you do attach 1234 inside of gdb to attach to the running process (which will pause it). Then you can do bt in gdb to get a backtrace (which may be multiple screens long and make you hit enter to see them all).

If your backtrace doesn't look informative (for example, if it's just sitting in "futex wait" or something similar), it could be you are on a thread that isn't doing anything interesting and some other thread is really the hold-up. In that case, do info threads and find the number of a thread that isn't just sitting around waiting. (Often you will see all the threads sitting in the same wait function except for one that is toiling away at some impossible problem). Then once you know the number of the thread you want to look at (say it is 10), do thread 10 to switch to that thread, and then bt again to get a backtrace for that thread.

When you are done with gdb, do quit and let the vg process continue on.

Then you can open a Github issue for the slow snarl finding and attach the backtrace(s).

My code is slow and I want to profile it with callgrind!

If you are trying to speed up part of vg, you can use the callgrind profiler from Valgrind to collect profiling information, and the KCachegrind GUI viewer tool (also distributed as QCachegrind) to view it.

  1. If you are interested in a particular part of the code, make sure to #include <valgrind/callgrind.h>, and call CALLGRIND_START_INSTRUMENTATION; just before the code you are interested in. This lets you skip profiling setup/index loading steps, and look only at the actual operation. Several vg subcommands already have this machinery in place, behind a USE_CALLGRIND #define switch. See for example https://github.com/vgteam/vg/blob/379c37db5d3f0f7f1a084782ce72dfcad1d6f60d/src/subcommand/giraffe_main.cpp#L42
  2. Run your command with valgrind --tool=callgrind vg .... If you are using CALLGRIND_START_INSTRUMENTATION to profile only part of the code, use valgrind --tool=callgrind --instr-atstart=no vg ... instead. This will create a file named callgrind.out.<pid of your process> with your profiling data.
  3. If profiling on a remote machine, download the callgrind.out.<pid> file to your local machine.
  4. Open the file with KCachegrind or QCachegrind. For Mac on Homebrew, you can brew install qcachegrind to get it. Linux distributions will package it as kcachegrind. Windows builds are available of dubious provenance and advanced age.
  5. Explore the profile and figure out where your time is going. You can consult this useful guide to interpreting KCachegrind's interface, or this overview blog post, to learn to use the tool. You can also read the full KCachegrind Handbook.

I do not like what vg giraffe is doing with my reads!

To investigate why vg giraffe might be making poor decisions, you can run one or a few offending reads with the --show-work option. You should also redirect the standard error log to a file, because it may be quite large: --show-work 2>log.txt. Note that this will also write some files in the current directory (for things like debugging seed chaining for long reads).

For long reads, the log will look something like this:

T0:     Read c0b60621-b15f-4fa3-93f1-856ff7e2f305: <1298 bp>
T0:     All minimizers:
T0:     Minimizer 0: GTCCCAGCTACTTGAGTGGCTGAAGTGGGAG@1253 with 1 hits
T0:     Minimizer 1: TAACTGGGACTACAGGCGCCCCCCACCACGC@64 with 1 hits
T0:     Minimizer 2: ATGGTCTCGATCTCCAACCTCGTGATCCACC@151 with 1 hits
T0:     Minimizer 3: TAGCTGGGCGCGGTGGCTCAGGCCTGTAATC@925 with 1 hits
T0:     Minimizer 4: AGCTGGGCGCGGTGGCTCAGGCCTGTAATCC@926 with 2 hits
T0:     Minimizer 5: CCCCCCACCACGCCCGGCTAATTTTTTTTGT@82 with 4 hits
T0:     Minimizer 6: GCTTGCAGTGAGCTGAGATTGCGCCACGGCA@1137 with 5 hits
T0:     Minimizer 7: GCAGGAGAATGGTATGAACCCAGGAGGCAGA@1106 with 23 hits
T0:     Minimizer 8: GAGGCAGGAGAATGGTATGAACCCAGGAGGC@1103 with 48 hits
T0:     Minimizer 9: GGGCGCGGTGGCTCAGGCCTGTAATCCCAGC@930 with 55 hits
T0:     <More minimizers from 0 to 1298>
T0:     Total hits overall: 4189679
T0:     Total minimizers with hits overall: 27
T0:     Downsampled 59 minimizers of 1 lengths to 50 minimizers
T0:     Total hits after downsampling: 4189523
T0:     Total minimizers with hits after downsampling: 26
T0:     Found 1293 seeds from 16 minimizers, rejected 43
T0:     Found 1250 zip code trees, scores 49.8819 best, 14.031 second best, coverages 0.121726 best, 0.0477658 second best
T0:     =====Creating fragments=====
T0:     Making fragments for zip code tree 96 with score 49.8819 and coverage 0.121726
T0:     Seed read:64+ = 88102848-3 from minimizer 1(1), #1
T0:     Seed read:892- = 88102813-21 from minimizer 12(101), #327
T0:     Seed read:1133- = 88102798-69 from minimizer 8(48), #60
T0:     Seed read:1136- = 88102798-72 from minimizer 7(23), #29
T0:     Seed read:1167- = 88102795-2 from minimizer 6(5), #11
T0:     Seed read:1253+ = 88102782-17 from minimizer 0(1), #0
T0:     Computing fragments over 6 anchors
T0:     Found 3 fragments in zip code tree 96 running {R:64=G:88102848-3(+0)-88102848-34(-31)*31} to {R:1253=G:88102782-17(+0)-88102782-48(-31)*31}
T0:             Fragment with score 155 and length 4 running (9){R:871=G:88102813-0(+21)-88102813-22(-1)*22} to {R:1253=G:88102782-17(+0)-88102782-48(-31)*31}
T0:             Fragment with score 51 and length 1 running {R:1103=G:88102798-39(+30)-88102798-70(-1)*31} to {R:1103=G:88102798-39(+30)-88102798-70(-1)*31}
T0:             Fragment with score 51 and length 1 running {R:64=G:88102848-3(+0)-88102848-34(-31)*31} to {R:64=G:88102848-3(+0)-88102848-34(-31)*31}
T0:     Making fragments for zip code tree 28 with score 14.031 and coverage 0.0261941
T0:     Seed read:1133- = 86665544-13 from minimizer 8(48), #59
T0:     Seed read:1136- = 86665544-16 from minimizer 7(23), #28
...
T0:             Kept 2/2 fragments.
T0:     Keeping, of the 2 fragments in 703, those with score of at least 2
T0:             Kept 2/2 fragments.
T0:     Keeping, of the 2 fragments in 1089, those with score of at least 2
T0:             Kept 2/2 fragments.
T0:     Keeping, of the 2 fragments in 335, those with score of at least 2
T0:             Kept 2/2 fragments.
T0:     Keeping, of the 2 fragments in 741, those with score of at least 2
T0:             Kept 2/2 fragments.
T0:     Keeping, of the 2 fragments in 940, those with score of at least 2
T0:             Kept 2/2 fragments.
T0:     =====Creating chains=====
T0:     Tree 96 has a good enough fragment set (score=257)
T0:     Chaining fragments from zip code tree 96
Chaining group of 3 items
                Come from score 71 from nowhere across {R:64=G:88102848-3(+0)-88102848-34(-31)*31} to (9){R:871=G:88102813-0(+21)-88102782-48(-31)*413}
                        For read distance 776 and graph distance 636 an indel of length 140 seems plausible
                We can reach #1 with 71 from #0 + -3 from transition + 175 from item = 243 from #0
                Come from score 71 from nowhere across {R:64=G:88102848-3(+0)-88102848-34(-31)*31} to {R:1103=G:88102798-39(+30)-88102798-70(-1)*31}
                        For read distance 1008 and graph distance 863 an indel of length 145 seems plausible
                We can reach #2 with 71 from #0 + -3 from transition + 71 from item = 139 from #0
        Best way to reach #0 {R:64=G:88102848-3(+0)-88102848-34(-31)*31} is 71 from nowhere
        Best chain end so far: 71 from #0
        Best way to reach #1 (9){R:871=G:88102813-0(+21)-88102782-48(-31)*413} is 243 from #0
        Best chain end so far: 243 from #1
        Best way to reach #2 {R:1103=G:88102798-39(+30)-88102798-70(-1)*31} is 139 from #0
        Best chain end so far: 243 from #1
T0:     Chain 0 with score 206 is composed from local fragments: 0 1
T0:     Chain 0 with score 206 is composed from global fragments: 2 0
T0:     Chain 0 with score 206 contains seeds: 1 327 29 11 0
T0:     Chain 1 with score 51 is composed from local fragments: 2
T0:     Chain 1 with score 51 is composed from global fragments: 1
T0:     Chain 1 with score 51 contains seeds: 60
T0:     Tree 395 has a good enough fragment set (score=102)
T0:     Chaining fragments from zip code tree 395
Chaining group of 2 items
        Best way to reach #0 {R:1103=G:161416495+62(+30)-161416495+93(-1)*31} is 71 from nowhere
        Best chain end so far: 71 from #0
        Best way to reach #1 {R:1106=G:161416495+65(+30)-161416495+96(-1)*31} is 71 from nowhere
        Best chain end so far: 71 from #0
...
T0:     =====Creating alignments=====
T0:     Chain 0 is good enough (score=206, min=0, threshold 160 off best)
T0:     Chain 0 overlaps none of the 0 read-node matchings used in previous alignments
T0:     Chain 0 is chain 0 in its tree 96 and is kept
T0:     Align chain of:  1 327 29 11 0 in 1293 items
T0:     Aligned left tail length 64
T0:     Aligned and added link of 1158 bp read and 997 bp graph via align_sequence_between with score of 0
T0:     Aligned right tail length 14
T0:     Composed alignment is length 1298 with score of 107
T0:     Composed alignment starts with: {"edit": [{"sequence": "GTGGT", "to_length": 5}, {"from_length": 13, "to_length": 13}], "position": {"is_reverse": true, "node_id": "88102856", "offset": "28"}}
T0:     Composed alignment ends with: {"edit": [{"from_length": 6, "to_length": 6}, {"from_length": 1, "sequence": "C", "to_length": 1}, {"from_length": 5, "to_length": 5}, {"sequence": "CA", "to_length": 2}], "position": {"is_reverse": true, "node_id": "88102782", "offset": "48"}}
T0:     Produced alignment from chain 0 with score 107: 5I52M3I50M1I32M2X22M1D21M1X37M1X9M3X7M1X6M1X2M1X3M6I2M4D8M6I6M1X1M6D15M1X2M1I3M2X5D3M1X2M6D1M1X2M1X2M1D16M1D3M1X2M2X2M6I3M7D3M1I6M2D9M2X4M1X1M3D3M6I6M1X3M1X4M2D2M3X1M2I7M1D3M1X1M3D2M1X4M2X4M5D2M2I3M1X10M1X3M1X2M1X2M1X5M1X3M2D5M1X2D5M2X2M1D1M4I3M1X5M1X2M1X20I4M2X1M3I9M1I3M99I5M29I6M1X1M4D7M1X2M1X6I8M1X3M1X4M2D4M1X7M2X1M1X1M1D20M1X1M1X1M1I5M1I7M2X2M1X7M2I8M1X1I13M1X1M1X4M1I2M1X3M2X5M2X59M1I3M1X4M1I3M1X3D2M3I4M1I1M1X1M1X37M1X8M1X1I3M1X33M1X27M1I1M1X2M1X7M1X156M1X2M1X16I50M1X5M2I@88102856- score 107
T0:     Chain 1 is good enough (score=51, min=0, threshold 160 off best)
T0:     Chain 1 overlaps a previous alignment at read pos 1133 and graph pos 88102798-69 with matching 88102798, 1, 1064
T0:     Chain 2 is good enough (score=51, min=0, threshold 160 off best)
T0:     Chain 2 overlaps none of the 85 read-node matchings used in previous alignments
T0:     Chain 2 is chain 0 in its tree 395 and is kept
...
T0:     chain 8 failed because there were too many good chains (score=51)
T0:     chain 9 failed because there were too many good chains (score=51)
T0:     chain 10 failed because there were too many good chains (score=51)
T0:     chain 11 failed because there were too many good chains (score=51)
T0:     alignment 2 accepted because 1 of it is from nodes not already used
T0:     alignment 0 accepted because 1 of it is from nodes not already used
T0:     alignment 1 accepted because 1 of it is from nodes not already used
T0:     Picked best alignment 939I6M1X21M1X11M1X2M1X3M6D2M1X9M1X13M1X4M1X11M1X2M1X18M1X1I2M1X3M1X2M2X11M1X74M1X13M1X15M1X8M1X6M2X3M1X13M83I@197272273+ score 143
T0:     For scores: 143, 107, 76
T0:     Scaled scores: 16.5254, 12.3652, 8.78274
T0:     MAPQ is 14

(Almost) every line starts with an indicator of which thread it came from, and the log describes the minimizer finding, seed selection, zip code tree construction, fragment generation, chaining, alignment, and winning-alignment-selection stages of the algorithm in order. Each stage talks about which of the candidates from the previous stage look good enough to continue with, and what various heuristic scores they achieved. During the chaining stage, a log of the chaining dynamic programming process is also dumped, so you can see all the fragments and the transitions that were considered between them, and how those transitions were scored.

If you are mapping simulated reads from a GAM with embedded truth positions (like you would use as a vg gamcompare truth set), and you use --track-correctness, candidate items that agree with the input read's annotated linear reference truth positions will be marked as CORRECT in the log.

At several places, a compact notation like {R:1103=G:161416495+62(+30)-161416495+93(-1)*31} is used to represent an anchoring match (or an entire fragment) aligning the read and the graph. This breaks down into:

  • R:1103: The anchor starts at 0-based position 1103 on the read
  • =G:161416495+62: The anchor matches that read position to graph node 161416495, on the + strand, at 0-based position 62
  • (+30): The "hint" position for the anchor's start (i.e. the location that has a zip code stored) is 30bp to the right of its graph start position
  • -: The anchor represents a range of matched positions
  • 161416495+93: The past-end position of the matched range in the graph is on node 161416495, on the + strand, at 0-based position 93
  • (-1): The hint position for the anchor's end is 1bp to the left of the past-end position for the anchor in the graph
  • *31: The anchor is 31bp long in the read

If you want to debug chaining, you can consult the best-chain-dotplot*.tsv file that is written. It describes the zip code tree that had the highest-scoring chain in it. For each anchor, and for each occurrence of an anchor in a fragment or chain, it gives where it is along reference paths embedded in the graph, and then where it is along the read. It has entries like this (note that this is not the same read as in the log above):

CHM13#0#chr3	95824937	162
CHM13#0#chr3	95786934	162
CHM13#0#chr3-chain-0	95786934	162
CHM13#0#chr3-chain-0	95786666	430
CHM13#0#chr3-frag-13	95843741	130
CHM13#0#chr3-frag-14	95786934	162
CHM13#0#chr3-frag-14	95786666	430

Each name consists of a path name (or NO_PATH), and then possibly a type (frag, chain, etc.) and a number for which item of that type the anchor appears in. Then there is a coordinate along the path (or 0 of there is no path) and a coordinate along the read.

If you are using a graph where only one path of your chosen reference is in a connected component, and your reference is CHM13, you can plot this information with this scatterplot script and this command:

scatter.py \
  --category_regex \
  --categories "chain" "frag" "CHM13" \
  --types "line" "line" "point" \
  --markers "o" "." "." \
  --colors "g" "r" "b" \
  --no_legend \
  --x_label "Reference" \
  --y_label "Read" \
  --title "Chain Dotplot" \
  --save chain-dotplot.png \
  <(cat best-chain-dotplot*.tsv | grep "CHM13")

This gives an image like this: A scatter plot titled 'chain dotplot (n = 2615)'. Reference is plotted on the x-axis from 9.574E7 to 9.588E7 and read is plotted on the y-axis from 0 to about 25000, both using linear scales. 520 lines are plotted. The plot is full of short red lines connecting runs of red points and declining at a steep angle. Near the left side there is a long green line connecting the points in some of the red lines, declining unevenly at a steep angle.

The red points connected together indicate fragments, and the green line behind some of them shows the best chain.

Clone this wiki locally