-
Notifications
You must be signed in to change notification settings - Fork 3
Description
RAPPAS may miscalculate the scores of k-mers in a window for k sufficiently high if many k-mers are alive in this window. I found examples (D652 dataset, k=10, o=1.5) where the scores computed by RAPPAS v1.21 differ from the real values (computed manually) by up to 1e-5 (in non-log values). While it does not seem to be a lot, the compound effect of those little differences while placing queries produces placements that are different compared to RAPPAS2. (for the examples I found, XPAS scores are much closer to the real values).
This happens in src/core/algos/WordExplorer_v3.java:
currentLogSum+=session.parsedProbas.getPP(nodeId, i, j);
...
currentLogSum-=session.parsedProbas.getPP(nodeId, i, j);
where the class variable currentLogSum accumulates the rounding error from adding and subtracting the same value. The error is the higher the more k-mers are alive in the window (i.e., the number of times we change the variable is O(|alive k-mers|))
The easy fix is to make currentLogSum a local variable (change it only O(k) times instead).