-
Couldn't load subscription status.
- Fork 64
Open
Labels
Description
In https://github.com/artsy/mongoid_fulltext/blob/master/lib/mongoid_fulltext.rb:
# Figure out how many ngrams to extract from the string. If we can't afford to extract all ngrams,
# step over the string in evenly spaced strides to extract ngrams. For example, to extract 3 3-letter
# ngrams from 'abcdefghijk', we'd want to extract 'abc', 'efg', and 'ijk'.
if bound_number_returned
step_size = [((filtered_str.length - config[:ngram_width]).to_f / config[:max_ngrams_to_search]).ceil, 1].max
else
step_size = 1
end
If we want to get 3 n-grams: abc, efg and ijk from abcdefghijk (11) we need a step of 4, not 3.
(11.to_f - 3) / 3 = 2.6, ceil to 3
I think this needs to not do - config[:ngram_width].
However, I wonder whether the comment is incorrect and we want the first 3 n-grams instead of skipping characters.
cc: @aaw