Description
We recently began experimenting with BL7, Solr 7.4, Rails 5.2 and a 7 million item catalog. In the early testing so far we've noticed that the suggester feature errors out. The Solr web UI log screen will look like this:
ERROR true
x:blacklight-core
SuggestComponent
Exception in building suggester index for: mySuggester
Looking into the logs, the full trace looks like this:
2018-09-13 19:40:12.318 INFO (searcherExecutor-10-thread-1-processing-x:blacklight-core) [ x:blacklight-core] o.a.s.h.c.SpellCheckComponent Index is not optimized therefore skipping building spell check index for: default
2018-09-13 19:40:12.318 INFO (searcherExecutor-10-thread-1-processing-x:blacklight-core) [ x:blacklight-core] o.a.s.h.c.SpellCheckComponent Index is not optimized therefore skipping building spell check index for: author
2018-09-13 19:40:12.318 INFO (searcherExecutor-10-thread-1-processing-x:blacklight-core) [ x:blacklight-core] o.a.s.h.c.SpellCheckComponent Index is not optimized therefore skipping building spell check index for: subject
2018-09-13 19:40:12.318 INFO (searcherExecutor-10-thread-1-processing-x:blacklight-core) [ x:blacklight-core] o.a.s.h.c.SpellCheckComponent Index is not optimized therefore skipping building spell check index for: title
2018-09-13 19:40:12.318 INFO (searcherExecutor-10-thread-1-processing-x:blacklight-core) [ x:blacklight-core] o.a.s.h.c.SuggestComponent buildOnCommit: mySuggester
2018-09-13 19:40:12.318 INFO (searcherExecutor-10-thread-1-processing-x:blacklight-core) [ x:blacklight-core] o.a.s.s.s.SolrSuggester SolrSuggester.build(mySuggester)
2018-09-13 19:40:12.818 ERROR (searcherExecutor-10-thread-1-processing-x:blacklight-core) [ x:blacklight-core] o.a.s.h.c.SuggestComponent Exception in building suggester index for: mySuggester
java.lang.IllegalArgumentException: input automaton is too large: 1001
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1298) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
( this goes one for 1000 lines)
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1306) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.util.automaton.Operations.topoSortStates(Operations.java:1275) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
at org.apache.lucene.search.suggest.analyzing.AnalyzingSuggester.replaceSep(AnalyzingSuggester.java:292) ~[lucene-suggest-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:52:17]
at org.apache.lucene.search.suggest.analyzing.AnalyzingSuggester.toAutomaton(AnalyzingSuggester.java:854) ~[lucene-suggest-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:52:17]
at org.apache.lucene.search.suggest.analyzing.AnalyzingSuggester.build(AnalyzingSuggester.java:430) ~[lucene-suggest-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:52:17]
at org.apache.lucene.search.suggest.Lookup.build(Lookup.java:190) ~[lucene-suggest-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:52:17]
at org.apache.solr.spelling.suggest.SolrSuggester.build(SolrSuggester.java:181) ~[solr-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:55:13]
at org.apache.solr.handler.component.SuggestComponent$SuggesterListener.buildSuggesterIndex(SuggestComponent.java:534) ~[solr-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:55:13]
at org.apache.solr.handler.component.SuggestComponent$SuggesterListener.newSearcher(SuggestComponent.java:521) ~[solr-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:55:13]
at org.apache.solr.core.SolrCore.lambda$getSearcher$18(SolrCore.java:2322) ~[solr-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:55:13]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_181]
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209) ~[solr-solrj-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:55:14]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
2018-09-13 19:40:12.824 INFO (searcherExecutor-10-thread-1-processing-x:blacklight-core) [ x:blacklight-core] o.a.s.c.SolrCore [blacklight-core] Registered new searcher Searcher@b5ea2d0[blacklight-core] main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_gy(7.4.0):C2138244/151:delGen=3) Uninverting(_k0(7.4.0):C2025787/58:delGen=2) Uninverting(_if(7.4.0):c125682/2:delGen=1) Uninverting(_mk(7.4.0):C2086012/4:delGen=2) Uninverting(_k8(7.4.0):C10433/38:delGen=2) Uninverting(_l1(7.4.0):C8823/2:delGen=2) Uninverting(_lj(7.4.0):C16528/1:delGen=1) Uninverting(_lk(7.4.0):C17041/1:delGen=1) Uninverting(_lx(7.4.0):C16192) Uninverting(_m0(7.4.0):C17514) Uninverting(_mw(7.4.0):c248023) Uninverting(_mc(7.4.0):C18352/1:delGen=1) Uninverting(_n7(7.4.0):c255323) Uninverting(_mx(7.4.0):C16296/3:delGen=1) Uninverting(_mu(7.4.0):C17043/2:delGen=1) Uninverting(_n0(7.4.0):C20258/2:delGen=1) Uninverting(_n5(7.4.0):C16103/1:delGen=1) Uninverting(_n6(7.4.0):C13497) Uninverting(_n3(7.4.0):C13603) Uninverting(_n8(7.4.0):C4865) Uninverting(_o8(7.4.0):c575/50:delGen=14) Uninverting(_nv(7.4.0):C9761/51:delGen=21) Uninverting(_nw(7.4.0):C5308/51:delGen=18) Uninverting(_p2(7.4.0):c1175/2:delGen=2) Uninverting(_p0(7.4.0):C1364/4:delGen=1) Uninverting(_p3(7.4.0):C97/2:delGen=1) Uninverting(_p4(7.4.0):C51/1:delGen=1) Uninverting(_p5(7.4.0):C98)))}
2018-09-13 19:40:12.825 INFO (qtp817348612-15) [ x:blacklight-core] o.a.s.u.p.LogUpdateProcessorFactory [blacklight-core] webapp=/solr path=/update/json params={commit=true}{commit=} 0 572
So, it appears that there is a recursive function that is only permitted to run 1,000 times before it quits. This is, apparently, a change that occurred in Lucene 7.0. Here's an email thread that discusses it:
http://lucene.472066.n3.nabble.com/solr-5-2-gt-7-2-suggester-failure-td4383551.html
Here's the relevant commit (as pointed out in the email thread):
I'll mention too that the suggester worked fine for us with a small corpus. It appears that once the corpus becomes too large, the suggester struggles. Also, all of our settings in schema.xml
and solrconfig.xml
are default.