Open
Description
background
I am doing a portal catalog search, which is bringing more 100K brains. When I going iterate all brains, got bellows error
2020-05-18 17:45:57,857 INFO [elasticsearch:83][waitress] POST http://127.0.0.1:9200/danbioapp-backend-portal_catalog/portal_catalog/_bulk [status:200 request:0.024s]
2020-05-18 17:45:58,861 INFO [elasticsearch:83][waitress] GET http://127.0.0.1:9200/_nodes/_all/http [status:200 request:0.002s]
2020-05-18 17:45:58,912 WARNING [elasticsearch:97][waitress] GET http://127.0.0.1:9200/danbioapp-backend-portal_catalog/portal_catalog/_search?from=10000&stored_fields=path.path&size=50 [status:500 request:0.050s]
2020-05-18 17:45:59,971 ERROR [Zope.SiteErrorLog:251][waitress] 1589816759.920.727241015445 http://localhost:9090/danbioapp-backend/f....
Traceback (innermost last):
Module ZPublisher.WSGIPublisher, line 156, in transaction_pubevents
Module ZPublisher.WSGIPublisher, line 338, in publish_module
Module ZPublisher.WSGIPublisher, line 256, in publish
Module ZPublisher.mapply, line 85, in mapply
Module ZPublisher.WSGIPublisher, line 62, in call_object
Module Products.ExternalMethod.ExternalMethod, line 230, in __call__
- __traceback_info__: ((<PloneSite at /danbioapp-backend>,), {}, None)
Module <string>, line 32, in main
Module ZTUtils.Lazy, line 201, in __getitem__
Module collective.elasticsearch.es, line 104, in __getitem__
Module collective.elasticsearch.es, line 170, in _search
Module elasticsearch.client.utils, line 76, in _wrapped
Module elasticsearch.client, line 660, in search
Module elasticsearch.transport, line 318, in perform_request
Module elasticsearch.connection.http_urllib3, line 186, in perform_request
Module elasticsearch.connection.base, line 125, in _raise_error
TransportError: TransportError(500, u'search_phase_execution_exception',
u'Result window is too large, from + size must be less than or equal to: [10000]
but was [10050]. See the scroll api for a more efficient way to request large data sets.
This limit can be set by changing the [index.max_result_window] index level setting.')
Found some related https://stackoverflow.com/questions/35206409/elasticsearch-2-1-result-window-is-too-large-index-max-result-window
I know it is possible to increase index.max_result_window
, but it would consist of memory usage.
My Idea here, if it is possible to use Elasticsearch scroll API here https://github.com/collective/collective.elasticsearch/blob/master/src/collective/elasticsearch/es.py#L48
Not sure if that would solve the problem, your expert opinion requested.
Thanks
@vangheem