-
Notifications
You must be signed in to change notification settings - Fork 18
Not ready for merge: Elasticsearch backend #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
* This is essentially just a search and replace on mongostorage.py
* I assumed that a modular-odm collection was the same thing as an elasticsearch index. Nope. A collection is the doc-type in ES parlance. Add a required es_index attribute to ElasticsearchStorage constructor and update ES calls to use correct nomenclature. * The test ES storage object now provides es_index.
* ElasticsearchQuerySet.data is an array, not a real cursor. Stop calling count() on it, just use len() instead.
* ES does real-time search, so we have to refresh the index to make sure all of the fixtures have been inserted before searching.
* Get find(), find_one(), update(), remove() working properly
* Turn modular-odm query objects into filter structures suitable for passing to Elasticsearch. We now support everything in the tests except for icontains.
* Since we're storing a list of results rather than an actual cursor, our queryset implementation is basically the same as Pickle's rather than Mongo's.
* ES is returning integer ID fields as strings instead of integers. Add a stub _to_native_types method where casting will take place.
* delete_by_query() won't accept a plain filter as the query body. Instead, pass a "filtered" query that does a match_all + filter
* Backrefs are stored as tuples of (id, ref_name). If id is an integer, then Elasticsearch assumes all elements in the tuple will be integers and chokes when it encounters ref_name (a string). For now, explicitly cast the first element of a tuple as a str().
* Elasicsearch does real-time search, so searching immediately after save() may not return up-to-date-results. Add a default-noop refresh() method to Storage() and implement for Elasticsearch backend. Call after save().
|
Hello, I've updated this PR with elasticsearch v1.0 support. It's still failing three tests: The test_eq_abstract tests are failing because searching for an array like ["b3e4d", "foo"] in ES has match-any semantics. Since the "foo" key is present in all of the stored backrefs, it will falsely return a match. More details and a possible solutions are here: The icontains test is failing because the default analyzer in ES is case sensitive. To be able to search both case-sensitive and case-insensitive, you'll need to set up a custom analyzer on init. I'm not sure I'll be able to spend more time on this, but if anyone else would like to use the code or ask me some questions, feel free! Cheers, |
|
Thanks for the updates! Will look into the remaining failing tests and merge when I get a chance. |
Hello,
I'm filing this dummy pull request so that my branch adding elasticsearch support will be visible from the main repo. There are four outstanding tasks before its ready for merge:
1.) mapping returned values to native types
2.) case-insensitive searching
3.) escaping metacharacters in contains regex
4.) abstract backref support
I plan on working on these this week and will hopefully have a final PR soon.
Cheers,
Fitz