Not ready for merge: Elasticsearch backend #80

felliott · 2014-03-03T02:10:21Z

Hello,

I'm filing this dummy pull request so that my branch adding elasticsearch support will be visible from the main repo. There are four outstanding tasks before its ready for merge:

1.) mapping returned values to native types
2.) case-insensitive searching
3.) escaping metacharacters in contains regex
4.) abstract backref support

I plan on working on these this week and will hopefully have a final PR soon.

Cheers,
Fitz

* This is essentially just a search and replace on mongostorage.py

* I assumed that a modular-odm collection was the same thing as an elasticsearch index. Nope. A collection is the doc-type in ES parlance. Add a required es_index attribute to ElasticsearchStorage constructor and update ES calls to use correct nomenclature. * The test ES storage object now provides es_index.

* ElasticsearchQuerySet.data is an array, not a real cursor. Stop calling count() on it, just use len() instead.

* ES does real-time search, so we have to refresh the index to make sure all of the fixtures have been inserted before searching.

* Get find(), find_one(), update(), remove() working properly

* Turn modular-odm query objects into filter structures suitable for passing to Elasticsearch. We now support everything in the tests except for icontains.

* Since we're storing a list of results rather than an actual cursor, our queryset implementation is basically the same as Pickle's rather than Mongo's.

* ES is returning integer ID fields as strings instead of integers. Add a stub _to_native_types method where casting will take place.

* delete_by_query() won't accept a plain filter as the query body. Instead, pass a "filtered" query that does a match_all + filter

* Backrefs are stored as tuples of (id, ref_name). If id is an integer, then Elasticsearch assumes all elements in the tuple will be integers and chokes when it encounters ref_name (a string). For now, explicitly cast the first element of a tuple as a str().

* Elasicsearch does real-time search, so searching immediately after save() may not return up-to-date-results. Add a default-noop refresh() method to Storage() and implement for Elasticsearch backend. Call after save().

felliott · 2014-07-07T12:14:01Z

Hello,

I've updated this PR with elasticsearch v1.0 support. It's still failing three tests:

test_foreign_queries.py:test_eq_abstract
test_foreign_queries.py:test_eq_abstract_list
test_string_operators.py:test_icontains

The test_eq_abstract tests are failing because searching for an array like ["b3e4d", "foo"] in ES has match-any semantics. Since the "foo" key is present in all of the stored backrefs, it will falsely return a match. More details and a possible solutions are here:

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_finding_multiple_exact_values.html

The icontains test is failing because the default analyzer in ES is case sensitive. To be able to search both case-sensitive and case-insensitive, you'll need to set up a custom analyzer on init.

I'm not sure I'll be able to spend more time on this, but if anyone else would like to use the code or ask me some questions, feel free!

Cheers,
Fitz

jmcarp · 2014-07-07T13:41:11Z

Thanks for the updates! Will look into the remaining failing tests and merge when I get a chance.

felliott added 23 commits July 7, 2014 07:28

add elasticsearch daemon task

ce668bf

first pass at new elasticsearchstorage.py

e1ce884

* This is essentially just a search and replace on mongostorage.py

add Elasticsearch test object

031cdc8

fix crash in ElasticsearchQuerySet

5e2f3e5

* ElasticsearchQuerySet.data is an array, not a real cursor. Stop calling count() on it, just use len() instead.

delete ES test index when done

2f3968e

refresh ES index after inserting fixtures

44b53b0

* ES does real-time search, so we have to refresh the index to make sure all of the fixtures have been inserted before searching.

fill out storage retrieval methods

51c7e16

* Get find(), find_one(), update(), remove() working properly

add search by query functionality

6932b08

* Turn modular-odm query objects into filter structures suitable for passing to Elasticsearch. We now support everything in the tests except for icontains.

EsQuerySet now copies PickleQuerySet

dc0fec8

* Since we're storing a list of results rather than an actual cursor, our queryset implementation is basically the same as Pickle's rather than Mongo's.

add stub for converting ES resp. to native types

5b221ce

* ES is returning integer ID fields as strings instead of integers. Add a stub _to_native_types method where casting will take place.

doc/style cleanup

fdb87ea

fix ES delete query syntax

633a2ea

* delete_by_query() won't accept a plain filter as the query body. Instead, pass a "filtered" query that does a match_all + filter

fix linting complaints in test_simple_queries

a2a8863

make Elasticsearch evaluate lazily

bf355de

fix lint complaints in elasticsearch storage

61dd33e

add NullHandler to silence ES.trace logging

d989498

fix foolish unpacking of ES matches

374c8df

add refresh() method to Storage base

3b170cb

* Elasicsearch does real-time search, so searching immediately after save() may not return up-to-date-results. Add a default-noop refresh() method to Storage() and implement for Elasticsearch backend. Call after save().

require ES v1.0 or greater

9dda082

add option to tasks.py to run ES as daemon

31c8ba0

update ES delete syntax for v1.0

2b50b38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Not ready for merge: Elasticsearch backend #80

Not ready for merge: Elasticsearch backend #80

Uh oh!

felliott commented Mar 3, 2014

Uh oh!

felliott commented Jul 7, 2014

Uh oh!

jmcarp commented Jul 7, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Not ready for merge: Elasticsearch backend #80

Are you sure you want to change the base?

Not ready for merge: Elasticsearch backend #80

Uh oh!

Conversation

felliott commented Mar 3, 2014

Uh oh!

felliott commented Jul 7, 2014

Uh oh!

jmcarp commented Jul 7, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants