Skip to content

Call "insert" method when reindexing #53

Description

@thejcannon

Realistically, there should be little difference in calling insert_foo vs. update_foo since the writer's insert and update are essentially the same thing (and are interchangeable). However there is a semantic difference between those two methods that clients might care about.

An example is that I might write my own Whoosheer that uses two models: User and Review, and is used to search for Reviews (for this example, the user's name is being indexed too, so people can search for reviews by the user's name).
insert_review is straightforward,
insert_user is a no-op (since there isn't any reviews for that user yet).
update_review is also straightforward.
update_user is a bit more complex. In update_user I loop through all the user's reviews and update the document for each review.

Now if a user changes something meaningless (like say their email preferences), then the update_user code will still trigger and I'll loop through all their reviews and update the documents for a net-zero change. To mitigate this, I've implemented a short-circuit in the update_foo methods to bail early if none of the fields we care about have changed. I've left the insert_foo methods unchanged because we'll always add to the document when a Review gets inserted.

Now, when I reindex, because flask_whooshee calls update_foo instead of insert_foo my "clever" code notices no fields have changed and bails early, leaving my index barren and dry.


P.S. If you're interested in the "unchanged fields" optimization, it looks something like this (I've implemented it as a decorator, but omitted that part for brevity):
from sqlalchemy import inspect

fields_i_care_about = [...]

def update_foo(cls, writer, target):
   unmodified_fields = inspect(target).unmodified
   if all(map(lambda key: key in unmodified_fields, fields_i_care_about)):
      return
   
   ...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions