Index update, Phase 2: Use new large_user_postcode index#572
Merged
Conversation
974b49a to
3c31f6c
Compare
sihugh
reviewed
Apr 23, 2025
Contributor
sihugh
left a comment
There was a problem hiding this comment.
Looks like a nice change. I had a couple of questions:
- is the migration quick enough to make it through a standard deploy? (We've had some recent ones where it's hit a timeout and had to be decoupled into schema updates and rake tasks, or some such arrangement)
- can we drop any of the other indices after this one has gone in if we're worried about write performance?
Contributor
Author
Good question - I think so, because I have deployed this to integration. Let me do that again, see how long the migration takes.
I think we're not worried per se about write performance, I just mention it for completeness. Writing is relatively stable (we have a fixed rate of 3 writes per second, except during ONSPD imports), and it's all done in workers so doesn't affect the end user. But we can certainly audit the indices somehow to confirm whether there are any vestigial ones. |
sihugh
approved these changes
Apr 23, 2025
3c31f6c to
71058b0
Compare
71058b0 to
8b61fa4
Compare
8b61fa4 to
b482bf8
Compare
b482bf8 to
c78722e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The updates added in PR #556 included a naive method of identifying relevant ONSPD records to update (fetching a list which had to be filtered by a select statement), resulting in a notable bump in CPU usage by the worker threads. This PR moves the "size" element from the free-form JSON results field into a specific field which can be indexed and searched in a single search.
Speed of the query code as reported by benchmark:
Before:
0.473827 s / call
After
0.000685 s / call
(Speed improvement of 690x on this code, which is called once a second)
Obviously there's a cost in writing, because the index has to be updated, so it's not plain sailing, but the benefits should outweigh the cost, we can monitor that on CPU usage.
Integration worker CPU and memory load over the last two weeks - low before PR 556, high after it, then low again after this PR is temporarily deployed - CPU and memory load seem to be marginally better even than before PR 556
Actual index and migration task added in #610
https://trello.com/c/Pf89f3wr/350-improve-worker-cpu-usage-on-locations-api, Jira issue PNP-6398
Follow these steps if you are doing a Rails upgrade.