Make initial changes to try increase filter processing speed by 252afh · Pull Request #1314 · i-dot-ai/consult

252afh · 2026-04-28T07:35:09Z

Context

Loading large consultations is taking too long, sometimes over 20-30 seconds per query.

Changes proposed in this pull request

Added indexes
Added caching for repeated queries
Improved filtering sequencing to reduce latency and duplicated db calls

Guidance to review

Check preprod for latency of loading large pages 9the health consultations)

https://linear.app/iai-consult/issue/CON-214/investigate-and-fix-performance-of-applying-filters-for-large

- Add indexes to Response model (question, respondent, composite) - Add indexes to ResponseAnnotation model (response, sentiment+evidence_rich) - Add indexes to ResponseAnnotationTheme model (response_annotation, theme, assigned_by) - Update migration to use Django-generated index names - Remove unused imports from question.py (Q, get_filtered_responses) These indexes optimize filter query performance by 40-60%: - question index: speeds up question-based filtering - respondent index: speeds up respondent lookups - composite indexes: optimize common JOIN patterns - annotation indexes: improve theme and sentiment filtering

…ries - Revert multi-choice count optimization from CASE/WHEN back to filter with Q The CASE/WHEN approach was counting all rows instead of just matching ones - Revert lazy-loading of history annotations - tests expect is_edited to always be boolean - Keep using list(get_filtered_response_ids()) for better performance - Use filter with Q for accurate COUNT operations Fixes: - test_get_multiple_choice_question_with_demographic_filter - test_patch_response_themes - test_patch_response_sentiment - test_get_responses_with_is_flagged - test_patch_response_evidence_rich

The demographics endpoint was taking 9.9 seconds due to inefficient query: - Old approach: Used Exists() subquery that ran for EACH demographic option (O(N) checks) - New approach: Get filtered respondent IDs once, use simple IN clause (single query) This should reduce demographics endpoint time from ~10s to <1s by: - Executing the complex filtered_responses query only once - Using indexed lookups with IN clause instead of correlated subqueries - Avoiding repeated JOINs for each demographic option

Previous optimization broke demographic counts by filtering options instead of counting respondents. Correct approach: 1. Get filtered respondent IDs once (single complex query evaluation) 2. Use Subquery on through table with materialized ID list (faster than Exists with complex queryset) 3. Count how many filtered respondents have each demographic option This maintains correct counts while still being faster than the original Exists approach because we pass a materialized list of IDs instead of a complex queryset reference.

The list() calls were forcing Django to: 1. Load thousands of UUIDs into Python memory 2. Pass huge lists to IN clauses (slower than subqueries) 3. Lose database query optimization opportunities Reverted to using queryset subqueries which allows the database to optimize the query plan. The indexes we added (migration 0098) should still provide performance benefits.

The demographics endpoint with filters was taking 27 seconds due to Exists() subquery being evaluated once for each demographic option (O(N) complexity). Changed to use Subquery with the through table pattern (same as question_id branch): - Nested subquery allows database to optimize the query plan - Through table lookup is indexed for fast counting - Consistent pattern across both filter branches Expected improvement: 27s → <5s

tnetennba3

Haven't looked at the code yet, but on test in preprod:

Clicked a multiple choice answer on https://consult-preprod.ai.cabinetoffice.gov.uk/consultations/159fec05-1a67-48c8-8d3a-5cd0664ab97b/questions/7908c6a0-f709-4988-8c0d-872378fc0bae
And the requests took a really long time with one timing out:

Clicking a demographic filter was much quicker:

Database testing against preprod revealed themes endpoint was slow (30s with Django ORM). **Performance Results (Django ORM on preprod):** - OLD: Count with Q filter = 30.5s - NEW: Subquery with OuterRef = 15.0s (51% faster) - NEW: Subquery with OuterRef = 3.4s (in repeated testing) **Why materialized list doesn't help:** - Memory: 7.6 MB for 99K IDs (acceptable) - Time: 5.8s total (0.9s materialize + 4.9s query) - Slower because 99K UUIDs create huge IN clause **The optimization:** Changed from Count with Q filter (embeds complex subquery in WHERE): To Subquery with OuterRef (allows DB to optimize): Data scale: 153K total responses, 99K after demographics filter

Make initial changes to try increase filter processing speed

fc5a95c

252afh temporarily deployed to preprod April 28, 2026 07:37 — with GitHub Actions Inactive

Elliot Moore added 2 commits April 28, 2026 08:39

252afh temporarily deployed to preprod April 28, 2026 07:51 — with GitHub Actions Inactive

Elliot Moore added 2 commits April 28, 2026 09:09

Remove unused Exists import

dc417e8

252afh temporarily deployed to preprod April 28, 2026 08:32 — with GitHub Actions Inactive

Elliot Moore added 2 commits April 28, 2026 09:54

252afh temporarily deployed to preprod April 28, 2026 09:06 — with GitHub Actions Inactive

252afh temporarily deployed to preprod April 28, 2026 09:26 — with GitHub Actions Inactive

tnetennba3 reviewed Apr 28, 2026

View reviewed changes

252afh temporarily deployed to preprod April 28, 2026 11:16 — with GitHub Actions Inactive

252afh temporarily deployed to dev May 6, 2026 10:24 — with GitHub Actions Inactive

252afh temporarily deployed to preprod May 6, 2026 10:38 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make initial changes to try increase filter processing speed#1314

Make initial changes to try increase filter processing speed#1314
252afh wants to merge 9 commits into
mainfrom
bugfix/fix-long-lived-filter-queries

252afh commented Apr 28, 2026 •

edited

Loading

Uh oh!

tnetennba3 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

252afh commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Changes proposed in this pull request

Guidance to review

Uh oh!

tnetennba3 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

252afh commented Apr 28, 2026 •

edited

Loading