fix: use `selectinload()` rather than `joinedload()` on one-to-many relationships to avoid Cartesian-product blow-up by cfm · Pull Request #7865 · freedomofpress/securedrop

cfm · 2026-06-17T23:54:35Z

Fixes #7862. Since #7628, we've explicitly "prefer[red] expensive eager queries over apparently cheaper lazy queries". That's the right instinct, but since #7604 we've overoptimized for the number of queries rather than the number of results.

Here we replace joinedload() for one-to-many relationships, guaranteeing $O(1)$ queries but giving us $O(n^2)$ results, with selectinload(), giving us us queries linear in the number of relationships (i.e., tables) involved and $O(n)$ results. In the test suite, assert_query_count() now enforces a maximum rather than an exact number of queries.

Test plan

Have a large server.
- Bonus: Have a large production server (either hardware or VM).

On develop:

Log in with the Inbox, sync a bunch, and observe high memory usage.

On this branch (or, on a production installation, just overwrite models.py from this branch):

Restart the server.
Log in with the Inbox, sync a bunch, and observe lower memory usage.

…ionships to avoid Cartesian-product blow-up Since #7628, we've explicitly "prefer[red] expensive eager queries over *apparently* cheaper lazy queries". That's the right instinct, but since #7604 we've overoptimized for the number of *queries* rather than the number of *results*. Here we replace joinedload() for one-to-many relationships, guaranteeing O(1) queries but giving us O(n^2) results, with selectinload(), giving us us queries linear in the number of relationships (i.e., *tables*) involved and O(n) results. In the test suite, assert_query_count() now enforces a maximum rather than an exact number of queries.

Copilot

Pull request overview

This PR adjusts eager-loading strategy in the SecureDrop SQLAlchemy models and related APIv2 tests to avoid Cartesian-product row blow-ups from joinedload() on one-to-many relationships, switching to selectinload() while keeping eager loading behavior.

Changes:

Replace joinedload() with selectinload() for key one-to-many relationships (e.g., Source.submissions, Source.replies, and various seen_* collections).
Update APIv2 query-count test helper to enforce an upper bound (maximum) instead of an exact query count.
Update affected query-count expectations in APIv2 journalist tests to reflect the new eager-loading behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`securedrop/models.py`	Switches eager-loading options from `joinedload()` to `selectinload()` on one-to-many relationships to reduce result-set multiplication.
`securedrop/tests/test_journalist_api2.py`	Changes `assert_query_count()` semantics to “max queries” and updates several expectations accordingly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

cfm · 2026-06-18T00:01:58Z

Log in with the Inbox, sync a bunch, and observe lower memory usage.

How much lower, you ask? Claude generated a profiling wrapper that found a difference of more than an order of magnitude:

RESULT strategy=joined    sources=113    submissions=4652    replies=1626    items=0       cartesian_rows~=93304     python_peak= 5926.8 MB elapsed=45.43 s
RESULT strategy=selectin  sources=113    submissions=4652    replies=1626    items=0       cartesian_rows~=93304     python_peak=   83.7 MB elapsed= 1.19 s

DELTA joined -> selectin: 5926.8 MB -> 83.7 MB (70.8x less python-heap peak)

We know from the test suite that this change maintains the correctness of the API, so what we're interested in now is seeing what the optimization actually looks like under Apache and mod_wsgi.

legoktm · 2026-06-18T20:56:17Z

This looks really promising, thanks. Oops on optimizing in the wrong direction!!

Code looks good, I'll test it on a prod server on Monday.

cfm added this to the SecureDrop 2.17.0 milestone Jun 17, 2026

cfm self-assigned this Jun 17, 2026

cfm added this to SecureDrop Jun 17, 2026

cfm moved this to In Progress in SecureDrop Jun 17, 2026

cfm requested a review from Copilot June 17, 2026 23:54

Copilot started reviewing on behalf of cfm June 17, 2026 23:55 View session

Copilot AI reviewed Jun 17, 2026

View reviewed changes

Comment thread securedrop/models.py

Comment thread securedrop/tests/test_journalist_api2.py

cfm moved this from In Progress to Ready For Review in SecureDrop Jun 18, 2026

cfm marked this pull request as ready for review June 18, 2026 00:02

cfm requested a review from a team as a code owner June 18, 2026 00:02

cfm mentioned this pull request Jun 18, 2026

query_options(base) is unused #7866

Open

nathandyer assigned legoktm and unassigned cfm Jun 18, 2026

nathandyer requested a review from legoktm June 18, 2026 16:09

legoktm moved this from Ready For Review to Under Review in SecureDrop Jun 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use `selectinload()` rather than `joinedload()` on one-to-many relationships to avoid Cartesian-product blow-up#7865

fix: use `selectinload()` rather than `joinedload()` on one-to-many relationships to avoid Cartesian-product blow-up#7865
cfm wants to merge 1 commit into
developfrom
7862-descartes-before-the-horse

cfm commented Jun 17, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

cfm commented Jun 18, 2026

Uh oh!

legoktm commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cfm commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

cfm commented Jun 18, 2026

Uh oh!

legoktm commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cfm commented Jun 17, 2026 •

edited

Loading