-
-
Notifications
You must be signed in to change notification settings - Fork 132
Add support for "true" cursor based pagination in connections #730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for "true" cursor based pagination in connections #730
Conversation
Add support for nested cursor-pagination connections
Reviewer's Guide by SourceryThis pull request introduces Updated class diagram for StrawberryDjangoQuerySetConfigclassDiagram
class StrawberryDjangoQuerySetConfig {
-optimized: bool
-optimized_by_prefetching: bool
-type_get_queryset_did_run: bool
-ordering_descriptors: list[OrderingDescriptor] | None
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @diesieben07 - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider adding a section to the documentation that compares and contrasts
ListConnection
andDjangoCursorConnection
, highlighting when each should be used. - It might be worth adding a note about the implications of strictly ordered QuerySets in the documentation.
Here's what I looked at during the review
- 🟡 General issues: 1 issue found
- 🟢 Security: all looks good
- 🟡 Testing: 1 issue found
- 🟡 Complexity: 1 issue found
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
## Cursor based connections | ||
|
||
As an alternative to the default `ListConnection`, `DjangoCursorConnection` is also available. | ||
It supports pagination through a Django `QuerySet` via "true" cursors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Clarify the meaning of "true cursors". Suggest using "offset-based cursors" vs "range-based cursors" to distinguish the approaches.
The term "true cursors" might be confusing to users. Using more descriptive terms like "offset-based cursors" and "range-based cursors" would improve clarity.
Suggested implementation:
It supports pagination through a Django `QuerySet` via range-based cursors.
`ListConnection` uses offset-based cursors (slicing) to achieve pagination, which can negatively affect performance for huge datasets,
strawberry_django/relay_cursor.py
Outdated
return [getattr(obj, descriptor.attname) for descriptor in descriptors] | ||
|
||
|
||
def build_tuple_compare( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (complexity): Consider refactoring the complex, nested logic in build_tuple_compare
and apply_cursor_pagination
into smaller, well-named helper functions to improve readability and maintainability by reducing nested code blocks and improving readability..
Consider splitting the complex, nested logic into smaller helper functions. For example, the loop in build_tuple_compare
mixes comparator and equality handling with nested conditionals. You can extract the "compare" logic into a helper:
def _compare_field(descriptor: OrderingDescriptor, field_value: Any, before: bool) -> Q:
value_expr = Value(field_value, output_field=descriptor.order_by.expression.output_field)
comparator = descriptor.get_comparator(value_expr, before)
eq = descriptor.get_eq(value_expr)
if comparator is None:
return eq
return comparator | (eq & Q())
def build_tuple_compare(
descriptors: list[OrderingDescriptor],
cursor_values: list,
before: bool,
) -> Q:
comparators = [
_compare_field(descriptor, value, before)
for descriptor, value in zip(reversed(descriptors), reversed(cursor_values))
]
# Combine comparators with an 'AND' chain
current = Q()
for comp in comparators:
current &= comp
return current
Similarly, consider isolating parts of the slicing logic in apply_cursor_pagination
into a small helper. For example:
def _apply_slice(qs: QuerySet, slice_: slice, related_field_id: Optional[str]) -> QuerySet:
if related_field_id:
offset = slice_.start or 0
return apply_window_pagination(qs, related_field_id=related_field_id, offset=offset, limit=slice_.stop - offset)
return qs[slice_]
# Then in apply_cursor_pagination replace:
if slice_ is not None:
qs = _apply_slice(qs, slice_, related_field_id)
This approach keeps all functionality intact while reducing nested code blocks and improving readability.
…rsor-based-connection
I don't know why the Typing test fails. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #730 +/- ##
==========================================
+ Coverage 88.45% 89.16% +0.70%
==========================================
Files 42 45 +3
Lines 4002 4253 +251
==========================================
+ Hits 3540 3792 +252
+ Misses 462 461 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job here, really appreciate your contributions ❤️
Left a couple of comments/suggestions
for more information, see https://pre-commit.ci
See #736 (comment) for the typing issue. @bellini666 As I said on Discord I have updated this for the changes in Strawberry Core, feel free to review this again or merge it if you think it's good to go! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one last ask, and then it is an automerge.
Can we have all of this under strawberry_django.relay
?
We can do that by creating a relay
directory, rename the current relay.py
to utils.py
, move the ListConnectionWithTotalCount
in it to a connections.py
module, and put everything you have on relay_cursor
inside it. Then we export the important stuff on relay/__init__.py
. Wdyt?
We would end up with something like
strawberry_django/
relay/
__init__.py
connections.py
utils.py
Co-authored-by: Thiago Bellini Ribeiro <[email protected]>
Co-authored-by: Thiago Bellini Ribeiro <[email protected]>
for more information, see https://pre-commit.ci
Yes, 100%! That was my plan all along, I did it this way for now to make the diff easier to parse and limit the changes to what I am actually changing in this PR. I'll refactor it as you suggested. |
@bellini666 I've applied your suggestion, DjangoCursorConnection can now be imported from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ty very much! ❤️
This pull request adds a new connection class
DjangoCursorConnection
which supports efficient cursor-based pagination through any DjangoQuerySet
without relying on offset-slicing.Description
ListConnection
uses slicing to achieve pagination. This works for the general case, but can be inefficient for large datasets, because large page numbers result in a largeOFFSET
in SQL. An alternative to limit/offset pagination is cursor based pagination, which replacesOFFSET
by range queries such asQ(due_date__gt=...)
.DjangoCursorConnection
implements this approach.How it works
DjangoCursorConnection
inspects theQuerySet
and extracts its ordering parameters. It then uses those parameters to construct the cursors. For example, fororder_by("due_date", "pk")
the ordering parameters would bedue_date
andpk
. If the ordering parameter is an expression or not a direct field on the model (e.g.order_by(Upper("name"))
ororder_by("project__name")
, a new annotation will be added to the queryset, mirroring the ordering expression, so that the value can be extracted later.The extracted values are then encoded into a cursor.
When paginating, the cursor is deconstructed into its parts again and those parts are then used to build a pagination filter.
For example, when ordering by
"due_date", "pk"
, the cursor might contain the parts"2025-03-01"
and,"3"
. If that cursor is passed forafter
, the following filter would be constructed:Q(due_date__gt="2025-03-01") | (Q(due_date="2025-03-01") & Q(pk__gt="3"))
Serializing and deserializing the field values to strings is dedicated to the model field implementation, ensuring maximal compatibility.
Other
During the implementation I discovered a bug in
get_queryset_config
. It is used in the code as if it did a "get or create" operation, setting the config on the QuerySet if not already present. However it did not actually do so.This is likely the cause for why this hack was implemented.
While writing tests I have noticed that
strawberry_django.field(disable_optimization=True)
had no effect when used on a top level field. I have fixed this.Currently, the code lives in a separate
relay_cursor.py
file. I have chosen to do this for now to make the diff of this PR easier to parse. However I think the relay code should be refactored so thatrelay.py
is removed and we haverelay/__init__.py
instead. Then the code can be split up into multiple files and still offer the same imports. What do you think?I have fixed
get_queryset_config
, added specific tests for it and removed the (now unnecessary) hack inis_optimized_by_prefetching
.Types of Changes
Checklist