Pagination for LWS Containers#82
Conversation
Add link-based pagination model for large containers with first/last/ next/prev navigation URIs. Add pagination section to spec, extend container representation with pagination properties, and add pagination terms to JSON-LD context.
|
This was discussed during the #lws meeting on 23 February 2026. View the transcriptPull Request 82 Pagination for LWS Containers (by laurensdeb)laurens: this one introduces a pagination mechanism |
|
"The server determines page boundaries and page size. " This is fundamentally flawed for paging. How in the world does a server know what paging size app needs. It might be thousands for time series, large data sets, analysis, etc - this is an app decision, not the server, making this approach to paging seriously flawed. |
Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
A server will always have the final say on the generation of an HTTP response. A client may provide hints related to what it needs, and there are mechanisms for this, such as sending I would suggest adding some text to the paging section, describing how clients can send hints related to the size of a page of container members. For example:
|
|
This is exactly why the LDP paging approach was never implemented, as a variable page size per request makes caching not possible. Are there examples in production at scale where a server is setting the page size for all requests? Also, the paging solution should apply to anything that needs paging, such as type search, not just specific to a container. Ultimately, with any paging (using qty), it gets down to an ordered list and then at what point do you start in that list, and how many items do your return from that start. Adding another unit in, like Page, is not solving any problems but making them. |
|
@gibsonf1 you seem to be making two contradictory arguments. One the one hand, you want client-provided hints for variable response sizes. On the other hand you are concerned about caching these variable responses. If a server determines the size of page responses, there are fewer issues related to caching and the implementation is considerably easier. The Prefer header feature could optionally be layered on top. Using ETags and Cache-Control headers is the standard way a server provides hints to caches, and a paging feature is no different in that regard.
I realize that you would like to expand the scope of this feature. At present, however, a Container is the only LWS resource type that is under discussion. There is no proposal for a type index. Is there a reason why this approach is incompatible with some future type index feature?
I do not understand this statement. |
The most important thing I'm saying here is that the server should not determine paging item quantity - this is something that the app should request as it's unworkable, for example, if an app requires thousands in a request (such as for analytics etc) when server limits items, to say, 20. The other important thing I'm saying is that a key reason why the LDP paging was not implemented is the use of "page" as a unit that varies per request. If LWS uses something as a model, shouldn't we use things that are implemented and used in production at scale? LDP paging does not qualify. Ultimately, for any request, it has to be translated into the core units (in the case of an ordered list that has a quantity): where the response should start, and how many items are in it. Why not use those two things? What in the world is introducing a opaque "page" unit doing for us? Where has this approach been used successfully at scale and in production? |
|
@gibsonf1 Note that the Github API, Facebook API and Zalando REST Guidelines all include some notion of pagination as well as the concept of a ‘page’ in a collection of entities. The use of opaque links as is proposed in this PR (and which was discussed over several WG calls) reduces the complexity of the specification text, and could be supplemented with a Prefer header to allow for client negotiation. An alternative could be to explicitly define an interface for navigating through a collection, e.g. through a cursor or offset based mechanism, which has trade-offs I won’t dive into now. If you don’t think the opaque link-based approach proposed here is the best way of navigating large containers, I would encourage you to open a PR with a proposal of how you’d envision this. |
Opaque links are more complex to specify and implement than simply having a query string paging directive for any resources such as ?start=20&rows=40 But given that I seem to be making no headway simplifying things by not using "page" with opaque links... The examples you list have two forms of pagination, Offset-based pagination and Cursor-based pagination. The current proposal is offset-based pagination with server forced page sizes with a format not seen in any of the listed examples based on the non-implemented LDP paging spec. If we go that direction, why not use one of the approaches listed such as github's offset-pagination (which seems like the best one) using link headers which includes the first and last links as well? There is also support with that approach to very importantly change number of items per page directly in the request uri with per_page key: of course, the simplest thing to do is just use the request uri with something like ?start=20&rows=40 with the total count in the header - that would be my proposal (and the current implementation on TwinPod) |
|
@gibsonf1 can you please explain how opaque links are more complex to specify and implement? To me it seems exactly the opposite. An opaque link allows an implementer to use whatever format they want, such as the one you propose without requiring all implementers to follow that same pattern (since every approach has trade-offs). From a client perspective, you just follow the opaque link, no parsing or structured analysis required. I would also remind you of the Opacity Axiom that forms the basis of web architecture. |
Yes, I'm proposing no links at all in the header, which is less complex, and instead only a total count. Then both apps and server only need to look at request query string for paging, in the form of a start position and payload quantity, such as Also, this approach to paging is supported on the app side by all major ui component libraries, such as vue etc., making it extremely simple to implement for apps (and for servers too) |
|
@gibsonf1 please propose specification text for this as a pull request. The text would include requirements for both clients and servers. This would be an alternative to what is proposed in this PR. Please ensure that your proposal addresses the Web Architecture guidance on URI opacity and prior art in W3C and/or IETF specifications. The group as a whole can decide which approach to pursue. |
|
I see some points above being argued based on their impact on caching. As all LWS connections are meant to be secure (e.g., HTTPS or other TLS-based e2ee), caching is already of limited value, as it can only be applied to reloads of the same resource (or portion thereof) by the same client, which might include server-side caches of client requests. Caching is mostly a red herring on the modern web. |
|
Ted, the recent discussion has centered primarily on a simpler way to do paging, and I'll write up the simple approach we take (the dropping all the links in the headers approach.) For caching discussed earlier, yes, this is actually a big deal especially for enterprise. Everyone who gains access using the same ACL is on the same cache (in the TwinPod case), for example, if an entire company or department is given access to a hierarchy of documents, once the first person from that company looks at them, anyone else from the same company (or department etc) gets the cache. So highly valuable for an enterprise use case. For paging, TwinPod tracks for each resource (until there is a change which deletes the cache) the ranges that have been requested with paging, such that no loading is needed if any request for a resource with paging in a range already requested. So that is the generic resource cache on the server, and then there is a specific cache related to the access a user/group has for that resource in the form of sending a 304, etc. |
pchampin
left a comment
There was a problem hiding this comment.
Approving modulo a few remarks below.
| - **`rel="first"`**: The URI of the first page of results. MUST be present on paginated responses. | ||
| - **`rel="last"`**: The URI of the last page of results. MUST be present on paginated responses. | ||
| - **`rel="next"`**: The URI of the next page of results. MUST be present when there are subsequent | ||
| pages. MUST be omitted on the last page. | ||
| - **`rel="prev"`**: The URI of the previous page of results. MUST be present when there are preceding | ||
| pages. MUST be omitted on the first page. |
There was a problem hiding this comment.
I'm -1 on requiring prev and last. Those may be significantly hard to compute depending on the pagination strategy of the server.
I'm OK with merging this PR and reflecting this comment in an issue to solve later on.
There was a problem hiding this comment.
This is one of the things handled by proper implementation of something like ODBC's Scrollable Cursors, as I suggested elsewhere. prev and last only need to be computed when the client has requested they be available and when the server supports them.
|
@gibsonf1 I think you are reading too much in the current text, both in terms of ambitions, and in terms of what it specifies (note that the examples are not normative, and granted, they probably need to be changed). The ambition of the PR is simply to make LWS servers "good web citizens", by not flooding clients with huge representations of containers. It is not to allow clients to fine tune how their access the content of a container (let alone other kinds of resources). And the PR does not specify how to construct the In contrast, it seems that you propose that the spec impose one specific implementation strategy (the I'm more inclined to favor the flexibility for server implementers, while keeping the ability to advertise more capabilities in the Storage Description Resource, that clients understanding those feature could discover and use when appropriate. The |
As I think I pointed out earlier, there are valid exceptions to URI opacity. The That being said, and as explained just above, I prefer the proposal in this PR over enforcing a query-parameter-based pagination. |
|
I think it may be worthwhile to adjust the way things are being discussed here, and change from referring only to unqualified generic "caching", and instead refer to qualified server-side "query caching", server-side and/or intermediary "web caching", and the like. I think it will also be worthwhile for folks to look into "scrollable cursors", as are commonly used in the relational database world. (Note -- the link just previous appears to target Microsoft SQL Server documentation, but it is really targeting the DBMS-agnostic ODBC API documentation.) This is not a new challenge, and we would be well served by learning from the solutions developed by those who've traveled this road before us, rather than painfully and painstakingly reinventing the wheels they invented in years and decades past. Especially of note —
|
Co-authored-by: Pierre-Antoine Champin <pierre-antoine@w3.org>
@pchampin Actually, very good points. This PR works well if only the next link is there, but I think the totalItems needs to be an explicit integer in the header - it seems a bit murky right now how that gets defined. For examples, maybe we could show two examples of actual implemented paging where one example request could show (for TwinPod case) ?start=0&limit=30 (or like that to indicate a way things could work implementation side - I think limit is better than rows) and the other for Inrupt (once they have paging) (Right now TwinPod is using an integer in X-Total-Count in the header for the totalItems) @TallTed For more advanced cursor-based approaches, I think maybe that could be later or a server capability listing. |
Sure. If you read up on the ODBC/JDBC world, you'll note that there are default behaviors (these being the easiest to implement and work with), and capability-checks by which client applications can learn what cursors are supported by the driver and/or DBMS they're talking to. ODBC/JDBC drivers are not required to implement anything beyond the most basic, and client applications are expected to ask about the more advanced features they want to use, and generally handle the lowest-common-denominator, which usually just means "slower" performance. That said, it's best to plan for the cursors at the start, so that early components on both client and server side continue to work with later components. |
| ### Pagination | ||
|
|
||
| Containers may hold a large number of resources. To allow clients to retrieve container | ||
| listings incrementally, servers MUST support pagination for containers whose membership |
There was a problem hiding this comment.
This text is a bit ambiguous: the server MUST support pagination, but the server is allowed to determine for itself that the threshold is Infinity. So, for more clarity, in my opinion, either there MUST always be pagination (independent of server-determined thresholds), or there SHOULD be pagination (allowing a server to pick and choose for itself which containers contain pagination or not)
There was a problem hiding this comment.
I believe I left or at least 👍 similar comment on the prior google doc
There was a problem hiding this comment.
just moving from MUST to SHOULD and minor edits should be fine
Containers may hold a large number of resources. To allow clients to retrieve container listings incrementally, servers SHOULD support pagination (e.g., for containers whose membership exceeds a server-determined threshold).
| @@ -0,0 +1,123 @@ | |||
| ### Pagination | |||
|
|
|||
There was a problem hiding this comment.
Suggestion to remove specific references to containers.
| ##### Pagination Model | ||
|
|
||
| Pagination is link-based: the server provides pagination URIs via HTTP `Link` headers [[!RFC8288]], | ||
| allowing clients to navigate the full listing without relying on numeric offsets. The server |
There was a problem hiding this comment.
Remove determines page boundaries & size.
Removed redundant sentence about server determining page boundaries and size.
|
This was discussed during the #lws meeting on 30 March 2026. View the transcriptPR #82 - Pagination<laurens> w3c/lws-protocol#82 <gb> Pull Request 82 Pagination for LWS Containers (by laurensdeb) laurens: a few minor changes since last week. Pagination section has been moved. Now a SHOULD in general, and some wording changes. <Zakim> bendm, you wanted to ask about total items: must be exact? bendm: are totalItems an integer and MUST? gibsonf1: about pagination, is the server in charge of how many items to put in a page? laurens: yes, the text says the server defines page boundary gibsonf1: could we take that out? laurens: we could, but note that no text specifies how the client would specify page boundary gibsonf1: agreed, but let's not prevent that pchampin: I think we can remove that server decides page counts - support removing that constraint. I think total-items can be absent <Zakim> acoburn, you wanted to ask about totalItems gibsonf1: it would be really difficult for an application to provide good UX without knowing the total number of items acoburn: I agree with pchampin just said. Could be difficult to determine total items. Maybe not have total items bendm: I'm happy to change the MUST to a SHOULD <acoburn> +1 for SHOULD on totalItems <eBremer> +1 for SHOULD on totalItems laurens: fine by me to change total items to Should <laurens> PROPOSAL: to accept the pull request #82 as proposed. <gb> Pull Request 82 Pagination for LWS Containers (by laurensdeb) laurens: would like to vote on proposal <gibsonf1> +1 <pchampin> +1 <eBremer> +1 <laurens> +1 <TallTed> +0 <acoburn> +1 <uvdsl> +0 <jeswr> +1 <AZ> +0 (need more careful reading) <bendm> +1 laurens: looks like consensus to merge RESOLUTION: to accept the pull request #82 as proposed. <gb> Pull Request 82 Pagination for LWS Containers (by laurensdeb) <Zakim> bendm, you wanted to ask about modifications bendm: Just adding a new issue is fine for follow up? laurens: issue or pull request is fine for modifications |
|
The LWS WG discussed and voted to approve this PR on 2026-03-30 |
This PR continues on the discussions of the proposed mechanics for containers in LWS.
Changes from the initial discussions
Context
It is part of a series of three PRs:
Preview | Diff