Skip to content

Align read window with part boundaries#1618

Closed
vladem wants to merge 3 commits into
awslabs:mainfrom
vladem:aligned-read-window-end
Closed

Align read window with part boundaries#1618
vladem wants to merge 3 commits into
awslabs:mainfrom
vladem:aligned-read-window-end

Conversation

@vladem

@vladem vladem commented Sep 25, 2025

Copy link
Copy Markdown
Contributor

Memory limiter currently doesn’t account for the “rounded up” memory by CRT, which means up to 8MiB of unaccounted memory per file handle. Based on this flaw in the memory model, limiter may choose larger windows, while there is actually no free memory for those. We should “round up” the read window to the next part boundary on our side. The metric s3.client.read_window_excess_bytes may be used to track the amount of “rounded up” memory, not accounted in the memory limiter.

To test the change, PR introduces read_window_increment_failed flag to MockClient and adds a basic_read_test_mock_large test.

Note that this fix doesn't apply to CachingPartStream: for this stream BackpressureController is not aware of second_request_start offset, which is required for the alignment.

Does this change impact existing behavior?

In a memory constrained environment, this may result in smaller read window sizes and less memory consumption.

Does this change need a changelog entry? Does it require a version change?

Patch version change and a change log to mountpoint-s3-fs, will add later.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).

@vladem vladem temporarily deployed to PR integration tests September 25, 2025 16:50 — with GitHub Actions Inactive
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:50 — with GitHub Actions Failure
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:50 — with GitHub Actions Failure
@vladem vladem temporarily deployed to PR integration tests September 25, 2025 16:50 — with GitHub Actions Inactive
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:50 — with GitHub Actions Failure
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:50 — with GitHub Actions Failure
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:50 — with GitHub Actions Failure
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:50 — with GitHub Actions Failure
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:50 — with GitHub Actions Failure
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:50 — with GitHub Actions Failure
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:50 — with GitHub Actions Failure
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:50 — with GitHub Actions Failure
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:51 — with GitHub Actions Failure
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:51 — with GitHub Actions Failure
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:51 — with GitHub Actions Failure
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:51 — with GitHub Actions Failure
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:51 — with GitHub Actions Failure
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:51 — with GitHub Actions Failure
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:51 — with GitHub Actions Failure
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:51 — with GitHub Actions Failure
@vladem vladem had a problem deploying to PR integration tests September 25, 2025 16:51 — with GitHub Actions Failure
@vladem vladem marked this pull request as draft September 26, 2025 14:31
@vladem vladem temporarily deployed to PR integration tests October 3, 2025 10:05 — with GitHub Actions Inactive
@vladem vladem temporarily deployed to PR integration tests October 3, 2025 10:05 — with GitHub Actions Inactive
@vladem vladem temporarily deployed to PR integration tests October 3, 2025 10:05 — with GitHub Actions Inactive
@vladem vladem temporarily deployed to PR integration tests October 3, 2025 10:05 — with GitHub Actions Inactive
@vladem vladem temporarily deployed to PR integration tests October 3, 2025 10:05 — with GitHub Actions Inactive
@vladem vladem temporarily deployed to PR integration tests October 3, 2025 10:05 — with GitHub Actions Inactive
@vladem vladem temporarily deployed to PR integration tests October 3, 2025 10:05 — with GitHub Actions Inactive
@vladem vladem temporarily deployed to PR integration tests October 3, 2025 10:05 — with GitHub Actions Inactive
@vladem vladem temporarily deployed to PR integration tests October 3, 2025 10:05 — with GitHub Actions Inactive
@vladem vladem temporarily deployed to PR integration tests October 3, 2025 10:05 — with GitHub Actions Inactive
@vladem vladem temporarily deployed to PR integration tests October 3, 2025 10:05 — with GitHub Actions Inactive
@vladem vladem temporarily deployed to PR integration tests October 3, 2025 10:05 — with GitHub Actions Inactive
@vladem vladem temporarily deployed to PR integration tests October 3, 2025 10:05 — with GitHub Actions Inactive
@vladem vladem temporarily deployed to PR integration tests October 3, 2025 10:05 — with GitHub Actions Inactive
@vladem vladem temporarily deployed to PR integration tests October 3, 2025 10:05 — with GitHub Actions Inactive
@vladem vladem temporarily deployed to PR integration tests October 3, 2025 10:05 — with GitHub Actions Inactive
@vladem vladem temporarily deployed to PR integration tests October 3, 2025 10:05 — with GitHub Actions Inactive
@vladem vladem temporarily deployed to PR integration tests October 3, 2025 10:05 — with GitHub Actions Inactive
@vladem vladem added the performance PRs to run benchmarks on label Oct 3, 2025
@vladem vladem marked this pull request as ready for review October 6, 2025 09:33
Vlad Volodkin and others added 3 commits October 17, 2025 09:51
Signed-off-by: Vlad Volodkin <vlaad@amazon.com>
Signed-off-by: Vlad Volodkin <vlaad@amazon.com>
Signed-off-by: Vlad Volodkin <vlaad@amazon.co.uk>
@vladem vladem force-pushed the aligned-read-window-end branch from c526574 to eea0347 Compare October 17, 2025 09:57
@vladem vladem temporarily deployed to PR integration tests October 17, 2025 09:57 — with GitHub Actions Inactive

@passaro passaro left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General concerns:

  • The fix is quite fragile, since it requires the BackpressureController to know what the PartStream will do with the window increment.
  • As you noted, the first consequence is that there is no way to make it work with the cache.
  • Less importantly, the mock client becomes even more complicated.

Let's discuss in person whether there are other viable options.

let new_read_window_end_offset = Self::round_up_to_part_boundary(
new_read_window_end_offset,
self.second_request_start,
self.min_read_window_size as u64,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are assuming this is actually the read part size, right? We should make it explicit.

// NOTE: in the end of the object we still have up to "part_size" of unaccounted memory.
// For more accurate memory limiting we could reserve in "full parts", but that would make
// release more complicated. So accept this inaccuracy, and reserve only till `request_end_offset`.
let new_read_window_end_offset = new_read_window_end_offset.min(self.request_end_offset);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to avoid shadowing here.

.saturating_add(self.preferred_read_window_size as u64)
.min(self.request_end_offset);
.saturating_add(self.preferred_read_window_size as u64);
let new_read_window_end_offset = Self::round_up_to_part_boundary(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the right place? Or should we align in scale_up and scale_down, where we already modify preferred_read_window_size?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the existing incrementing logic, I think, it is the right place. Above this line we "blindly" adding preferred_read_window_size to new_read_window_end_offset. Having a fixed preferred_read_window_size won't work for arbitrary new_read_window_end_offset, which is dictated by the reading side (e.g. we may need to increment the window at offset 4,194,305 or 4,194,306 depending on how much data was read).

@vladem

vladem commented Oct 31, 2025

Copy link
Copy Markdown
Contributor Author

closing in favour of #1671

@vladem vladem closed this Oct 31, 2025
github-merge-queue Bot pushed a commit that referenced this pull request Dec 22, 2025
…t_size` (#1707)

Re-created #1618, in this
PR:

- we align read window end to the part boundary for the second request
(see `round_up_to_part_boundary` method);
- we update mock client to allow testing of this change;
- we add `PrefetcherConfig::initial_request_size` field and use it in
`mount_from_config.rs` example.

### Does this change impact existing behavior?

In a memory constrained environment, this may result in smaller read
window sizes and less memory consumption.

### Does this change need a changelog entry? Does it require a version
change?

Minor version change and a change log to `mountpoint-s3-fs`, will add
later. Patch version change to the `mountpoint-s3-fs-client`.

---

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and I agree to the terms of
the [Developer Certificate of Origin
(DCO)](https://developercertificate.org/).

---------

Signed-off-by: Vlad Volodkin <vlaad@amazon.com>
Signed-off-by: Vlad Volodkin <vlaad@amazon.co.uk>
Co-authored-by: Vlad Volodkin <vlaad@amazon.com>
Co-authored-by: Vlad Volodkin <vlaad@amazon.co.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PRs to run benchmarks on

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants