Skip to content

Fix race condition between mem acquire and cancel#521

Merged
DmitriyMusatkin merged 13 commits intomainfrom
racy_races
Jun 6, 2025
Merged

Fix race condition between mem acquire and cancel#521
DmitriyMusatkin merged 13 commits intomainfrom
racy_races

Conversation

@DmitriyMusatkin
Copy link
Contributor

@DmitriyMusatkin DmitriyMusatkin commented Jun 6, 2025

Issue #, if available:

Description of changes:
canceling meta request had a race condition with memory buffer acquisition. cancelling at the right time, might have resulted in seg fault due to some bookkeeping info gettting incorrect.

this pr covers several changes:

  • cancel no longer removes bookkeeping nodes for pending buffer futures (and relies on future callback to do that)
  • future callback is now registered on a separate thread to avoid any deadlocks between cancel and mem acquisition

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@codecov-commenter
Copy link

codecov-commenter commented Jun 6, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.57%. Comparing base (52c90d3) to head (88cc003).
Report is 4 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #521      +/-   ##
==========================================
+ Coverage   89.24%   89.57%   +0.32%     
==========================================
  Files          21       21              
  Lines        6539     6543       +4     
==========================================
+ Hits         5836     5861      +25     
+ Misses        703      682      -21     
Files with missing lines Coverage Δ
source/s3_client.c 91.75% <100.00%> (+1.47%) ⬆️
source/s3_default_buffer_pool.c 96.08% <ø> (ø)
source/s3_meta_request.c 91.25% <100.00%> (+0.35%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@TingDaoK TingDaoK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix&ship

@DmitriyMusatkin DmitriyMusatkin merged commit 233c587 into main Jun 6, 2025
34 checks passed
@DmitriyMusatkin DmitriyMusatkin deleted the racy_races branch June 6, 2025 23:19
github-merge-queue bot pushed a commit to awslabs/mountpoint-s3 that referenced this pull request Jun 9, 2025
> [!NOTE]
> This PR reapplies the changes in #1430, previously reverted in #1435,
with the addition of a fix to a race condition in `aws-c-s3`
(awslabs/aws-c-s3#521).

In particular, we pick up - but do not adopt in this change - the new
Memory pool interface
([awslabs/aws-c-s3#517](awslabs/aws-c-s3#517)),
which requires minor adjustments to the bindings and the
`poll_buffer_pool_usage_stats` function.

<details>
  <summary>Full CRT changelog:</summary>

```
Submodule mountpoint-s3-crt-sys/crt/aws-c-cal fa108de5..938d0fea:
  > [FIX] heap use after free on aws_ecc_key_pair_new_from_asn1 (#219)
  > Remove clang-3 from CI (#218)
  > Fix casing on Windows header files (#217)
  > dlopen(NULL) returns NULL on static linked executable (#215)
Submodule mountpoint-s3-crt-sys/crt/aws-c-common 8ae8f48e..aaa2f11e:
  > Fix invalid XML Buffer Overflow Error (#1201)
  > Add aws_cbor_decoder_reset_src api for aws_cbor_decoder (#1202)
  > Fix casing on Windows header files (#1199)
  > Error handling docs (#1197)
  > make exports consistent (#1196)
Submodule mountpoint-s3-crt-sys/crt/aws-c-http ca7e0e29..3eedf1ef:
  > fix mock server window update on 0 length body (#517)
Submodule mountpoint-s3-crt-sys/crt/aws-c-io 8286c781..689dee3c:
  > Fix warnings in iOS Cross Compile CI (#733)
  > Remove clang-3 from CI (#731)
  > Acquire/Release Event Loop (#725)
  > Fix casing on Windows header files (#730)
Submodule mountpoint-s3-crt-sys/crt/aws-c-s3 7d2d4b30..52c90d39:
  > Fix race condition between mem acquire and cancel (#521)
  > Memory pool interface (#517)
  > Remove clang-3 from CI (#520)
  > Revert "[s3_meta_request]: Retry on ExpiredToken" (#518)
Submodule mountpoint-s3-crt-sys/crt/aws-c-sdkutils ba6a28fa..f678bda9:
  > Fix double free on malformed rulesets (#53)
  > make exports consistent (#52)
```
</details>

### Does this change impact existing behavior?

No change in behavior.

### Does this change need a changelog entry? Does it require a version
change?

Yes.

---

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and I agree to the terms of
the [Developer Certificate of Origin
(DCO)](https://developercertificate.org/).

---------

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>
mansi153 pushed a commit to mansi153/mountpoint-s3 that referenced this pull request Jul 24, 2025
> [!NOTE]
> This PR reapplies the changes in awslabs#1430, previously reverted in awslabs#1435,
with the addition of a fix to a race condition in `aws-c-s3`
(awslabs/aws-c-s3#521).

In particular, we pick up - but do not adopt in this change - the new
Memory pool interface
([awslabs/aws-c-s3#517](awslabs/aws-c-s3#517)),
which requires minor adjustments to the bindings and the
`poll_buffer_pool_usage_stats` function.

<details>
  <summary>Full CRT changelog:</summary>

```
Submodule mountpoint-s3-crt-sys/crt/aws-c-cal fa108de5..938d0fea:
  > [FIX] heap use after free on aws_ecc_key_pair_new_from_asn1 (awslabs#219)
  > Remove clang-3 from CI (awslabs#218)
  > Fix casing on Windows header files (awslabs#217)
  > dlopen(NULL) returns NULL on static linked executable (awslabs#215)
Submodule mountpoint-s3-crt-sys/crt/aws-c-common 8ae8f48e..aaa2f11e:
  > Fix invalid XML Buffer Overflow Error (awslabs#1201)
  > Add aws_cbor_decoder_reset_src api for aws_cbor_decoder (awslabs#1202)
  > Fix casing on Windows header files (awslabs#1199)
  > Error handling docs (awslabs#1197)
  > make exports consistent (awslabs#1196)
Submodule mountpoint-s3-crt-sys/crt/aws-c-http ca7e0e29..3eedf1ef:
  > fix mock server window update on 0 length body (awslabs#517)
Submodule mountpoint-s3-crt-sys/crt/aws-c-io 8286c781..689dee3c:
  > Fix warnings in iOS Cross Compile CI (awslabs#733)
  > Remove clang-3 from CI (awslabs#731)
  > Acquire/Release Event Loop (awslabs#725)
  > Fix casing on Windows header files (awslabs#730)
Submodule mountpoint-s3-crt-sys/crt/aws-c-s3 7d2d4b30..52c90d39:
  > Fix race condition between mem acquire and cancel (awslabs#521)
  > Memory pool interface (awslabs#517)
  > Remove clang-3 from CI (awslabs#520)
  > Revert "[s3_meta_request]: Retry on ExpiredToken" (awslabs#518)
Submodule mountpoint-s3-crt-sys/crt/aws-c-sdkutils ba6a28fa..f678bda9:
  > Fix double free on malformed rulesets (awslabs#53)
  > make exports consistent (awslabs#52)
```
</details>

### Does this change impact existing behavior?

No change in behavior.

### Does this change need a changelog entry? Does it require a version
change?

Yes.

---

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and I agree to the terms of
the [Developer Certificate of Origin
(DCO)](https://developercertificate.org/).

---------

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>
dannycjones pushed a commit to c-hagem/mountpoint-s3 that referenced this pull request Oct 9, 2025
> [!NOTE]
> This PR reapplies the changes in awslabs#1430, previously reverted in awslabs#1435,
with the addition of a fix to a race condition in `aws-c-s3`
(awslabs/aws-c-s3#521).

In particular, we pick up - but do not adopt in this change - the new
Memory pool interface
([awslabs/aws-c-s3#517](awslabs/aws-c-s3#517)),
which requires minor adjustments to the bindings and the
`poll_buffer_pool_usage_stats` function.

<details>
  <summary>Full CRT changelog:</summary>

```
Submodule mountpoint-s3-crt-sys/crt/aws-c-cal fa108de5..938d0fea:
  > [FIX] heap use after free on aws_ecc_key_pair_new_from_asn1 (awslabs#219)
  > Remove clang-3 from CI (awslabs#218)
  > Fix casing on Windows header files (awslabs#217)
  > dlopen(NULL) returns NULL on static linked executable (awslabs#215)
Submodule mountpoint-s3-crt-sys/crt/aws-c-common 8ae8f48e..aaa2f11e:
  > Fix invalid XML Buffer Overflow Error (awslabs#1201)
  > Add aws_cbor_decoder_reset_src api for aws_cbor_decoder (awslabs#1202)
  > Fix casing on Windows header files (awslabs#1199)
  > Error handling docs (awslabs#1197)
  > make exports consistent (awslabs#1196)
Submodule mountpoint-s3-crt-sys/crt/aws-c-http ca7e0e29..3eedf1ef:
  > fix mock server window update on 0 length body (awslabs#517)
Submodule mountpoint-s3-crt-sys/crt/aws-c-io 8286c781..689dee3c:
  > Fix warnings in iOS Cross Compile CI (awslabs#733)
  > Remove clang-3 from CI (awslabs#731)
  > Acquire/Release Event Loop (awslabs#725)
  > Fix casing on Windows header files (awslabs#730)
Submodule mountpoint-s3-crt-sys/crt/aws-c-s3 7d2d4b30..52c90d39:
  > Fix race condition between mem acquire and cancel (awslabs#521)
  > Memory pool interface (awslabs#517)
  > Remove clang-3 from CI (awslabs#520)
  > Revert "[s3_meta_request]: Retry on ExpiredToken" (awslabs#518)
Submodule mountpoint-s3-crt-sys/crt/aws-c-sdkutils ba6a28fa..f678bda9:
  > Fix double free on malformed rulesets (awslabs#53)
  > make exports consistent (awslabs#52)
```
</details>

### Does this change impact existing behavior?

No change in behavior.

### Does this change need a changelog entry? Does it require a version
change?

Yes.

---

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and I agree to the terms of
the [Developer Certificate of Origin
(DCO)](https://developercertificate.org/).

---------

Signed-off-by: Alessandro Passaro <alexpax@amazon.co.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants