Skip to content

Commit fd5803f

Browse files
Merge branch 'main' into unknown_checksum
2 parents db79c84 + 10be224 commit fd5803f

2 files changed

Lines changed: 90 additions & 20 deletions

File tree

README.md

Lines changed: 45 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -18,34 +18,61 @@ The AWS-C-S3 library is an asynchronous AWS S3 client focused on maximizing thro
1818

1919
### Configuration
2020

21-
#### Memory Limit
21+
#### Environment Variables
2222

23-
The S3 client uses a buffer pool to manage memory for concurrent transfers. You can control the memory limit in two ways:
23+
1. **Memory Limit - `AWS_CRT_S3_MEMORY_LIMIT_IN_GIB`**
2424

25-
1. **Via Configuration** (Recommended): Set `memory_limit_in_bytes` in `aws_s3_client_config`:
25+
The S3 client uses a buffer pool to manage memory for concurrent transfers.
2626

27-
```c
28-
struct aws_s3_client_config config = {
29-
.memory_limit_in_bytes = GB_TO_BYTES(4), // 4 GiB limit
30-
// ... other configuration
31-
};
27+
Example Usage:
28+
29+
```bash
30+
export AWS_CRT_S3_MEMORY_LIMIT_IN_GIB=4 # 4 GiB limit
3231
```
3332

34-
2. **Via Environment Variable**: Set the `AWS_CRT_S3_MEMORY_LIMIT_IN_GIB` environment variable:
33+
**Default Behavior**:
34+
When nothing is set, the client sets a default memory limit based on the target throughput.
3535

36-
```bash
37-
export AWS_CRT_S3_MEMORY_LIMIT_IN_GIB=4 # 4 GiB limit
36+
**Notes**:
37+
* The limit applies per client. If multiple clients created, limit will apply to each separately.
38+
* The environment variable value must be a valid positive integer representing gigabytes (GiB).
39+
* The value is converted from GiB to bytes internally (1 GiB = 1024³ bytes).
40+
* Invalid values or overflow conditions will cause client creation to fail with `AWS_ERROR_INVALID_ARGUMENT`.
41+
42+
> [!TIP]
43+
> You can also control memory limit *in bytes* using client config. The client config takes precedence over the environment variable (memory_limit_in_bytes needs to be set to a non-zero value).
44+
> ```c
45+
> struct aws_s3_client_config config = {
46+
> .memory_limit_in_bytes = GB_TO_BYTES(4), // 4 GiB limit
47+
> // ... other configuration
48+
> };
49+
> ```
50+
51+
2. **Maximum Parts Pending Read - `AWS_CRT_S3_MAX_PARTS_PENDING_READ`**
52+
53+
Controls the maximum number of parts that can be pending read from the input stream during an individual multipart upload. Higher values may improve upload throughput for large files by allowing more parts to be read in parallel, only if the disk read speed can benefit from more concurrent reading of parts.
54+
55+
Example Usage:
56+
57+
```bash
58+
export AWS_CRT_S3_MAX_PARTS_PENDING_READ=20
3859
```
3960
40-
**Priority**: The configuration value takes precedence over the environment variable. If `memory_limit_in_bytes` is set to a non-zero value in the config, the environment variable is ignored.
61+
**Default Behavior**:
62+
If not set, the default value is 5.
63+
64+
**Notes**:
65+
* Only affects multipart uploads. Small files that fit in a single part are not affected.
66+
* If there are multiple parallel multipart upload requests, each upload is limited by the value individually (not cumulatively).
67+
* Setting this too low may introduce delays between reads, as the meta-request waits for the client to schedule more work.
68+
* Setting this too high may cause a single upload to hog work tokens, starving other concurrent uploads.
69+
* The value must be a positive integer (1–4294967295). Invalid or zero values are ignored with a warning, and the default is used.
70+
* The value is read once on first use and cached for the lifetime of the process.
71+
* If the network bandwidth of the device is too low, even a higher value of pending read might not be respected due to having maximum allowed requests in flight.
4172
42-
**Default Behavior**: If neither is set (config is 0 and environment variable is not set), the client sets a default memory limit based on the target throughput.
73+
3. **Test Bucket - `CRT_S3_TEST_BUCKET_NAME`**
4374
44-
**Notes**:
45-
* The limit applies per client. If multiple clients created, limit will apply to each separately.
46-
* The environment variable value must be a valid positive integer representing gigabytes (GiB).
47-
* The value is converted from GiB to bytes internally (1 GiB = 1024³ bytes).
48-
* Invalid values or overflow conditions will cause client creation to fail with `AWS_ERROR_INVALID_ARGUMENT`.
75+
The S3 bucket name used for running unit tests. See the [test_helper documentation](./tests/test_helper/) for setup instructions.
4976
5077
## License
5178

source/s3_auto_ranged_put.c

Lines changed: 45 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,9 @@
1212
#include "aws/s3/private/s3_util.h"
1313
#include <aws/common/clock.h>
1414
#include <aws/common/encoding.h>
15+
#include <aws/common/environment.h>
1516
#include <aws/common/string.h>
17+
#include <aws/common/thread.h>
1618
#include <aws/common/xml_parser.h>
1719
#include <aws/io/stream.h>
1820

@@ -38,8 +40,48 @@ static const uint32_t s_unknown_length_default_num_parts = 32;
3840
* (1st meta-request as queue of 100 "work tokens" that it needs to read
3941
* the stream for, while later meta-requests are doing nothing waiting for work tokens)
4042
*
43+
* Can be overridden via env var AWS_CRT_S3_MAX_PARTS_PENDING_READ.
44+
*
4145
* TODO: this value needs further benchmarking. */
42-
static const uint32_t s_max_parts_pending_read = 5;
46+
static const uint32_t s_max_parts_pending_read_default = 5;
47+
static const char *s_max_parts_pending_read_env_var = "AWS_CRT_S3_MAX_PARTS_PENDING_READ";
48+
static uint32_t s_max_parts_pending_read = 0;
49+
static aws_thread_once s_max_parts_pending_read_once = AWS_THREAD_ONCE_STATIC_INIT;
50+
51+
static void s_max_parts_pending_read_init(void *user_data) {
52+
struct aws_allocator *allocator = user_data;
53+
s_max_parts_pending_read = s_max_parts_pending_read_default;
54+
struct aws_string *from_env = aws_get_env_nonempty(allocator, s_max_parts_pending_read_env_var);
55+
if (from_env) {
56+
uint64_t parsed = 0;
57+
if (!aws_byte_cursor_utf8_parse_u64(aws_byte_cursor_from_string(from_env), &parsed) && parsed > 0 &&
58+
parsed <= UINT32_MAX) {
59+
s_max_parts_pending_read = (uint32_t)parsed;
60+
AWS_LOGF_INFO(
61+
AWS_LS_S3_META_REQUEST,
62+
"Using %s=%" PRIu32 " from environment.",
63+
s_max_parts_pending_read_env_var,
64+
s_max_parts_pending_read);
65+
} else {
66+
AWS_LOGF_WARN(
67+
AWS_LS_S3_META_REQUEST,
68+
"Ignoring invalid value for env var %s; using default %" PRIu32 ".",
69+
s_max_parts_pending_read_env_var,
70+
s_max_parts_pending_read_default);
71+
}
72+
aws_string_destroy(from_env);
73+
} else {
74+
AWS_LOGF_INFO(
75+
AWS_LS_S3_META_REQUEST,
76+
"Using %" PRIu32 " because no value was set from environment.",
77+
s_max_parts_pending_read);
78+
}
79+
}
80+
81+
static uint32_t s_get_max_parts_pending_read(struct aws_allocator *allocator) {
82+
aws_thread_call_once(&s_max_parts_pending_read_once, s_max_parts_pending_read_init, allocator);
83+
return s_max_parts_pending_read;
84+
}
4385

4486
static const struct aws_byte_cursor s_create_multipart_upload_copy_headers[] = {
4587
AWS_BYTE_CUR_INIT_FROM_STRING_LITERAL("x-amz-server-side-encryption-customer-algorithm"),
@@ -487,7 +529,8 @@ static bool s_should_skip_scheduling_more_parts_based_on_flags(
487529
}
488530

489531
/* In all other cases, cap the number of pending-reads to something reasonable */
490-
return auto_ranged_put->synced_data.num_parts_pending_read >= s_max_parts_pending_read;
532+
return auto_ranged_put->synced_data.num_parts_pending_read >=
533+
s_get_max_parts_pending_read(auto_ranged_put->base.allocator);
491534
}
492535

493536
static void s_s3_auto_ranged_put_send_request_finish(

0 commit comments

Comments
 (0)