|
3 | 3 | The AWS-C-S3 library is an asynchronous AWS S3 client focused on maximizing throughput and network utilization. |
4 | 4 |
|
5 | 5 | ### Key features: |
6 | | -- **Automatic Request Splitting**: Improves throughput by automatically splitting the request into part-sized chunks and performing parallel uploads/downloads of these chunks over multiple connections. There's a cap on the throughput of single S3 connection, the only way to go faster is multiple parallel connections. |
7 | | -- **Automatic Retries**: Increases resilience by retrying individual failed chunks of a file transfer, eliminating the need to restart transfers from scratch after an intermittent error. |
8 | | -- **DNS Load Balancing**: DNS resolver continuously harvests Amazon S3 IP addresses. When load is spread across the S3 fleet, overall throughput more reliable than if all connections are going to a single IP. |
9 | | -- **Advanced Network Management**: The client incorporates automatic request parallelization, effective timeouts and retries, and efficient connection reuse. This approach helps to maximize throughput and network utilization, and to avoid network overloads. |
10 | | -- **Thread Pools and Async I/O**: Avoids bottlenecks associated with single-thread processing. |
11 | | -- **Parallel Reads**: When uploading a large file from disk, reads from multiple parts of the file in parallel. This is faster than reading the file sequentially from beginning to end. |
| 6 | + |
| 7 | +* **Automatic Request Splitting**: Improves throughput by automatically splitting the request into part-sized chunks and performing parallel uploads/downloads of these chunks over multiple connections. There's a cap on the throughput of single S3 connection, the only way to go faster is multiple parallel connections. |
| 8 | +* **Automatic Retries**: Increases resilience by retrying individual failed chunks of a file transfer, eliminating the need to restart transfers from scratch after an intermittent error. |
| 9 | +* **DNS Load Balancing**: DNS resolver continuously harvests Amazon S3 IP addresses. When load is spread across the S3 fleet, overall throughput more reliable than if all connections are going to a single IP. |
| 10 | +* **Advanced Network Management**: The client incorporates automatic request parallelization, effective timeouts and retries, and efficient connection reuse. This approach helps to maximize throughput and network utilization, and to avoid network overloads. |
| 11 | +* **Thread Pools and Async I/O**: Avoids bottlenecks associated with single-thread processing. |
| 12 | +* **Parallel Reads**: When uploading a large file from disk, reads from multiple parts of the file in parallel. This is faster than reading the file sequentially from beginning to end. |
12 | 13 |
|
13 | 14 | ### Documentation |
14 | 15 |
|
15 | | -- [GetObject](docs/GetObject.md): A visual representation of the GetObject request flow. |
16 | | -- [Memory Aware Requests Execution](docs/memory_aware_request_execution.md): An in-depth guide on optimizing memory usage during request executions. |
| 16 | +* [GetObject](docs/GetObject.md): A visual representation of the GetObject request flow. |
| 17 | +* [Memory Aware Requests Execution](docs/memory_aware_request_execution.md): An in-depth guide on optimizing memory usage during request executions. |
| 18 | + |
| 19 | +### Configuration |
| 20 | + |
| 21 | +#### Memory Limit |
| 22 | + |
| 23 | +The S3 client uses a buffer pool to manage memory for concurrent transfers. You can control the memory limit in two ways: |
| 24 | + |
| 25 | +1. **Via Configuration** (Recommended): Set `memory_limit_in_bytes` in `aws_s3_client_config`: |
| 26 | + |
| 27 | +```c |
| 28 | + struct aws_s3_client_config config = { |
| 29 | + .memory_limit_in_bytes = GB_TO_BYTES(4), // 4 GiB limit |
| 30 | + // ... other configuration |
| 31 | + }; |
| 32 | + ``` |
| 33 | +
|
| 34 | +2. **Via Environment Variable**: Set the `AWS_CRT_S3_MEMORY_LIMIT_IN_GIB` environment variable: |
| 35 | +
|
| 36 | +```bash |
| 37 | + export AWS_CRT_S3_MEMORY_LIMIT_IN_GIB=4 # 4 GiB limit |
| 38 | + ``` |
| 39 | + |
| 40 | +**Priority**: The configuration value takes precedence over the environment variable. If `memory_limit_in_bytes` is set to a non-zero value in the config, the environment variable is ignored. |
| 41 | + |
| 42 | +**Default Behavior**: If neither is set (config is 0 and environment variable is not set), the client sets a default memory limit based on the target throughput. |
| 43 | + |
| 44 | +**Notes**: |
| 45 | +* The limit applies per client. If multiple clients created, limit will apply to each separately. |
| 46 | +* The environment variable value must be a valid positive integer representing gigabytes (GiB). |
| 47 | +* The value is converted from GiB to bytes internally (1 GiB = 1024³ bytes). |
| 48 | +* Invalid values or overflow conditions will cause client creation to fail with `AWS_ERROR_INVALID_ARGUMENT`. |
17 | 49 |
|
18 | 50 | ## License |
19 | 51 |
|
@@ -86,14 +118,19 @@ cmake --build aws-c-s3/build --target install |
86 | 118 | After installing all the dependencies, and building aws-c-s3, you can run the sample directly from the s3 build directory. |
87 | 119 |
|
88 | 120 | To download: |
| 121 | + |
89 | 122 | ``` |
90 | 123 | aws-c-s3/build/samples/s3/s3 cp s3://<bucket-name>/<object-name> <download-path> --region <region> |
91 | 124 | ``` |
| 125 | + |
92 | 126 | To upload: |
| 127 | + |
93 | 128 | ``` |
94 | 129 | aws-c-s3/build/samples/s3/s3 cp <upload-path> s3://<bucket-name>/<object-name> --region <region> |
95 | 130 | ``` |
| 131 | + |
96 | 132 | To list objects: |
| 133 | + |
97 | 134 | ``` |
98 | 135 | aws-c-s3/build/samples/s3/s3 ls s3://<bucket-name> --region <region> |
99 | 136 | ``` |
|
0 commit comments