Skip to content

Commit 237e9e1

Browse files
waahm7graebm
andauthored
GetObject Flow Documentation (#402)
Co-authored-by: Michael Graeb <[email protected]>
1 parent 1dd55be commit 237e9e1

File tree

4 files changed

+28
-2
lines changed

4 files changed

+28
-2
lines changed

README.md

+14-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,19 @@
11
## AWS C S3
22

3-
C99 library implementation for communicating with the S3 service, designed for maximizing throughput on high bandwidth EC2 instances.
3+
The AWS-C-S3 library is an asynchronous AWS S3 client focused on maximizing throughput and network utilization.
4+
5+
### Key features:
6+
- **Automatic Request Splitting**: Improves throughput by automatically splitting the request into part-sized chunks and performing parallel uploads/downloads of these chunks over multiple connections. There's a cap on the throughput of single S3 connection, the only way to go faster is multiple parallel connections.
7+
- **Automatic Retries**: Increases resilience by retrying individual failed chunks of a file transfer, eliminating the need to restart transfers from scratch after an intermittent error.
8+
- **DNS Load Balancing**: DNS resolver continuously harvests Amazon S3 IP addresses. When load is spread across the S3 fleet, overall throughput is better than if all connections were hammering the same IP simultaneously.
9+
- **Advanced Network Management**: The client incorporates automatic request parallelization, effective timeouts and retries, and efficient connection reuse. This approach helps to maximize throughput and network utilization, and to avoid network overloads.
10+
- **Thread Pools and Async I/O**: Avoids bottlenecks associated with single-thread processing.
11+
- **Parallel Reads**: When uploading a large file from disk, reads from multiple parts of the file in parallel. This is faster than reading the file sequentially from beginning to end.
12+
13+
### Documentation
14+
15+
- [GetObject](docs/GetObject.md): A visual representation of the GetObject request flow.
16+
- [Memory Aware Requests Execution](docs/memory_aware_request_execution.md): An in-depth guide on optimizing memory usage during request executions.
417

518
## License
619

docs/GetObject.md

+9
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# GetObject
2+
3+
## Overview
4+
The `GetObject` is used to download objects from Amazon S3. Optimized for throughput, the CRT S3 client enhances performance and reliability by parallelizing multiple part-sized `GetObject` with range requests.
5+
6+
## Flow Diagram
7+
Below is the typical flow of a GetObject request made by the user.
8+
9+
![GetObject Flow Diagram](images/GetObjectFlow.svg)

docs/images/GetObjectFlow.svg

+4
Loading

docs/memory_aware_request_execution.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ receiving for each request and scaling requests in flight has direct impact on
55
memory used. In practice, setting high target throughput or larger part size can
66
lead to high observed memory usage.
77

8-
To mitigate high memory usages, memory reuse improvements were recently added to
8+
To mitigate high memory usages, memory reuse improvements were added to
99
the client along with options to limit max memory used. The following sections
1010
will go into more detail on aspects of those changes and how the affect the
1111
client.

0 commit comments

Comments
 (0)