Implement S3's diverse-IP performance recommendation internally

**Is your feature request related to a problem? Please describe.**

My team runs batch data processing jobs using dozens of machines in EC2. The machines tend to boot up at the same time, and then each reads 10,000s of files from S3. Sometimes, this loading process is significantly slowed by S3 throttling (503 SlowDowns, connection resets, etc.), likely depending on S3's internal scaling for the (many) prefixes involved (we observed this before and after [2018-07-17](https://aws.amazon.com/about-aws/whats-new/2018/07/amazon-s3-announces-increased-request-rate-performance/)), maybe even the number of concurrent jobs.

[S3 performance recommendations](https://docs.aws.amazon.com/AmazonS3/latest/dev/optimizing-performance-design-patterns.html) say:
> Finally, it’s worth paying attention to DNS and double-checking that requests are being spread over a wide pool of Amazon S3 IP addresses. DNS queries for Amazon S3 cycle through a large list of IP endpoints. But caching resolvers or application code that reuses a single IP address do not benefit from address diversity and the load balancing that follows from it.

I observed that the AWS-provided DNS resolver in our VPC seemed to internally cache results for S3 hostnames (bucket.us-west-2.s3.amazonaws.com) for around 4 secs each. Since our machines initiate 10,000s of S3 object reads shortly after booting (and also periodically throughout the job—it works in phases), this apparently led to them connecting to relatively few S3 peers ([demonstration program](https://github.com/grailbio/base/blob/ef0d2598b60610fbd2d8f311af5e2875c442c343/file/s3file/internal/cmd/resolvetest/main.go)). I think this led to throttling even when our request rates were below S3's theoretical limits.

**Describe the solution you'd like**

It'd be great if the SDK handled this internally, transparently (for example, diversifying connection pools).

**Describe alternatives you've considered**

We're trying out a workaround: [a custom `"net/http".RoundTripper` implementation](https://github.com/grailbio/base/blob/ef0d2598b60610fbd2d8f311af5e2875c442c343/file/s3file/s3transport/transport.go) that rewrites requests to spread load over all known S3 peers. Over time (over many VPC DNS cache intervals) we resolve more S3 IPs, spreading load over many peers and avoiding throttling (in our experience so far). However, this implementation is relatively inelegant and inconvenient, and there are probably better ways to handle this.

In other issues I've seen recommendations to use s3manager to retry throttling errors. Unfortunately I don't think we can use that in our application because we're interested in streaming (read, compute, discard), buffering in memory or on local disk might increase costs. Also, that seems to use the same HTTP client as the regular interface, so I'd expect it to succeed slowly, whereas connecting to more peers could succeed quickly.

**Additional context**

I noticed that issues aws/aws-sdk-go#1763, aws/aws-sdk-go#3707, aws/aws-sdk-go#1242 mention throttling so there's a chance those users could benefit from this, too.

CC @jcharum @yasushi-saito

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement S3's diverse-IP performance recommendation internally #2331

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development