Skip to content

When putting an S3Object connecting to S3, the contentEncoding for that object is always "aws-chunked" #5769

Closed
@ngudbhav

Description

@ngudbhav

Describe the bug

s3.txt

This is a re-opened thread from #4746 (comment).

I have attached the packet details of the PUT Object call from SDK to S3 (Localstack).

SDK always send Content-encoding as aws-chunked. This causes the result to fail to decompress. I have tried to explicitly set the Content-Length to a sufficiently high number but in vain. This is only reproducible with localstack and not the real AWS.

Screenshot 2025-01-03 at 5 51 09 PM

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

Content-encoding should not always be aws-chunked.

Current Behavior

Content-encoding is not always be aws-chunked.

Reproduction Steps

I have used the below code to upload the file

S3AsyncClient buildS3Client() {
        S3CrtAsyncClientBuilder builder = S3AsyncClient.crtBuilder()
                .credentialsProvider(getAwsCredentialsProvider())
                .region(Region.of(region));
        Optional<String> s3Endpoint = getLocalStackEndpoint();
        s3Endpoint.ifPresent(s -> {
            builder.endpointOverride(URI.create("https://s3.localhost.localstack.cloud:4566"));
            builder.forcePathStyle(true);
            builder.minimumPartSizeInBytes((long) (8 * 1024 * 1024));
        });
        return builder.build();
    }
s3Client = buildS3Client()
s3TransferManager = S3TransferManager.builder()
                .s3Client(s3Client)
                .build();

The above snippet initialises the S3Client. I have used the minimumPartSizeInBytes in trial and error.

putObjectRequest = PutObjectRequest.builder()
            .bucket(bucket)
            .key(key)
            .contentEncoding(GZIP_ENCODING)
            .contentType(contentType)
            .contentLength((long) (8 * 1024 * 1024))
            .tagging(tagging)
            .build();
uploadRequest = UploadRequest.builder()
            .putObjectRequest(putObjectRequest)
            .requestBody(AsyncRequestBody.fromBytes(bytes))
            .build();
s3TransferManager.upload(uploadRequest).completionFuture().join()

This code actually facilitates the transfer!

Possible Solution

No response

Additional Information/Context

No response

AWS Java SDK version used

2.29.15

JDK version used

17.0.13

Operating System and version

Ubuntu 22.04.5 LTS, Linux 6.10.14-linuxkit, Inside Docker 27.4.0

Activity

added
bugThis issue is a bug.
needs-triageThis issue or PR still needs to be triaged.
on Jan 3, 2025
changed the title When putting an S3Object connecting to S3 via http, the contentEncoding for that object is always "aws-chunked" When putting an S3Object connecting to S3, the contentEncoding for that object is always "aws-chunked" on Jan 3, 2025
added
investigatingThis issue is being investigated and/or work is in progress to resolve the issue.
p1This is a high priority issue
potential-regressionMarking this issue as a potential regression to be checked by team member
and removed
needs-triageThis issue or PR still needs to be triaged.
on Jan 3, 2025
self-assigned this
on Jan 3, 2025
bhoradc

bhoradc commented on Jan 3, 2025

@bhoradc

Hi @ngudbhav,

Thank you for reporting the issue. I tried to reproduce this scenario but found the behavior to be consistent between AWS S3 and LocalStack.

Both environments:

  • Accept the dual content-encoding (gzip,aws-chunked)
  • Successfully process the request
  • Return 200 status codes

Could you please go through the reproduction steps from below and let me know for any deviation that may result in your reported behavior?

pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>org.example</groupId>
    <artifactId>V2_ContentEncoding_5769</artifactId>
    <version>1.0-SNAPSHOT</version>
    <properties>
        <maven.compiler.source>11</maven.compiler.source>
        <maven.compiler.target>11</maven.compiler.target>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <aws.sdk.version>2.29.15</aws.sdk.version>
    </properties>
    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>software.amazon.awssdk</groupId>
                <artifactId>bom</artifactId>
                <version>2.29.15</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
            <dependency>
                <groupId>org.apache.logging.log4j</groupId>
                <artifactId>log4j-bom</artifactId>
                <version>2.19.0</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>software.amazon.awssdk</groupId>
            <artifactId>s3</artifactId>
        </dependency>
        <dependency>
            <groupId>software.amazon.awssdk</groupId>
            <artifactId>s3-transfer-manager</artifactId>
        </dependency>
        <dependency>
            <groupId>software.amazon.awssdk.crt</groupId>
            <artifactId>aws-crt</artifactId>
            <version>0.33.7</version>
        </dependency>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-core</artifactId>
        </dependency>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-api</artifactId>
        </dependency>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-slf4j2-impl</artifactId>
        </dependency>
    </dependencies>
</project>

1. AWS S3 Behavior:

Code snippet
public static void main(String[] args) {

        Log.initLoggingToFile(Log.LogLevel.Trace, "/Users/***/IdeaProjects/V2_ContentEncoding_5769/log.txt");

        String bucket = "<<bucket_name>>";
        String key = "testing-file.txt";
        String contentType = "text/plain";

        String content = "Hello Java SDK!";
        byte[] bytes = content.getBytes(StandardCharsets.UTF_8);

        S3AsyncClient s3Client = buildS3Client();

        S3TransferManager s3TransferManager = S3TransferManager.builder()
                .s3Client(s3Client)
                .build();

        PutObjectRequest putObjectRequest = PutObjectRequest.builder()
                .bucket(bucket)
                .key(key)
                .contentEncoding(GZIP_ENCODING)
                .contentType(contentType)
                .contentLength((long) bytes.length)
                .build();

        UploadRequest uploadRequest = UploadRequest.builder()
                .putObjectRequest(putObjectRequest)
                .requestBody(AsyncRequestBody.fromBytes(bytes))
                .build();

        try {
            s3TransferManager.upload(uploadRequest).completionFuture().join();
            System.out.println("Upload completed successfully");
        } catch (Exception e) {
            System.err.println("Upload failed: " + e.getMessage());
            e.printStackTrace();
        } finally {
            s3TransferManager.close();
            s3Client.close();
        }
    }
    private static S3AsyncClient buildS3Client() {
        S3CrtAsyncClientBuilder builder = S3AsyncClient.crtBuilder()
                .credentialsProvider(DefaultCredentialsProvider.create())
                .region(Region.of(REGION))
                .minimumPartSizeInBytes((long) (8 * 1024 * 1024));
        return builder.build();
    }
}
CRT debug log
content-encoding:gzip,aws-chunked
content-length:56
content-type:text/plain
host:<<bucket_name>>.s3.amazonaws.com

2. LocalStack Behavior:

Code snippet
public class Main {
    private static final String GZIP_ENCODING = "gzip";
    private static final String REGION = "us-east-1";

    public static void main(String[] args) {

        Log.initLoggingToFile(Log.LogLevel.Trace, "/Users/bhoradc/IdeaProjects/V2_ContentEncoding_5769/log.txt");

        String bucket = "<<bucket_name>>";
        String key = "testing-file.txt";
        String contentType = "text/plain";

        String content = "Hello Java SDK!";
        byte[] bytes = content.getBytes(StandardCharsets.UTF_8);

       // S3AsyncClient s3Client = buildS3Client();
        S3AsyncClient s3Client =localstackbuildS3Client();

        S3TransferManager s3TransferManager = S3TransferManager.builder()
                .s3Client(s3Client)
                .build();

        s3Client.createBucket(CreateBucketRequest.builder()
                        .bucket(bucket)
                        .build())
                .join();
        System.out.println("Bucket created successfully: " + bucket);

        PutObjectRequest putObjectRequest = PutObjectRequest.builder()
                .bucket(bucket)
                .key(key)
                .contentEncoding(GZIP_ENCODING)
                .contentType(contentType)
                .contentLength((long) bytes.length)
                .build();

        UploadRequest uploadRequest = UploadRequest.builder()
                .putObjectRequest(putObjectRequest)
                .requestBody(AsyncRequestBody.fromBytes(bytes))
                .build();

        try {
            s3TransferManager.upload(uploadRequest).completionFuture().join();
            System.out.println("Upload completed successfully");
        } catch (Exception e) {
            System.err.println("Upload failed: " + e.getMessage());
            e.printStackTrace();
        } finally {
            s3TransferManager.close();
            s3Client.close();
        }
    }
    private static S3AsyncClient buildS3Client() {
        S3CrtAsyncClientBuilder builder = S3AsyncClient.crtBuilder()
                .credentialsProvider(DefaultCredentialsProvider.create())
                .region(Region.of(REGION))
                .minimumPartSizeInBytes((long) (8 * 1024 * 1024));
        return builder.build();
    }

    private static S3AsyncClient localstackbuildS3Client() {
        S3CrtAsyncClientBuilder builder = S3AsyncClient.crtBuilder()
                .credentialsProvider(StaticCredentialsProvider.create(
                AwsBasicCredentials.create("test", "test")))
                .region(Region.of(REGION));
        Optional<String> s3Endpoint = getLocalStackEndpoint();
        s3Endpoint.ifPresent(s -> {
            builder.endpointOverride(URI.create("http://localhost:4566"));
            builder.forcePathStyle(true);
            builder.minimumPartSizeInBytes((long) (8 * 1024 * 1024));
        });
        return builder.build();
    }

    private static Optional<String> getLocalStackEndpoint() {
        return Optional.of("http://localhost:4566");
    }
}
CRT debug log
PUT
/<<bucket_name>>/testing-file.txt

amz-sdk-invocation-id:16b7fdce-8be2-0a1e-26af-fe4488c7e0b9
amz-sdk-request:attempt=1; max=1
content-encoding:gzip,aws-chunked
content-length:56
content-type:text/plain
host:localhost:4566
x-amz-content-sha256:STREAMING-UNSIGNED-PAYLOAD-TRAILER
x-amz-date:20250103T182315Z
x-amz-decoded-content-length:15
x-amz-trailer:x-amz-checksum-crc32
LocalStack version
~ % docker inspect localstack/localstack:3.7.2 | grep -i version
        "DockerVersion": "",
                "PYTHON_VERSION=3.11.9",
                "PYTHON_PIP_VERSION=24.0",
                "PYTHON_SETUPTOOLS_VERSION=65.5.1",
                "LOCALSTACK_BUILD_VERSION=3.7.2"

Only notable difference I see in the networking setup, your environment is localstack:4566 (docker’s internal network) whereas I am running it on localhost:4566. But this difference should not affect the content-encoding behavior is what I believe.

Regards,
Chaitanya

added
response-requestedWaiting on additional info and feedback. Will move to "closing-soon" in 10 days.
and removed
investigatingThis issue is being investigated and/or work is in progress to resolve the issue.
on Jan 3, 2025
ngudbhav

ngudbhav commented on Jan 4, 2025

@ngudbhav
Author

Hi @bhoradc
Thanks a lot for the quick reply.
Is there any way I can disable the aws-chunked content encoding?

I have tried various ways but downloading the file requires manual decompression.

As you can see in the screenshot, even the wireshark displays an error that decompression failed. I have tried using a browser, Go AWS client but the automatic decompression is not working.

However, if I explicitly write a code to decompress the GZIP file, I get the expected contents back.

I am not sure if the dual headers is the cause of this behaviour.

removed
response-requestedWaiting on additional info and feedback. Will move to "closing-soon" in 10 days.
on Jan 4, 2025
bhoradc

bhoradc commented on Jan 7, 2025

@bhoradc

Hi @ngudbhav,

Currently, I don’t see the CRT builder having any support disabling chunked encoding through signer parameters or configuration settings, similar to the S3 Standard/Builder clients.

However, I don't see this as a regression from #5043. The results I shared in my previous comment demonstrate the expected behavior for dual content-encoding with the Java SDK.

Regards,
Chaitanya

added
p2This is a standard priority issue
response-requestedWaiting on additional info and feedback. Will move to "closing-soon" in 10 days.
and removed
p1This is a high priority issue
potential-regressionMarking this issue as a potential regression to be checked by team member
on Jan 7, 2025
ngudbhav

ngudbhav commented on Jan 8, 2025

@ngudbhav
Author

Thanks a lot for your reply.

I understand that adding support to the CRT builder may not be in the pipeline. Is this something I can pick up? Our development experience is stuck because the browser cannot decompress the JSONs and CSVs from the local stack's S3.

Also, Can you please help me understand why the clients cannot decompress the server response? May be this is something that can be fixed without adding the support.

Thank you

removed
response-requestedWaiting on additional info and feedback. Will move to "closing-soon" in 10 days.
on Jan 8, 2025
DmitriyMusatkin

DmitriyMusatkin commented on Jan 16, 2025

@DmitriyMusatkin

I dont think its really crt issue.
aws-chunked is a fairly old s3 protocol for sending payload in chunks and supporting trailing headers.
this is used by clients to compute checksum as data is streamed and send it in the header.
the server should remove the aws-chunked from Content-Encoding after interpreting the chunks.
but looks like localstack has limited support for that and might not do it the same way as s3 does.
java transfer manager calculates checksum by default, so aws-chunked usage has been there since launch.
it might be possible to disable checksums or provide a precomputed checksum for the payload to workaround this

ngudbhav

ngudbhav commented on Jan 24, 2025

@ngudbhav
Author

Hi @DmitriyMusatkin, @bhoradc

The Localstack Team has resolved this issue after @DmitriyMusatkin's comment. Thanks a lot for the guidance.

Please let me know if you would like to keep this issue open.

github-actions

github-actions commented on Jan 24, 2025

@github-actions

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

bugThis issue is a bug.p2This is a standard priority issue

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

    Participants

    @ngudbhav@DmitriyMusatkin@bhoradc

    Issue actions

      When putting an S3Object connecting to S3, the contentEncoding for that object is always "aws-chunked" · Issue #5769 · aws/aws-sdk-java-v2