Description
Describe the bug
Using Ec2MetadataClient
to query the IMDSv2 service on an EC2 instance will eventually throw an SdkClientException
for a 401 Unauthorized
HTTP status code as shown here.
Regression Issue
- Select this option if this issue appears to be a regression.
Expected Behavior
This reproduction example should not print any errors. The Ec2MetadataClient
should not use stale tokens, which could be the cause of the 401 Unauthorized
response.
Current Behavior
Let it run long enough and the reproduction example produces the following log:
DEBUG 2024-12-28 18:40:33.451 UTC [main] software.amazon.awssdk.core.internal.http.loader.ClasspathSdkHttpServiceProvider - The HTTP implementation loaded is software.amazon.awssdk.http.apache.ApacheSdkHttpService@4b45a2f5
INFO 2024-12-28 18:40:33.802 UTC [main] App - Started.
DEBUG 2024-12-28 18:40:33.804 UTC [pool-1-thread-1] software.amazon.awssdk.utils.cache.CachedSupplier - (Ec2MetadataClient()) Cached value is stale and will be refreshed.
DEBUG 2024-12-28 18:40:33.805 UTC [pool-1-thread-1] software.amazon.awssdk.utils.cache.CachedSupplier - (Ec2MetadataClient()) Refreshing cached value.
DEBUG 2024-12-28 18:40:33.886 UTC [pool-1-thread-1] software.amazon.awssdk.utils.cache.CachedSupplier - (Ec2MetadataClient()) Successfully refreshed cached value. Next Prefetch Time: +1000000000-12-31T23:59:59.999999999Z. Next Stale Time: 2024-12-28T18:41:33.886487462Z
INFO 2024-12-28 18:40:33.912 UTC [pool-1-thread-1] App - MetadataResponse(body=InService)
INFO 2024-12-28 18:40:48.803 UTC [pool-1-thread-1] App - MetadataResponse(body=InService)
INFO 2024-12-28 18:41:03.803 UTC [pool-1-thread-1] App - MetadataResponse(body=InService)
INFO 2024-12-28 18:41:18.803 UTC [pool-1-thread-1] App - MetadataResponse(body=InService)
DEBUG 2024-12-28 18:41:33.807 UTC [pool-1-thread-1] software.amazon.awssdk.imds.internal.DefaultEc2MetadataClient - Metadata request response body:
DEBUG 2024-12-28 18:41:33.811 UTC [pool-1-thread-1] software.amazon.awssdk.imds.internal.DefaultEc2MetadataClient - Error while executing EC2Metadata request. Total attempts: 1. The requested metadata at path '/latest/meta-data/autoscaling/target-lifecycle-state' returned Http code 401
ERROR 2024-12-28 18:41:33.812 UTC [pool-1-thread-1] java.lang.Throwable - software.amazon.awssdk.core.exception.SdkClientException: The requested metadata at path '/latest/meta-data/autoscaling/target-lifecycle-state' returned Http code 401
ERROR 2024-12-28 18:41:33.812 UTC [pool-1-thread-1] java.lang.Throwable - at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111)
ERROR 2024-12-28 18:41:33.812 UTC [pool-1-thread-1] java.lang.Throwable - at software.amazon.awssdk.imds.internal.DefaultEc2MetadataClient.sendRequest(DefaultEc2MetadataClient.java:178)
ERROR 2024-12-28 18:41:33.812 UTC [pool-1-thread-1] java.lang.Throwable - at software.amazon.awssdk.imds.internal.DefaultEc2MetadataClient.get(DefaultEc2MetadataClient.java:123)
ERROR 2024-12-28 18:41:33.812 UTC [pool-1-thread-1] java.lang.Throwable - at App.lambda$new$0(AppManager.java:58)
ERROR 2024-12-28 18:41:33.812 UTC [pool-1-thread-1] java.lang.Throwable - at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
ERROR 2024-12-28 18:41:33.812 UTC [pool-1-thread-1] java.lang.Throwable - at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:358)
ERROR 2024-12-28 18:41:33.812 UTC [pool-1-thread-1] java.lang.Throwable - at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
ERROR 2024-12-28 18:41:33.812 UTC [pool-1-thread-1] java.lang.Throwable - at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
ERROR 2024-12-28 18:41:33.812 UTC [pool-1-thread-1] java.lang.Throwable - at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
ERROR 2024-12-28 18:41:33.813 UTC [pool-1-thread-1] java.lang.Throwable - at java.base/java.lang.Thread.run(Thread.java:1583)
DEBUG 2024-12-28 18:41:48.800 UTC [pool-1-thread-1] software.amazon.awssdk.utils.cache.CachedSupplier - (Ec2MetadataClient()) Cached value is stale and will be refreshed.
DEBUG 2024-12-28 18:41:48.801 UTC [pool-1-thread-1] software.amazon.awssdk.utils.cache.CachedSupplier - (Ec2MetadataClient()) Refreshing cached value.
DEBUG 2024-12-28 18:41:48.804 UTC [pool-1-thread-1] software.amazon.awssdk.utils.cache.CachedSupplier - (Ec2MetadataClient()) Successfully refreshed cached value. Next Prefetch Time: +1000000000-12-31T23:59:59.999999999Z. Next Stale Time: 2024-12-28T18:42:48.803994713Z
INFO 2024-12-28 18:41:48.806 UTC [pool-1-thread-1] App - MetadataResponse(body=InService)
INFO 2024-12-28 18:42:03.803 UTC [pool-1-thread-1] App - MetadataResponse(body=InService)
Reproduction Steps
This reproduction example uses a short token TTL, but longer TTLs produce the same errors.
Ec2MetadataClient ec2Metadata = Ec2MetadataClient.builder().tokenTtl(ofMinutes(1)).build();
newSingleThreadScheduledExecutor().scheduleAtFixedRate(() -> {
try {
System.out.println(ec2Metadata.get("/latest/meta-data/autoscaling/target-lifecycle-state"));
} catch (Exception exception) {
exception.printStackTrace();
}
}, 0, 15, SECONDS);
Possible Solution
This line checks if the current IMDSv2 token is expired before sending the request to the IMDS service, but there appears to be some sort of issue that causes a stale token to be used for the IMDS request. The subsequent call after a 401 Unauthorized
exception will refresh the token before the IMDS request, so I suspect a potential race condition (there may need to be some buffer time between the IMDS token TTL and the JVM-cached TTL) or some other timing issue.
Additional Information/Context
No response
AWS Java SDK version used
2.29.43
JDK version used
OpenJDK 64-Bit Server VM Corretto-21.0.5.11.1
Operating System and version
Ubuntu 24.04.1 LTS