Skip to content

Upgrading sdk packages causes "28000: PAM authentication failed for user" error #4433

Description

@mrizzini

Describe the bug

Hi there,

My team has been investigating an error occurring when connecting to our our postgres DB instance, from our EKS container, and have not been able to get to the root cause. The error we are seeing is "28000: PAM authentication failed for user {ourUser}"

We have recently started upgrading our AWSSDK packages in our EKS container, and that is when we started getting these errors. Before the errors began, we were stable, and we were on the following packages:

<PackageReference Include="AWSSDK.Core" Version="4.0.1.3" />
<PackageReference Include="AWSSDK.EventBridge" Version="4.0.5.2" />
<PackageReference Include="AWSSDK.RDS" Version="4.0.10.7" />
<PackageReference Include="AWSSDK.S3" Version="4.0.9.1" />
<PackageReference Include="AWSSDK.SecretsManager" Version="4.0.2" />
<PackageReference Include="AWSSDK.SecurityToken" Version="4.0.3" />
<PackageReference Include="AWSSDK.SQS" Version="4.0.2" />
<PackageReference Include="AWSSDK.SSO" Version="4.0.2" />
<PackageReference Include="AWSSDK.SSOOIDC" Version="4.0.3.1" />
<PackageReference Include="AWSSDK.StepFunctions" Version="4.0.1" />
<PackageReference Include="AWSSDK.Transfer" Version="4.0.4.3" />

Sometime in April or May, we upgraded to the following:

<PackageReference Include="AWSSDK.Core" Version="4.0.3.31" />
<PackageReference Include="AWSSDK.EventBridge" Version="4.0.5.2" />
<PackageReference Include="AWSSDK.RDS" Version="4.0.10.7" />
<PackageReference Include="AWSSDK.S3" Version="4.0.9.1" />
<PackageReference Include="AWSSDK.SecretsManager" Version="4.0.2" />
<PackageReference Include="AWSSDK.SecurityToken" Version="4.0.5.20" />
<PackageReference Include="AWSSDK.SQS" Version="4.0.2" />
<PackageReference Include="AWSSDK.SSO" Version="4.0.2" />
<PackageReference Include="AWSSDK.SSOOIDC" Version="4.0.3.1" />
<PackageReference Include="AWSSDK.StepFunctions" Version="4.0.1" />
<PackageReference Include="AWSSDK.Transfer" Version="4.0.4.3" />

Notably in the above, we upgraded to "AWSSDK.Core" Version="4.0.3.31" and "AWSSDK.SecurityToken" Version="4.0.5.20".

When we made that switch, the errors began at a frequent rate, happening multiple times an hour. It is important to note too, that there were many successful connections made to the DB during this stretch too, so not all were failing. It seems to happen randomly.

When triaging the issue, we downgraded back to the original versions we were on, and the errors stopped. Then we tried to upgrade to even newer packages, and went on the following:

<PackageReference Include="AWSSDK.Core" Version="4.0.6.1" />
<PackageReference Include="AWSSDK.EventBridge" Version="4.0.5.29" />
<PackageReference Include="AWSSDK.RDS" Version="4.0.20.3" />
<PackageReference Include="AWSSDK.S3" Version="4.0.23" />
<PackageReference Include="AWSSDK.SecretsManager" Version="4.0.4.20" />
<PackageReference Include="AWSSDK.SecurityToken" Version="4.0.6.3" />
<PackageReference Include="AWSSDK.SQS" Version="4.0.2.28" />
<PackageReference Include="AWSSDK.SSO" Version="4.0.2.27" />
<PackageReference Include="AWSSDK.SSOOIDC" Version="4.0.3.28" />
<PackageReference Include="AWSSDK.StepFunctions" Version="4.0.2.23" />
<PackageReference Include="AWSSDK.Transfer" Version="4.0.9.1" />

Notably in the above, we upgraded to "AWSSDK.Core" Version="4.0.6.1" and "AWSSDK.SecurityToken" Version="4.0.6.3".

The errors came back this time, happening again multiple times per hour, and again with many successful connections as well. Finally, we went back to the original versions again and the errors stopped.

We believe the SDK versions upgrades are causing these PAM errors, but have been unable to diagnose them ourselves. We did some investigation of the error on your github page and found some information:

This person started getting PAM errors when they upgraded from SDK 3.7.4 to 4.0.1 : #3873

Then someone from the aws-sdk team said it was because of a change they introduced: #3873 (comment)

So aws reverted it in Core 4.0.0.32 : https://github.com/aws/aws-sdk-net/blob/main/changelogs/SDK.CHANGELOG.2025.md#401030-2025-10-01-183

And presumably this fixed the problem by reverting back to the old code

And finally they reintroduced it in Core 4.0.3.0 : https://github.com/aws/aws-sdk-net/blob/main/changelogs/SDK.CHANGELOG.2025.md#401290-2025-11-07-194…

We had been on "AWSSDK.Core" Version="4.0.1.3" for awhile, and we were good, and that was the version that had the rolled back code,

When we upgraded to "AWSSDK.Core" Version="4.0.3.31", we see the errors. that was the version that had the re-released code, that presumably worked for other people but we see errors with.

Then we tried to upgrade to 4.0.6.1 and still had the problem. Currently there is one newer version, Core 4.0.6.2, which we have not tried yet.

Could you take a look into this for us?

Thank you!

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

No PAM authentication failed errors
OR
A document explaining what changed and what specific values/settings are needed

Current Behavior

Npgsql.PostgresException (0x80004005): 28000: PAM authentication failed for user "{ourUser}"

Reproduction Steps

The line of code we are executing where this error occurs is
RDSAuthTokenGenerator.GenerateAuthToken(region, hostName, portNumber, dbUserName);

Possible Solution

No response

Additional Information/Context

No response

AWS .NET SDK and/or Package version used

<PackageReference Include="AWSSDK.Core" Version="4.0.6.1" /> <PackageReference Include="AWSSDK.EventBridge" Version="4.0.5.29" /> <PackageReference Include="AWSSDK.RDS" Version="4.0.20.3" /> <PackageReference Include="AWSSDK.S3" Version="4.0.23" /> <PackageReference Include="AWSSDK.SecretsManager" Version="4.0.4.20" /> <PackageReference Include="AWSSDK.SecurityToken" Version="4.0.6.3" /> <PackageReference Include="AWSSDK.SQS" Version="4.0.2.28" /> <PackageReference Include="AWSSDK.SSO" Version="4.0.2.27" /> <PackageReference Include="AWSSDK.SSOOIDC" Version="4.0.3.28" /> <PackageReference Include="AWSSDK.StepFunctions" Version="4.0.2.23" /> <PackageReference Include="AWSSDK.Transfer" Version="4.0.9.1" />

Targeted .NET Platform

originally on net8.0, but when we are upgrading we are trying to go to net10.0

Operating System and version

EKS container running on Amazon Linux 2023.10.20260216

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugThis issue is a bug.credentialsneeds-reproductionThis issue needs reproduction.potential-regressionMarking this issue as a potential regression to be checked by team member

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions