-
Notifications
You must be signed in to change notification settings - Fork 258
Description
When using postgresql_database
to provision extra databases on RDS, during a destroy operation (specifically, when used in a terraform test) it intermittently seems to get into a state where it waits indefinitely trying to acquire a lock until the operation times out.
Based on examining the state of the RDS instance at the point where the operation gets stuck, it appears that the problem is here:
"SELECT pg_advisory_xact_lock(member::bigint) FROM pg_auth_members JOIN pg_roles ON roleid = pg_roles.oid WHERE rolname = $1", |
I see the connection from the provider waiting for this query to complete, and reviewing the pg_locks
table, the only advisory lock that I see is one for the root
user, and originating from the same PID as the above operation. Based on this, it appears that the lock operation on line 575 in the same file is taking out an advisory lock on the current user (root), then somehow it's trying to take out another advisory lock for the same user, and getting stuck because there's already a lock, which it itself created.
I'm not able to identify the cause of this, querying pg_auth_members
shortly after it gets into this state doesn't give any indication to me of how this could be happening, but the transient nature of the issue leads me to believe it could be a race condition somehow related to the granting/revoking of roles that this provider does when performing certain operations.
Conditions to recreate
- Configuration contains an AWS RDS serverless cluster running postgres with one instance
- Postgres provider configuration:
provider "postgresql" {
host = aws_rds_cluster.this.endpoint
port = 5432
username = "root"
password = random_password.rds_admin_password.result
superuser = false
sslmode = "disable"
max_connections = 1
}
- Configuration contains one
postgresql_database
, twopostgresql_role
resources, and severalpostgresql_grant
resources. Usesdepends_on
to ensure the grants are applied sequentially, and both the roles and grants wait for thepostgresql_database
to be applied. Between this andmax_connections
, this guarantees that all postgresql provider operations are single threaded. postgresql_database
configuration:
resource "postgresql_database" "extra_databases" {
name = "testdb2"
template = aws_rds_cluster.this.database_name
}
- An integration test (
.tftest.hcl
) is created and run usingterraform test
, which creates the above configuration, then immediately destroys it.