DB performance issues after upgrading to v25.4.0

### Preflight checklist

- [x] I could not find a solution in the existing issues, docs, nor discussions.
- [x] I agree to follow this project's [Code of Conduct](https://github.com/ory/kratos/blob/master/CODE_OF_CONDUCT.md).
- [x] I have read and am following this repository's [Contribution Guidelines](https://github.com/ory/kratos/blob/master/CONTRIBUTING.md).
- [x] I have joined the [Ory Community Slack](https://slack.ory.sh).
- [x] I am signed up to the [Ory Security Patch Newsletter](https://www.ory.sh/l/sign-up-newsletter).

### Ory Network Project

_No response_

### Describe the bug

Hi,
we upgraded our self-hosted Kratos version from `v1.3.0` to `v25.4.0` yesterday and we've been having DB performance issues since then.

I tried to pinpoint what's going wrong by looking at the query plans from before/after. The new ones indeed look better in paper, but this is what I actually see in the cpu resources:

This is a period of 2 hours, ~2M calls to this query

### Before:

```sql
SELECT
  COALESCE(identity_credential_identifiers.identifier, $3) cred_identifier,
  ict.id cred_type_id,
  ict.name cred_type,
  identity_credentials.config cred_config,
  identity_credentials.created_at created_at,
  identity_credentials.id cred_id,
  identity_credentials.identity_id identity_id,
  identity_credentials.nid nid,
  identity_credentials.updated_at updated_at,
  identity_credentials.version cred_version
FROM
  identity_credentials AS identity_credentials
INNER JOIN
  identity_credential_types ict
ON
  (identity_credentials.identity_credential_type_id = ict.id)
LEFT JOIN
  identity_credential_identifiers
ON
  identity_credential_identifiers.identity_credential_id = identity_credentials.id
  AND identity_credential_identifiers.nid = identity_credentials.nid
WHERE
  ((identity_credentials.identity_id = $1
      AND identity_credentials.nid = $2))
```

<img width="489" height="690" alt="Image" src="https://github.com/user-attachments/assets/d68f5afb-82e8-4236-84e6-1a0021921dcc" />

---

### After:

```sql
SELECT
  COALESCE(identity_credential_identifiers.identifier, $3) cred_identifier,
  identity_credentials.config,
  identity_credentials.created_at,
  identity_credentials.id,
  identity_credentials.identity_credential_type_id,
  identity_credentials.identity_id,
  identity_credentials.nid,
  identity_credentials.updated_at,
  identity_credentials.version
FROM
  identity_credentials AS identity_credentials
LEFT JOIN
  identity_credential_identifiers
ON
  identity_credential_identifiers.identity_credential_id = identity_credentials.id
  AND identity_credential_identifiers.nid = identity_credentials.nid
WHERE
  (identity_credentials.identity_id = $1)
  AND (identity_credentials.nid = $2)
ORDER BY
  identity_credential_identifiers.identifier ASC
```

<img width="490" height="685" alt="Image" src="https://github.com/user-attachments/assets/8f85dfdf-cf4f-4566-a71c-021f6bb500b6" />


> [!IMPORTANT]  
>  Notice how the 50th percentile is almost the same, but the 95th, 99th is way higher. So it could be some kind of edge case or another query/operation affecting it. The database load went up by a lot for this query as a consequence.


---

### Other metrics

These are other metrics that spiked quite a lot after the upgrade (~14h in the graphs) and I don't understand why yet:

The number of new connections is a lot higher, same amount of pods are connecting to it 4~6 instances of Kratos. Is there any change around the connection pool that would explain this?

<img width="1093" height="307" alt="Image" src="https://github.com/user-attachments/assets/19c46633-06af-4464-bc38-eb78273f1716" />

Amount of fetched/returned rows used to be 7x less, but I cannot find where this is happening. If you could point me into the right direction I could investigate which query could be fetching/returning more.

<img width="1096" height="321" alt="Image" src="https://github.com/user-attachments/assets/eda9c2bf-ffed-46f3-88de-90e4c3942fe9" />

And of course, the CPU usage

<img width="1091" height="306" alt="Image" src="https://github.com/user-attachments/assets/81af8b97-648d-4071-95e6-71cae5ecf311" />

---

I'm not super familiar with Kratos internals so maybe I'm looking at the wrong query or plans. Please let me know which other information you need to make this clearer.


### Reproducing the bug

The only change we did was to upgrade kratos from v1.3.0 to v25.4.0
This is in a Postgres DB

### Relevant log output

```shell

```

### Relevant configuration

```yml

```

### Version

v25.4.0

### On which operating system are you observing this issue?

None

### In which environment are you deploying?

None

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DB performance issues after upgrading to v25.4.0 #4520

Preflight checklist

Ory Network Project

Describe the bug

Before:

After:

Other metrics

Reproducing the bug

Relevant log output

Relevant configuration

Version

On which operating system are you observing this issue?

In which environment are you deploying?

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DB performance issues after upgrading to v25.4.0 #4520

Description

Preflight checklist

Ory Network Project

Describe the bug

Before:

After:

Other metrics

Reproducing the bug

Relevant log output

Relevant configuration

Version

On which operating system are you observing this issue?

In which environment are you deploying?

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions