III-6975 Fix replay of role created by grubolsch · Pull Request #2153 · cultuurnet/udb3-backend

grubolsch · 2026-02-11T10:32:42Z

Fixed

When trying to replay a role created, you got a duplicate key error. Has a unit test to prove it works.

Ticket: https://jira.uitdatabank.be/browse/III-6975

lucwollants · 2026-02-11T10:39:18Z

Nice find, but I'm not sure we need to handle it in this way. There are even more projections that are stored inside MySQL tables that have the same issue. There used to be a page on Confluence about this, with a list of tables to clear.

It's a known issue that we need to clear the read models before doing a replay. We also need to clean Redis before a replay.

grubolsch · 2026-02-11T10:39:37Z

I am worried about this bug, I am wondering how many other events there are we cannot replay.

grubolsch · 2026-02-11T10:58:58Z

I think each event should be able to be replayed individually, so I don't see the problem in this bugfix.

src/Role/ReadModel/Search/Doctrine/DBALRepository.php

bertramakers · 2026-02-11T11:42:46Z

There are even more projections that are stored inside MySQL tables that have the same issue. There used to be a page on Confluence about this, with a list of tables to clear.

It's a known issue that we need to clear the read models before doing a replay. We also need to clean Redis before a replay.

This is technically correct, if we do a complete replay we need to clear the read models first because a lot of them are not programmed to handle duplicate key constraints. This "purge" is handled with a specific command for the MySQL tables, so we don't need to keep a list of those tables in Confluence and truncate them manually:

udb3-backend/app/Console/Command/PurgeModelCommand.php

Line 12 in 3fcd0d8

class PurgeModelCommand extends Command

This specific PR will not fix that for complete replays because this only makes one read model flexible enough to handle duplicates. Additionally I think we will always need to do complete replays on an environment separate from production, because they take too long and the read models would be outdated until the replay is completely finished. So it probably doesn't really matter that the read models need to be emptied in the case of complete replays, except that the person doing the replay needs to remember to run that purge command first (on the replay instance).

In the context of the ticket however it seems like we only need to replay a specific role or specific list of roles. There are two ways we can handle this:

Update the relevant projector(s) to better handle the duplicate key constraint so we can replay those role(s) without errors - aka the suggested changes in this PR
Delete the relevant row(s) of those role(s) and then do the replay of those role(s)

I would try to compare the risks and benefits of these two regardless of the situation of complete replays.

Option 1:

✅ No manual work in the production DB = less/zero chance of accidentally deleting or altering incorrect rows
⚠️ Make sure to double check that this change does not introduce unintended side effects in the read model aside from replays. In other words, does it make sense that it also updates an existing role when a new one is created outside of a replay? This could cover up other bugs that cause new roles to be created with existing IDs and incorrectly overwrites data. Maybe it would be safer to first catch the exception, and then check if we're doing a replay before either updating or re-throwing the exception? We should have a flag on domain messages to indicate whether they are sent from a replay or not:

udb3-backend/src/Broadway/Domain/DomainMessageIsReplayed.php

Line 9 in ae62e3a

class DomainMessageIsReplayed implements DomainMessageSpecificationInterface

(Double-check that this still works as expected though!)
⚠️ Probably requires thorough testing on acc or test before pushing this to prod, which might take longer to achieve your goal compared to manually deleting the specific rows and re-running the replay command without pushing code changes

Option 2:

⚠️ Manual DB changes on prod = high risk
✅ No code changes = 0% chance of new bugs or unintended side effects in the new code
✅ Probably faster than deploying and testing code changes

If time is not an important factor I would be inclined to go for option 1 just to avoid manual changes to the prod DB, BUT with the caveat that we need to be sure that there are no unintended side effects.

As a bonus option 1 could also be useful in the context of complete replays to get rid of the purge command completely in the long run, but I'm not sure yet if that would be a real improvement because I imagine that @willaerk has included that in some kind of script he runs for replays and it's probably not something he has to actively think about. If it is a manual step however, which could be overlooked, it could be helpful to make the projectors smart enough to handle duplicate keys in the context of replays so we can skip that purge step completely.

bertramakers · 2026-02-11T14:06:47Z

After some more investigation with @grubolsch we discovered the actual issue is not with the role read model, but with the "user roles" read model in Redis. This is used to show the roles of a user on the user detail page in UDB, and also for the permission checks it seems. The projector for that read model does not take RoleDeleted into account:

udb3-backend/src/Role/ReadModel/Users/UserRolesProjector.php

Line 17 in 0c7d36e

class UserRolesProjector extends RoleProjector

So this PR will not solve the specific issue mentioned in the ticket. TBD if it's still useful or not to invest time in this particular PR (feedback + deploy + testing). Probably not high priority.

As a quick fix for the actual problem, @grubolsch is going to check with @willaerk how to manually clean up the roles of the user that is experiencing issues.

The complete fix is quite involved:

UserRolesProjector needs to take RoleDeleted into account, and it will need to use either the user_roles table or JSON projection of a role to find all users related to the deleted role, so it will also be crucial that the UserRolesProjector processes the RoleDeleted before that relationship info is deleted
All RoleDeleted messages will need to be replayed so they are correctly processed by the UserRolesProjector

Or alternatively, the code needs to be reworked to e.g. only work with a single user_roles table and not have 3 read models for the user & roles relationships.

While important to fix correctly, either solution will take too much time to fix the current situation with the user that is not able to work in UDB because their amount of roles is too big and the permission checks are broken.

lucwollants · 2026-02-12T11:05:53Z

We have some nice follow up tickets https://jira.publiq.be/browse/III-7029 and https://jira.publiq.be/browse/III-7030, so I would wait with this.

Fix replay of role created

f1d6430

grubolsch requested review from JonasVHG, bertramakers and lucwollants as code owners February 11, 2026 10:32

lucwollants reviewed Feb 11, 2026

View reviewed changes

src/Role/ReadModel/Search/Doctrine/DBALRepository.php Outdated Show resolved Hide resolved

Change to a query builder

a801786

grubolsch requested a review from lucwollants February 11, 2026 15:37

cs

a2e5f5f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

III-6975 Fix replay of role created#2153

III-6975 Fix replay of role created#2153
grubolsch wants to merge 3 commits intomasterfrom
III-6975/fix-replay-role-added

grubolsch commented Feb 11, 2026

Uh oh!

lucwollants commented Feb 11, 2026

Uh oh!

grubolsch commented Feb 11, 2026

Uh oh!

grubolsch commented Feb 11, 2026

Uh oh!

Uh oh!

bertramakers commented Feb 11, 2026 •

edited

Loading

Uh oh!

bertramakers commented Feb 11, 2026 •

edited

Loading

Uh oh!

lucwollants commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

grubolsch commented Feb 11, 2026

Fixed

Uh oh!

lucwollants commented Feb 11, 2026

Uh oh!

grubolsch commented Feb 11, 2026

Uh oh!

grubolsch commented Feb 11, 2026

Uh oh!

Uh oh!

bertramakers commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bertramakers commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lucwollants commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bertramakers commented Feb 11, 2026 •

edited

Loading

bertramakers commented Feb 11, 2026 •

edited

Loading