Conversation
|
Nice find, but I'm not sure we need to handle it in this way. There are even more projections that are stored inside MySQL tables that have the same issue. There used to be a page on Confluence about this, with a list of tables to clear. It's a known issue that we need to clear the read models before doing a replay. We also need to clean Redis before a replay. |
|
I am worried about this bug, I am wondering how many other events there are we cannot replay. |
|
I think each event should be able to be replayed individually, so I don't see the problem in this bugfix. |
This is technically correct, if we do a complete replay we need to clear the read models first because a lot of them are not programmed to handle duplicate key constraints. This "purge" is handled with a specific command for the MySQL tables, so we don't need to keep a list of those tables in Confluence and truncate them manually: This specific PR will not fix that for complete replays because this only makes one read model flexible enough to handle duplicates. Additionally I think we will always need to do complete replays on an environment separate from production, because they take too long and the read models would be outdated until the replay is completely finished. So it probably doesn't really matter that the read models need to be emptied in the case of complete replays, except that the person doing the replay needs to remember to run that purge command first (on the replay instance). In the context of the ticket however it seems like we only need to replay a specific role or specific list of roles. There are two ways we can handle this:
I would try to compare the risks and benefits of these two regardless of the situation of complete replays. Option 1:
Option 2:
If time is not an important factor I would be inclined to go for option 1 just to avoid manual changes to the prod DB, BUT with the caveat that we need to be sure that there are no unintended side effects. As a bonus option 1 could also be useful in the context of complete replays to get rid of the purge command completely in the long run, but I'm not sure yet if that would be a real improvement because I imagine that @willaerk has included that in some kind of script he runs for replays and it's probably not something he has to actively think about. If it is a manual step however, which could be overlooked, it could be helpful to make the projectors smart enough to handle duplicate keys in the context of replays so we can skip that purge step completely. |
|
After some more investigation with @grubolsch we discovered the actual issue is not with the role read model, but with the "user roles" read model in Redis. This is used to show the roles of a user on the user detail page in UDB, and also for the permission checks it seems. The projector for that read model does not take So this PR will not solve the specific issue mentioned in the ticket. TBD if it's still useful or not to invest time in this particular PR (feedback + deploy + testing). Probably not high priority. As a quick fix for the actual problem, @grubolsch is going to check with @willaerk how to manually clean up the roles of the user that is experiencing issues. The complete fix is quite involved:
Or alternatively, the code needs to be reworked to e.g. only work with a single While important to fix correctly, either solution will take too much time to fix the current situation with the user that is not able to work in UDB because their amount of roles is too big and the permission checks are broken. |
|
We have some nice follow up tickets https://jira.publiq.be/browse/III-7029 and https://jira.publiq.be/browse/III-7030, so I would wait with this. |
Fixed
Ticket: https://jira.uitdatabank.be/browse/III-6975