Skip to content

Conversation

guusdk
Copy link
Member

@guusdk guusdk commented Oct 1, 2025

This commit shifts responsibility for maintaining the last stanza that changed a MUC room's subject from its HistoryStrategy to the MUCRoom implementation itself.

The purpose for this change is to ensure that a MUCRoom has all data required to broadcast its latest subject.

The changes in this commit include:

  • datatype change for MUCRoom#subject from String to Stanza
  • deprecation of various subject-related methods in HistoryStrategy
  • database migration to copy the latest subject stanza from table ofMucConversationLog to ofMucRoom

The database migration is not perfect: it does not migrate the stanza's timestamp, which means that the subject that is broadcast no longer mentions when the subject was changed (as the stanzas in ofMucConversationLog do have an occupantJID in its 'from' attribute, the 'who' is correctly retained).

In rare occasions (see OF-3131) the room can have a subject, while the history does not. In those cases, the database migration scripts leave the (non-stanza) subject in the ofMucRoom table intact (this means that the column holds a mixture of plain text and XMPP data). The code that reads from the database will generate a stanza on the fly from such a plain text subject.

@guusdk
Copy link
Member Author

guusdk commented Oct 1, 2025

Before merging, the database migration scripts should be tested.

@guusdk guusdk requested review from Fishbowler and akrherz October 1, 2025 17:54
Copy link
Member

@akrherz akrherz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems sensible

@guusdk
Copy link
Member Author

guusdk commented Oct 2, 2025

I've added some to-be-migrated data in the upgrade scripts (prior to the migration taking place). This intends to test the migration, via GitHub-invoked CI. I'll revoke the commit that introduces this data once CI has ran.

@guusdk
Copy link
Member Author

guusdk commented Oct 2, 2025

Although these scripts cannot check the result of the migration, they at least verified that the migration didn't crash. @Fishbowler I'd love your feedback on the way this was tested, and possibly suggest improvements.

@guusdk
Copy link
Member Author

guusdk commented Oct 2, 2025

I've reverted the commit that introduces the test data. This way, the data remains part of the commit history, without continuing to be part of the migration scripts.

@guusdk
Copy link
Member Author

guusdk commented Oct 2, 2025

Related changes in the REST API plugin: igniterealtime/openfire-restAPI-plugin#214

@Fishbowler
Copy link
Member

In the case where a stanza is generated because the history is missing, should the stanza be persisted, to avoid repeating the same work possibly many times over?

@Fishbowler
Copy link
Member

Were you able to do any testing of the upgrade scripts locally?

@guusdk
Copy link
Member Author

guusdk commented Oct 2, 2025

In the case where a stanza is generated because the history is missing, should the stanza be persisted, to avoid repeating the same work possibly many times over?

I'm on the fence. It would replace one 'broken' solution with another. If we'd retain the non-stanza subject, it remains possible to manually detect the affected rooms and correct things (somehow). It's arguably a low-value benefit, but on the flip side, the benefit of avoiding such a tiny amount of duplicated work is, too. Maybe wanting to avoid the added complexity of saving things in the database (while we're already iterating over the database) is a deciding factor here. No hill for me to die on.

Were you able to do any testing of the upgrade scripts locally?

Not really. I dabbled a bit with HSQLDB, and used some external databases (without hooking up Openfire) to do some manual testing of partial scripts, but nothing of much consequence.

guusdk added 6 commits October 3, 2025 16:16
This commit shifts responsibility for maintaining the last stanza that changed a MUC room's subject from its HistoryStrategy to the MUCRoom implementation itself.

The purpose for this change is to ensure that a MUCRoom has all data required to broadcast its latest subject.

The changes in this commit include:
- datatype change for `MUCRoom#subject` from String to Stanza
- deprecation of various subject-related methods in `HistoryStrategy`
- database migration to copy the latest subject stanza from table `ofMucConversationLog` to `ofMucRoom`

The database migration is not perfect: it does not migrate the stanza's timestamp, which means that the subject that is broadcast no longer mentions _when_ the subject was changed (as the stanzas in `ofMucConversationLog` do have an occupantJID in its 'from' attribute, the 'who' is correctly retained).

In rare occasions (see OF-3131) the room can have a subject, while the history does not. In those cases, the database migration scripts leave the (non-stanza) subject in the ofMucRoom table intact (this means that the column holds a mixture of plain text and XMPP data). The code that reads from the database will generate a stanza on the fly from such a plain text subject.
This reverts commit 92569f6.

The tests in that commit have served their purpose (they passed the migration test that happens as part of the continuous integration workflows).
@guusdk guusdk force-pushed the OF-3131_MUC-subject branch from 05cc534 to 645e354 Compare October 3, 2025 14:16
@guusdk
Copy link
Member Author

guusdk commented Oct 3, 2025

Rebased

@Fishbowler
Copy link
Member

Based on the dummy data tests you did, that covers that it upgrades without error on 4 systems (MySQL, MSSQL, Postgres, Oracle). You've also had a play with HSQLDB. From a risk POV, I think we're mostly there.

Comment on lines +2726 to +2728
changeSubject(roomSubject, this.selfOccupantData);
} catch (ForbiddenException e) {
Log.warn("Unable to change the subject of room {}", this.getJID(), e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this the only exception that could happen? Surely there could (should?) be database and other things that could occur and be trapped and bubbled up?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only checked exception that is thrown. Things like database exceptions are handled within these calls.

There's a lot to be said about handling of checked vs unchecked exceptions, but also on the handling of database errors in Openfire. That, however, is probably best addressed in a distinct issue/topic/pull request. What we do here is at least roughly consistent with the rest of the code.

Comment on lines 11 to 23
UPDATE ofMucRoom r
SET r.subject = (
SELECT l.stanza
FROM ofMucConversationLog l
WHERE l.roomID = r.roomID
AND l.subject IS NOT NULL
AND l.logTime = (
SELECT MAX(l2.logTime)
FROM ofMucConversationLog l2
WHERE l2.roomID = r.roomID
AND l2.subject IS NOT NULL
)
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLM complained that this might have many full table scans, and suggested using ROW_NUMBER like you did for Oracle. I've no experience here, but docs suggest that it's possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ROW_NUMBER in HSQLDB appears to have quirks. LLM suggests that we can use it only for this particular purpose after upgrading from 2.7.1 to 2.7.4 (which we may want to do anyways).

I do believe that the query can be improved for a roughly comparable performance improvement, but with something that's usable under 2.7.1. I have added a commit that has that improvement.

guusdk added 3 commits October 9, 2025 14:41
This is LLM-based optimization:

How it works:
- Inner derived table lm computes the latest logTime per room (ignoring NULL subjects).
- Joins l1 back to lm on (roomID, maxTime) to get the exact row(s).
- Filters out rows with stanza IS NULL.
- Updates r.subject with l1.stanza.

Advantages:
- Clear separation: first compute “latest logTime per room,” then pick the stanza.
- Deterministic if each room has at most one message per logTime.
- Efficient for large tables because it aggregates once per room.
This reverts commit 732f4f4.

The tests in that commit have served their purpose (they passed the migration test that happens as part of the continuous integration workflows).
@guusdk guusdk requested a review from Fishbowler October 9, 2025 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants