-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Fix ConfigurationVO load exception after schema change #10485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #10485 +/- ##
============================================
+ Coverage 16.15% 16.26% +0.10%
- Complexity 13274 13387 +113
============================================
Files 5666 5675 +9
Lines 498078 498943 +865
Branches 60267 60337 +70
============================================
+ Hits 80481 81162 +681
- Misses 408584 408740 +156
- Partials 9013 9041 +28
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
@blueorangutan package |
@abh1sar a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 12634 |
@blueorangutan package |
@abh1sar a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12636 |
@blueorangutan test |
@abh1sar a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
[SF] Trillian Build Failed (tid-12539) |
@blueorangutan test |
@rohityadavcloud a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
[SF] Trillian Build Failed (tid-12546) |
@blueorangutan package |
@abh1sar a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
[SF] Trillian Build Failed (tid-12553) |
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12648 |
[SF] Trillian Build Failed (tid-12555) |
Instead of changing how we get the value, shouldn't we normalize the database data so that it works with the current way of getting the values? Otherwise, if someone in the future creates a method to get the value the old way and only tests on a new install, it might introduce a bug for people running old installs. |
@JoaoJandre We identified the issue with how BackupDaoImpl class caches the columns of the table. Even though both configurations table and ConfigurationVO code has the new schema, the ConfigurationsDao._allColumns field still had the older schema from before upgrade. That's why after management server restart ConfigurationsDaoImpl_allColumns was getting regenerated with the correct fields. |
@blueorangutan package |
@rohityadavcloud a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12767 |
@blueorangutan test |
@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
[SF] Trillian Build Failed (tid-12681) |
@@ -98,6 +100,8 @@ protected void migrateConfigurationScopeToBitmask(Connection conn) { | |||
migrateExistingConfigurationScopeValues(conn); | |||
DbUpgradeUtils.dropTableColumnsIfExist(conn, "configuration", List.of("scope")); | |||
DbUpgradeUtils.changeTableColumnIfNotExist(conn, "configuration", "new_scope", "scope", "BIGINT NOT NULL DEFAULT 0 COMMENT 'Bitmask for scope(s) of this parameter'"); | |||
ConfigurationDao dao = new ConfigurationDaoImpl(); | |||
dao.markForColumnsRefresh(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of marking columns for refresh later, why not do it on the upgrade?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JoaoJandre cc @abh1sar there could be a better way to refresh columns but I wasn't able to do it after the upgrade from this class. At runtime, we have a different ConfigurationDaoImpl instance which was created as the bean and needs columns refresh. I was not being able to access it from here directly. Maybe something like this can help,
try {
ConfigurationDao dao =
ComponentContext.getDelegateComponentOfType(ConfigurationDao.class);
dao.refreshColumns();
} catch (NoSuchBeanDefinitionException ignored) {
logger.debug("No ConfigurationDao bean found");
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shwstppr @JoaoJandre
sorry for the delay. I tested the above, but ComponentContext only keeps track of PluggableService class.
I was not able to access ConfigurationDaoImpl any other way as well.
Any other ideas are welcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abh1sar thanks for checking. No ideas at the moment
Only improvement suggestion I've at the moment is making markForColumnsRefresh
static so we don't need to create new instance and as it only toggles a static member of the class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shwstppr markForColumnRefresh accesses the non-static _table
field in GenericDaoBase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @abh1sar for checking. My bad.
@abh1sar could you explain how does the class structure get cached? I think I may be missing something, as on upgrade, the MGMT server will be down, after it is started which cache could it have? How can we reproduce this error? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM but others will have a better say
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abh1sar based on your testing and discussion this would need changes.
I was able to not see the error by evicting old connections when HIkariCP is used for pooling.
diff --git a/engine/schema/src/main/java/com/cloud/upgrade/DatabaseUpgradeChecker.java b/engine/schema/src/main/java/com/cloud/upgrade/DatabaseUpgradeChecker.java
index 1e3b3a7e5e..6ea7242f83 100644
--- a/engine/schema/src/main/java/com/cloud/upgrade/DatabaseUpgradeChecker.java
+++ b/engine/schema/src/main/java/com/cloud/upgrade/DatabaseUpgradeChecker.java
@@ -379,6 +379,7 @@ public class DatabaseUpgradeChecker implements SystemIntegrityChecker {
} finally {
txn.close();
}
+ TransactionLegacy.refreshConnections(TransactionLegacy.CLOUD_DB);
return version;
}
diff --git a/framework/db/src/main/java/com/cloud/utils/db/TransactionLegacy.java b/framework/db/src/main/java/com/cloud/utils/db/TransactionLegacy.java
index 88af397c06..18a90749e4 100644
--- a/framework/db/src/main/java/com/cloud/utils/db/TransactionLegacy.java
+++ b/framework/db/src/main/java/com/cloud/utils/db/TransactionLegacy.java
@@ -605,6 +605,15 @@ public class TransactionLegacy implements Closeable {
return _conn;
}
+ public static void refreshConnections(final short dbId) {
+ if (dbId != CLOUD_DB) {
+ return;
+ }
+ if (s_ds instanceof HikariDataSource) {
+ ((HikariDataSource)s_ds).getHikariPoolMXBean().softEvictConnections();
+ }
+ }
+
protected boolean takeOver(final String name, final boolean create) {
if (_stack.size() != 0) {
if (!create) {
I'm not sure about DBCP. I've not reproduced the issue there yet and based on my little research it may need closing the datasource.
Also, we need to remove that markForColumnsRefresh logic as that doesn't seem to be the problem.
Description
This PR fixes #10480
The configuration table schema was changed in PR #10300
But it causes problem if the ConfigurationVO class structure was cached with the old fields.
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?
How did you try to break this feature and the system with this change?