Skip to content

Fix ConfigurationVO load exception after schema change #10485

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

abh1sar
Copy link
Collaborator

@abh1sar abh1sar commented Feb 28, 2025

Description

This PR fixes #10480

The configuration table schema was changed in PR #10300
But it causes problem if the ConfigurationVO class structure was cached with the old fields.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI
  • test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

Copy link

codecov bot commented Feb 28, 2025

Codecov Report

Attention: Patch coverage is 33.33333% with 10 lines in your changes missing coverage. Please review.

Project coverage is 16.26%. Comparing base (eab37ec) to head (e56109a).
Report is 27 commits behind head on main.

Files with missing lines Patch % Lines
...c/main/java/com/cloud/utils/db/GenericDaoBase.java 25.00% 9 Missing ⚠️
...b/src/main/java/com/cloud/utils/db/GenericDao.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #10485      +/-   ##
============================================
+ Coverage     16.15%   16.26%   +0.10%     
- Complexity    13274    13387     +113     
============================================
  Files          5666     5675       +9     
  Lines        498078   498943     +865     
  Branches      60267    60337      +70     
============================================
+ Hits          80481    81162     +681     
- Misses       408584   408740     +156     
- Partials       9013     9041      +28     
Flag Coverage Δ
uitests 3.99% <ø> (-0.01%) ⬇️
unittests 17.12% <33.33%> (+0.11%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@abh1sar abh1sar changed the title Fix ConfigurationVO load exception on fresh install Fix ConfigurationVO load exception after schema change Feb 28, 2025
@DaanHoogland DaanHoogland added this to the 4.21.0 milestone Mar 3, 2025
@abh1sar
Copy link
Collaborator Author

abh1sar commented Mar 3, 2025

@blueorangutan package

@blueorangutan
Copy link

@abh1sar a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 12634

@abh1sar
Copy link
Collaborator Author

abh1sar commented Mar 3, 2025

@blueorangutan package

@blueorangutan
Copy link

@abh1sar a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12636

@abh1sar
Copy link
Collaborator Author

abh1sar commented Mar 3, 2025

@blueorangutan test

@blueorangutan
Copy link

@abh1sar a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian Build Failed (tid-12539)

@rohityadavcloud
Copy link
Member

@blueorangutan test

@blueorangutan
Copy link

@rohityadavcloud a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian Build Failed (tid-12546)

@abh1sar
Copy link
Collaborator Author

abh1sar commented Mar 4, 2025

@blueorangutan package

@blueorangutan
Copy link

@abh1sar a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

[SF] Trillian Build Failed (tid-12553)

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12648

@blueorangutan
Copy link

[SF] Trillian Build Failed (tid-12555)

@JoaoJandre
Copy link
Contributor

Instead of changing how we get the value, shouldn't we normalize the database data so that it works with the current way of getting the values?

Otherwise, if someone in the future creates a method to get the value the old way and only tests on a new install, it might introduce a bug for people running old installs.

@abh1sar
Copy link
Collaborator Author

abh1sar commented Mar 11, 2025

@JoaoJandre We identified the issue with how BackupDaoImpl class caches the columns of the table. Even though both configurations table and ConfigurationVO code has the new schema, the ConfigurationsDao._allColumns field still had the older schema from before upgrade. That's why after management server restart ConfigurationsDaoImpl_allColumns was getting regenerated with the correct fields.
I have reverted the older commit and added the commit to regenerate ConfigurationsDaoImpl._allColumns when the Configurations table schema is changed.

@abh1sar abh1sar self-assigned this Mar 11, 2025
@rohityadavcloud
Copy link
Member

@blueorangutan package

@blueorangutan
Copy link

@rohityadavcloud a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12767

@abh1sar abh1sar requested a review from nvazquez March 13, 2025 12:07
@DaanHoogland
Copy link
Contributor

@blueorangutan test

@blueorangutan
Copy link

@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian Build Failed (tid-12681)

@@ -98,6 +100,8 @@ protected void migrateConfigurationScopeToBitmask(Connection conn) {
migrateExistingConfigurationScopeValues(conn);
DbUpgradeUtils.dropTableColumnsIfExist(conn, "configuration", List.of("scope"));
DbUpgradeUtils.changeTableColumnIfNotExist(conn, "configuration", "new_scope", "scope", "BIGINT NOT NULL DEFAULT 0 COMMENT 'Bitmask for scope(s) of this parameter'");
ConfigurationDao dao = new ConfigurationDaoImpl();
dao.markForColumnsRefresh();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of marking columns for refresh later, why not do it on the upgrade?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoaoJandre cc @abh1sar there could be a better way to refresh columns but I wasn't able to do it after the upgrade from this class. At runtime, we have a different ConfigurationDaoImpl instance which was created as the bean and needs columns refresh. I was not being able to access it from here directly. Maybe something like this can help,

       try {
            ConfigurationDao dao =
                    ComponentContext.getDelegateComponentOfType(ConfigurationDao.class);
            dao.refreshColumns();
        } catch (NoSuchBeanDefinitionException ignored) {
            logger.debug("No ConfigurationDao bean found");
        }

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shwstppr @JoaoJandre
sorry for the delay. I tested the above, but ComponentContext only keeps track of PluggableService class.
I was not able to access ConfigurationDaoImpl any other way as well.
Any other ideas are welcome.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abh1sar thanks for checking. No ideas at the moment
Only improvement suggestion I've at the moment is making markForColumnsRefresh static so we don't need to create new instance and as it only toggles a static member of the class

Copy link
Collaborator Author

@abh1sar abh1sar Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shwstppr markForColumnRefresh accesses the non-static _table field in GenericDaoBase.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @abh1sar for checking. My bad.

@JoaoJandre
Copy link
Contributor

@abh1sar could you explain how does the class structure get cached? I think I may be missing something, as on upgrade, the MGMT server will be down, after it is started which cache could it have? How can we reproduce this error?

Copy link
Contributor

@shwstppr shwstppr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM but others will have a better say

Copy link
Contributor

@shwstppr shwstppr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abh1sar based on your testing and discussion this would need changes.

I was able to not see the error by evicting old connections when HIkariCP is used for pooling.

diff --git a/engine/schema/src/main/java/com/cloud/upgrade/DatabaseUpgradeChecker.java b/engine/schema/src/main/java/com/cloud/upgrade/DatabaseUpgradeChecker.java
index 1e3b3a7e5e..6ea7242f83 100644
--- a/engine/schema/src/main/java/com/cloud/upgrade/DatabaseUpgradeChecker.java
+++ b/engine/schema/src/main/java/com/cloud/upgrade/DatabaseUpgradeChecker.java
@@ -379,6 +379,7 @@ public class DatabaseUpgradeChecker implements SystemIntegrityChecker {
         } finally {
             txn.close();
         }
+        TransactionLegacy.refreshConnections(TransactionLegacy.CLOUD_DB);
         return version;
     }
 
diff --git a/framework/db/src/main/java/com/cloud/utils/db/TransactionLegacy.java b/framework/db/src/main/java/com/cloud/utils/db/TransactionLegacy.java
index 88af397c06..18a90749e4 100644
--- a/framework/db/src/main/java/com/cloud/utils/db/TransactionLegacy.java
+++ b/framework/db/src/main/java/com/cloud/utils/db/TransactionLegacy.java
@@ -605,6 +605,15 @@ public class TransactionLegacy implements Closeable {
         return _conn;
     }
 
+    public static void refreshConnections(final short dbId) {
+        if (dbId != CLOUD_DB) {
+            return;
+        }
+        if (s_ds instanceof HikariDataSource) {
+            ((HikariDataSource)s_ds).getHikariPoolMXBean().softEvictConnections();
+        }
+    }
+
     protected boolean takeOver(final String name, final boolean create) {
         if (_stack.size() != 0) {
             if (!create) {

I'm not sure about DBCP. I've not reproduced the issue there yet and based on my little research it may need closing the datasource.
Also, we need to remove that markForColumnsRefresh logic as that doesn't seem to be the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DB] Exceptions logged on fresh management server start
8 participants