Opt into purging by destroy for container entities, use delete elsewhere #23389

jrafanie · 2025-03-20T18:59:58Z

Fixes most of the issues in #23307

We were leaving around lots of orphaned container* rows when we removed the container entities. This change allows us to opt-into use destroy on the primary table in situations where know the associated records are NOT going to be many tens of thousands of rows. If they are, those associations should NOT be using dependent :destroy, and have their own purger.

TODO:

Note: I've moved additional tasks for more non-container purging to #23394

app/models/mixins/purging_mixin.rb

Fryguy · 2025-03-20T20:15:22Z

@jrafanie Just for clarification, this changes everything to use destroy as part of the purger, but what schedules the purging of those? Or is that still TODO?

Is the plan to let things orphan out the way they do now and then change that in a followup? Or is the plan to switch to destroy and also change the models to do dependent => destroy?

jrafanie · 2025-03-20T20:49:33Z

@jrafanie Just for clarification, this changes everything to use destroy as part of the purger, but what schedules the purging of those? Or is that still TODO?

Just the container entities handled by the purger were changed to destroy. Everything else is still delete. The schedules remain the same:

      :container_entities_purge_interval: 1.day

under the covers, the scheduler calls the same methods with the same interval. When they're executed in batches, they'll now use destroy.

Is the plan to let things orphan out the way they do now and then change that in a followup? Or is the plan to switch to destroy and also change the models to do dependent => destroy?

Most of container* associated with these container entities that were missed as highlighted in #23307, such as container conditions/volume/etc. should already be dependent destroy from the entities (container/groups/node/project/etc.)

jrafanie · 2025-03-20T20:57:09Z

Is the plan to let things orphan out the way they do now and then change that in a followup? Or is the plan to switch to destroy and also change the models to do dependent => destroy?

Most of container* associated with these container entities that were missed as highlighted in #23307, such as container conditions/volume/etc. should already be dependent destroy from the entities (container/groups/node/project/etc.)

FYI, I added a bullet list to the PR description of what container entities and associations I'm verifying.

Fryguy · 2025-03-21T13:17:00Z

I'd like @agrare to also review this so it plays nice with the refresh workers possibly deleting entities.

agrare · 2025-03-21T13:23:24Z

so it plays nice with the refresh workers possibly deleting entities.

Anything being considered for purging should already have been disconnected/archived by the RefreshWorker so there should be no contention here.

      def purge_scope(older_than)
        where(arel_table[:deleted_on].lteq(older_than))
      end

jrafanie · 2025-03-21T13:59:32Z

app/models/container_image.rb

@@ -38,9 +38,9 @@ class ContainerImage < ApplicationRecord
           :inverse_of => :resource
  has_one :last_scan_result, :class_name => "ScanResult", :as => :resource, :dependent => :destroy, :autosave => true

-  has_many :metric_rollups, :as => :resource, :dependent => :nullify, :inverse_of => :resource
-  has_many :metrics, :as => :resource, :dependent => :nullify, :inverse_of => :resource
-  has_many :vim_performance_states, :as => :resource, :dependent => :nullify, :inverse_of => :resource


Not sure why these were nullify

Yeah weird - in other models we just let those be orphaned, right?

Correct, this is the only one that does nullify on metrics* and vim_perf*

jrafanie · 2025-03-21T14:00:55Z

app/models/container_image_registry.rb

@@ -2,13 +2,13 @@ class ContainerImageRegistry < ApplicationRecord
  belongs_to :ext_management_system, :foreign_key => "ems_id"

  # Associated with images in the registry.
-  has_many :container_images, :dependent => :nullify


This should be handled by the purger as long as refresh marks them as archived via deleted_on.

ContainerImages are also related to Containers which could still be around after a container_image_registry is deleted. I think dependent nullify is appropriate here

jrafanie · 2025-03-21T14:01:59Z

app/models/container_node.rb

@@ -44,7 +44,7 @@ class ContainerNode < ApplicationRecord
  has_many :metrics, :as => :resource
  has_many :metric_rollups, :as => :resource
  has_many :vim_performance_states, :as => :resource
-  has_many :miq_alert_statuses, :as => :resource
+  has_many :miq_alert_statuses, :as => :resource, :dependent => :destroy


not sure why these 2 were not being destroyed ☝️

Seeing thousands of these likely because nodes don't have the same fast lifecycle as containers/pods... should be handled by existing node purgers once this is in.

👍 this looks like a good fix

TODO: consider a data migration to do orphan cleanup for anything like this that was already removed and associated rows left orphaned.

Added a bullet item to do a data migration in the high level issue: #23394

jrafanie · 2025-03-21T14:04:10Z

app/models/container_project.rb

  has_many :all_container_groups, :class_name => "ContainerGroup", :inverse_of => :container_project
  has_many :archived_container_groups, -> { archived }, :class_name => "ContainerGroup"
-  has_many :persistent_volume_claims
+  has_many :persistent_volume_claims, :dependent => :destroy


WAT, yes, we probably don't have that many container projects but still, I think these should all be destroyed... otherwise the purger for projects will just leave these orphaned.

@agrare do any of these have a chance to be impossible to delete in the UI/backend due to many tens of thousands of rows? Maybe builds? I'm not sure.

persistent_volume_claims are associated to a container_volume which does the dependent destroy,

app/models/container_volume.rb: belongs_to :persistent_volume_claim, :dependent => :destroy

jrafanie · 2025-03-21T14:05:38Z

app/models/container_service.rb

@@ -6,7 +6,7 @@ class ContainerService < ApplicationRecord

  belongs_to  :ext_management_system, :foreign_key => "ems_id"
  has_and_belongs_to_many :container_groups, :join_table => :container_groups_container_services
-  has_many :container_routes
+  has_many :container_routes, :dependent => :destroy


Is this one correct @agrare ? projects has many routes and services has many routes? Should projects have routes through services? Or maybe there can be routes not attached to a service? 🤷

Are we leaving routes orphaned @jrafanie ?

A container route is its own "top-level" managed object so we would destroy it when we get the destroy event from k8s (versus e.g. a container which is only part of a container_group and wouldn't get its own destroy event) I'm surprised it isn't at least dependent nullify though.

No. I'm seeing less than 200 routes where container conditions, env vars, volumes, security contexts, port configs, and custom attributes are the big ones in the container area over 1 million rows.

I can change it to nullify. Are there others here that should be treated in the same way? I don't want to change behavior. I'll verify with my table counts from various databases.

Kept it the same and added a comment

jrafanie · 2025-03-21T14:06:30Z

app/models/persistent_volume_claim.rb

@@ -1,7 +1,7 @@
 class PersistentVolumeClaim < ApplicationRecord
  belongs_to :ext_management_system, :foreign_key => "ems_id"
  belongs_to :container_project
-  has_many :container_volumes
+  has_many :container_volumes, :dependent => :destroy


not sure why this wasn't being destroyed... too many maybe? We don't have a separate purger for volumes though.

A persistent volume claim will point to a persistent volume when the claim is satisfied, but it doesn't own the volume. The volume could be reused by a future claim, so deleting the claim leaves the volume around.

Or more accurately, if the PVC is marked as "Retain", then on deleting it won't also delete the PV.

Kubernetes is hard this way, because technically all these objects are loosely bound to each other.

Will add a comment. That does make sense when thinking about a claim.

jrafanie · 2025-03-21T15:12:59Z

app/models/container_build.rb

@@ -9,7 +9,7 @@ class ContainerBuild < ApplicationRecord
           :as         => :resource,
           :inverse_of => :resource

-  has_many :container_build_pods
+  has_many :container_build_pods, :dependent => :destroy


seeing only 10s of these, perhaps they're managed elsewhere, such as delete evens.

container_build_pods are a top level managed object which should be deleted during refresh

jrafanie · 2025-03-21T15:20:21Z

app/models/container_image_registry.rb

  has_many :containers, :through => :container_images
  has_many :container_groups, :through => :container_images

  # Associated with serving the registry itself - for openshift's internal
  # image registry. These will be empty for external registries.
-  has_many :container_services
+  has_many :container_services, :dependent => :destroy


seeing tens of registries to many hundreds of services. Should be handled if registries are removed with the container manager.

ContainerServices should be deleted by refresh when they are removed

Fryguy · 2025-03-21T15:25:19Z

I'm starting to wonder if this is a refresh problem. I thought just about everything in Kubernetes was a "top-level" object, since you can create objects willy-nilly that do anything, and there are loose associations between many things by using labels and selectors. There's not really ownership references between them. Are we just missing events during refresh for destroying these orphaned thing, or perhaps they just aren't in the events/watches?

jrafanie · 2025-03-21T15:26:53Z

app/models/container_project.rb

-  has_many :container_routes
-  has_many :container_replicators
-  has_many :container_services
+  has_many :container_routes, :dependent => :destroy


Many 10s of these. Maybe handled by event handling or just lower lifecycle churn.

container_routes are a top-level managed entity that should be deleted by refresh when they are removed from k8s

jrafanie · 2025-03-21T15:27:04Z

app/models/container_project.rb

-  has_many :container_replicators
-  has_many :container_services
+  has_many :container_routes, :dependent => :destroy
+  has_many :container_replicators, :dependent => :destroy


jrafanie · 2025-03-21T15:28:54Z

app/models/container_project.rb

-  has_many :container_services
+  has_many :container_routes, :dependent => :destroy
+  has_many :container_replicators, :dependent => :destroy
+  has_many :container_services, :dependent => :destroy


See ☝️ registries: https://github.com/ManageIQ/manageiq/pull/23389/files#r2007811459

There are hundreds of these to hundreds projects. Low churn or handled elsewhere?

jrafanie · 2025-03-21T15:29:21Z

app/models/container_project.rb

  has_many :containers, :through => :container_groups
  has_many :container_images, -> { distinct }, :through => :container_groups
  has_many :container_nodes, -> { distinct }, :through => :container_groups
  has_many :container_quotas, -> { active }, :inverse_of => :container_project
  has_many :container_quota_scopes, :through => :container_quotas
  has_many :container_quota_items, :through => :container_quotas
-  has_many :container_limits
+  has_many :container_limits, :dependent => :destroy


0 from example data

These are a top-level object which should be deleted by the refresher

app/models/container_project.rb

spec/models/mixins/purging_mixin_spec.rb

Add comment about pvcs from projects as pvcs are removed via the container volume belongs to. Add commment about pvc having container_volumes that can live on their own and be used by a different claim, no need to delete the volume when a claim is removed. Leave it to the purger where we already have purgers No other model nullifies metrics|states.

jrafanie · 2025-03-25T17:59:13Z

spec/models/mixins/purging_mixin_spec.rb

@@ -1,6 +1,19 @@
 RSpec.describe PurgingMixin do
  let(:example_class) { PolicyEvent }
  let(:purge_date) { 2.weeks.ago }
+  purge_by_delete_classes, purge_by_destroy_classes = ActiveRecord::Base.descendants.select { |m| m.ancestors.include?(PurgingMixin) && m.base_model == m }.partition { |m| m.purge_method == :delete }


I had to add the base_model check as we don't need to test all the descendant classes such as:

ManageIQ::Providers::Azure::ContainerManager::ContainerGroup.purge_method is destroy ManageIQ::Providers::Kubernetes::ContainerManager::ContainerGroup.purge_method is destroy ManageIQ::Providers::Vmware::ContainerManager::ContainerGroup.purge_method is destroy ManageIQ::Providers::OracleCloud::ContainerManager::Container.purge_method is destroy

TODO: add base_model? check and use it here

Yeah as discussed there's a base_class and base_class? method, so base_model should match that pattern with a base_model? method. Then we can use that here.

jrafanie · 2025-03-25T21:22:44Z

app/models/container_quota_item/purging.rb

@@ -3,6 +3,9 @@ module Purging
    extend ActiveSupport::Concern
    include PurgingMixin

+    # According to 022e15256fd07fa7bf5b3ade7ce16b13daa87b84
+    # This is necessary because ContainerQuotaItem may be archived due to edits
+    # to parent ContainerQuota that is still alive.


Added a bullet item to review if archiving/purging is needed for container quota/quota scopes/quota items in #23394

jrafanie · 2025-03-25T21:37:29Z

lib/extensions/ar_base_model.rb

@@ -1,7 +1,8 @@
 module ActiveRecord
  class Base
    class << self
-      alias_method :base_model, :base_class
+      alias_method :base_model,  :base_class
+      alias_method :base_model?, :base_class?


This is the new alias to simplify the purging_mixin test 👇

Fryguy · 2025-03-26T17:38:41Z

Backported to spassky in commit 1cf5193.

commit 1cf51930a4ca5f4cf435b7f8352e712b195bf332
Author: Jason Frey <[email protected]>
Date:   Tue Mar 25 18:01:22 2025 -0400

    Merge pull request #23389 from jrafanie/purge-revamp
    
    Opt into purging by destroy for container entities, use delete elsewhere
    
    (cherry picked from commit 27854bb139c662d52e4de8c139362d581ecd23fe)

Opt into purging by destroy for container entities, use delete elsewhere (cherry picked from commit 27854bb)

jrafanie added bug wip core spassky/yes? labels Mar 20, 2025

jrafanie requested review from agrare, Fryguy and kbrock as code owners March 20, 2025 18:59

jrafanie commented Mar 20, 2025

View reviewed changes

app/models/mixins/purging_mixin.rb Outdated Show resolved Hide resolved

jrafanie mentioned this pull request Mar 20, 2025

Change Container related tables purging to use archiving and destroy_all on archived records as they age out #23307

Closed

jrafanie force-pushed the purge-revamp branch 2 times, most recently from 0cd93cb to 0311e21 Compare March 20, 2025 22:43

jrafanie commented Mar 21, 2025

View reviewed changes

jrafanie force-pushed the purge-revamp branch from 0311e21 to e0ad93d Compare March 21, 2025 15:25

jrafanie commented Mar 21, 2025

View reviewed changes

jrafanie force-pushed the purge-revamp branch from 934b59c to 7d6f784 Compare March 25, 2025 15:26

jrafanie changed the title ~~[WIP] Opt into purging by destroy for container entities, use delete elsewhere~~ Opt into purging by destroy for container entities, use delete elsewhere Mar 25, 2025

jrafanie removed the wip label Mar 25, 2025

Fryguy assigned Fryguy and agrare Mar 25, 2025

Fryguy reviewed Mar 25, 2025

View reviewed changes

app/models/container_project.rb Outdated Show resolved Hide resolved

Fryguy reviewed Mar 25, 2025

View reviewed changes

spec/models/mixins/purging_mixin_spec.rb Outdated Show resolved Hide resolved

jrafanie added 3 commits March 25, 2025 12:08

Tell why quota item purger exists if quota already has a purger

af998f0

Add destroy verification

3d008a1

jrafanie force-pushed the purge-revamp branch 2 times, most recently from 385b3b9 to 47b2ba1 Compare March 25, 2025 17:50

jrafanie added 2 commits March 25, 2025 13:55

Add test confirming which classes purge by destroy vs. delete

08dedad

Extract a method purge_one_batch

7eec344

jrafanie force-pushed the purge-revamp branch from 47b2ba1 to 7eec344 Compare March 25, 2025 17:55

jrafanie commented Mar 25, 2025

View reviewed changes

Add purging tests for container classes

eaedae2

jrafanie commented Mar 25, 2025

View reviewed changes

Fryguy approved these changes Mar 25, 2025

View reviewed changes

Add base_model? alias and use it in purging_mixin spec

39ae690

jrafanie commented Mar 25, 2025

View reviewed changes

Fryguy merged commit 27854bb into ManageIQ:master Mar 25, 2025
8 of 12 checks passed

jrafanie mentioned this pull request Mar 26, 2025

Fix issue where base_model? is incorrect when base_model is overridden #23395

Merged

jrafanie deleted the purge-revamp branch March 26, 2025 13:48

Fryguy added spassky/yes and removed spassky/yes? labels Mar 26, 2025

Fryguy added a commit that referenced this pull request Mar 26, 2025

Merge pull request #23389 from jrafanie/purge-revamp

1cf5193

Opt into purging by destroy for container entities, use delete elsewhere (cherry picked from commit 27854bb)

Fryguy added spassky/backported and removed spassky/yes labels Mar 26, 2025

Opt into purging by destroy for container entities, use delete elsewhere #23389

Opt into purging by destroy for container entities, use delete elsewhere #23389

Uh oh!

Conversation

jrafanie commented Mar 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Fryguy commented Mar 20, 2025

Uh oh!

jrafanie commented Mar 20, 2025

Uh oh!

jrafanie commented Mar 20, 2025

Uh oh!

Fryguy commented Mar 21, 2025

Uh oh!

agrare commented Mar 21, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jrafanie Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Fryguy commented Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

jrafanie commented Mar 20, 2025 •

edited

Loading

jrafanie Mar 21, 2025 •

edited

Loading

Fryguy commented Mar 21, 2025 •

edited

Loading

jrafanie Mar 25, 2025 •

edited

Loading