Skip to content

feat: run deferred events with fresh charm instances #1631

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 34 commits into
base: main
Choose a base branch
from

Conversation

tonyandrewmeyer
Copy link
Contributor

@tonyandrewmeyer tonyandrewmeyer commented Mar 14, 2025

Create a new charm object for each emission of a deferred event, and for the event that triggered the ops process.

When using the testing.Context as a context manager, the charm that is made available is the one for the primary event. If there is demand in the future, we could add a mechanism for accessing the other charm objects for manipulation prior to running the deferred events, or inspection after.

This will break charms that are relying on the charm object to be the same object for deferred and the primary event. However, we have always told charmers to not do that, and it's risky behaviour because it'll differ based on whether it's the non-deferred or deferred handler call, so we're ok with this change.

The other significant change for charms is that, because we need to commit the framework before finishing up with each charm, the pre-commit and commit Lifecycle events are emitted after each deferred event as well as after the primary event. I'm only aware of two charms that use these:

  • ops-sunbeam, which uses commit to put statuses in stored state. Running more than once should be fine, and the performance impact not noticeable.
  • statustest (of @benhoyt's test-charms collection), which seems almost identical to the sunbeam one. I assume this isn't really used, and it wouldn't break anyway.

There are three main parts of this implementation:

  • Fairly minor adjustments to _main, particularly _Manager, so that the charm is passed around and the framework is accessed via the charm, rather than creating the framework and charm once, and the actual creation of each charm prior to re-emitting.
  • Functionality to "forget" charms. The framework registers event types on the class, not the instance, so creating multiple charm instances, even on different Frameworks, is not normally possible (Scenario gets around this by wrapping the event source in a new class for each run). CharmBase gains a _destory_charm function that undoes all of the event registration done in __init__ (for all the dynamic events from relations, containers, and so forth). I considered using __del__ for this, but it's awkward to cleanly remove everything when destruction is happening, and we don't really want charmers to start creating more than one charm instance, so this leaves the functionality private.
  • Reworking the Scenario "capture events" mechanism, to use a Framework subclass, rather than a context manager. This needed changes to handle capturing deferred events, but it also fixes a bug. Previously, the Framework class was patched, with methods that used local variables to determine which events would be captured: this meant that the capture_events() context manager was not threadsafe.

Fixes #1174

…it doesn't use cls, so a static method makes more sense than a class method.
This is a public API change so that _Manager can continue to use only public framework methods. It means that anyone using a framework object directly can also do this, but that seems acceptable.
…e type, even though it's not required right now.
@tonyandrewmeyer tonyandrewmeyer requested a review from benhoyt March 14, 2025 06:04
@tonyandrewmeyer
Copy link
Contributor Author

@benhoyt would you mind giving this a 'pre-review' - no need to go over it line-by-line, but check whether the general approach seems reasonable?

Copy link
Collaborator

@benhoyt benhoyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment about whether we can restructure to avoid the "forgetting" (which seems error-prone to me).

Regarding this:

This will break charms that are relying on the charm object to be the same object for deferred and the primary event. However, we have always told charmers to not do that, and it's risky behaviour because it'll differ based on whether it's the non-deferred or deferred handler call, so we're ok with this change.

I agree. But can we get any more verification of this? Perhaps (once we've finalised the approach) run several charms' integration tests to get a better idea.

Can we check with the ops-sunbeam folks about the commit change, and make sure the original authors of that approach seem okay with this?

The reemit change is an optional argument so shouldn't break even those existing users, right?

@tonyandrewmeyer
Copy link
Contributor Author

tonyandrewmeyer commented Mar 16, 2025

This will break charms that are relying on the charm object to be the same object for deferred and the primary event. However, we have always told charmers to not do that, and it's risky behaviour because it'll differ based on whether it's the non-deferred or deferred handler call, so we're ok with this change.

I agree. But can we get any more verification of this? Perhaps (once we've finalised the approach) run several charms' integration tests to get a better idea.

I can try that, yes. The trouble I've had previously is that the most common use of defer is in the data platform charms, and it's challenging to run their integration tests with a custom branch of ops.

Can we check with the ops-sunbeam folks about the commit change, and make sure the original authors of that approach seem okay with this?

Sure.

The reemit change is an optional argument so shouldn't break even those existing users, right?

Yes, it's entirely backwards compatible. I mentioned it because it means we support that going forward, and it's not strictly necessary (since _Manager could just use a private Framework method).

@benhoyt
Copy link
Collaborator

benhoyt commented Mar 16, 2025

I can try that, yes. The trouble I've had previously is that the most common use of defer is in the data platform charms, and it's challenging to run their integration tests with a custom branch of ops.

Well, let's try it on some non-data charms before merge, and then a couple of data platform ones after merge but before release. That would be easier, right? We can always revert or fix before release.

Regarding reemit and supporting that new arg. That's a good point. I think I'd prefer to use the private method in that case, rather than exporting an new arg that others won't / probably shouldn't use.

@tonyandrewmeyer tonyandrewmeyer marked this pull request as ready for review March 18, 2025 08:14
Copy link
Collaborator

@benhoyt benhoyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now, thanks

Copy link
Contributor

@dimaqq dimaqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About multiple lifecycle events: what happens if the collected status values are inconsistent?

@dimaqq
Copy link
Contributor

dimaqq commented Mar 25, 2025

Or did I perhaps misunderstand, and it's pre-commit and commit events are multiplied, but collect app/unit status events are not?

Copy link
Contributor

@dimaqq dimaqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed about 50% so far.

ops/charm.py Outdated
Comment on lines 1380 to 1386
for relation_name in self.framework.meta.relations:
relation_name = relation_name.replace('-', '_')
self.on._undefine_event(f'{relation_name}_relation_created')
self.on._undefine_event(f'{relation_name}_relation_joined')
self.on._undefine_event(f'{relation_name}_relation_changed')
self.on._undefine_event(f'{relation_name}_relation_departed')
self.on._undefine_event(f'{relation_name}_relation_broken')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow I'm not too happy with this.
Is there a better way?

For example, remembering the events that were defined?
A context manager?
Or removing events by name or pattern matching?
Or removing all events except the hardcoded set / allow list?
Or simply resetting the whole self.on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, remembering the events that were defined?

The events that were defined are the ones in the meta list - it seems to me that also putting them in some other list would just be duplication.

A context manager?

The defining is done in the CharmBase __init__, and I don't think we can change that and keep backwards compatibility. So the defining couldn't be in an __enter__. I guess main could do something like:

            with self._make_charm(event_handle.kind) as charm:
                charm.framework._reemit_single_path(event_path)
                self._commit(charm.framework)
                self._close()

And the __exit__ could call _destroy_charm. That doesn't seem significantly better to me, though. The calls to _undefine_event could go into the __exit__, but I think it's better to have them right next to the code that does the defining - for example, if we end up adding a new dynamic event then it's more likely that the person doing that will realise that they need to undefine it as well than if the code is in _main.py.

Or removing events by name or pattern matching?

This is by name. I'd rather be explicit (iterating through the meta list in exactly the same way that __init__ does) than do something like for event in self.on.events() with if event.kind.endswith('relation_created') or something like that.

Or removing all events except the hardcoded set / allow list?

This is roughly:

for event in self.on.events():
    if event.kind not in ('install', 'start', 'etc'):
        self.on._undefine_event(event)

We could probably generate and save the set of events in __init__.

To me this feels less clean, because it's doing more than just the inverse of what __init__ does, but it would have the advantage that if anyone had dynamically added to on those would get removed as well (but generally I think we should discourage that).

Or simply resetting the whole self.on?

The problem is that these are defined on the CharmEvents class, so that means we'd need to have something to produce the class itself, and we need to do this in a backwards compatible way. This ends up being much messier (at least in the attempts I made at it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's your view of this now?

@tonyandrewmeyer
Copy link
Contributor Author

About multiple lifecycle events: what happens if the collected status values are inconsistent?
Or did I perhaps misunderstand, and it's pre-commit and commit events are multiplied, but collect app/unit status events are not?

Yes, you get a pre-commit and commit after every event, but there's still just one collect-status run. If there was multiple collect-status emits then you'd just get the status after the first event until it was replaced with a status set from the next event - but it seems unlikely that there would be changes here, so I think it's better to take the performance win and only do that once.

For pre-commit and commit, mostly they're unused, but since we need to commit the framework it seems like we have to emit those events as well.

Possibly the confusion comes from the Sunbeam charms using the commit event to do status management.

@tonyandrewmeyer tonyandrewmeyer requested a review from dimaqq March 28, 2025 00:09
Comment on lines +34 to +35
ctx = Context(MyCharm, meta=MyCharm.META, capture_framework_events=True)
ctx.run(ctx.on.start(), State())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this change allows to de-trigger the other testing tests.
(obv., a separate PR if that's the case)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean stop using the "trigger" helper? That can definitely be done everywhere. In general, I've been doing it when tests need to be updated, but there could be a single PR that gets rid of all the remaining ones (but yes agreed in a separate PR).

@@ -961,6 +998,7 @@ def _reemit(self, single_event_path: Optional[str] = None):
try:
event = self.load_snapshot(event_handle)
except NoTypeError:
logger.debug('Skipping notice %s - cannot find event class.', event_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's probably OK, given that it should not happen.
At the same time, recall that we (often? typically?) run with debug on, and this line would likely appear in e.g. action log output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging output shouldn't appear in action logs, only in juju-log, because we send it to juju-log rather than printing it to stdout/stderr.

We do essentially always have logging set to debug (all the filtering, including level, being done on the Juju side), so that's correct that it would always end up there. But I think that's ok - this really isn't expected to happen, so when it does having a log seems ok (it almost feels like it could be info, but I think most of the time you don't care at all).

Comment on lines +1383 to +1385
for event in self.on.events():
if event in self._static_events:
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set(...) - set(...) may be more expressive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. It's briefer (one set of parentheses might not be required here):

Suggested change
for event in self.on.events():
if event in self._static_events:
continue
for event in (self._static_events - set(self.on.events())):

Personally, I feel like having the if/continue is more readable, as well as avoiding creating an additional set (although set-set does avoid the O(n) checking for membership).

@dimaqq
Copy link
Contributor

dimaqq commented Apr 7, 2025

This branch probably needs main to be merged into, even if no conflicts are detected by GitHub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run deferred Juju events on a fresh charm instance
3 participants