feat: run deferred events with fresh charm instances #1631

tonyandrewmeyer · 2025-03-14T00:10:20Z

Create a new charm object for each emission of a deferred event, and for the event that triggered the ops process.

When using the testing.Context as a context manager, the charm that is made available is the one for the primary event. If there is demand in the future, we could add a mechanism for accessing the other charm objects for manipulation prior to running the deferred events, or inspection after.

This will break charms that are relying on the charm object to be the same object for deferred and the primary event. However, we have always told charmers to not do that, and it's risky behaviour because it'll differ based on whether it's the non-deferred or deferred handler call, so we're ok with this change.

The other significant change for charms is that, because we need to commit the framework before finishing up with each charm, the pre-commit and commit Lifecycle events are emitted after each deferred event as well as after the primary event. I'm only aware of two charms that use these:

ops-sunbeam, which uses commit to put statuses in stored state. Running more than once should be fine, and the performance impact not noticeable.
statustest (of @benhoyt's test-charms collection), which seems almost identical to the sunbeam one. I assume this isn't really used, and it wouldn't break anyway.

There are three main parts of this implementation:

Fairly minor adjustments to _main, particularly _Manager, so that the charm is passed around and the framework is accessed via the charm, rather than creating the framework and charm once, and the actual creation of each charm prior to re-emitting.
Functionality to "forget" charms. The framework registers event types on the class, not the instance, so creating multiple charm instances, even on different Frameworks, is not normally possible (Scenario gets around this by wrapping the event source in a new class for each run). CharmBase gains a _destory_charm function that undoes all of the event registration done in __init__ (for all the dynamic events from relations, containers, and so forth). I considered using __del__ for this, but it's awkward to cleanly remove everything when destruction is happening, and we don't really want charmers to start creating more than one charm instance, so this leaves the functionality private.
Reworking the Scenario "capture events" mechanism, to use a Framework subclass, rather than a context manager. This needed changes to handle capturing deferred events, but it also fixes a bug. Previously, the Framework class was patched, with methods that used local variables to determine which events would be captured: this meant that the capture_events() context manager was not threadsafe.

Fixes #1174

…it doesn't use cls, so a static method makes more sense than a class method.

This is a public API change so that _Manager can continue to use only public framework methods. It means that anyone using a framework object directly can also do this, but that seems acceptable.

ops/_main.py

…e type, even though it's not required right now.

testing/tests/test_e2e/test_event.py

testing/src/scenario/_ops_main_mock.py

tonyandrewmeyer · 2025-03-14T06:05:10Z

@benhoyt would you mind giving this a 'pre-review' - no need to go over it line-by-line, but check whether the general approach seems reasonable?

…d changes.

testing/src/scenario/state.py

benhoyt

One comment about whether we can restructure to avoid the "forgetting" (which seems error-prone to me).

Regarding this:

This will break charms that are relying on the charm object to be the same object for deferred and the primary event. However, we have always told charmers to not do that, and it's risky behaviour because it'll differ based on whether it's the non-deferred or deferred handler call, so we're ok with this change.

I agree. But can we get any more verification of this? Perhaps (once we've finalised the approach) run several charms' integration tests to get a better idea.

Can we check with the ops-sunbeam folks about the commit change, and make sure the original authors of that approach seem okay with this?

The reemit change is an optional argument so shouldn't break even those existing users, right?

ops/framework.py

tonyandrewmeyer · 2025-03-16T22:35:15Z

This will break charms that are relying on the charm object to be the same object for deferred and the primary event. However, we have always told charmers to not do that, and it's risky behaviour because it'll differ based on whether it's the non-deferred or deferred handler call, so we're ok with this change.

I agree. But can we get any more verification of this? Perhaps (once we've finalised the approach) run several charms' integration tests to get a better idea.

I can try that, yes. The trouble I've had previously is that the most common use of defer is in the data platform charms, and it's challenging to run their integration tests with a custom branch of ops.

Can we check with the ops-sunbeam folks about the commit change, and make sure the original authors of that approach seem okay with this?

Sure.

The reemit change is an optional argument so shouldn't break even those existing users, right?

Yes, it's entirely backwards compatible. I mentioned it because it means we support that going forward, and it's not strictly necessary (since _Manager could just use a private Framework method).

benhoyt · 2025-03-16T22:39:04Z

I can try that, yes. The trouble I've had previously is that the most common use of defer is in the data platform charms, and it's challenging to run their integration tests with a custom branch of ops.

Well, let's try it on some non-data charms before merge, and then a couple of data platform ones after merge but before release. That would be easier, right? We can always revert or fix before release.

Regarding reemit and supporting that new arg. That's a good point. I think I'd prefer to use the private method in that case, rather than exporting an new arg that others won't / probably shouldn't use.

… and when we are not.

Note that this is not (and was not before) threadsafe, so if there are tests in different threads that are using different capture arguments, things may break.

… custom and lifecycle events).

benhoyt

Looks good to me now, thanks

ops/_main.py

dimaqq

About multiple lifecycle events: what happens if the collected status values are inconsistent?

ops/_main.py

dimaqq · 2025-03-25T08:43:27Z

Or did I perhaps misunderstand, and it's pre-commit and commit events are multiplied, but collect app/unit status events are not?

dimaqq

Reviewed about 50% so far.

test/test_main.py

dimaqq · 2025-03-25T08:52:07Z

ops/charm.py

+        for relation_name in self.framework.meta.relations:
+            relation_name = relation_name.replace('-', '_')
+            self.on._undefine_event(f'{relation_name}_relation_created')
+            self.on._undefine_event(f'{relation_name}_relation_joined')
+            self.on._undefine_event(f'{relation_name}_relation_changed')
+            self.on._undefine_event(f'{relation_name}_relation_departed')
+            self.on._undefine_event(f'{relation_name}_relation_broken')


Somehow I'm not too happy with this.
Is there a better way?

For example, remembering the events that were defined?
A context manager?
Or removing events by name or pattern matching?
Or removing all events except the hardcoded set / allow list?
Or simply resetting the whole self.on?

For example, remembering the events that were defined?

The events that were defined are the ones in the meta list - it seems to me that also putting them in some other list would just be duplication.

A context manager?

The defining is done in the CharmBase __init__, and I don't think we can change that and keep backwards compatibility. So the defining couldn't be in an __enter__. I guess main could do something like:

with self._make_charm(event_handle.kind) as charm: charm.framework._reemit_single_path(event_path) self._commit(charm.framework) self._close()

And the __exit__ could call _destroy_charm. That doesn't seem significantly better to me, though. The calls to _undefine_event could go into the __exit__, but I think it's better to have them right next to the code that does the defining - for example, if we end up adding a new dynamic event then it's more likely that the person doing that will realise that they need to undefine it as well than if the code is in _main.py.

Or removing events by name or pattern matching?

This is by name. I'd rather be explicit (iterating through the meta list in exactly the same way that __init__ does) than do something like for event in self.on.events() with if event.kind.endswith('relation_created') or something like that.

Or removing all events except the hardcoded set / allow list?

This is roughly:

for event in self.on.events(): if event.kind not in ('install', 'start', 'etc'): self.on._undefine_event(event)

We could probably generate and save the set of events in __init__.

To me this feels less clean, because it's doing more than just the inverse of what __init__ does, but it would have the advantage that if anyone had dynamically added to on those would get removed as well (but generally I think we should discourage that).

Or simply resetting the whole self.on?

The problem is that these are defined on the CharmEvents class, so that means we'd need to have something to produce the class itself, and we need to do this in a backwards compatible way. This ends up being much messier (at least in the attempts I made at it).

What's your view of this now?

tonyandrewmeyer · 2025-03-27T23:44:08Z

About multiple lifecycle events: what happens if the collected status values are inconsistent?
Or did I perhaps misunderstand, and it's pre-commit and commit events are multiplied, but collect app/unit status events are not?

Yes, you get a pre-commit and commit after every event, but there's still just one collect-status run. If there was multiple collect-status emits then you'd just get the status after the first event until it was replaced with a status set from the next event - but it seems unlikely that there would be changes here, so I think it's better to take the performance win and only do that once.

For pre-commit and commit, mostly they're unused, but since we need to commit the framework it seems like we have to emit those events as well.

Possibly the confusion comes from the Sunbeam charms using the commit event to do status management.

testing/tests/test_emitted_events_util.py

dimaqq · 2025-04-07T04:58:48Z

testing/tests/test_emitted_events_util.py

+    ctx = Context(MyCharm, meta=MyCharm.META, capture_framework_events=True)
+    ctx.run(ctx.on.start(), State())


I wonder if this change allows to de-trigger the other testing tests.
(obv., a separate PR if that's the case)

Do you mean stop using the "trigger" helper? That can definitely be done everywhere. In general, I've been doing it when tests need to be updated, but there could be a single PR that gets rid of all the remaining ones (but yes agreed in a separate PR).

testing/src/scenario/state.py

test/test_main.py

test/charms/test_main/src/charm.py

dimaqq · 2025-04-07T05:17:15Z

ops/framework.py

@@ -961,6 +998,7 @@ def _reemit(self, single_event_path: Optional[str] = None):
            try:
                event = self.load_snapshot(event_handle)
            except NoTypeError:
+                logger.debug('Skipping notice %s - cannot find event class.', event_path)


That's probably OK, given that it should not happen.
At the same time, recall that we (often? typically?) run with debug on, and this line would likely appear in e.g. action log output.

Logging output shouldn't appear in action logs, only in juju-log, because we send it to juju-log rather than printing it to stdout/stderr.

We do essentially always have logging set to debug (all the filtering, including level, being done on the Juju side), so that's correct that it would always end up there. But I think that's ok - this really isn't expected to happen, so when it does having a log seems ok (it almost feels like it could be info, but I think most of the time you don't care at all).

ops/framework.py

dimaqq · 2025-04-07T05:24:02Z

ops/charm.py

+        for event in self.on.events():
+            if event in self._static_events:
+                continue


set(...) - set(...) may be more expressive.

I'm not sure. It's briefer (one set of parentheses might not be required here):

Suggested change

for event in self.on.events():

if event in self._static_events:

continue

for event in (self._static_events - set(self.on.events())):

Personally, I feel like having the if/continue is more readable, as well as avoiding creating an additional set (although set-set does avoid the O(n) checking for membership).

ops/charm.py

dimaqq · 2025-04-07T05:52:48Z

This branch probably needs main to be merged into, even if no conflicts are detected by GitHub.

Co-authored-by: Dima Tisnek <[email protected]>

Revert #1631 (f97e707). An issue that was missed was that because the framework doesn't provide a recommended way to run code prior to the event handler (an inverse of the `commit` event) charms are doing this work in `__init__`. With the change from #1631, that code would be run more than once per hook if there are any queued notices, which may be problematic. We also don't provide a mechanism for knowing if a handler is running from a queued notice or as a new event (other than looking in the environment), so charms couldn't trigger only on one of the events (although this would be simpler to solve). For now, we're reverting this change. We'll discuss further whether we abandon this change, find an alternative implementation, or provide alternatives to running this type of code in `__init__` and do the change later.

tonyandrewmeyer added 5 commits March 12, 2025 15:24

_run doesn't use self, so make it explicitly a static method. log_spl…

48d2c33

…it doesn't use cls, so a static method makes more sense than a class method.

Allow reemit'ing a single event path.

4926451

This is a public API change so that _Manager can continue to use only public framework methods. It means that anyone using a framework object directly can also do this, but that seems acceptable.

feat: emit deferred events with a fresh charm instance

6b54dd6

Merge origin/main.

79245bb

Fix merge.

40b20e1

tonyandrewmeyer commented Mar 14, 2025

View reviewed changes

ops/_main.py Show resolved Hide resolved

Close each event so that the Scenario state is updated. unregister th…

1256e64

…e type, even though it's not required right now.

tonyandrewmeyer commented Mar 14, 2025

View reviewed changes

testing/tests/test_e2e/test_event.py Show resolved Hide resolved

tonyandrewmeyer commented Mar 14, 2025

View reviewed changes

testing/src/scenario/_ops_main_mock.py Show resolved Hide resolved

Merge origin/main

8e25005

tonyandrewmeyer requested a review from benhoyt March 14, 2025 06:04

tonyandrewmeyer added 7 commits March 14, 2025 19:09

Adjust after merge.

632fb96

Add test to make sure that we get all the logs even though the backen…

9936a27

…d changes.

Resolve TODO via a comment.

ae0191a

Remove unused argument and make static checker less sad.

2940a85

Ensure the event being deferred is actually observed.

2b3baec

We must observe the event in order to have had it deferred.

1c123bd

Minor tweaks.

56feef0

tonyandrewmeyer commented Mar 14, 2025

View reviewed changes

testing/src/scenario/state.py Show resolved Hide resolved

benhoyt reviewed Mar 16, 2025

View reviewed changes

ops/framework.py Outdated Show resolved Hide resolved

tonyandrewmeyer added 7 commits March 18, 2025 13:20

Don't bother testing the length when the content is tested right after.

c4a7bd5

Make changes private.

a501a38

Use the new method to defer, so that it's clear when we are deferring…

769d4f6

… and when we are not.

Use private methods.

5440855

import ops, ops.x

4db2223

Adjust capturing.

c3325f7

Note that this is not (and was not before) threadsafe, so if there are tests in different threads that are using different capture arguments, things may break.

Fix capturing events.

ad739e3

Add test to make sure that charm instances are not reused (except for…

41580eb

… custom and lifecycle events).

tonyandrewmeyer requested review from benhoyt and dimaqq March 18, 2025 08:13

tonyandrewmeyer marked this pull request as ready for review March 18, 2025 08:14

benhoyt approved these changes Mar 19, 2025

View reviewed changes

ops/_main.py Show resolved Hide resolved

ops/_main.py Outdated Show resolved Hide resolved

Improve docstring, add comment.

8a1baaa

dimaqq reviewed Mar 25, 2025

View reviewed changes

ops/_main.py Outdated Show resolved Hide resolved

dimaqq reviewed Mar 25, 2025

View reviewed changes

tonyandrewmeyer added 4 commits March 28, 2025 12:27

import pathlib, not from pathlib import.

2824fe7

Remove unnecessary change.

1fef404

Minor docstring fix.

0e8ceff

Remove unnnecessary parentheses.

c56333d

Simplify the undefining, per review.

eda6599

tonyandrewmeyer requested a review from dimaqq March 28, 2025 00:09

dimaqq approved these changes Apr 7, 2025

View reviewed changes

tonyandrewmeyer and others added 6 commits April 25, 2025 17:16

Merge origin/main.

bbca999

post-merge tweaks.

741298a

post-merge fixes.

ba745c7

Add comment, per review.

cb5856b

Update ops/framework.py

38aa725

Co-authored-by: Dima Tisnek <[email protected]>

Tweaks per review.

2445c14

tonyandrewmeyer merged commit f97e707 into canonical:main Apr 27, 2025
31 of 32 checks passed

tonyandrewmeyer deleted the defer-new-charm-object branch April 27, 2025 22:37

tonyandrewmeyer mentioned this pull request May 1, 2025

revert: run deferred events with fresh charm instances #1711

Merged

tonyandrewmeyer mentioned this pull request Jun 23, 2025

feat: expose trace data in testing #1842

Merged

3 tasks

		ctx = Context(MyCharm, meta=MyCharm.META, capture_framework_events=True)
		ctx.run(ctx.on.start(), State())

feat: run deferred events with fresh charm instances #1631

feat: run deferred events with fresh charm instances #1631

Conversation

tonyandrewmeyer commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tonyandrewmeyer commented Mar 14, 2025

Uh oh!

Uh oh!

benhoyt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tonyandrewmeyer commented Mar 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benhoyt commented Mar 16, 2025

Uh oh!

benhoyt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dimaqq left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dimaqq commented Mar 25, 2025

Uh oh!

dimaqq left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tonyandrewmeyer commented Mar 27, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dimaqq commented Apr 7, 2025

Uh oh!

Uh oh!

Uh oh!

tonyandrewmeyer commented Mar 14, 2025 •

edited

Loading

tonyandrewmeyer commented Mar 16, 2025 •

edited

Loading