Skip to content

Change ValueChangedEvent to a struct #6114

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

peppy
Copy link
Member

@peppy peppy commented Jan 9, 2024

There's very little reason for this to be a class.

It is constructed by each bindable locally when it has listeners, and generally never modified or passed further. Changing to a struct reduces GC overhead.

(honestly we shouldn't be firing these enough for it to be an issue in the first place, but there's some egregious usage of bindables in the game which needs further consideration to fix, and this may alleviate things somewhat).

Not sure whether this needs profiling. A well structured benchmark may show lower GC counts, but allocations should not change.

osu!-side changes:

diff --git a/osu.Game/Screens/OnlinePlay/Match/RoomSubScreen.cs b/osu.Game/Screens/OnlinePlay/Match/RoomSubScreen.cs
index 2cd8e45d28..d9c7ee3ffd 100644
--- a/osu.Game/Screens/OnlinePlay/Match/RoomSubScreen.cs
+++ b/osu.Game/Screens/OnlinePlay/Match/RoomSubScreen.cs
@@ -493,7 +493,7 @@ private void endHandlingTrack()
             cancelTrackLooping();
         }
 
-        private void applyLoopingToTrack(ValueChangedEvent<WorkingBeatmap> _ = null)
+        private void applyLoopingToTrack(ValueChangedEvent<WorkingBeatmap> _ = default)
         {
             if (!this.IsCurrentScreen())
                 return;
diff --git a/osu.Game/Screens/Select/SongSelect.cs b/osu.Game/Screens/Select/SongSelect.cs
index 2d5c44e5a5..6b229936d5 100644
--- a/osu.Game/Screens/Select/SongSelect.cs
+++ b/osu.Game/Screens/Select/SongSelect.cs
@@ -493,9 +493,9 @@ public void FinaliseSelection(BeatmapInfo? beatmapInfo = null, RulesetInfo? rule
 
         private ScheduledDelegate? selectionChangedDebounce;
 
-        private void updateCarouselSelection(ValueChangedEvent<WorkingBeatmap>? e = null)
+        private void updateCarouselSelection(ValueChangedEvent<WorkingBeatmap> e = default)
         {
-            var beatmap = e?.NewValue ?? Beatmap.Value;
+            var beatmap = e.NewValue ?? Beatmap.Value;
             if (beatmap is DummyWorkingBeatmap || !this.IsCurrentScreen()) return;
 
             Logger.Log($"Song select working beatmap updated to {beatmap}");

@bdach
Copy link
Collaborator

bdach commented Jan 9, 2024

My main problem with this change is that I have no idea how to gauge if it does anything good or if it breaks anything (loudly or silently).

Like yes it takes 5 minutes to review the diff but this may as well break somewhere in the hundreds of usages of the struct somewhere else due to type semantics changing. And without conclusive profiling I'm not even seeing what we get out of this. It is not obvious to me that the decreased heap pressure is worth trading off for increased number of struct copies.

I'm currently running game-side test suites as an initial sweep in a weak attempt to estimate the blast radius of this. But I don't know what comes next.

@@ -7,7 +7,7 @@ namespace osu.Framework.Bindables
/// An event fired when a value changes, providing the old and new value for reference.
/// </summary>
/// <typeparam name="T">The type of bindable.</typeparam>
public class ValueChangedEvent<T>
public struct ValueChangedEvent<T>
Copy link
Collaborator

@bdach bdach Jan 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could probably declare as readonly struct at this point, but again, not sure what that does or doesn't do (in fact it'd be safer to do so if possible as that decreases the chance of defensive copying happening)

@peppy
Copy link
Member Author

peppy commented Jan 9, 2024

for increased number of struct copies.

Do you have an example of where this struct is copied? Most every usage I looked at was (as you'd hope) immediate usage and then throw-away. That's the only reason I made this change and am confident in doing so without much thought.

@bdach
Copy link
Collaborator

bdach commented Jan 9, 2024

Do you have an example of where this struct is copied

No but that's part of the point. I'm not sure how I would be able to tell where it is or isn't copied without spending hours combing through every usage.

@peppy
Copy link
Member Author

peppy commented Jan 9, 2024

Well even if it's copies, ignoring performance issues (since logically i'm sure we can agree any copies would be a an edge case) it should be enough to check for zero mutation of the event's properties to ensure behaviour is not regressed, right?

@bdach
Copy link
Collaborator

bdach commented Jan 9, 2024

since logically i'm sure we can agree any copies would be a an edge case

Yeah see I'm not sure about that either.

https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/builtin-types/struct#passing-structure-type-variables-by-reference says:

When you pass a structure-type variable to a method as an argument or return a structure-type value from a method, the whole instance of a structure type is copied. Pass by value can affect the performance of your code in high-performance scenarios that involve large structure types. You can avoid value copying by passing a structure-type variable by reference. Use the ref, out, in, or ref readonly method parameter modifiers to indicate that an argument must be passed by reference. Use ref returns to return a method result by reference. For more information, see Avoid allocations.

Which suggests that copies are going to be happening as there is no readonly carve-out here, and moreover, they're not going to be rare, since the point of the thing is to be passed to callbacks, which - if I am interpreting the above correctly - will incur struct copies rather than by-ref passing that was there before.

If we were wanting to fix that, we'd have to change the signature of ValueChanged et al. to pass the struct by ref (or probably in), and I don't think I need to explain how sweeping of a change that would be...

@bdach
Copy link
Collaborator

bdach commented Jan 9, 2024

I tried to write a benchmark to confirm or deny the above and I'm even more confused than I was before.

https://gist.github.com/bdach/80815935278b1ff580cebe16f9931ec3

The results make zero sense to me.

  • Why are classes twice as slow?
  • Why is there no difference between struct, readonly struct by-val, and readonly struct by-in?

Did I screw up the benchmark somehow?

@smoogipoo
Copy link
Contributor

Why are classes twice as slow?

Likely due to having to dereference from non-contiguous references.

Why is there no difference between struct, readonly struct by-val, and readonly struct by-in?

Because the JIT is smart enough to pass structs by value if they're small enough. You should see an effect by increasing the size of the struct.

https://devblogs.microsoft.com/premier-developer/the-in-modifier-and-the-readonly-structs-in-c/

@bdach
Copy link
Collaborator

bdach commented Jan 9, 2024

You should see an effect by increasing the size of the struct

I spammed a few extra fields... but am equally unsure how to interpret the results...?

Okay sure the gen0-2 counts are gone (which is expected? I think?) but also structs are now... slower? Universally, too? in doesn't seem to change much?

I must be somehow benchmarking wrong because this is nonsense. The blogpost linked above even says this:

If the size of a readonly struct is bigger than IntPtr.Size you should pass it as an in-parameter for performance reasons.

which confirms my initial thinking, and is kind of where I'm stuck at deciding if it should be in for not, because the size of the struct is gonna depend on the generic T in ValueChangedEvent, but I think it's likely that 60-70% of usages are going to exceed that (because either two references, which will be IntPtr.Size wide each, or a value type larger than IntPtr.Size, which is long / double / probably most non-trivial structs).

@peppy
Copy link
Member Author

peppy commented Jan 10, 2024

gen0-2 counts are gone (which is expected? I think?)

I would have expected these to go away for even the earlier benchmarks, can you explain why they are still there? o_O

@bdach
Copy link
Collaborator

bdach commented Jan 10, 2024

Nope, I cannot.

@peppy
Copy link
Member Author

peppy commented Jan 10, 2024

I'm going to drop this into a draft state in the hope of investigating and understanding this for the future. Let's not make rash changes here for now.

@peppy peppy marked this pull request as draft January 10, 2024 06:05
@smoogipoo
Copy link
Contributor

smoogipoo commented May 11, 2025

Reviving this a little bit because these structs are a large number of allocations we're seeing in gameplay. I'd still like to see this go in.

The issue you're seeing in that benchmark is that frobnicate is trivial therefore getting inlined. You'll see differences if you attach [MethodImpl(MethodImplOptions.NoInlining)] to those methods.

Here's a representative raw ASM - it doesn't exactly match the benchmark code (I had to modify it for sharplab) but you should notice how similar class and in readonly struct are :

BenchmarkStructPassing.PassClass()
    L0000: push eax
    L0001: mov edx, [ecx+4]
    L0004: cmp dword ptr [edx+4], 0
    L0008: jbe short L001c
    L000a: mov edx, [edx+8]
    L000d: call 0x2da600f0
    L0012: fstp dword ptr [esp], st
    L0015: vmovss xmm0, [esp]
    L001a: pop ecx
    L001b: ret
    L001c: call Internal.Runtime.CompilerHelpers.ThrowHelpers.ThrowIndexOutOfRangeException()
    L0021: int3

BenchmarkStructPassing.PassStruct()
    L0000: push ebp
    L0001: mov ebp, esp
    L0003: push eax
    L0004: mov eax, [ecx+8]
    L0007: cmp dword ptr [eax+4], 0
    L000b: jbe short L003b
    L000d: add eax, 8
    L0010: push dword ptr [eax+0x20]
    L0013: push dword ptr [eax+0x1c]
    L0016: push dword ptr [eax+0x18]
    L0019: push dword ptr [eax+0x14]
    L001c: push dword ptr [eax+0x10]
    L001f: push dword ptr [eax+0xc]
    L0022: push dword ptr [eax+8]
    L0025: push dword ptr [eax+4]
    L0028: push dword ptr [eax]
    L002a: call 0x2da60120
    L002f: fstp dword ptr [ebp-4], st
    L0032: vmovss xmm0, [ebp-4]
    L0037: mov esp, ebp
    L0039: pop ebp
    L003a: ret
    L003b: call Internal.Runtime.CompilerHelpers.ThrowHelpers.ThrowIndexOutOfRangeException()
    L0040: int3

BenchmarkStructPassing.PassReadonlyStruct()
    L0000: push ebp
    L0001: mov ebp, esp
    L0003: push eax
    L0004: mov eax, [ecx+0xc]
    L0007: cmp dword ptr [eax+4], 0
    L000b: jbe short L003b
    L000d: add eax, 8
    L0010: push dword ptr [eax+0x20]
    L0013: push dword ptr [eax+0x1c]
    L0016: push dword ptr [eax+0x18]
    L0019: push dword ptr [eax+0x14]
    L001c: push dword ptr [eax+0x10]
    L001f: push dword ptr [eax+0xc]
    L0022: push dword ptr [eax+8]
    L0025: push dword ptr [eax+4]
    L0028: push dword ptr [eax]
    L002a: call 0x2da60150
    L002f: fstp dword ptr [ebp-4], st
    L0032: vmovss xmm0, [ebp-4]
    L0037: mov esp, ebp
    L0039: pop ebp
    L003a: ret
    L003b: call Internal.Runtime.CompilerHelpers.ThrowHelpers.ThrowIndexOutOfRangeException()
    L0040: int3

BenchmarkStructPassing.PassReadonlyStructWithIn()
    L0000: push eax
    L0001: mov edx, [ecx+0xc]
    L0004: cmp dword ptr [edx+4], 0
    L0008: jbe short L001c
    L000a: add edx, 8
    L000d: call 0x2da60180
    L0012: fstp dword ptr [esp], st
    L0015: vmovss xmm0, [esp]
    L001a: pop ecx
    L001b: ret
    L001c: call Internal.Runtime.CompilerHelpers.ThrowHelpers.ThrowIndexOutOfRangeException()
    L0021: int3

With 2 fields:

|                   Method |     Mean |     Error |    StdDev |    Gen0 |    Gen1 |    Gen2 | Allocated |
|------------------------- |---------:|----------:|----------:|--------:|--------:|--------:|----------:|
|                PassClass | 6.450 ms | 0.1272 ms | 0.1361 ms | 70.3125 | 70.3125 | 31.2500 |   3.81 MB |
|               PassStruct | 3.468 ms | 0.0279 ms | 0.0261 ms | 54.6875 | 54.6875 | 35.1563 |   3.81 MB |
|       PassReadonlyStruct | 3.424 ms | 0.0677 ms | 0.0695 ms | 62.5000 | 62.5000 | 42.9688 |   3.81 MB |
| PassReadonlyStructWithIn | 3.179 ms | 0.0192 ms | 0.0160 ms | 54.6875 | 54.6875 | 35.1563 |   3.81 MB |

With 9 fields:

|                   Method |     Mean |     Error |    StdDev |   Median |    Gen0 |    Gen1 |    Gen2 | Allocated |
|------------------------- |---------:|----------:|----------:|---------:|--------:|--------:|--------:|----------:|
|                PassClass | 7.751 ms | 0.1501 ms | 0.1606 ms | 7.723 ms | 54.6875 | 54.6875 | 15.6250 |   3.81 MB |
|               PassStruct | 2.797 ms | 0.0532 ms | 0.0522 ms | 2.806 ms | 42.9688 | 42.9688 | 23.4375 |   3.81 MB |
|       PassReadonlyStruct | 2.706 ms | 0.0522 ms | 0.1279 ms | 2.659 ms | 54.6875 | 54.6875 | 15.6250 |   3.81 MB |
| PassReadonlyStructWithIn | 3.824 ms | 0.0289 ms | 0.0242 ms | 3.829 ms | 46.8750 | 46.8750 | 27.3438 |   3.81 MB |

With 2 string fields which is the largest I foresee us using constantly. Yes there are larger (e.g. Bindable<MarginPadding>) but I don't foresee those being used as much:

|                   Method |     Mean |     Error |    StdDev |   Median |    Gen0 |    Gen1 |    Gen2 | Allocated |
|------------------------- |---------:|----------:|----------:|---------:|--------:|--------:|--------:|----------:|
|                PassClass | 9.528 ms | 0.1221 ms | 0.0953 ms | 9.504 ms | 93.7500 | 93.7500 | 15.6250 |   3.81 MB |
|               PassStruct | 8.862 ms | 0.1427 ms | 0.2849 ms | 8.726 ms | 93.7500 | 93.7500 | 15.6250 |   3.81 MB |
|       PassReadonlyStruct | 8.784 ms | 0.0792 ms | 0.0618 ms | 8.776 ms | 93.7500 | 93.7500 | 15.6250 |   3.81 MB |
| PassReadonlyStructWithIn | 9.465 ms | 0.1855 ms | 0.2719 ms | 9.318 ms | 93.7500 | 93.7500 | 15.6250 |   3.81 MB |

(Though the results seem multi-modal, possibly suggesting GC influence)

In any case, I think passing by value is best. You probably shouldn't be using bindables at all at a point where raw performance is becoming an issue to this point. Passing by ref is a bit more pain for the callsite to handle - requires actual changes in code whereas by value should be relatively painless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants