Cache StringNames and NodePaths in implicit casts from string in C##100452
Cache StringNames and NodePaths in implicit casts from string in C##100452Joy-less wants to merge 1 commit intogodotengine:masterfrom
Conversation
|
I ran some benchmarks using BenchmarkDotNet.Godot with alternative caching mechanisms:
Benchmark Code#nullable enable
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Threading;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Godot.Attributes;
using BenchmarkDotNet.Godot.Attributes.Jobs;
using Godot;
using Microsoft.Extensions.Caching.Memory;
using ZiggyCreatures.Caching.Fusion;
[MemoryDiagnoser, GodotSimpleJob]
public partial class StringNameBenchmark : Node {
[Export] public int TestInteger = 53;
public const int Count = 200_000;
private readonly FusionCache FusionCache = new(new FusionCacheOptions());
private readonly MemoryCache MemoryCache = new(new MemoryCacheOptions());
private readonly ConcurrentDictionary<string, StringName> ConcurrentDictionary = [];
private readonly Dictionary<string, StringName> Dictionary = [];
private readonly Lock Lock = new();
private string TargetString = "TestInteger";
[GodotBenchmark]
public void NoneTest() {
for (int Counter = 0; Counter < Count; Counter++) {
Get(TargetString);
}
}
[GodotBenchmark]
public void FusionCacheTest() {
for (int Counter = 0; Counter < Count; Counter++) {
Get(FusionCache.GetOrSet(TargetString, _ => (StringName)TargetString));
}
}
[GodotBenchmark]
public void MemoryCacheTest() {
for (int Counter = 0; Counter < Count; Counter++) {
Get(MemoryCache.GetOrCreate(TargetString, _ => (StringName)TargetString));
}
}
[GodotBenchmark]
public void MemoryCacheStaticTest() {
for (int Counter = 0; Counter < Count; Counter++) {
Get(MemoryCache.GetOrCreate(TargetString, static (ICacheEntry Entry) => (StringName)(string)Entry.Key));
}
}
[GodotBenchmark]
public void ConcurrentDictionaryTest() {
for (int Counter = 0; Counter < Count; Counter++) {
Get(ConcurrentDictionary.GetOrAdd(TargetString, _ => (StringName)TargetString));
}
}
[GodotBenchmark]
public void ConcurrentDictionaryStaticTest() {
for (int Counter = 0; Counter < Count; Counter++) {
Get(ConcurrentDictionary.GetOrAdd(TargetString, static (string Key) => (StringName)Key));
}
}
[GodotBenchmark]
public void DictionaryWithLockTest() {
for (int Counter = 0; Counter < Count; Counter++) {
lock (Lock) {
if (!Dictionary.TryGetValue(TargetString, out StringName? Result)) {
Dictionary[TargetString] = (StringName)TargetString;
}
Get(Result);
}
}
}
}
I think the best option is to change to string TargetString = "TestNumber";
StringName TargetStringName = StringNameCache.GetOrCreate(TargetString, static (ICacheEntry Entry) => new StringName((string)Entry.Key));The static lambda avoids allocating any delegates. The unboxing should not be a problem since The implicit cast would end up looking like this: public static implicit operator StringName(string from) => _stringNameCache.GetOrCreate(from, static (ICacheEntry entry) => new StringName((string)entry.Key))!; |
|
Hi all, FusionCache creator here: let me know if you need anything. Just FYI: Also, after months of work, FusionCache V2 is about to be released in the next few weeks with a lot of extra features. Here is the last preview. Hope this helps! |
|
I don't know that much about Godot, so take this with a pinch of salt, but: for something so core and so core and so low level I would probably not use FusionCache, nor MemoryCache (its API surface area is not really great) nor anything like that at all. I think I'd create something uber low level and 100% specialized for this exact purpose, basically something like a To be clear: I'm saying this not because FusionCache is bad, I mean, I created it (and FWIW I think it's really good 😬). The thing is that it does a lot like L1+L2, backplane, fail-safe, eager-refresh, conditional refresh, auto-recovery and way more, and if what is needed here is "just" an in-memory dictionary with some concurrent access mechanism, that may be overkill. Hope this helps. |
|
Hi @Joy-less
I'm not sure I got this 100% right, but to be clear: FusionCache can of course work with a static delegate and no allocation at all. Hope this helps. |
|
I have changed the cache again to Benchmark code#nullable enable
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Threading;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Godot.Attributes;
using BenchmarkDotNet.Godot.Attributes.Jobs;
using BitFaster.Caching;
using BitFaster.Caching.Lfu;
using BitFaster.Caching.Lru;
using Godot;
using Microsoft.Extensions.Caching.Memory;
using ZiggyCreatures.Caching.Fusion;
[MemoryDiagnoser, GodotSimpleJob]
public partial class StringNameBenchmark : Node {
[Export] public int TestInteger = 53;
public const int Count = 200_000;
private readonly FusionCache FusionCache = new(new FusionCacheOptions());
private readonly MemoryCache MemoryCache = new(new MemoryCacheOptions());
private readonly ConcurrentLru<string, StringName> ConcurrentLru = new(capacity: 1_000);
private readonly ConcurrentLfu<string, StringName> ConcurrentLfu = new(capacity: 1_000);
private readonly ICache<string, StringName> ExpiringConcurrentLru = new ConcurrentLruBuilder<string, StringName>()
.WithCapacity(1_000)
.WithExpireAfterAccess(TimeSpan.FromSeconds(30))
.Build();
private readonly ConcurrentDictionary<string, StringName> ConcurrentDictionary = [];
private readonly Dictionary<string, StringName> Dictionary = [];
private readonly Lock Lock = new();
private string TargetString = "TestInteger";
[GodotBenchmark]
public void NoneTest() {
for (int Counter = 0; Counter < Count; Counter++) {
Get(TargetString);
}
}
[GodotBenchmark]
public void FusionCacheTest() {
for (int Counter = 0; Counter < Count; Counter++) {
Get(FusionCache.GetOrSet(TargetString, _ => (StringName)TargetString));
}
}
[GodotBenchmark]
public void MemoryCacheTest() {
for (int Counter = 0; Counter < Count; Counter++) {
Get(MemoryCache.GetOrCreate(TargetString, _ => (StringName)TargetString));
}
}
[GodotBenchmark]
public void MemoryCacheStaticTest() {
for (int Counter = 0; Counter < Count; Counter++) {
Get(MemoryCache.GetOrCreate(TargetString, static (ICacheEntry Entry) => (StringName)(string)Entry.Key));
}
}
[GodotBenchmark]
public void ConcurrentDictionaryTest() {
for (int Counter = 0; Counter < Count; Counter++) {
Get(ConcurrentDictionary.GetOrAdd(TargetString, _ => (StringName)TargetString));
}
}
[GodotBenchmark]
public void ConcurrentDictionaryStaticTest() {
for (int Counter = 0; Counter < Count; Counter++) {
Get(ConcurrentDictionary.GetOrAdd(TargetString, static (string Key) => (StringName)Key));
}
}
[GodotBenchmark]
public void DictionaryWithLockTest() {
for (int Counter = 0; Counter < Count; Counter++) {
lock (Lock) {
if (!Dictionary.TryGetValue(TargetString, out StringName? Result)) {
Dictionary[TargetString] = (StringName)TargetString;
}
Get(Result);
}
}
}
[GodotBenchmark]
public void ConcurrentLruTest() {
for (int Counter = 0; Counter < Count; Counter++) {
Get(ConcurrentLru.GetOrAdd(TargetString, static (string TargetString) => (StringName)TargetString));
}
}
[GodotBenchmark]
public void ConcurrentLfuTest() {
for (int Counter = 0; Counter < Count; Counter++) {
Get(ConcurrentLfu.GetOrAdd(TargetString, static (string TargetString) => (StringName)TargetString));
}
}
[GodotBenchmark]
public void ExpiringConcurrentLruTest() {
for (int Counter = 0; Counter < Count; Counter++) {
Get(ExpiringConcurrentLru.GetOrAdd(TargetString, static (string TargetString) => (StringName)TargetString));
}
}
}Benchmark results:
Conclusion |
|
This can also be emulated before the pull request is accepted using the following helper functions: Open meglobal using static RecycleExtensions;
using System.Diagnostics.CodeAnalysis;
using BitFaster.Caching;
using BitFaster.Caching.Lru;
public static class RecycleExtensions {
private static readonly ICache<string, StringName> StringNameCache = new ConcurrentLruBuilder<string, StringName>()
.WithCapacity(1_000)
.WithExpireAfterAccess(TimeSpan.FromSeconds(30))
.Build();
private static readonly ICache<string, NodePath> NodePathCache = new ConcurrentLruBuilder<string, NodePath>()
.WithCapacity(1_000)
.WithExpireAfterAccess(TimeSpan.FromSeconds(30))
.Build();
/// <summary>
/// Converts a <see cref="string"/> to a <see cref="Godot.StringName"/>.<br/>
/// The resulting <see cref="Godot.StringName"/> is temporarily cached for future casts.
/// </summary>
[return: NotNullIfNotNull(nameof(String))]
public static StringName? StringName(string? String) {
if (String is null) {
return null;
}
return StringNameCache.GetOrAdd(String, static (string From) => new StringName(From));
}
/// <summary>
/// Converts a <see cref="string"/> to a <see cref="Godot.NodePath"/>.<br/>
/// The resulting <see cref="Godot.NodePath"/> is temporarily cached for future casts.
/// </summary>
[return: NotNullIfNotNull(nameof(String))]
public static NodePath? NodePath(string? String) {
if (String is null) {
return null;
}
return NodePathCache.GetOrAdd(String, static (string From) => new NodePath(From));
}
}Use like so: StringName Apple = StringName("apple");
StringName Banana = NodePath("banana"); |
|
I removed the |
|
Is there a solid reason why we should invalidate the caches by time? |
The caches must be invalidated at some point to prevent Ruby's symbols denial-of-service problem (at least before mortal symbols were introduced). |
It almost sounds like your solution aims to address the use case where the engine user suddenly creates an unreasonable number of StringNames (or NodePaths). Is this common in a typical project?
Generally, the tiered GC (as opposed to the Unity's legacy non-moving non-tiered GC) can take care of a small number (<500? benchmark needed) of these creations (or not to create stuttering), as tier 0 object GC is generally very fast. |
Very common, any project which uses them for getting nodes, checking input actions, setting shader parameters, or using GDScript variables will benefit from this.
Yes, C# is performant, but that doesn't mean it makes sense to make hidden allocations every frame for no reason. |
|
I'm curious how the performance would be if you used a lower number instead of 200,000 in previous benchmarks. |
I believe we need to first prove the performance issue exists before addressing them,
The documentation did tell you to use the cached StringNames. |
|
The benchmark project as requested by @Delsin-Yu: |
I made an updated benchmark that creates about 60 unique StringNames per frame to get a shader parameter: |
|
The test numbers are here: In general, any form of caching starts to make a difference when the numbers have reached Details
You may grab the test project an try it yourself: This project also includes the ability to test GC spike interactively: QQ2025326-172515.mp4 |
|
Since some people raised an issue with the Codepublic sealed class CustomCache<TKey, TValue> where TKey : notnull {
public TimeSpan ExpireAfterAccessDuration { get; set; }
public TimeSpan EvictInterval {
get => TimeSpan.FromMilliseconds(EvictTimer.Interval);
set => EvictTimer.Interval = value.TotalMilliseconds;
}
private readonly ConcurrentDictionary<TKey, Entry> Entries = [];
private readonly IEnumerator<KeyValuePair<TKey, Entry>> EntriesEnumerator;
private readonly System.Timers.Timer EvictTimer = new();
private readonly Lock EvictLock = new();
public CustomCache(TimeSpan expireAfterAccessDuration, TimeSpan evictInterval) {
ExpireAfterAccessDuration = expireAfterAccessDuration;
EvictInterval = evictInterval;
EntriesEnumerator = Entries.GetEnumerator();
EvictTimer.Elapsed += (_, _) => EvictExpired();
EvictTimer.Start();
}
public TValue GetOrAdd(TKey key, Func<TKey, TValue> getValue) {
Entry Entry = Entries.GetOrAdd(key, static (TKey key, Func<TKey, TValue> getValue) => {
return new Entry() {
Value = getValue(key),
Timestamp = Environment.TickCount64
};
}, getValue);
Entry.Timestamp = Environment.TickCount64;
return Entry.Value;
}
public void EvictExpired() {
lock (EvictLock) {
// Get each cache entry
while (EntriesEnumerator.MoveNext()) {
(TKey key, Entry entry) = EntriesEnumerator.Current;
// Evict if expired
if (TimeSpan.FromTicks(Environment.TickCount64 - entry.Timestamp) > ExpireAfterAccessDuration) {
Entries.TryRemove(key, out _);
}
}
EntriesEnumerator.Reset();
}
}
public void EvictAll() {
lock (EvictLock) {
Entries.Clear();
}
}
private sealed class Entry {
public required TValue Value { get; set; }
public required long Timestamp { get; set; }
}
}
The implementation uses If using the alternative implementation, the timer should likely be replaced with a native Godot timer if possible. |
|
I believe caching can bring enough performance improvement, but I wonder if it's possible to implement a more fundamental solution to reduce the overhead of implicit conversion. |
In GDScript it's not as big of a problem for three reasons:
|
|
I refined the custom caching implementation into two alternative implementations. One uses a Implementation 1public sealed class CustomCache1<TKey, TValue> where TKey : notnull {
public TimeSpan ExpireAfterAccessDuration { get; set; }
public TimeSpan EvictInterval { get; set; }
private readonly ConcurrentDictionary<TKey, Entry> Entries = [];
private readonly IEnumerator<KeyValuePair<TKey, Entry>> EntriesEnumerator;
private readonly Lock EvictExpiredLock = new();
private long LastEvictionTimestamp;
public CustomCache1(TimeSpan expireAfterAccessDuration, TimeSpan evictInterval) {
ExpireAfterAccessDuration = expireAfterAccessDuration;
EvictInterval = evictInterval;
EntriesEnumerator = Entries.GetEnumerator();
}
public TValue GetOrAdd(TKey key, Func<TKey, TValue> getValue) {
// Evict expired entries every interval
long tickCount64 = System.Environment.TickCount64;
if (TimeSpan.FromTicks(tickCount64 - LastEvictionTimestamp) >= EvictInterval) {
LastEvictionTimestamp = tickCount64;
EvictExpired();
}
// Get or add entry
Entry entry = Entries.GetOrAdd(key, static (TKey key, Func<TKey, TValue> getValue) => {
return new Entry() {
Value = getValue(key),
Timestamp = System.Environment.TickCount64,
};
}, getValue);
// Update access timestamp
entry.Timestamp = tickCount64;
return entry.Value;
}
public void Evict(TKey key) {
Entries.Remove(key, out _);
}
public void EvictExpired() {
lock (EvictExpiredLock) {
// Get each cache entry
while (EntriesEnumerator.MoveNext()) {
(TKey key, Entry entry) = EntriesEnumerator.Current;
// Evict if expired
if (TimeSpan.FromTicks(System.Environment.TickCount64 - entry.Timestamp) >= ExpireAfterAccessDuration) {
Entries.TryRemove(key, out _);
}
}
EntriesEnumerator.Reset();
}
}
public void EvictAll() {
Entries.Clear();
}
private sealed class Entry {
public required TValue Value { get; set; }
public required long Timestamp { get; set; }
}
}Implementation 2public sealed class CustomCache2<TKey, TValue> where TKey : notnull {
public TimeSpan ExpireAfterAccessDuration { get; set; }
public TimeSpan AutoEvictInterval { get; set; }
private readonly Dictionary<TKey, TValue> Values = [];
private readonly PriorityQueue<TKey, long> Timestamps = new(new TimestampComparer());
private readonly Lock Lock = new();
private long LastEvictionTimestamp = System.Environment.TickCount64;
public CustomCache2(TimeSpan expireAfterAccessDuration, TimeSpan minimumEvictInterval) {
ExpireAfterAccessDuration = expireAfterAccessDuration;
AutoEvictInterval = minimumEvictInterval;
}
public TValue GetOrAdd(TKey key, Func<TKey, TValue> getValue) {
lock (Lock) {
// Evict expired entries every interval
long tickCount64 = System.Environment.TickCount64;
if (TimeSpan.FromTicks(tickCount64 - LastEvictionTimestamp) > AutoEvictInterval) {
LastEvictionTimestamp = tickCount64;
EvictExpired();
}
// Try fetch from cache
if (Values.TryGetValue(key, out TValue? value)) {
// Update last access timestamp
Timestamps.DequeueEnqueue(key, System.Environment.TickCount64);
return value;
}
// Add to cache
value = getValue(key);
Values[key] = value;
Timestamps.Enqueue(key, System.Environment.TickCount64);
return value;
}
}
public void Evict(TKey key) {
lock (Lock) {
Values.Remove(key);
Timestamps.Remove(key, out _, out _);
}
}
public void EvictExpired() {
lock (Lock) {
while (Timestamps.TryPeek(out TKey? key, out long timestamp)) {
// Get time elapsed since last access
TimeSpan timeElapsed = TimeSpan.FromTicks(System.Environment.TickCount64 - timestamp);
// Ensure enough time elapsed
if (timeElapsed < ExpireAfterAccessDuration) {
break;
}
// Evict from cache and move on to next entry
Values.Remove(key);
Timestamps.Dequeue();
}
}
}
public void EvictAll() {
lock (Lock) {
Values.Clear();
Timestamps.Clear();
}
}
private readonly struct TimestampComparer : IComparer<long> {
public int Compare(long x, long y) {
// Older timestamps have higher priority than newer timestamps
return y.CompareTo(x);
}
}
}
The benchmarks for
So! I can implement either of these or stick with the Benchmarks: stringnamestest.zip |
|
This is a patch using WeakReference and ConcurrentDictionary as cache: Detailsdiff --git a/modules/mono/glue/GodotSharp/GodotSharp/Core/NodePath.cs b/modules/mono/glue/GodotSharp/GodotSharp/Core/NodePath.cs
index 0af640533d..68f7f37c12 100644
--- a/modules/mono/glue/GodotSharp/GodotSharp/Core/NodePath.cs
+++ b/modules/mono/glue/GodotSharp/GodotSharp/Core/NodePath.cs
@@ -1,4 +1,5 @@
using System;
+using System.Collections.Concurrent;
using System.Diagnostics.CodeAnalysis;
using Godot.NativeInterop;
@@ -48,8 +49,13 @@ namespace Godot
private WeakReference<IDisposable>? _weakReferenceToSelf;
+ private string? _cache_key;
+
+ private static readonly ConcurrentDictionary<string, WeakReference<NodePath>> NodePathCache = new();
+
~NodePath()
{
+ if (_cache_key != null) NodePathCache.TryRemove(_cache_key, out _);
Dispose(false);
}
@@ -129,16 +135,32 @@ namespace Godot
}
/// <summary>
- /// Converts a string to a <see cref="NodePath"/>.
+ /// Converts a <see cref="string"/> to a <see cref="NodePath"/>.<br/>
+ /// The resulting <see cref="NodePath"/> is temporarily cached for future casts.
/// </summary>
/// <param name="from">The string to convert.</param>
- public static implicit operator NodePath(string from) => new NodePath(from);
+ public static implicit operator NodePath(string from)
+ {
+ if (NodePathCache.TryGetValue(from, out WeakReference<NodePath>? weakref) && weakref != null)
+ {
+ if (weakref.TryGetTarget(out NodePath? val) && val != null)
+ {
+ return val;
+ }
+ }
+ var ret = new NodePath(from)
+ {
+ _cache_key = from
+ };
+ NodePathCache[from] = new(ret);
+ return ret;
+ }
/// <summary>
- /// Converts this <see cref="NodePath"/> to a string.
+ /// Converts a <see cref="NodePath"/> to a <see cref="string"/>.
/// </summary>
/// <param name="from">The <see cref="NodePath"/> to convert.</param>
- [return: NotNullIfNotNull("from")]
+ [return: NotNullIfNotNull(nameof(from))]
public static implicit operator string?(NodePath? from) => from?.ToString();
/// <summary>
diff --git a/modules/mono/glue/GodotSharp/GodotSharp/Core/StringName.cs b/modules/mono/glue/GodotSharp/GodotSharp/Core/StringName.cs
index 21d9ada127..aa058e3a41 100644
--- a/modules/mono/glue/GodotSharp/GodotSharp/Core/StringName.cs
+++ b/modules/mono/glue/GodotSharp/GodotSharp/Core/StringName.cs
@@ -1,4 +1,5 @@
using System;
+using System.Collections.Concurrent;
using System.Diagnostics.CodeAnalysis;
using Godot.NativeInterop;
@@ -19,8 +20,13 @@ namespace Godot
private WeakReference<IDisposable>? _weakReferenceToSelf;
+ private string? _cache_key;
+
+ private static readonly ConcurrentDictionary<string, WeakReference<StringName>> StringNameCache = new();
+
~StringName()
{
+ if (_cache_key != null) StringNameCache.TryRemove(_cache_key, out _);
Dispose(false);
}
@@ -75,16 +81,32 @@ namespace Godot
}
/// <summary>
- /// Converts a string to a <see cref="StringName"/>.
+ /// Converts a <see cref="string"/> to a <see cref="StringName"/>.<br/>
+ /// The resulting <see cref="StringName"/> is temporarily cached for future casts.
/// </summary>
/// <param name="from">The string to convert.</param>
- public static implicit operator StringName(string from) => new StringName(from);
+ public static implicit operator StringName(string from)
+ {
+ if (StringNameCache.TryGetValue(from, out WeakReference<StringName>? weakref) && weakref != null)
+ {
+ if (weakref.TryGetTarget(out StringName? val) && val != null)
+ {
+ return val;
+ }
+ }
+ var ret = new StringName(from)
+ {
+ _cache_key = from
+ };
+ StringNameCache[from] = new(ret);
+ return ret;
+ }
/// <summary>
- /// Converts a <see cref="StringName"/> to a string.
+ /// Converts a <see cref="StringName"/> to a <see cref="string"/>.
/// </summary>
/// <param name="from">The <see cref="StringName"/> to convert.</param>
- [return: NotNullIfNotNull("from")]
+ [return: NotNullIfNotNull(nameof(from))]
public static implicit operator string?(StringName? from) => from?.ToString();
/// <summary>
@@ -95,7 +117,10 @@ namespace Godot
{
if (IsEmpty)
return string.Empty;
-
+ if (_cache_key != null)
+ {
+ return _cache_key;
+ }
var src = (godot_string_name)NativeValue;
NativeFuncs.godotsharp_string_name_as_string(out godot_string dest, src);
using (dest)The following two tests are in the release mode, where the improvement by cache is more significant. In my opinion, compared with time-based caching, WeakReference caching is good as well. |
|
@beicause Nice work! One possible problem with weak reference caching is that, if the garbage collector runs often, then string names / node paths could be collected and reallocated more often than they should. I think it could still be more ideal though since it will take into account the available memory of the system. |
|
See my comment to discuss other possible solutions: godotengine/godot-proposals#10826 (comment) |



This pull request addresses godotengine/godot-proposals#10826 (please take a look).
To summarise, StringNames and NodePaths are now cached when implicitly casting from strings to avoid allocations like:
This pull request adds a FusionCache dependency since there is no cache built-in to C#. This could be replaced with a MemoryCache dependency, or an in-house solution could be created using a
ConcurrentDictionary.In my opinion, StringNames (and NodePaths) do not make sense in C# and should be removed from the API. The reason is the garbage collection spike issues I detailed in the discussion. There are very few benefits to StringNames since identifiers are generally less than 20 characters long, and NodePaths don't seem to provide any value.
That being said, as long as StringNames and NodePaths remain in the API, implicit allocations should not exist, especially when it's so common to use string literals as StringNames and NodePaths.
If this pull request is added, the developer can still opt for a non-cached StringName or NodePath using
new StringName(string)ornew NodePath(string).Another related issue is there is currently no equality operator between StringName and string, so comparing them will result in the string being converted to a StringName, which is not ideal. It may be a good idea to add an
==operator to StringName which accepts a string and compares them by converting the StringName to a string (instead of converting the string to a StringName).By no means is this the only solution to the problem - I am open to discussions about better alternatives.