-
-
Notifications
You must be signed in to change notification settings - Fork 1k
Support very long string as benchmark arguments #1248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
if (methodName.Length <= MaxBenchmarkNameLength) | ||
return methodName; | ||
|
||
return methodName.Substring(0, MaxBenchmarkNameLength); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that such an approach may lead to the same "BenchmarkName"s for different benchmarks. Can it be a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good catch! In such case, we are going to overwrite the trace file.
This is obviously far from perfect. I could generate some unique file name (using guid for example) but it would make it harder for the end-user to connect the trace file name with the benchmark.
@eerhardt let's say that you want to run following benchmark and use EtwProfiler
to profile it:
public IEnumerable<object> Arguments()
{
yield return new string('a', 200_000);
yield return new string('a', 200_000 - 1) + "b";
}
[Benchmark]
[ArgumentsSource(nameof(Arguments))]
public void Some(string value)
We can't use full benchmark name with the super long string as a trace file name and we can not take just the first part of the string because the file name would be the same for two benchmarks.
Would using a guid in such case as a trace file name be suprising to the end user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest keeping the beginning of the name the same, so I could tell that these 2 benchmarks were Some
benchmarks. But then appending the Guid to the end would be understandable to me.
So either leaving out the args all together:
MyNamespace.MyClass.MyMethod.NewGuid
or chopping the arguments NewGuid
characters before the max, and then appending NewGuid
to the end.
MyNamespace.MyClass.MyMethod.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaNewGuid
I don't think purely using a Guid
for the whole name would be understandable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be nice if generate the same name for the same benchmarking. I'm suggesting to use a hashcode of the whole string at the end of the "shortified" name instead of a random GUID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be nice if generate the same name for the same benchmarking. I'm suggesting to use a hashcode of the whole string at the end of the "shortified" name instead of a random GUID
This is a good suggestion, however, afaik in .NET Core, the hash codes are the same for given string value only for current program lifetime. If you close the app and run it again, then the value is different (afaik for some security reasons)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One possibility would be to use a different hash algorithm, ex. SHA or MurmurHash.
We use Murmur hashing in ML.NET if you want to copy the code:
https://github.com/dotnet/machinelearning/blob/master/src/Microsoft.ML.Core/Utilities/Hashing.cs
# Conflicts: # src/BenchmarkDotNet.Diagnostics.Windows/Sessions.cs # tests/BenchmarkDotNet.IntegrationTests/ArgumentsTests.cs
…mentLength" This reverts commit 9cc7fbb.
…as a different command length limit)
@AndreyAkinshin @eerhardt it's been a while but the PR is now ready for review. |
You probably want to retain the copyright on this and Hashing.cs files. Refers to: src/BenchmarkDotNet/Extensions/PathFeatures.cs:1 in 183927e. [](commit_id = 183927e, deletion_comment = False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me (for what its worth).
# Conflicts: # tests/BenchmarkDotNet.IntegrationTests/ArgumentsTests.cs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you specify the source of a copied file like
// file copied from https://github.com/dotnet/machinelearning/blob/master/src/Microsoft.ML.Core/Utilities/Hashing.c
it's better to reference a specific commit or a specific tag. The source code in the master branch can be changed at any moment; it may complicate tracking the origin of the copied source code.
done |
Fixes #1431
/cc @eerhardt