Description
Description
In Lightning Talk: Saturday Is Coming Faster - Part 1 and Lightning Talk: Saturday Is Coming Faster -P art 2 @cassioneri presents a way to optimize % 7
for C++'s weekday::weekday{year_month_day}
function.
The same micro-optimization seems to be applicable for C#'s DateTime.DayOfWeek
, DateOnly.DayOfWeek
and possibly other places in Calendar.cs
and ISOWeek.cs
using % 7
.
In short: For the range of "days since epoch" between DateTime.MinValue
and DateTime.MaxValue
we can replace % 7
with this method.
private static uint FastMod7(uint n)
{
Debug.Assert(n < 178_956_973);
return 613_566_757 * n >> 29;
}
If we implement this fast path in DayOfWeek2
, we can similarly to the talk, proof this works by simply enumerating all days between DateTime.MinValue
and DateTime.MaxValue
.
public DayOfWeek DayOfWeek2 => (DayOfWeek)FastMod7((uint)(UTicks / TimeSpan.TicksPerDay) + 1);
private static uint FastMod7(uint n)
{
Debug.Assert(n < 178_956_973);
return 613_566_757 * n >> 29;
}
for (var ticks = DateTime.MinValue.Ticks; ticks <= DateTime.MaxValue.Ticks; ticks += TimeSpan.TicksPerDay)
{
var d = new MyDateTime(ticks);
Debug.Assert(d.DayOfWeek == d.DayOfWeek2);
}
I tried micro benchmarking this and it shows a 5% improvement and a 14 B code size reduction.
BenchmarkDotNet v0.14.1-nightly.20250503.223, Windows 11 (10.0.26100.3775)
Unknown processor
.NET SDK 10.0.100-preview.3.25201.16
[Host] : .NET 10.0.0 (10.0.25.17105), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
DefaultJob : .NET 10.0.0 (10.0.25.17105), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
Method | Mean | Error | StdDev | Ratio | RatioSD | Code Size |
---|---|---|---|---|---|---|
DayOfWeek | 0.2111 ns | 0.0015 ns | 0.0013 ns | 1.00 | 0.01 | 63 B |
DayOfWeek2 | 0.2010 ns | 0.0039 ns | 0.0035 ns | 0.95 | 0.02 | 49 B |
asm
.NET 10.0.0 (10.0.25.17105), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
; WeekDayBenchmark0.DayOfWeek()
mov rdx,[rcx+8]
mov rax,3FFFFFFFFFFFFFFF
and rdx,rax
mov rcx,28B8FFC778816079
mov rax,rdx
mul rcx
shr rdx,25
lea ecx,[rdx+1]
mov rdx,2492492492492493
mov eax,ecx
mul rdx
imul eax,edx,7
sub ecx,eax
mov eax,ecx
ret
; Total bytes of code 63
.NET 10.0.0 (10.0.25.17105), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
; WeekDayBenchmark0.DayOfWeek2()
mov rdx,[rcx+8]
mov rax,3FFFFFFFFFFFFFFF
and rdx,rax
mov rcx,28B8FFC778816079
mov rax,rdx
mul rcx
shr rdx,25
inc edx
imul eax,edx,24924925
shr eax,1D
ret
; Total bytes of code 49
Benchmark code
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.Diagnostics;
using System;
BenchmarkRunner.Run<WeekDayBenchmark>();
[DisassemblyDiagnoser]
public class WeekDayBenchmark
{
public MyDateTime MyDateTime0 { get; set; }
[GlobalSetup]
public void GlobalSetup()
{
MyDateTime0 = new MyDateTime(DateTime.MaxValue.Ticks);
}
[Benchmark(Baseline = true)]
public DayOfWeek DayOfWeek() => MyDateTime0.DayOfWeek;
[Benchmark]
public DayOfWeek DayOfWeek2() => MyDateTime0.DayOfWeek2;
}
public readonly partial struct MyDateTime
{
private const ulong TicksMask = 0x3FFFFFFFFFFFFFFF;
internal readonly ulong _dateData;
public MyDateTime(long ticks)
{
_dateData = (ulong)ticks;
}
private ulong UTicks => _dateData & TicksMask;
public DayOfWeek DayOfWeek => (DayOfWeek)(((uint)(UTicks / TimeSpan.TicksPerDay) + 1) % 7);
public DayOfWeek DayOfWeek2 => (DayOfWeek)FastMod7((uint)(UTicks / TimeSpan.TicksPerDay) + 1);
private static uint FastMod7(uint n)
{
Debug.Assert(n < 178_956_973);
return 613_566_757 * n >> 29;
}
}
Is this something that would be worth pursuing in the BCL or is it either not worth the complexity, should be handled by the JIT or something else?