Benchmark Date: January 27, 2026
Comparison: MediatR 12.4.1 vs Routya v2.0 (Runtime) vs Routya v3.0 (Source Gen)
✅ Notifications: 21% faster than v2.0, competitive with MediatR
❌ Requests: 20% slower than v2.0, 79% slower than MediatR
The source generator successfully optimized notification handling but revealed deeper performance issues in request dispatch that require core architecture improvements.
| Method | Mean | Error | StdDev | Ratio | RatioSD | Allocated | Alloc Ratio |
|---|---|---|---|---|---|---|---|
| MediatR_Request | 195.0 ns | 3.16 ns | 2.95 ns | 1.00 | 0.02 | - | NA |
| RoutyaV2_Request | 290.8 ns | 2.23 ns | 1.87 ns | 1.49 | 0.02 | 776 B | NA |
| RoutyaV3_SourceGen_Request | 349.8 ns | 3.11 ns | 2.43 ns | 1.79 | 0.03 | - | NA |
| MediatR_Notification | 215.8 ns | 1.73 ns | 1.62 ns | 1.11 | 0.02 | 600 B | NA |
| RoutyaV2_Notification | 277.3 ns | 2.19 ns | 2.05 ns | 1.42 | 0.02 | 488 B | NA |
| RoutyaV3_SourceGen_Notification | 220.0 ns | 2.35 ns | 2.20 ns | 1.13 | 0.02 | - | NA |
- MediatR (Baseline): 195.0 ns
- Routya v2.0: 290.8 ns (+49% vs MediatR)
- Routya v3.0: 349.8 ns (+79% vs MediatR, +20% vs v2.0)
⚠️
- MediatR: 215.8 ns
- Routya v2.0: 277.3 ns (+28% vs MediatR)
- Routya v3.0: 220.0 ns (+2% vs MediatR, -21% vs v2.0) ✅
- Compile-time handler discovery eliminates runtime reflection
- Direct dictionary registration reduces lookup overhead
- Notification handler resolution benefits most from pre-computed metadata
The dispatcher implementation remains identical between v2.0 and v3.0:
- Both use
DefaultRoutyawith runtime dispatch logic - Source generator only optimizes registration, not execution
- Request dispatch still uses:
RequestHandlerInfodictionary lookups- Generic type resolution via
typeof() - Expression compilation overhead (from v2.0)
The V3 regression suggests the generated registration code adds overhead:
- Larger dictionary initialization
- More complex metadata structures
- Additional indirection layers
MediatR Advantages:
- Simplified generic constraints
- Optimized internal caching
- Minimal abstraction layers
- Type-specific compiled delegates
Routya Overhead Sources:
- Dictionary lookups:
RequestHandlerInfo<TRequest, TResponse> - Generic type resolution complexity
- Scope validation checks
- Handler instantiation through DI
Source generation brought notification performance within margin of error of MediatR.
Goal: Match or exceed v2.0 performance (290.8 ns target)
- Reduce
RequestHandlerInfometadata overhead - Generate specialized dispatch methods per request type
- Eliminate redundant dictionary lookups
- Pre-compute generic type arguments
// Instead of: dispatcher.SendAsync<TRequest, TResponse>(request)
// Generate:
public async Task<string> SendAsync(TestRequest request, CancellationToken ct = default)
{
using var scope = _serviceProvider.CreateScope();
var handler = scope.ServiceProvider.GetRequiredService<TestRequestHandler>();
return await handler.HandleAsync(request, ct);
}Estimated Impact: -100 to -150 ns (down to ~200-250 ns range)
Goal: <200 ns for request handling
-
Generate dedicated dispatcher per request
- Zero dictionary lookups
- Direct handler resolution
- Inline scope creation
-
Compile-time generic resolution
- No
typeof()calls - Static handler type binding
- Optimized async state machines
- No
-
Minimize allocations
- Reuse scope instances where safe
- ValueTask optimizations
- Struct-based internal types
Estimated Impact: -50 to -100 ns (match MediatR 195 ns baseline)
-
Investigate V3 regression
- Profile generated
AddGeneratedRoutya()method - Identify added overhead sources
- Optimize metadata structures
- Profile generated
-
Document notification wins
- Highlight 21% improvement
- Promote for notification-heavy workloads
-
Add benchmark suite to CI
- Track performance across releases
- Prevent regressions
-
Implement type-specific dispatchers
- Generate
Send_TestRequest(TestRequest)methods - Direct handler instantiation
- Remove generic overhead
- Generate
-
Optimize handler resolution
- Cached delegate invocation
- Reduce DI container overhead
-
Zero-allocation mode
- ValueTask patterns
- ArrayPool usage
- Struct-based internal types
-
AOT compilation support
- NativeAOT compatibility
- Trimming-safe metadata
-
Performance parity with MediatR
- <200 ns for all scenarios
- Zero-allocation for sync paths
- ❌ Request/Response: 79% slower than MediatR
- ✅ Notifications: Competitive with MediatR (2% slower)
⚠️ Request/Response: <30% slower than MediatR (~250 ns)- ✅ Notifications: Maintain parity (~220 ns)
- ✅ Request/Response: Match or beat MediatR (<195 ns)
- ✅ Notifications: Beat MediatR (<200 ns)
- ✅ Zero allocations for hot paths
- ✅ NativeAOT ready
The source generator v3.0 successfully validated the compile-time approach with notification performance improvements. However, it exposed fundamental architectural overhead in request dispatch that requires deeper optimization.
Next Steps:
- Fix v3.0 request regression (target: match v2.0 at 290 ns)
- Implement type-specific dispatch generation
- Work toward MediatR performance parity
The path forward is clear: generate specialized, zero-overhead dispatch code for each handler type, eliminating runtime generic resolution and dictionary lookups.
Generated by: Routya Source Generator Performance Analysis
Version: 3.0.0-preview.1
Benchmark Runtime: .NET 10.0.0, X64 RyuJIT AVX2