Description
Currently, we have the following heuristic: if the duration of the first WorkloadJitting
iteration is more than IterationTime
(which is 500ms by default), we skip the Pilot
stage and perform a single invocation in further iterations.
This heuristic doesn't always work well, so in this scope of #1573, we have introduced another heuristic: if the duration of the first WorkloadJitting
iteration takes more than IterationTime
, we but less than 1 second (a magic number), we discard the first results and perform another WorkloadJitting
iteration.
Unfortunately, it doesn't always work well either. If we try to run these F# benchmarks (the problem was originally reported here by Matthew Crews), we could observe the following situation (the full log is available here):
WorkloadJitting 1: 1 op, 2660880000.00 ns, 2.6609 s/op
WorkloadWarmup 1: 1 op, 55999700.00 ns, 55.9997 ms/op
WorkloadWarmup 2: 1 op, 48296500.00 ns, 48.2965 ms/op
WorkloadWarmup 3: 1 op, 47339800.00 ns, 47.3398 ms/op
WorkloadWarmup 4: 1 op, 37181000.00 ns, 37.1810 ms/op
WorkloadWarmup 5: 1 op, 37303500.00 ns, 37.3035 ms/op
WorkloadWarmup 6: 1 op, 38082300.00 ns, 38.0823 ms/op
WorkloadWarmup 7: 1 op, 37534700.00 ns, 37.5347 ms/op
WorkloadWarmup 8: 1 op, 39494000.00 ns, 39.4940 ms/op
WorkloadWarmup 9: 1 op, 37381700.00 ns, 37.3817 ms/op
While the typical duration of the warmed benchmark is about 35-45ms, the first invocation takes more than 2.5 seconds. This situation leads to reasonable warnings:
I believe that the ultimate solution is to introduce an ability to restart the Pilot stage in the case of "too fast" WorkloadWarmup iterations. However, it would require a major refactoring of our Engine infrastructure. As a quick hotfix, we could just bump the magic number from 1 second to a higher value (let's say 10 seconds). The worst cast side effect from such a change is extra 10 seconds of the total benchmarking time for heavy benchmarks (that actually take more than 500ms, but less than 10 seconds).
@adamsitnik, what do you think?