Skip to content

Incorrectly detected number of invocations #1780

Open
@AndreyAkinshin

Description

@AndreyAkinshin

Currently, we have the following heuristic: if the duration of the first WorkloadJitting iteration is more than IterationTime (which is 500ms by default), we skip the Pilot stage and perform a single invocation in further iterations.

This heuristic doesn't always work well, so in this scope of #1573, we have introduced another heuristic: if the duration of the first WorkloadJitting iteration takes more than IterationTime, we but less than 1 second (a magic number), we discard the first results and perform another WorkloadJitting iteration.

Unfortunately, it doesn't always work well either. If we try to run these F# benchmarks (the problem was originally reported here by Matthew Crews), we could observe the following situation (the full log is available here):

WorkloadJitting  1: 1 op, 2660880000.00 ns, 2.6609 s/op

WorkloadWarmup   1: 1 op, 55999700.00 ns, 55.9997 ms/op
WorkloadWarmup   2: 1 op, 48296500.00 ns, 48.2965 ms/op
WorkloadWarmup   3: 1 op, 47339800.00 ns, 47.3398 ms/op
WorkloadWarmup   4: 1 op, 37181000.00 ns, 37.1810 ms/op
WorkloadWarmup   5: 1 op, 37303500.00 ns, 37.3035 ms/op
WorkloadWarmup   6: 1 op, 38082300.00 ns, 38.0823 ms/op
WorkloadWarmup   7: 1 op, 37534700.00 ns, 37.5347 ms/op
WorkloadWarmup   8: 1 op, 39494000.00 ns, 39.4940 ms/op
WorkloadWarmup   9: 1 op, 37381700.00 ns, 37.3817 ms/op

While the typical duration of the warmed benchmark is about 35-45ms, the first invocation takes more than 2.5 seconds. This situation leads to reasonable warnings:

I believe that the ultimate solution is to introduce an ability to restart the Pilot stage in the case of "too fast" WorkloadWarmup iterations. However, it would require a major refactoring of our Engine infrastructure. As a quick hotfix, we could just bump the magic number from 1 second to a higher value (let's say 10 seconds). The worst cast side effect from such a change is extra 10 seconds of the total benchmarking time for heavy benchmarks (that actually take more than 500ms, but less than 10 seconds).

@adamsitnik, what do you think?

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions