Skip to content

Commit 433376f

Browse files
psvvspSergey Pavlovalliepiper
authored
Restrict stopping criterion parameter usage in command line (#174)
* restrict stopping criterion parameter usage in command line * Update docs for stopping criterion. * Add convenience benchmark_base API for criterion params. * Add more test cases for stopping criterion parsing. --------- Co-authored-by: Sergey Pavlov <[email protected]> Co-authored-by: Allison Piper <[email protected]>
1 parent ca0e795 commit 433376f

File tree

9 files changed

+481
-87
lines changed

9 files changed

+481
-87
lines changed

docs/cli_help.md

Lines changed: 65 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -83,36 +83,6 @@
8383
* Applies to the most recent `--benchmark`, or all benchmarks if specified
8484
before any `--benchmark` arguments.
8585

86-
* `--min-samples <count>`
87-
* Gather at least `<count>` samples per measurement.
88-
* Default is 10 samples.
89-
* Applies to the most recent `--benchmark`, or all benchmarks if specified
90-
before any `--benchmark` arguments.
91-
92-
* `--stopping-criterion <criterion>`
93-
* After `--min-samples` is satisfied, use `<criterion>` to detect if enough
94-
samples were collected.
95-
* Only applies to Cold measurements.
96-
* Default is stdrel (`--stopping-criterion stdrel`)
97-
98-
* `--min-time <seconds>`
99-
* Accumulate at least `<seconds>` of execution time per measurement.
100-
* Only applies to `stdrel` stopping criterion.
101-
* Default is 0.5 seconds.
102-
* If both GPU and CPU times are gathered, this applies to GPU time only.
103-
* Applies to the most recent `--benchmark`, or all benchmarks if specified
104-
before any `--benchmark` arguments.
105-
106-
* `--max-noise <value>`
107-
* Gather samples until the error in the measurement drops below `<value>`.
108-
* Noise is specified as the percent relative standard deviation.
109-
* Default is 0.5% (`--max-noise 0.5`)
110-
* Only applies to `stdrel` stopping criterion.
111-
* Only applies to Cold measurements.
112-
* If both GPU and CPU times are gathered, this applies to GPU noise only.
113-
* Applies to the most recent `--benchmark`, or all benchmarks if specified
114-
before any `--benchmark` arguments.
115-
11686
* `--skip-time <seconds>`
11787
* Skip a measurement when a warmup run executes in less than `<seconds>`.
11888
* Default is -1 seconds (disabled).
@@ -123,16 +93,6 @@
12393
* Applies to the most recent `--benchmark`, or all benchmarks if specified
12494
before any `--benchmark` arguments.
12595

126-
* `--timeout <seconds>`
127-
* Measurements will timeout after `<seconds>` have elapsed.
128-
* Default is 15 seconds.
129-
* `<seconds>` is walltime, not accumulated sample time.
130-
* If a measurement times out, the default markdown log will print a warning to
131-
report any outstanding termination criteria (min samples, min time, max
132-
noise).
133-
* Applies to the most recent `--benchmark`, or all benchmarks if specified
134-
before any `--benchmark` arguments.
135-
13696
* `--throttle-threshold <value>`
13797
* Set the GPU throttle threshold as percentage of the device's default clock rate.
13898
* Default is 75.
@@ -166,3 +126,68 @@
166126
* Intended for use with external profiling tools.
167127
* Applies to the most recent `--benchmark`, or all benchmarks if specified
168128
before any `--benchmark` arguments.
129+
130+
## Stopping Criteria
131+
132+
* `--timeout <seconds>`
133+
* Measurements will timeout after `<seconds>` have elapsed.
134+
* Default is 15 seconds.
135+
* `<seconds>` is walltime, not accumulated sample time.
136+
* If a measurement times out, the default markdown log will print a warning to
137+
report any outstanding termination criteria (min samples, min time, max
138+
noise).
139+
* Applies to the most recent `--benchmark`, or all benchmarks if specified
140+
before any `--benchmark` arguments.
141+
142+
* `--min-samples <count>`
143+
* Gather at least `<count>` samples per measurement before checking any
144+
other stopping criterion besides the timeout.
145+
* Default is 10 samples.
146+
* Applies to the most recent `--benchmark`, or all benchmarks if specified
147+
before any `--benchmark` arguments.
148+
149+
* `--stopping-criterion <criterion>`
150+
* After `--min-samples` is satisfied, use `<criterion>` to detect if enough
151+
samples were collected.
152+
* Only applies to Cold and CPU-only measurements.
153+
* If both GPU and CPU times are gathered, GPU time is used for stopping
154+
analysis.
155+
* Stopping criteria provided by NVBench are:
156+
* "stdrel": (default) Converges to a minimal relative standard deviation,
157+
stdev / mean
158+
* "entropy": Converges based on the cumulative entropy of all samples.
159+
* Each stopping criterion may provide additional parameters to customize
160+
behavior, as detailed below:
161+
162+
### "stdrel" Stopping Criterion Parameters
163+
164+
* `--min-time <seconds>`
165+
* Accumulate at least `<seconds>` of execution time per measurement.
166+
* Only applies to `stdrel` stopping criterion.
167+
* Default is 0.5 seconds.
168+
* Applies to the most recent `--benchmark`, or all benchmarks if specified
169+
before any `--benchmark` arguments.
170+
171+
* `--max-noise <value>`
172+
* Gather samples until the error in the measurement drops below `<value>`.
173+
* Noise is specified as the percent relative standard deviation (stdev/mean).
174+
* Default is 0.5% (`--max-noise 0.5`)
175+
* Applies to the most recent `--benchmark`, or all benchmarks if specified
176+
before any `--benchmark` arguments.
177+
178+
### "entropy" Stopping Criterion Parameters
179+
180+
* `--max-angle <value>`
181+
* Maximum linear regression angle of cumulative entropy.
182+
* Smaller values give more accurate results.
183+
* Default is 0.048.
184+
* Applies to the most recent `--benchmark`, or all benchmarks if specified
185+
before any `--benchmark` arguments.
186+
187+
* `--min-r2 <value>`
188+
* Minimum coefficient of determination for linear regression of cumulative
189+
entropy.
190+
* Larger values give more accurate results.
191+
* Default is 0.36.
192+
* Applies to the most recent `--benchmark`, or all benchmarks if specified
193+
before any `--benchmark` arguments.

examples/CMakeLists.txt

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,6 @@ set(example_srcs
1616
add_custom_target(nvbench.example.all)
1717
add_dependencies(nvbench.all nvbench.example.all)
1818

19-
2019
function (nvbench_add_examples_target target_prefix cuda_std)
2120
add_custom_target(${target_prefix}.all)
2221
add_dependencies(nvbench.example.all ${target_prefix}.all)
@@ -29,9 +28,15 @@ function (nvbench_add_examples_target target_prefix cuda_std)
2928
target_include_directories(${example_name} PRIVATE "${CMAKE_CURRENT_LIST_DIR}")
3029
target_link_libraries(${example_name} PRIVATE nvbench::main)
3130
set_target_properties(${example_name} PROPERTIES COMPILE_FEATURES cuda_std_${cuda_std})
31+
32+
set(example_args --timeout 0.1)
33+
# The custom_criterion example doesn't support the --min-time argument:
34+
if (NOT "${example_src}" STREQUAL "custom_criterion.cu")
35+
list(APPEND example_args --min-time 1e-5)
36+
endif()
37+
3238
add_test(NAME ${example_name}
33-
COMMAND "$<TARGET_FILE:${example_name}>" --timeout 0.1 --min-time 1e-5
34-
)
39+
COMMAND "$<TARGET_FILE:${example_name}>" ${example_args})
3540

3641
# These should not deadlock. If they do, it may be that the CUDA context was created before
3742
# setting CUDA_MODULE_LOAD=EAGER in main, see NVIDIA/nvbench#136.

nvbench/benchmark_base.cuh

Lines changed: 40 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -266,21 +266,52 @@ struct benchmark_base
266266
return *this;
267267
}
268268

269-
[[nodiscard]] nvbench::criterion_params &get_criterion_params() { return m_criterion_params; }
270-
[[nodiscard]] const nvbench::criterion_params &get_criterion_params() const
271-
{
272-
return m_criterion_params;
273-
}
274-
275269
/// Control the stopping criterion for the measurement loop.
276270
/// @{
277271
[[nodiscard]] const std::string &get_stopping_criterion() const { return m_stopping_criterion; }
278-
benchmark_base &set_stopping_criterion(std::string criterion)
272+
benchmark_base &set_stopping_criterion(std::string criterion);
273+
/// @}
274+
275+
[[nodiscard]] bool has_criterion_param(const std::string &name) const
279276
{
280-
m_stopping_criterion = std::move(criterion);
277+
return m_criterion_params.has_value(name);
278+
}
279+
280+
[[nodiscard]] nvbench::int64_t get_criterion_param_int64(const std::string &name) const
281+
{
282+
return m_criterion_params.get_int64(name);
283+
}
284+
benchmark_base &set_criterion_param_int64(const std::string &name, nvbench::int64_t value)
285+
{
286+
m_criterion_params.set_int64(name, value);
281287
return *this;
282288
}
283-
/// @}
289+
290+
[[nodiscard]] nvbench::float64_t get_criterion_param_float64(const std::string &name) const
291+
{
292+
return m_criterion_params.get_float64(name);
293+
}
294+
benchmark_base &set_criterion_param_float64(const std::string &name, nvbench::float64_t value)
295+
{
296+
m_criterion_params.set_float64(name, value);
297+
return *this;
298+
}
299+
300+
[[nodiscard]] std::string get_criterion_param_string(const std::string &name) const
301+
{
302+
return m_criterion_params.get_string(name);
303+
}
304+
benchmark_base &set_criterion_param_string(const std::string &name, std::string value)
305+
{
306+
m_criterion_params.set_string(name, std::move(value));
307+
return *this;
308+
}
309+
310+
[[nodiscard]] nvbench::criterion_params &get_criterion_params() { return m_criterion_params; }
311+
[[nodiscard]] const nvbench::criterion_params &get_criterion_params() const
312+
{
313+
return m_criterion_params;
314+
}
284315

285316
protected:
286317
friend struct nvbench::runner_base;

nvbench/benchmark_base.cxx

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
*/
1818

1919
#include <nvbench/benchmark_base.cuh>
20+
#include <nvbench/criterion_manager.cuh>
2021
#include <nvbench/detail/transform_reduce.cuh>
2122

2223
namespace nvbench
@@ -88,4 +89,11 @@ std::size_t benchmark_base::get_config_count() const
8889
return per_device_count * m_devices.size();
8990
}
9091

92+
benchmark_base &benchmark_base::set_stopping_criterion(std::string criterion)
93+
{
94+
m_stopping_criterion = std::move(criterion);
95+
m_criterion_params = criterion_manager::get().get_criterion(m_stopping_criterion).get_params();
96+
return *this;
97+
}
98+
9199
} // namespace nvbench

nvbench/criterion_manager.cuh

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,9 @@ public:
5050

5151
using params_description = std::vector<std::pair<std::string, nvbench::named_values::type>>;
5252
params_description get_params_description() const;
53+
54+
using params_map = std::unordered_map<std::string, params_description>;
55+
params_map get_params_description_map() const;
5356
};
5457

5558
/**

nvbench/criterion_manager.cxx

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,4 +104,23 @@ nvbench::criterion_manager::params_description criterion_manager::get_params_des
104104
return desc;
105105
}
106106

107+
criterion_manager::params_map criterion_manager::get_params_description_map() const
108+
{
109+
params_map result;
110+
111+
for (auto &[criterion_name, criterion] : m_map)
112+
{
113+
params_description &desc = result[criterion_name];
114+
nvbench::criterion_params params = criterion->get_params();
115+
116+
for (auto param : params.get_names())
117+
{
118+
nvbench::named_values::type type = params.get_type(param);
119+
desc.emplace_back(param, type);
120+
}
121+
}
122+
123+
return result;
124+
}
125+
107126
} // namespace nvbench

nvbench/detail/measure_cold.cu

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@
3030
#include <algorithm>
3131
#include <chrono>
3232
#include <limits>
33+
#include <optional>
3334
#include <thread>
3435

3536
namespace nvbench::detail
@@ -387,19 +388,30 @@ void measure_cold_base::generate_summaries()
387388

388389
if (m_max_time_exceeded)
389390
{
390-
const auto timeout = m_walltime_timer.get_duration();
391-
const auto max_noise = m_criterion_params.get_float64("max-noise");
392-
const auto min_time = m_criterion_params.get_float64("min-time");
391+
const auto timeout = m_walltime_timer.get_duration();
393392

394-
if (cuda_noise > max_noise)
393+
auto get_param = [this](std::optional<nvbench::float64_t> &param, const std::string &name) {
394+
if (m_criterion_params.has_value(name))
395+
{
396+
param = m_criterion_params.get_float64(name);
397+
}
398+
};
399+
400+
std::optional<nvbench::float64_t> max_noise;
401+
get_param(max_noise, "max-noise");
402+
403+
std::optional<nvbench::float64_t> min_time;
404+
get_param(max_noise, "min-time");
405+
406+
if (max_noise && cuda_noise > *max_noise)
395407
{
396408
printer.log(nvbench::log_level::warn,
397409
fmt::format("Current measurement timed out ({:0.2f}s) "
398410
"while over noise threshold ({:0.2f}% > "
399411
"{:0.2f}%)",
400412
timeout,
401413
cuda_noise * 100,
402-
max_noise * 100));
414+
*max_noise * 100));
403415
}
404416
if (m_total_samples < m_min_samples)
405417
{
@@ -410,15 +422,15 @@ void measure_cold_base::generate_summaries()
410422
m_total_samples,
411423
m_min_samples));
412424
}
413-
if (m_total_cuda_time < min_time)
425+
if (min_time && m_total_cuda_time < *min_time)
414426
{
415427
printer.log(nvbench::log_level::warn,
416428
fmt::format("Current measurement timed out ({:0.2f}s) "
417429
"before accumulating min_time ({:0.2f}s < "
418430
"{:0.2f}s)",
419431
timeout,
420432
m_total_cuda_time,
421-
min_time));
433+
*min_time));
422434
}
423435
}
424436

0 commit comments

Comments
 (0)