Skip to content

Conversation

@comeyrd
Copy link
Contributor

@comeyrd comeyrd commented Oct 22, 2025

This pull request is based on the idea from the issue #121.

The user would be able to provide Cupti metrics it wants to use for his benchmark.
By filling one or multiple CustomCuptiMetrics

struct CustomCuptiMetrics{
    const char *metric_name;
    const char *name;
    const char *hint;
    const char *description;
    const double divider;
  }

The measure_cupti.cuh/cu would add these custom metrics to the nvbench defined ones, forwarding them to cupti_profiler.cuh/cxx.
To support that, cupti_profiler.cuh/cxx would need to check if the selected metrics are available on the device, dropping the unavailable metrics and running the experiment even if some metrics are not available.

The measure_cupti.cuh/cu would handle displaying that metrics were unavailable.

The availability of a metric would be queried using Perfworks Metric API and the NVPW_MetricsEvaluator_GetMetricNames() API call to see if the user-specified metrics are available on the current device.

For now only some comments for understanding the actual implementation are on this branch, and also places where I think I will add the required code.

Any thoughts on this Project ?

@copy-pr-bot
Copy link

copy-pr-bot bot commented Oct 22, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@comeyrd comeyrd marked this pull request as draft October 22, 2025 21:18
@comeyrd
Copy link
Contributor Author

comeyrd commented Oct 23, 2025

Hi @gevtushenko !
I am working on the cupti_profiler that you wrote a few years ago,
I just have one question about the counter_data_builder and it's use.
in cupti_profiler.cxx, line 476 (or around that)

void cupti_profiler::initialize_counter_data_prefix_image()
{
  const std::uint8_t *counter_availability_image = nullptr;

  std::vector<NVPA_RawMetricRequest> raw_metric_requests =
    get_raw_metric_requests(m_chip_name, m_metric_names, counter_availability_image);

  counter_data_builder data_builder(m_chip_name, counter_availability_image);
...

For what I understood, the availability image is an array of uint8_t, that will be used in the futur (or is used now) instead of the chip_name to identify the chip and its capabilities.

in cupti_profiler.cxx, line 133 (or around that) you initialise the member variable m_availability_image,

void cupti_profiler::initialize_availability_image()
{
  CUpti_Profiler_GetCounterAvailability_Params params{};

  params.structSize = CUpti_Profiler_GetCounterAvailability_Params_STRUCT_SIZE;
  params.ctx        = m_device.get_context();

  cupti_call(cuptiProfilerGetCounterAvailability(&params));

  m_availability_image.clear();
  m_availability_image.resize(params.counterAvailabilityImageSize);
  params.pCounterAvailabilityImage = m_availability_image.data();

  cupti_call(cuptiProfilerGetCounterAvailability(&params));
}

But in cupti::initialize_counter_data_prefix_image() you give the counter_data_builder a nullptr availability image.

Can you give me insight on that ?
Is there other documentation about cupti else than Cupti Usage, is there an API reference ? Or the only info I can access is in the different headers of the library ?

Thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant