Timing is inaccurate

I found two problems with the way kfusion measures how long the individual kernels take:
- Use of `return double(std::clock())/CLOCKS_PER_SEC`

This is not an atomic clock, and worse it depends on the CPU activity. Changing it to

``` c++
struct timespec clockData;
clock_gettime(CLOCK_MONOTONIC, &clockData);
return (double) clockData.tv_sec + clockData.tv_nsec / 1000000000.0;
```

makes the timing more accurate, and also shows that kfusion is around 10% faster than measured with the old way.
- CUDA kernels are asynchronous. Timing information can really only be measured after `cudaDeviceSynchronize()`, but kfusion doesn't do that in many places (apart from the `total` time, which consequently is correct). Adding it before each `Stats.sample` fixes that, and and changes my measurement for `integrate` from `0.00*` milliseconds to `3.*` milliseconds, which makes a lot of sense.

Note though that `cudaDeviceSynchronize()` can slow things sometimes, but hasn't done so in my measurements.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timing is inaccurate #6

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Timing is inaccurate #6

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions