Commit 9a89b7d
Refactor binary HNSW stats to use OpenMP reduction instead of destructor sync (#4910)
Summary:
Pull Request resolved: #4910
## Problem
Follow-up to the previous fix (replacing `#pragma omp critical` with `#pragma omp atomic` in `FlatHammingDis::~FlatHammingDis`). This commit implements the architecturally correct fix by matching the pattern already used in the float HNSW path (`IndexHNSW.cpp`).
The previous approach had two issues:
1. Stats accumulation in a destructor is fragile — the timing and frequency of destructor calls depends on the caller's object lifetime management, not on the search algorithm
2. `FlatHammingDis` maintained a redundant `ndis` counter (incremented on every `operator()` call) that duplicated what `HNSW::search()` already tracks internally via its returned `HNSWStats`
## Changes
### `IndexBinaryHNSW::search()` — use `#pragma omp for reduction`
- Capture the `HNSWStats` returned by `hnsw.search()` per query
- Accumulate `n1, n2, ndis, nhops` via `#pragma omp for reduction(+: ...)`, which uses lock-free per-thread copies merged at barrier — zero contention
- Call `hnsw_stats.combine()` once, single-threaded, after the parallel region
- This exactly matches the pattern in `hnsw_search()` in `IndexHNSW.cpp` (lines 259-291)
### `FlatHammingDis` — remove stats accumulation entirely
- Remove `size_t ndis` field (redundant with `HNSW::search()` internal tracking)
- Remove `ndis++` from `operator()` (eliminates per-distance-computation overhead)
- Remove destructor (no stats to accumulate → no sync needed)
### `IndexBinaryHNSWCagra::search()` — fix missing stats aggregation
- The `base_level_only` path's second parallel block (`search_level_0`) accumulated stats into a per-thread `HNSWStats search_stats` but never aggregated it into `hnsw_stats` — this was a pre-existing bug where stats were silently lost
- Added `#pragma omp critical { hnsw_stats.combine(search_stats); }` once per thread at the end of the parallel block, matching `IndexHNSW::search_level_0_wrapper()` (line 461)
## Performance Impact
- **Zero lock contention** in the main `IndexBinaryHNSW::search()` path (reduction is lock-free)
- Removes per-distance-computation `ndis++` overhead from `FlatHammingDis::operator()`
- `IndexBinaryHNSWCagra` path: `#pragma omp critical` once per thread (acceptable, matches float HNSW)
- No change to search results — only stats accumulation mechanism changes
Reviewed By: mnorris11
Differential Revision: D959114401 parent b023178 commit 9a89b7d
1 file changed
Lines changed: 15 additions & 13 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
217 | 217 | | |
218 | 218 | | |
219 | 219 | | |
| 220 | + | |
| 221 | + | |
220 | 222 | | |
221 | 223 | | |
222 | 224 | | |
223 | 225 | | |
224 | 226 | | |
225 | 227 | | |
226 | | - | |
| 228 | + | |
227 | 229 | | |
228 | 230 | | |
229 | 231 | | |
230 | 232 | | |
231 | 233 | | |
232 | 234 | | |
233 | 235 | | |
234 | | - | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
235 | 241 | | |
236 | 242 | | |
237 | 243 | | |
238 | 244 | | |
| 245 | + | |
| 246 | + | |
239 | 247 | | |
240 | 248 | | |
241 | 249 | | |
| |||
267 | 275 | | |
268 | 276 | | |
269 | 277 | | |
270 | | - | |
271 | 278 | | |
272 | 279 | | |
273 | 280 | | |
274 | | - | |
275 | 281 | | |
276 | 282 | | |
277 | 283 | | |
| |||
281 | 287 | | |
282 | 288 | | |
283 | 289 | | |
284 | | - | |
285 | | - | |
286 | | - | |
287 | | - | |
| 290 | + | |
288 | 291 | | |
289 | 292 | | |
290 | 293 | | |
291 | 294 | | |
292 | 295 | | |
293 | 296 | | |
294 | | - | |
295 | | - | |
296 | | - | |
297 | | - | |
298 | | - | |
299 | 297 | | |
300 | 298 | | |
301 | 299 | | |
| |||
405 | 403 | | |
406 | 404 | | |
407 | 405 | | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
408 | 410 | | |
409 | 411 | | |
410 | 412 | | |
| |||
0 commit comments