Commit aa3ce37
Fix race condition, memory management, debug output, and hashtable lookup in sorting.cpp (facebookresearch#5078)
Summary:
Pull Request resolved: facebookresearch#5078
Four fixes in `faiss/utils/sorting.cpp`:
**1. OpenMP directive fix in `fvec_argsort_parallel`**
The initialization loop used `#pragma omp parallel` without the `for` directive. This caused every thread to execute the entire loop independently rather than distributing iterations. With `nt` threads, each `permA[i]` was written by all `nt` threads concurrently — a data race under the C++ memory model (multiple unsynchronized writes to the same non-atomic location), and O(n * nt) wasted work instead of O(n). Fixed by changing to `#pragma omp parallel for`.
In practice, all threads write the same value (`permA[i] = i`), so the output was always correct despite the UB. The fix eliminates the undefined behavior and the redundant work.
**2. RAII memory management in `fvec_argsort_parallel`**
Replaced `new size_t[n]` / `delete[] perm2` with `std::vector<size_t>`. The old code had no realistic exception path between allocation and deallocation (all intermediate code is either C functions or non-throwing OpenMP regions), but the manual `new`/`delete` pattern is fragile against future edits that might introduce a throwing path. The `std::vector` provides RAII lifetime management with no behavioral change.
**3. Removed debug `printf` in `fvec_argsort_parallel`**
A leftover `printf("merge %d %d, %d threads\n", ...)` in the parallel merge loop wrote to stdout during normal operation. Removed.
**4. Missing early termination in `hashtable_int64_to_int64_lookup`**
The linear probing loop did not check for empty slots (`tab[slot * 2] == -1`). In an open-addressing hash table with no deletion support, an empty slot is definitive proof that the key was not inserted — the insert function would have placed it there or earlier. Without this check, lookups for absent keys probed every slot in the bucket before the wrap-around termination at `slot == hk_i`. The fix adds the standard empty-slot check, matching the structure of the insert function (`hashtable_int64_to_int64_add`). This is a performance optimization — the old code always returned the correct result (`-1` after a full bucket scan), just slower.
Reviewed By: mnorris11
Differential Revision: D100317917
fbshipit-source-id: aadfe33b1d76c34e04db7fe0c9b7ca53b4a30c711 parent b068fd9 commit aa3ce37
2 files changed
+87
-6
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
134 | 134 | | |
135 | 135 | | |
136 | 136 | | |
137 | | - | |
| 137 | + | |
138 | 138 | | |
139 | | - | |
| 139 | + | |
140 | 140 | | |
141 | 141 | | |
142 | 142 | | |
| |||
148 | 148 | | |
149 | 149 | | |
150 | 150 | | |
151 | | - | |
152 | | - | |
| 151 | + | |
| 152 | + | |
153 | 153 | | |
154 | 154 | | |
155 | 155 | | |
| |||
184 | 184 | | |
185 | 185 | | |
186 | 186 | | |
187 | | - | |
188 | 187 | | |
189 | 188 | | |
190 | 189 | | |
| |||
197 | 196 | | |
198 | 197 | | |
199 | 198 | | |
200 | | - | |
201 | 199 | | |
202 | 200 | | |
203 | 201 | | |
| |||
816 | 814 | | |
817 | 815 | | |
818 | 816 | | |
| 817 | + | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
819 | 821 | | |
820 | 822 | | |
821 | 823 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
0 commit comments