Commit 6bd749e
Fix IDSelector leak via SearchParameters.sel setter (#5208)
Summary:
Pull Request resolved: #5208
A user-reported leak (#2996 follow-up) reproduces on faiss 1.14.1 when an `IDSelectorBatch` is assigned to a `SearchParameters` field after construction. Each iteration of the reporter's loop leaks ~160 MB; over a long-running serving process this grows unbounded.
# Root cause
The Python wrapper for `IDSelectorBatch` carries a SWIG `thisown` flag that controls who calls C++ `delete`. The SWIG-generated property setter for `SearchParameters.sel` flips `thisown` to 0 on the assumption that the enclosing C++ object will take responsibility, but `SearchParameters`'s C++ destructor does not free `sel` (the field is borrowed by design). Result: the C++ `IDSelectorBatch` (a 40 MB sorted `std::vector<idx_t>` for the reporter's case) plus the retained numpy buffer are orphaned forever.
`handle_SearchParameters` in `class_wrappers.py` already protected the kwargs construction path (`SearchParameters(sel=x)`) by wrapping the assignment in `RememberSwigOwnership` and `add_to_referenced_objects`. The bare-attribute path was unprotected.
# Fix
Override `__setattr__` on the base `SearchParameters` class so the same ownership protection runs for both `SearchParameters(sel=x)` and `params.sel = x`. The override filters via `hasattr(v, "thisown")` so only SWIG-wrapped C++ objects go through the dance; SWIG internals like `self.this` (a `SwigPyObject` without `thisown`), bookkeeping lists, and plain Python values bypass it.
`replacement_init` is slimmed to a pure delegator: the protection moves into `replacement_setattr` so a single source of truth handles both code paths. Keeping both layers would double-wrap on kwargs construction.
The filter changes from a denylist (`type(v) not in (int, float, bool, str)`) to a duck-type check (`hasattr(v, "thisown")`). The new filter is stricter: numpy arrays, lists, and `None` no longer get appended. Safe in practice because `SearchParameters` SWIG fields are typed as C++ pointers and only legitimately accept SWIG wrappers; numpy buffers passed indirectly (e.g., to `IDSelectorBatch`) are kept alive by the inner wrapper's own `add_to_referenced_objects`.
The duck-type filter is field-name-agnostic, so other `SearchParameters` subclass fields with the same C++ shape (`Foo* field = nullptr` with no destructor freeing `field`) benefit from the same protection automatically. Examples include `SearchParametersPreTransform.index_params` and `SearchParametersIVF.quantizer_params`.
# Per-field ref tracking
A naive implementation that uses the existing `add_to_referenced_objects` helper (which appends to a list) leaks under a different access pattern: `params = SearchParameters(); for ...: params.sel = ...` accumulates one entry per reassignment because nothing drops the prior ref. This is a common pattern in long-lived servers that initialize a `SearchParameters` once and rebind `params.sel` per request.
Fixed by using a per-field dict `self._sp_field_refs` keyed by attribute name. Reassigning the same field replaces the entry in the dict, so the prior wrapper loses its Python ref and the C++ object is freed. The else branch of `replacement_setattr` also drops any prior ref when the field is set to None or to a non-SWIG value, so explicit clear (`params.sel = None`) releases the C++ object immediately.
# Subclass guard
`__setattr__` is installed only once per class hierarchy. SWIG generates per-class `__init__` slots but does NOT generate per-class `__setattr__`, so subclasses (`SearchParametersIVF`, `SearchParametersHNSW`, ...) inherit the override via MRO. Without the `_protected_setattr` sentinel, calling `handle_SearchParameters(SearchParametersIVF)` would capture the inherited `replacement_setattr` as `parent_setattr`, then install a second `replacement_setattr` on top - chained calls would fire field tracking twice. The guard uses `getattr` (which walks the MRO) rather than `__dict__` lookup (own-attribute only) so subclasses correctly see the base's marker.
`replacement_init` does not need this guard because each SWIG class generates its own `__init__` in its own `__dict__`; wrapping always captures SWIG's fresh init, never an inherited replacement.
# Prior attempts and why this approach is targeted
This bug has been attempted twice before via the SWIG layer:
- #3139 (Apr 2024, abandoned)
- #3810 (Sept 2024, changes requested)
Both added a `%typemap(in) SWIGTYPE *` in `swigfaiss.swig` that strips the `$disown` flag globally, so SWIG never transfers ownership for any pointer field assignment anywhere in the bindings. The minimal change is one typemap.
The trade-off proved untenable. PR 3810 surfaced the smoking gun while testing: `test_graph_based.test_io_no_storage` started crashing with heap-use-after-free at `index.storage = faiss.clone_index(index.storage)` with `own_fields = True`. The author worked around it by introducing a temp variable and flipping `own_fields = False` - but that flip is itself a behavior change to a documented FAISS contract that downstream users build on.
The right framing is not "SWIG's default ownership transfer is always wrong" (the typemap assumption) but "SWIG's default ownership transfer is wrong specifically for `SearchParameters` because that class lacks a destructor for its borrowed pointer fields." This diff scopes the change to that class hierarchy; every other SWIG-managed field keeps its existing semantics. No `swigfaiss.swig` change, no surprising downstream breakage.
The memory regression test pattern (`np.arange(5_000_000)` × N iterations, measure `faiss.get_mem_usage_kb`) was independently validated by PR 3810's `test_ownership_2` and is reused here. ASAN cannot catch this leak because the C++ object remains reachable from a Python wrapper that simply has no Python-side references - it is a Python-level reference leak, not a C++-level dangling pointer.
Reviewed By: mnorris11
Differential Revision: D104750178
fbshipit-source-id: 41f865028d0dc863fe560d1fd1bbe979dc8e8e8f1 parent 6376bc3 commit 6bd749e
2 files changed
Lines changed: 98 additions & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1316 | 1316 | | |
1317 | 1317 | | |
1318 | 1318 | | |
1319 | | - | |
1320 | | - | |
1321 | | - | |
1322 | | - | |
1323 | | - | |
| 1319 | + | |
| 1320 | + | |
| 1321 | + | |
1324 | 1322 | | |
1325 | 1323 | | |
1326 | 1324 | | |
1327 | 1325 | | |
1328 | 1326 | | |
1329 | 1327 | | |
1330 | 1328 | | |
1331 | | - | |
1332 | | - | |
1333 | | - | |
1334 | | - | |
| 1329 | + | |
1335 | 1330 | | |
1336 | 1331 | | |
1337 | 1332 | | |
| 1333 | + | |
| 1334 | + | |
| 1335 | + | |
| 1336 | + | |
| 1337 | + | |
| 1338 | + | |
| 1339 | + | |
| 1340 | + | |
| 1341 | + | |
| 1342 | + | |
| 1343 | + | |
| 1344 | + | |
| 1345 | + | |
| 1346 | + | |
| 1347 | + | |
| 1348 | + | |
| 1349 | + | |
| 1350 | + | |
| 1351 | + | |
| 1352 | + | |
| 1353 | + | |
| 1354 | + | |
| 1355 | + | |
1338 | 1356 | | |
1339 | 1357 | | |
1340 | 1358 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
84 | 84 | | |
85 | 85 | | |
86 | 86 | | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
87 | 158 | | |
88 | 159 | | |
89 | 160 | | |
| |||
0 commit comments