Skip to content

Commit 6bd749e

Browse files
alibeklfcmeta-codesync[bot]
authored andcommitted
Fix IDSelector leak via SearchParameters.sel setter (#5208)
Summary: Pull Request resolved: #5208 A user-reported leak (#2996 follow-up) reproduces on faiss 1.14.1 when an `IDSelectorBatch` is assigned to a `SearchParameters` field after construction. Each iteration of the reporter's loop leaks ~160 MB; over a long-running serving process this grows unbounded. # Root cause The Python wrapper for `IDSelectorBatch` carries a SWIG `thisown` flag that controls who calls C++ `delete`. The SWIG-generated property setter for `SearchParameters.sel` flips `thisown` to 0 on the assumption that the enclosing C++ object will take responsibility, but `SearchParameters`'s C++ destructor does not free `sel` (the field is borrowed by design). Result: the C++ `IDSelectorBatch` (a 40 MB sorted `std::vector<idx_t>` for the reporter's case) plus the retained numpy buffer are orphaned forever. `handle_SearchParameters` in `class_wrappers.py` already protected the kwargs construction path (`SearchParameters(sel=x)`) by wrapping the assignment in `RememberSwigOwnership` and `add_to_referenced_objects`. The bare-attribute path was unprotected. # Fix Override `__setattr__` on the base `SearchParameters` class so the same ownership protection runs for both `SearchParameters(sel=x)` and `params.sel = x`. The override filters via `hasattr(v, "thisown")` so only SWIG-wrapped C++ objects go through the dance; SWIG internals like `self.this` (a `SwigPyObject` without `thisown`), bookkeeping lists, and plain Python values bypass it. `replacement_init` is slimmed to a pure delegator: the protection moves into `replacement_setattr` so a single source of truth handles both code paths. Keeping both layers would double-wrap on kwargs construction. The filter changes from a denylist (`type(v) not in (int, float, bool, str)`) to a duck-type check (`hasattr(v, "thisown")`). The new filter is stricter: numpy arrays, lists, and `None` no longer get appended. Safe in practice because `SearchParameters` SWIG fields are typed as C++ pointers and only legitimately accept SWIG wrappers; numpy buffers passed indirectly (e.g., to `IDSelectorBatch`) are kept alive by the inner wrapper's own `add_to_referenced_objects`. The duck-type filter is field-name-agnostic, so other `SearchParameters` subclass fields with the same C++ shape (`Foo* field = nullptr` with no destructor freeing `field`) benefit from the same protection automatically. Examples include `SearchParametersPreTransform.index_params` and `SearchParametersIVF.quantizer_params`. # Per-field ref tracking A naive implementation that uses the existing `add_to_referenced_objects` helper (which appends to a list) leaks under a different access pattern: `params = SearchParameters(); for ...: params.sel = ...` accumulates one entry per reassignment because nothing drops the prior ref. This is a common pattern in long-lived servers that initialize a `SearchParameters` once and rebind `params.sel` per request. Fixed by using a per-field dict `self._sp_field_refs` keyed by attribute name. Reassigning the same field replaces the entry in the dict, so the prior wrapper loses its Python ref and the C++ object is freed. The else branch of `replacement_setattr` also drops any prior ref when the field is set to None or to a non-SWIG value, so explicit clear (`params.sel = None`) releases the C++ object immediately. # Subclass guard `__setattr__` is installed only once per class hierarchy. SWIG generates per-class `__init__` slots but does NOT generate per-class `__setattr__`, so subclasses (`SearchParametersIVF`, `SearchParametersHNSW`, ...) inherit the override via MRO. Without the `_protected_setattr` sentinel, calling `handle_SearchParameters(SearchParametersIVF)` would capture the inherited `replacement_setattr` as `parent_setattr`, then install a second `replacement_setattr` on top - chained calls would fire field tracking twice. The guard uses `getattr` (which walks the MRO) rather than `__dict__` lookup (own-attribute only) so subclasses correctly see the base's marker. `replacement_init` does not need this guard because each SWIG class generates its own `__init__` in its own `__dict__`; wrapping always captures SWIG's fresh init, never an inherited replacement. # Prior attempts and why this approach is targeted This bug has been attempted twice before via the SWIG layer: - #3139 (Apr 2024, abandoned) - #3810 (Sept 2024, changes requested) Both added a `%typemap(in) SWIGTYPE *` in `swigfaiss.swig` that strips the `$disown` flag globally, so SWIG never transfers ownership for any pointer field assignment anywhere in the bindings. The minimal change is one typemap. The trade-off proved untenable. PR 3810 surfaced the smoking gun while testing: `test_graph_based.test_io_no_storage` started crashing with heap-use-after-free at `index.storage = faiss.clone_index(index.storage)` with `own_fields = True`. The author worked around it by introducing a temp variable and flipping `own_fields = False` - but that flip is itself a behavior change to a documented FAISS contract that downstream users build on. The right framing is not "SWIG's default ownership transfer is always wrong" (the typemap assumption) but "SWIG's default ownership transfer is wrong specifically for `SearchParameters` because that class lacks a destructor for its borrowed pointer fields." This diff scopes the change to that class hierarchy; every other SWIG-managed field keeps its existing semantics. No `swigfaiss.swig` change, no surprising downstream breakage. The memory regression test pattern (`np.arange(5_000_000)` × N iterations, measure `faiss.get_mem_usage_kb`) was independently validated by PR 3810's `test_ownership_2` and is reused here. ASAN cannot catch this leak because the C++ object remains reachable from a Python wrapper that simply has no Python-side references - it is a Python-level reference leak, not a C++-level dangling pointer. Reviewed By: mnorris11 Differential Revision: D104750178 fbshipit-source-id: 41f865028d0dc863fe560d1fd1bbe979dc8e8e8f
1 parent 6376bc3 commit 6bd749e

2 files changed

Lines changed: 98 additions & 9 deletions

File tree

faiss/python/class_wrappers.py

Lines changed: 27 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1316,25 +1316,43 @@ def __exit__(self, *ignored):
13161316

13171317

13181318
def handle_SearchParameters(the_class):
1319-
""" this wrapper is to enable initializations of the form
1320-
SearchParametersXX(a=3, b=SearchParamsYY)
1321-
This also requires the enclosing class to keep a reference on the
1322-
sub-object, since the C++ code assumes the object ownership is
1323-
handled externally.
1319+
"""Protect SearchParameters from leaking SWIG-owned sub-objects assigned
1320+
via either kwargs construction (SearchParametersXX(sel=x)) or bare
1321+
attribute assignment (params.sel = x).
13241322
"""
13251323
the_class.original_init = the_class.__init__
13261324

13271325
def replacement_init(self, **args):
13281326
self.original_init()
13291327
for k, v in args.items():
13301328
assert hasattr(self, k)
1331-
with RememberSwigOwnership(v):
1332-
setattr(self, k, v)
1333-
if type(v) not in (int, float, bool, str):
1334-
add_to_referenced_objects(self, v)
1329+
setattr(self, k, v)
13351330

13361331
the_class.__init__ = replacement_init
13371332

1333+
# Install __setattr__ once per hierarchy; subclasses inherit via MRO.
1334+
if getattr(the_class, "_protected_setattr", False):
1335+
return
1336+
parent_setattr = the_class.__setattr__
1337+
1338+
def replacement_setattr(self, k, v):
1339+
# Per-field ref dict. Reassigning the same field drops the prior
1340+
# ref instead of accumulating, so a long-lived SearchParameters
1341+
# with repeated `params.sel = ...` does not leak.
1342+
if v is not None and hasattr(v, "thisown"):
1343+
if not hasattr(self, "_sp_field_refs"):
1344+
parent_setattr(self, "_sp_field_refs", {})
1345+
with RememberSwigOwnership(v):
1346+
parent_setattr(self, k, v)
1347+
self._sp_field_refs[k] = v
1348+
else:
1349+
parent_setattr(self, k, v)
1350+
if hasattr(self, "_sp_field_refs"):
1351+
self._sp_field_refs.pop(k, None)
1352+
1353+
the_class.__setattr__ = replacement_setattr
1354+
the_class._protected_setattr = True
1355+
13381356

13391357
def handle_IDSelectorSubset(the_class, class_owns, force_int64=True):
13401358
the_class.original_init = the_class.__init__

tests/test_referenced_objects.py

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,77 @@ def test_shards(self):
8484
gc.collect()
8585
index.search(xb, 10)
8686

87+
def test_SearchParameters_setattr_no_leak(self):
88+
# Regression: assigning IDSelectorBatch to params.sel after
89+
# construction must not leak. Before the fix, the SWIG property
90+
# setter flipped IDSelectorBatch.thisown to 0 but
91+
# SearchParameters never freed it -> ~80 MB orphaned per iter.
92+
n_ids = 5_000_000
93+
n_iters = 10
94+
95+
def body():
96+
params = faiss.SearchParameters()
97+
params.sel = faiss.IDSelectorBatch(np.arange(n_ids))
98+
del params
99+
100+
body() # warmup: prime allocator
101+
gc.collect()
102+
start_kb = faiss.get_mem_usage_kb()
103+
for _ in range(n_iters):
104+
body()
105+
gc.collect()
106+
growth_mb = (faiss.get_mem_usage_kb() - start_kb) / 1024.0
107+
# One leaked IDSelectorBatch is ~80 MB; 10 iters would leak
108+
# ~800 MB. 200 MB cap leaves generous headroom for allocator
109+
# noise but is well below the leak signal.
110+
self.assertLess(growth_mb, 200)
111+
112+
def test_SearchParameters_subclass_no_double_wrap(self):
113+
# Regression: SWIG does not generate per-class __setattr__, so
114+
# the ownership-protecting override must be installed only on
115+
# the base SearchParameters; otherwise subclasses inherit and
116+
# re-wrap, doubling the refcount on the assigned selector.
117+
params = faiss.SearchParametersIVF()
118+
sel = faiss.IDSelectorBatch(np.arange(10))
119+
before = sys.getrefcount(sel)
120+
params.sel = sel
121+
delta = sys.getrefcount(sel) - before
122+
self.assertEqual(delta, 1)
123+
124+
def test_SearchParameters_reassignment_no_accumulation(self):
125+
# Reassigning the same field on a long-lived SearchParameters
126+
# must release the prior ref. Without per-field tracking the
127+
# protection list grows unbounded - reproducing the original
128+
# leak shape under a different access pattern.
129+
params = faiss.SearchParameters()
130+
sel1 = faiss.IDSelectorBatch(np.arange(10))
131+
sel2 = faiss.IDSelectorBatch(np.arange(10))
132+
before1 = sys.getrefcount(sel1)
133+
params.sel = sel1
134+
params.sel = sel2
135+
# sel1 ref dropped when sel2 took its slot
136+
self.assertEqual(sys.getrefcount(sel1), before1)
137+
138+
def test_SearchParameters_setattr_else_branch(self):
139+
# Else branch coverage: setting a field to None drops the prior
140+
# SWIG ref; assigning two different fields keeps both alive
141+
# independently so dropping one does not affect the other.
142+
params = faiss.SearchParametersIVF()
143+
sel = faiss.IDSelectorBatch(np.arange(10))
144+
quant = faiss.SearchParameters()
145+
sel_before = sys.getrefcount(sel)
146+
quant_before = sys.getrefcount(quant)
147+
148+
params.sel = sel
149+
params.quantizer_params = quant
150+
self.assertEqual(sys.getrefcount(sel) - sel_before, 1)
151+
self.assertEqual(sys.getrefcount(quant) - quant_before, 1)
152+
153+
# Drop sel via None; quantizer_params untouched.
154+
params.sel = None
155+
self.assertEqual(sys.getrefcount(sel), sel_before)
156+
self.assertEqual(sys.getrefcount(quant) - quant_before, 1)
157+
87158

88159
dbin = 32
89160
xtbin = np.random.randint(256, size=(100, int(dbin / 8))).astype('uint8')

0 commit comments

Comments
 (0)