Skip to content

Commit 1e69d5a

Browse files
yantismeta-codesync[bot]
authored andcommitted
Fix peak-memory spike when loading IVF invlists via IO_FLAG_MMAP_IFC (#5122)
Summary: On a 148 GB `IndexIVFRaBitQ`, `faiss.read_index(path, IO_FLAG_MMAP_IFC)` currently spikes to **165 GB of committed private heap** for ~50 seconds during load, before dropping back to the intended ~9 GB steady state. The spike is transient but matches the entire invlist payload, which defeats the purpose of `IO_FLAG_MMAP_IFC` (keep invlist codes and ids file-cache-backed, not pagefile-committed). Root cause is a two-loop pattern in `read_InvertedLists_up` (`faiss/impl/index_read.cpp`) for both the `ilar` (`ArrayInvertedLists`) and `ilpn` (`ArrayInvertedListsPanorama`) branches: ```cpp // Loop A: pre-allocate every list's owning std::vector (= full invlist size on heap) for (i = 0 .. nlist) { ails->codes[i].resize(sizes[i] * code_size); ails->ids[i].resize(sizes[i]); } // Loop B: replace each owning vector with a view into mmap for (i = 0 .. nlist) { read_vector_with_known_size(ails->codes[i], ...); read_vector_with_known_size(ails->ids[i], ...); } ``` `read_vector_with_known_size` detects a `MappedFileIOReader` and replaces the target `MaybeOwnedVector` via `create_view`, correctly releasing the owning storage. But Loop A has already committed 100% of the invlist size to the heap before Loop B can release any of it. ## Fix Merge the two loops into a single pass per list, so each list's owning heap allocation is released via the view-substitution before the next list's is made. Peak heap during load is now bounded by a single list's worth (~one nprobe cluster, typically under 1 MB), not the sum of all lists'. Applied to both the `ilar` and `ilpn` branches. End state is byte-identical: with `MappedFileIOReader` every `MaybeOwnedVector` ends up as a view; with a regular `FileIOReader` every `MaybeOwnedVector` ends up as owning storage. ## Measured impact Same IndexIVFRaBitQ file (IVF262144_HNSW32,RaBitQ8, 99.7 M vectors, nlist=262144, code_size=1556, 148 GB on disk), loaded via `faiss.read_index(path, IO_FLAG_MMAP_IFC)`: | Metric | Before | After | |---------------------|--------:|------:| | Peak private memory | 164 GB | 10 GB | | Load time | 99 s | 16 s | | Steady-state RSS | 9 GB | 9 GB | Load time drops because ~90 s of the original was spent in `std::vector::resize` zero-filling pages that were about to be discarded. Pull Request resolved: #5122 Test Plan: - [x] Manual: `faiss.read_index(path, IO_FLAG_MMAP_IFC)` on the 148 GB production index; `search(q, 20)` at `nprobe=64` returns the same top-k ids before and after the patch. - [x] Manual: 1 Hz private-memory polling during a 10-query, `k=500`, `nprobe=256` stress — RSS grows identically to the pre-patch behaviour (file-cache page-fault growth, not private heap), confirming the view substitution still works. - [ ] Existing FAISS CI — no new tests added. The bug is a transient peak that existing tests cannot observe, and the fix is an ordering change with no new code path. ## Safety - No API, ABI, serialization-format, or behavioural change. - `FAISS_CHECK_DESERIALIZATION_LOOP_LIMIT` hardening still runs in the same place, before the merged loop. - Works identically on `FileIOReader` (heap-owning end state) and `MappedFileIOReader` (view end state). Made with [Cursor](https://cursor.com) Reviewed By: mdouze Differential Revision: D103055438 Pulled By: mnorris11 fbshipit-source-id: 5cdc371aa78547f00a4e8e854cac84945d81b6b5
1 parent 28b2b66 commit 1e69d5a

2 files changed

Lines changed: 52 additions & 17 deletions

File tree

faiss/impl/index_read.cpp

Lines changed: 19 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -518,16 +518,19 @@ std::unique_ptr<InvertedLists> read_InvertedLists_up(
518518
nlist, code_size, n_levels, bs);
519519
std::vector<size_t> sizes(nlist);
520520
read_ArrayInvertedLists_sizes(f, sizes);
521+
// Do resize + read in a single pass per list. See the matching
522+
// comment in the `ilar` branch below for rationale.
521523
size_t byte_limit = get_deserialization_vector_byte_limit();
522524
for (size_t i = 0; i < nlist; i++) {
525+
size_t n = sizes[i];
523526
FAISS_THROW_IF_NOT_FMT(
524-
sizes[i] <= byte_limit / sizeof(idx_t),
527+
n <= byte_limit / sizeof(idx_t),
525528
"inverted list %zu ids size %zu exceeds "
526529
"deserialization byte limit",
527530
i,
528-
sizes[i]);
529-
ailp->ids[i].resize(sizes[i]);
530-
size_t num_elems = ((sizes[i] + bs - 1) / bs) * bs;
531+
n);
532+
ailp->ids[i].resize(n);
533+
size_t num_elems = ((n + bs - 1) / bs) * bs;
531534
size_t codes_bytes = mul_no_overflow(
532535
num_elems, code_size, "inverted list codes");
533536
FAISS_THROW_IF_NOT_FMT(
@@ -549,9 +552,6 @@ std::unique_ptr<InvertedLists> read_InvertedLists_up(
549552
i,
550553
cum_sums_count);
551554
ailp->cum_sums[i].resize(cum_sums_count);
552-
}
553-
for (size_t i = 0; i < nlist; i++) {
554-
size_t n = sizes[i];
555555
if (n > 0) {
556556
read_vector_with_known_size(
557557
ailp->codes[i], f, ailp->codes[i].size());
@@ -627,27 +627,30 @@ std::unique_ptr<InvertedLists> read_InvertedLists_up(
627627
ails->codes.resize(ails->nlist);
628628
std::vector<size_t> sizes(ails->nlist);
629629
read_ArrayInvertedLists_sizes(f, sizes);
630+
// Resize + read in a single pass per list so that each list's
631+
// heap allocation is released by the mmap view-substitution
632+
// before the next list is allocated. This bounds peak heap to
633+
// one list's worth of memory, which matters for large IVF
634+
// indexes (hundreds of GB) under IO_FLAG_MMAP_IFC.
630635
size_t ilar_byte_limit = get_deserialization_vector_byte_limit();
631-
for (size_t i = 0; i < ails->nlist; i++) {
636+
for (size_t i = 0; i < sizes.size(); i++) {
637+
size_t n = sizes[i];
632638
FAISS_THROW_IF_NOT_FMT(
633-
sizes[i] <= ilar_byte_limit / sizeof(idx_t),
639+
n <= ilar_byte_limit / sizeof(idx_t),
634640
"inverted list %zu ids size %zu exceeds "
635641
"deserialization byte limit",
636642
i,
637-
sizes[i]);
638-
ails->ids[i].resize(sizes[i]);
639-
size_t codes_bytes = mul_no_overflow(
640-
sizes[i], ails->code_size, "inverted list codes");
643+
n);
644+
ails->ids[i].resize(n);
645+
size_t codes_bytes =
646+
mul_no_overflow(n, ails->code_size, "inverted list codes");
641647
FAISS_THROW_IF_NOT_FMT(
642648
codes_bytes <= ilar_byte_limit,
643649
"inverted list %zu codes size %zu exceeds "
644650
"deserialization byte limit",
645651
i,
646652
codes_bytes);
647653
ails->codes[i].resize(codes_bytes);
648-
}
649-
for (size_t i = 0; i < ails->nlist; i++) {
650-
size_t n = ails->ids[i].size();
651654
if (n > 0) {
652655
read_vector_with_known_size(
653656
ails->codes[i],

tests/test_io.py

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -574,7 +574,39 @@ def test_mmap(self):
574574
# unless index2 is collected by a GC
575575
try:
576576
os.unlink(fname)
577-
except:
577+
except Exception:
578+
pass
579+
580+
@unittest.skipIf(
581+
platform.system() not in ["Windows", "Linux", "Darwin"],
582+
"supported OSes only"
583+
)
584+
def test_mmap_ivf(self):
585+
d, nlist = 32, 64
586+
xt, xb, xq = get_dataset_2(d, 2000, 5000, 50)
587+
index = faiss.index_factory(d, f"IVF{nlist},Flat")
588+
index.train(xt)
589+
index.add(xb)
590+
index.nprobe = 8
591+
Dref, Iref = index.search(xq, 10)
592+
593+
fd, fname = tempfile.mkstemp()
594+
os.close(fd)
595+
596+
index2 = None
597+
try:
598+
faiss.write_index(index, fname)
599+
index2 = faiss.read_index(fname, faiss.IO_FLAG_MMAP_IFC)
600+
index2.nprobe = 8
601+
Dnew, Inew = index2.search(xq, 10)
602+
np.testing.assert_array_equal(Iref, Inew)
603+
np.testing.assert_array_equal(Dref, Dnew)
604+
finally:
605+
del index2
606+
if os.path.exists(fname):
607+
try:
608+
os.unlink(fname)
609+
except Exception:
578610
pass
579611

580612
def test_zerocopy(self):

0 commit comments

Comments
 (0)