Current Behavior
When using IndexFlatPanorama with METRIC_L2, adding the same offset vector to all stored vectors and the query can change the Top-1 result from the correct nearest neighbor to a wrong one.
This violates the translation invariance of L2 distance.
For squared L2 distance, if all database vectors and query vectors are translated by the same offset vector, all pairwise differences should remain unchanged:
d(x + o, q + o) = d(x, q)
Therefore, the squared L2 distances and the Top-k ordering should remain unchanged.
However, IndexFlatPanorama returns a different Top-1 after the translation. In the reproduction below, the original Top-1 is id 1, but after adding the same offset to both database vectors and the query, IndexFlatPanorama returns id 0.
This is not caused by approximate search. The reproduction uses IndexFlatPanorama as a flat-search index with a tiny deterministic dataset. It is also not caused by ties: the ground-truth distances are clearly different. In the same transformed float32 input, ordinary IndexFlatL2 still returns the correct Top-1.
Actual output on the affected path:
IndexFlatL2 original: id 1, distance 0.0001000000
IndexFlatL2 translated: id 1, distance 0.0001000507
IndexFlatPanorama original: id 1, distance 0.0001000017
IndexFlatPanorama translated: id 0, distance 0.0
Expected translated Top-1: id 1
Actual Panorama translated Top-1: id 0
For example, before translation:
q = [0.12573022, -0.13210486]
id 0 = [0.14393789, -0.11259123]
id 1 = [0.12316237, -0.14176954]
id 1 is the nearest vector.
After adding the same offset to the query and all database vectors:
offset = [91.70051, 111.15123]
the relative differences should remain unchanged. Therefore, id 1 should still be the nearest vector. Instead, IndexFlatPanorama returns id 0.
Steps to Reproduce
The following script creates two indexes:
one IndexFlatL2 index, used as a reference;
one IndexFlatPanorama index, which shows the wrong Top-1 after translation.
The script searches before and after adding the same offset vector to both the database vectors and the query.
import faiss
import numpy as np
np.set_printoptions(precision=10, suppress=True)
q = np.array([[0.12573022, -0.13210486]], dtype=np.float32)
# id 0 is farther, id 1 is nearest.
xb = np.array(
[
[0.14393789, -0.11259123],
[0.12316237, -0.14176954],
],
dtype=np.float32,
)
offset = np.array([91.70051, 111.15123], dtype=np.float32)
def run_indexflatl2():
index = faiss.IndexFlatL2(2)
index.add(xb)
d0, i0 = index.search(q, 1)
index2 = faiss.IndexFlatL2(2)
index2.add(xb + offset)
d1, i1 = index2.search(q + offset, 1)
return d0, i0, d1, i1
def run_panorama():
index = faiss.IndexFlatPanorama(2, faiss.METRIC_L2, 1, 1)
index.add(xb)
d0, i0 = index.search(q, 1)
index2 = faiss.IndexFlatPanorama(2, faiss.METRIC_L2, 1, 1)
index2.add(xb + offset)
d1, i1 = index2.search(q + offset, 1)
return d0, i0, d1, i1
def l2sqr(qv, xbv):
diff = xbv.astype(np.float64) - qv.astype(np.float64)
return np.sum(diff * diff, axis=1)
fd0, fi0, fd1, fi1 = run_indexflatl2()
pd0, pi0, pd1, pi1 = run_panorama()
gt_original = l2sqr(q[0], xb)
gt_translated = l2sqr((q + offset)[0], xb + offset)
print("faiss version:", getattr(faiss, "__version__", "unknown"))
print()
print("ground truth original distances:", gt_original)
print("ground truth original order:", np.argsort(gt_original))
print("ground truth translated distances:", gt_translated)
print("ground truth translated order:", np.argsort(gt_translated))
print()
print("IndexFlatL2 original:")
print("labels =", fi0)
print("distances =", fd0)
print()
print("IndexFlatL2 translated:")
print("labels =", fi1)
print("distances =", fd1)
print()
print("IndexFlatPanorama original:")
print("labels =", pi0)
print("distances =", pd0)
print()
print("IndexFlatPanorama translated:")
print("labels =", pi1)
print("distances =", pd1)
expected = int(fi0[0, 0])
actual = int(pi1[0, 0])
print()
print("Expected translated Top-1:", expected)
print("Actual Panorama translated Top-1:", actual)
if expected != actual:
print("BUG REPRODUCED")
print(
{
"mr": "VectorTranslationMutator",
"metric": "L2",
"index": "IndexFlatPanorama",
"nlevels": 1,
"batch_size": 1,
"offset": offset.tolist(),
}
)
else:
print("No bug reproduced")
Expected Behavior
For squared L2 distance, translating all stored vectors and the query by the same offset should preserve all pairwise differences.
Example:
q = [0.12573022, -0.13210486]
r0 = [0.14393789, -0.11259123]
r1 = [0.12316237, -0.14176954]
offset = [91.70051, 111.15123]
After translation:
q' = q + offset
r0' = r0 + offset
r1' = r1 + offset
The differences remain unchanged:
r0' - q' = r0 - q
r1' - q' = r1 - q
Therefore, the squared L2 distances and the Top-1 result should remain unchanged.
Before translation, id 1 is the nearest vector. After translation, id 1 should still be the nearest vector.
Expected translated IndexFlatPanorama Top-1:
Actual Behavior
IndexFlatPanorama returns id 1 before translation, but id 0 after translation:
IndexFlatPanorama original:
labels = [[1]]
distances = [[0.0001000017]]
IndexFlatPanorama translated:
labels = [[0]]
distances = [[0.0]]
This is incorrect because the translation does not change the relative geometry of the vectors. The expected translated Top-1 is still id 1.
In the same transformed float32 input, IndexFlatL2 still returns the correct Top-1:
IndexFlatL2 translated:
labels = [[1]]
distances = [[0.0001000507]]
This shows that the wrong result is specific to the Panorama flat-search path.
Why this is a bug
IndexFlatPanorama is a flat-search variant, not an approximate or quantized index. For L2 distance, adding the same offset to every database vector and the query should not change the Top-k ordering.
The observed behavior violates this invariant. The final reported L2 distance in IndexFlatPanorama appears to come from a norm-expansion computation of the form:
||x||^2 + ||y||^2 - 2 * <x, y>
After translation, the norm and dot-product terms become large and close to each other. In float32, this can suffer numerical cancellation. In the reproduction above, the translated Panorama distances collapse to 0.0, and the exact Top-1 changes from id 1 to id 0.
Environment
Faiss Python API.
Reproduced with faiss==1.14.2.
Current Behavior
When using IndexFlatPanorama with METRIC_L2, adding the same offset vector to all stored vectors and the query can change the Top-1 result from the correct nearest neighbor to a wrong one.
This violates the translation invariance of L2 distance.
For squared L2 distance, if all database vectors and query vectors are translated by the same offset vector, all pairwise differences should remain unchanged:
Therefore, the squared L2 distances and the Top-k ordering should remain unchanged.
However, IndexFlatPanorama returns a different Top-1 after the translation. In the reproduction below, the original Top-1 is id 1, but after adding the same offset to both database vectors and the query, IndexFlatPanorama returns id 0.
This is not caused by approximate search. The reproduction uses IndexFlatPanorama as a flat-search index with a tiny deterministic dataset. It is also not caused by ties: the ground-truth distances are clearly different. In the same transformed float32 input, ordinary IndexFlatL2 still returns the correct Top-1.
Actual output on the affected path:
IndexFlatL2 original: id 1, distance 0.0001000000
IndexFlatL2 translated: id 1, distance 0.0001000507
IndexFlatPanorama original: id 1, distance 0.0001000017
IndexFlatPanorama translated: id 0, distance 0.0
Expected translated Top-1: id 1
Actual Panorama translated Top-1: id 0
For example, before translation:
id 1 is the nearest vector.
After adding the same offset to the query and all database vectors:
offset =
[91.70051, 111.15123]the relative differences should remain unchanged. Therefore, id 1 should still be the nearest vector. Instead, IndexFlatPanorama returns id 0.
Steps to Reproduce
The following script creates two indexes:
one IndexFlatL2 index, used as a reference;
one IndexFlatPanorama index, which shows the wrong Top-1 after translation.
The script searches before and after adding the same offset vector to both the database vectors and the query.
Expected Behavior
For squared L2 distance, translating all stored vectors and the query by the same offset should preserve all pairwise differences.
Example:
After translation:
The differences remain unchanged:
Therefore, the squared L2 distances and the Top-1 result should remain unchanged.
Before translation, id 1 is the nearest vector. After translation, id 1 should still be the nearest vector.
Expected translated IndexFlatPanorama Top-1:
Actual Behavior
IndexFlatPanorama returns id 1 before translation, but id 0 after translation:
IndexFlatPanorama translated:
This is incorrect because the translation does not change the relative geometry of the vectors. The expected translated Top-1 is still id 1.
In the same transformed float32 input, IndexFlatL2 still returns the correct Top-1:
IndexFlatL2 translated:
This shows that the wrong result is specific to the Panorama flat-search path.
Why this is a bug
IndexFlatPanorama is a flat-search variant, not an approximate or quantized index. For L2 distance, adding the same offset to every database vector and the query should not change the Top-k ordering.
The observed behavior violates this invariant. The final reported L2 distance in IndexFlatPanorama appears to come from a norm-expansion computation of the form:
After translation, the norm and dot-product terms become large and close to each other. In float32, this can suffer numerical cancellation. In the reproduction above, the translated Panorama distances collapse to 0.0, and the exact Top-1 changes from id 1 to id 0.
Environment