Commit 0c1a952
[FSTORE-1970][APPEND] Fix similarity-search find_neighbors on OpenSearch 2.19.5 (k-too-large parsing + faiss efficient filtering) (#1008)
* [FSTORE-1970][APPEND] Parse OpenSearch 2.19.5 k-NN "k too large" error
OpenSearch 2.19.5 (k-NN plugin) reports an out-of-range k via
KNNQueryBuilder.Builder.validate() as
"[knn] requires k to be in the range (0, N]", whereas 1.3.6 reported
"[knn] requires k <= N". The vector-DB error parser only matched the
old form, so the max-k discovery probe in
VectorDbClient._find_neighbors could not extract the limit, left the
exception info empty, and re-raised instead of caching the limit.
find_neighbors() on a project index therefore failed with
"Requested k is too large".
Match both message forms (the new (0, N] range and the retained <= N
form, both present in the 2.19.5 source). Also tighten the guard so it
only classifies genuine upper-bound violations as REQUESTED_K_TOO_LARGE:
"[knn] requires k > 0" and "[knn] requires exactly one of k, distance
or score to be set" now fall through to OTHERS. The "Result window is
too large" parser is unchanged; its bracketed-number format is verified
against DefaultSearchContext.java at tag 2.19.5.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* [FSTORE-1970][APPEND] Use faiss efficient k-NN filtering so find_neighbors returns k
The ee OpenSearch 2.19.5 upgrade switched the embedding-index engine from
nmslib to faiss. PR #951 had moved the similarity-search query to faiss
efficient filtering (the filter nested inside the knn clause), but #1005
reverted it to a bool/must post-filter after hitting "[knn] unknown token
[START_OBJECT] after [filter]" — the signature of an engine that does not
support in-knn filtering (nmslib), most likely seen against an index not
yet recreated under faiss.
Post-filtering retrieves the k nearest first and prunes with the filter
afterwards, so a selective filter (the per-feature-group exists clause on
a shared project index) returns fewer than k results — e.g. find_neighbors
k=10 returning 7. faiss supports efficient filtering since OpenSearch 2.9
(GA in 2.19) and applies the filter during graph traversal, guaranteeing k
results when at least k exist.
Re-apply the efficient-filter query form (reverting #1005). The explicit
oversized-k path (find_neighbors k=2**31-1) still raises
VectorDatabaseException via the first search, and the parser fix in this
branch keeps the "k too large" message parseable on 2.19.5.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>1 parent dcc96b2 commit 0c1a952
4 files changed
Lines changed: 55 additions & 26 deletions
File tree
- python
- hopsworks_common/core
- hsfs/core
- tests/core
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
231 | 231 | | |
232 | 232 | | |
233 | 233 | | |
234 | | - | |
235 | | - | |
236 | | - | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
237 | 247 | | |
238 | 248 | | |
239 | 249 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
133 | 133 | | |
134 | 134 | | |
135 | 135 | | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
136 | 139 | | |
137 | 140 | | |
138 | 141 | | |
139 | | - | |
140 | | - | |
141 | | - | |
142 | | - | |
143 | | - | |
144 | | - | |
145 | | - | |
146 | | - | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
147 | 148 | | |
148 | 149 | | |
149 | 150 | | |
| |||
170 | 171 | | |
171 | 172 | | |
172 | 173 | | |
173 | | - | |
| 174 | + | |
174 | 175 | | |
175 | 176 | | |
176 | 177 | | |
| |||
189 | 190 | | |
190 | 191 | | |
191 | 192 | | |
192 | | - | |
| 193 | + | |
193 | 194 | | |
194 | 195 | | |
195 | 196 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
39 | 45 | | |
40 | 46 | | |
41 | 47 | | |
| |||
47 | 53 | | |
48 | 54 | | |
49 | 55 | | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
50 | 68 | | |
51 | 69 | | |
52 | 70 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
494 | 494 | | |
495 | 495 | | |
496 | 496 | | |
497 | | - | |
498 | | - | |
499 | | - | |
500 | | - | |
| 497 | + | |
501 | 498 | | |
502 | 499 | | |
503 | | - | |
504 | | - | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
505 | 503 | | |
506 | 504 | | |
507 | 505 | | |
508 | 506 | | |
509 | 507 | | |
510 | 508 | | |
511 | 509 | | |
512 | | - | |
513 | | - | |
514 | | - | |
515 | | - | |
516 | | - | |
517 | | - | |
518 | | - | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
0 commit comments