[BUG] Inconsistent KNN Search Results Across Identically Configured Indices

### Describe the bug

I'm using OpenSearch Vector Store version 2.17.1 and encountered an issue when creating 10 indices, each containing approximately 4000 .txt documents. These documents vary between physical models and table descriptions. All indices were created using the following configuration:

```

{
  "settings": {
    "index": {
      "number_of_shards": "2",
      "knn.algo_param": {
        "ef_search": "512"
      },
      "knn": "true"
    }
  },
  "mappings": {
    "properties": {
      "id": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      },
      "metadata": {
        "properties": {
          "source": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      },
      "text": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      },
      "vector_field": {
        "type": "knn_vector",
        "dimension": 1024,
        "method": {
          "engine": "nmslib",
          "space_type": "l2",
          "name": "hnsw",
          "parameters": {
            "ef_construction": 512,
            "m": 16
          }
        }
      }
    }
  }
}

```

However, when running the same vector search query against each of these indices, I observed inconsistent results—the same query returned different top documents in some indices, even though they were identically configured.

To work around this issue, I experimented with increasing the ef_search value at query time from the default 512 to 4096 using the following query:

```
{
  "size": k,
  "query": {
    "knn": {
      "vector_field": {
        "vector": question,
        "k": k,
        "method_parameters": {
          "ef_search": 4096
        }
      }
    }
  }
}


```

With this change, the search results became consistent and linear across all indices, which is the expected behavior.

## Questions:

1. Is this non-linear behavior with the default ef_search of 512 expected?

2. Is increasing ef_search to a high value (e.g., 4096) the correct and recommended approach to ensure consistent top-k results across identically configured indices?

3. Should I expect some level of randomness due to the HNSW algorithm, or is this pointing to an underlying issue?

Please let me know if you need any additional setup information.

### Related component

Search

### To Reproduce

1. Create 10 (or more) OpenSearch indices using the settings and mappings shown above.

2. Index approximately 4000 .txt documents per index, with content varying between table descriptions and physical models.

3. Run a knn query (with default ef_search: 512) on all indices using the same vector input.

4. Observe the inconsistency in top-k documents returned across the indices.

5. Modify the same query to include ef_search: 4096 in method_parameters.

6. Observe that the results become consistent and repeatable across all indices.


### Expected behavior

Given that all indices have identical configurations and similar volumes of data, running the same knn query with the same input vector should yield consistent top-k results across indices, even when using the default ef_search value.

### Additional Details

**Host/Environment (please complete the following information):**
 - OS: Linux/Docker
 - Version 2.17.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] Inconsistent KNN Search Results Across Identically Configured Indices #18052

Describe the bug

Questions:

Related component

To Reproduce

Expected behavior

Additional Details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Inconsistent KNN Search Results Across Identically Configured Indices #18052

Description

Describe the bug

Questions:

Related component

To Reproduce

Expected behavior

Additional Details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions