[Bug]: StreamingNode crashes in segcore retrieve during concurrent query and segment transfer

## Existing Issue
- [x] I have searched the existing issues

## Environment
- Milvus version: unknown-20260604-825016cbd7
- Git commit: 825016cbd7
- Build time: Thu Jun 4 23:15:39 UTC 2026
- Deployment mode: cluster
- MQ type: pulsar
- SDK version: unknown
- OS: Kubernetes / 4am cluster
- CPU/Memory: GOMAXPROCS=4, TotalMem=17179869184
- GPU: N/A
- Others: namespace `chaos-testing`, instance `pulsar-cluster-reinstall-3801`

K8s pod list at `2026-06-05 16:44:58 UTC` showed the target pod restarted 3 times:

```text
pulsar-cluster-reinstall-3801-milvus-streamingnode-59d6c4d4bp78   1/1   Running   3 (18m ago)   32m   10.104.30.192   4am-node38
pulsar-cluster-reinstall-3801-milvus-streamingnode-59d6c4dt52xj   1/1   Running   2 (32m ago)   32m   10.104.14.238   4am-node18
```

## Current Behavior
A StreamingNode pod restarted after running for a while. Loki logs show that the first two restarts were startup-time failures caused by etcd not being ready:

```text
2026/06/05 16:12:10 panic: failed to create etcd client: context deadline exceeded
2026/06/05 16:12:23 panic: failed to create etcd client: context deadline exceeded
```

The later runtime restart was different. At `2026-06-05 16:26:18 UTC`, the process crashed in the C++ segcore retrieve path while filling query result target fields from a growing FloatVector segment.

No Loki evidence was found for OOMKilled, liveness/readiness probe failure, BackOff, or Kubernetes killing the container.

## Expected Behavior
StreamingNode should not crash during concurrent query/retrieve and segment load/release/delete-snapshot transfer. It should either complete the query safely or return an error without terminating the process.

## Steps To Reproduce
The exact minimal reproducer is not available yet. The observed CI/test workload had the following pattern:

1. Deploy a Milvus cluster with Pulsar MQ using commit `825016cbd7`.
2. Run concurrent query/search workload while collections are actively receiving inserts/deletes/flushes.
3. Trigger load/release segment transfer on StreamingNode.
4. Observe StreamingNode crash during a retrieve/query path.

The relevant workload window involved collection `466793519583008866` and channel `by-dev-rootcoord-dml_12_466793519583008866v1`.

## Milvus Log
Runtime crash context:

```text
[2026/06/05 16:26:17.892 +00:00] [INFO] [querynodev2/services.go:443]
["received load segments request"] [traceID=0a357e11364b4ad94a2ec58fd54df871]
[collectionID=466793519583008866] [segmentID=466793519703471716]
[currentNodeID=2] [dstNodeID=9] [needTransfer=true] [loadScope=Full]

[2026/06/05 16:26:18.484 +00:00] [INFO] [delegator/delegator_data.go:855]
["forward delete to worker (phase 2: snapshot)..."]
[collectionID=466793519583008866] [segmentID=466793519703471716]
[tsHitDeleteRowNum=1497] [bfHitDeleteRowNum=1497]

[2026/06/05 16:26:18.487 +00:00] [INFO] [delegator/delegator_data.go:909]
["load stream delete done"] [collectionID=466793519583008866]
```

Native stack excerpt:

```text
_ZNK6milvus7segcore18SegmentGrowingImpl19bulk_subscript_implINS_11FloatVectorEEEv...
../../../internal/core/src/segcore/SegmentGrowingImpl.cpp:1472

_ZNK6milvus7segcore18SegmentGrowingImpl14bulk_subscript...
../../../internal/core/src/segcore/SegmentGrowingImpl.cpp:1155

_ZNK6milvus7segcore24SegmentInternalInterface15FillTargetEntry...
../../../internal/core/src/segcore/SegmentInterface.cpp:239

_ZNK6milvus7segcore24SegmentInternalInterface8Retrieve...
../../../internal/core/src/segcore/SegmentInterface.cpp:163

AsyncRetrieve(...)
../../../internal/core/src/segcore/segment_c.cpp:374
```

Nearby query workload:

```text
[2026/06/05 16:26:03.477 +00:00] ["received query request"]
[traceID=dd790501e2d1eb73d28a93ec42fd082c]
[collectionID=466793519583008866]
[outputFields="[115,118,112,109,127,100,102,101,110,129,117,106,120,111,124,119,107,125,113,104,105,126,108,121,123,128,122,103,1]"]
[segmentIDs="[]"]

[2026/06/05 16:26:05.554 +00:00] ["received query request"]
[traceID=70500c4304887481b03adcb9aaae2abb]
[collectionID=466793519583008866]
[outputFields="[129,100,1]"]
```

## Anything else?
The crash stack points to `SegmentGrowingImpl::bulk_subscript_impl<FloatVector>` reading raw vector data by physical offset during `FillTargetEntry`. The surrounding logs suggest a concurrency/lifecycle issue around query/retrieve plus load/release segment transfer and delete snapshot loading, rather than an external Kubernetes restart.

Grafana links:
- Startup etcd panic window: https://grafana-4am.zilliz.cc/explore?schemaVersion=1&panes=%7B%22a%22%3A%7B%22datasource%22%3A%22c2fa6d21-2b11-43ec-921f-ffefca84f260%22%2C%22queries%22%3A%5B%7B%22refId%22%3A%22A%22%2C%22expr%22%3A%22%7Bnamespace%3D%5C%22chaos-testing%5C%22%2C%20pod%3D%5C%22pulsar-cluster-reinstall-3801-milvus-streamingnode-59d6c4d4bp78%5C%22%2C%20filename%3D~%5C%22.%2A/streamingnode/%5B01%5D%5C%5C%5C%5C.log%5C%22%7D%22%2C%22queryType%22%3A%22range%22%2C%22datasource%22%3A%7B%22type%22%3A%22loki%22%2C%22uid%22%3A%22c2fa6d21-2b11-43ec-921f-ffefca84f260%22%7D%2C%22editorMode%22%3A%22code%22%2C%22direction%22%3A%22backward%22%7D%5D%2C%22range%22%3A%7B%22from%22%3A%222026-06-05T16%3A12%3A00.000Z%22%2C%22to%22%3A%222026-06-05T16%3A12%3A25.000Z%22%7D%7D%7D&orgId=1
- Runtime crash window: https://grafana-4am.zilliz.cc/explore?schemaVersion=1&panes=%7B%22a%22%3A%7B%22datasource%22%3A%22c2fa6d21-2b11-43ec-921f-ffefca84f260%22%2C%22queries%22%3A%5B%7B%22refId%22%3A%22A%22%2C%22expr%22%3A%22%7Bnamespace%3D%5C%22chaos-testing%5C%22%2C%20pod%3D%5C%22pulsar-cluster-reinstall-3801-milvus-streamingnode-59d6c4d4bp78%5C%22%2C%20filename%3D~%5C%22.%2A/streamingnode/2%5C%5C%5C%5C.log%5C%22%7D%22%2C%22queryType%22%3A%22range%22%2C%22datasource%22%3A%7B%22type%22%3A%22loki%22%2C%22uid%22%3A%22c2fa6d21-2b11-43ec-921f-ffefca84f260%22%7D%2C%22editorMode%22%3A%22code%22%2C%22direction%22%3A%22backward%22%7D%5D%2C%22range%22%3A%7B%22from%22%3A%222026-06-05T16%3A26%3A15.000Z%22%2C%22to%22%3A%222026-06-05T16%3A26%3A19.000Z%22%7D%7D%7D&orgId=1


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: StreamingNode crashes in segcore retrieve during concurrent query and segment transfer #50366

Existing Issue

Environment

Current Behavior

Expected Behavior

Steps To Reproduce

Milvus Log

Anything else?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug]: StreamingNode crashes in segcore retrieve during concurrent query and segment transfer #50366

Description

Existing Issue

Environment

Current Behavior

Expected Behavior

Steps To Reproduce

Milvus Log

Anything else?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions