Environment
- Milvus Version:
unknown-20260608-6193b7888c
- Milvus Git Commit:
6193b7888c
- Deployment Mode: standalone
- SDK: PyMilvus
3.1.0rc32
- Test Client: Milvus Python E2E, Python
3.12.13, pytest 8.3.4
- External Storage: MinIO external table source
Reproduction
Option B: Steps
- Create an external collection backed by MinIO parquet files with fields:
id as primary key, external field id
value as scalar field, external field value
embedding as float vector field, external field embedding
- Run
refresh_external_collection, create index, and load_collection.
- Call
add_collection_field on the loaded external collection:
- Milvus field:
score
- data type:
DOUBLE
- nullable:
true
- external field:
score
- Do not refresh or reload the external collection after adding the field.
- Query the newly added field, for example:
client.query(
collection_name,
filter="id < 5",
output_fields=["id", "score"],
limit=10,
)
The same issue also applies when the newly added field is used in a filter before refresh:
client.query(
collection_name,
filter="score > 0",
output_fields=["id", "score"],
limit=10,
)
This is covered by the Milvus Python E2E test:
python -m pytest \
milvus_client/test_milvus_client_external_table.py::TestMilvusClientExternalTableAddField::test_milvus_client_external_table_add_field_without_refresh_not_silent \
--host <milvus-host> --port 19530 \
--minio_host <minio-host> --minio_bucket <bucket> \
--tb=short -s
Trigger Conditions
- Frequency: always reproduced in the above E2E scenario.
- Segment state: reproduced with the old external snapshot loaded as sealed segment data.
- Does NOT happen when: the external collection is refreshed and loaded after adding the field; positive add-field refresh/load/query cases pass.
Expected Behavior
Querying or filtering on a newly added external field before the external collection has been refreshed/loaded should be rejected with a user-facing and actionable error message.
For example, the error should explain that the field is not available in the currently loaded external snapshot and that the collection must be refreshed/reloaded before the newly added field can be queried.
The query should not silently return fabricated data, and it should not expose an internal segcore assertion.
Actual Behavior
The query fails, which is acceptable for this state, but the returned error message is an internal QueryNode/segcore assert:
MilvusException: (code=65535, message=fail to Query on QueryNode 4:
worker(4) query failed: Assert "column != nullptr" => field 104 must exist
when getting raw data at ../internal/core/src/segcore/ChunkedSegmentSealedImpl.cpp:3400)
This is not meaningful for users. It does not mention the newly added field name (score), the external collection snapshot state, or the required refresh/reload action.
Error Logs
Relevant observed flow from server logs:
LoadCollection schema had old fields only:
fields:{fieldID:101 name:"id"...}
fields:{fieldID:102 name:"value"...}
fields:{fieldID:103 name:"embedding"...}
properties:{key:"max_field_id" value:"103"}
AddCollectionField succeeded and updated schema:
"FieldID":104,"Name":"score","DataType":11,"Nullable":true,"ExternalField":"score"
properties:{key:"max_field_id" value:"104"}
Query used the new schema field id:
Query received ... [OutputFields="[id,score]"]
translate output fields to field ids ... [OutputFieldsID="[101,104,100,1]"]
QueryNode read the old sealed segment and returned internal assert:
start do query segments ... [segmentType=Sealed]
Assert "column != nullptr" => field 104 must exist when getting raw data
Non-default Configuration
No relevant non-default Milvus server configuration was identified. The reproduction uses an external table backed by MinIO.
Analysis Hints
This looks related to the same problem family as:
However, this issue is not a chaos/recovery case. It is a normal external table schema evolution case: add_collection_field succeeds while the loaded external snapshot still does not contain field data for the newly added field.
A reasonable fix could be either:
- Reject query/search planning for newly added external fields until the refreshed snapshot is loaded, with a user-facing refresh/reload error; or
- Ensure query target/schema selection does not send field ids to sealed segments that cannot contain the corresponding field data.
The important requirement is that users should not see Assert "column != nullptr" from segcore for this expected invalid state.
Environment
unknown-20260608-6193b7888c6193b7888c3.1.0rc323.12.13, pytest8.3.4Reproduction
Option B: Steps
idas primary key, external fieldidvalueas scalar field, external fieldvalueembeddingas float vector field, external fieldembeddingrefresh_external_collection, create index, andload_collection.add_collection_fieldon the loaded external collection:scoreDOUBLEtruescoreThe same issue also applies when the newly added field is used in a filter before refresh:
This is covered by the Milvus Python E2E test:
Trigger Conditions
Expected Behavior
Querying or filtering on a newly added external field before the external collection has been refreshed/loaded should be rejected with a user-facing and actionable error message.
For example, the error should explain that the field is not available in the currently loaded external snapshot and that the collection must be refreshed/reloaded before the newly added field can be queried.
The query should not silently return fabricated data, and it should not expose an internal segcore assertion.
Actual Behavior
The query fails, which is acceptable for this state, but the returned error message is an internal QueryNode/segcore assert:
This is not meaningful for users. It does not mention the newly added field name (
score), the external collection snapshot state, or the required refresh/reload action.Error Logs
Relevant observed flow from server logs:
Non-default Configuration
No relevant non-default Milvus server configuration was identified. The reproduction uses an external table backed by MinIO.
Analysis Hints
This looks related to the same problem family as:
However, this issue is not a chaos/recovery case. It is a normal external table schema evolution case:
add_collection_fieldsucceeds while the loaded external snapshot still does not contain field data for the newly added field.A reasonable fix could be either:
The important requirement is that users should not see
Assert "column != nullptr"from segcore for this expected invalid state.