Description
A node crashes with a SIGSEGV while loading its AOF if the AOF contains an FT.INTERNAL_UPDATE entry — even a perfectly valid (empty) one. The module itself writes FT.INTERNAL_UPDATE to the AOF for index-metadata persistence in coordinator mode (MetadataManager::ReplicateFTInternalUpdate), so a node that persisted index metadata can crash-loop on restart / AOF reload.
Steps to reproduce
Build a multi-part AOF containing a valid (empty) FT.INTERNAL_UPDATE and load it:
mkdir -p /tmp/poisonaof/appendonlydir
printf 'file appendonly.aof.1.incr.aof seq 1 type i\n' > /tmp/poisonaof/appendonlydir/appendonly.aof.manifest
printf '*2\r\n$6\r\nSELECT\r\n$1\r\n0\r\n*4\r\n$18\r\nFT.INTERNAL_UPDATE\r\n$3\r\nidx\r\n$0\r\n\r\n$0\r\n\r\n' > /tmp/poisonaof/appendonlydir/appendonly.aof.1.incr.aof
docker run -d --name n -v /tmp/poisonaof:/data -e VALKEY_EXTRA_FLAGS="--appendonly yes" valkey/valkey-bundle:unstable
sleep 5
docker inspect n --format='{{.State.ExitCode}}' # -> 139
The entry is a valid empty protobuf, so this is not a malformed-input problem.
Crash output
valkey ... crashed by signal: 11, si_code: 1
TriggerCallbacks at src/coordinator/metadata_manager.cc:168
<- CreateEntryOnReplica at src/coordinator/metadata_manager.cc:925
<- FTInternalUpdateCmd at src/commands/ft_internal_update.cc:86
<- loadSingleAppendOnlyFile
Root cause
Under VALKEYMODULE_CTX_FLAGS_LOADING, FTInternalUpdateCmd (src/commands/ft_internal_update.cc:82-88) calls MetadataManager::CreateEntryOnReplica, which calls TriggerCallbacks (src/coordinator/metadata_manager.cc:165-168). At metadata_manager.cc:168 the registered update_callback is invoked, but the callback / schema-manager state is not safely usable during early AOF replay, producing a null/invalid dereference. This is a load-path lifecycle bug, not input validation.
Suggested fix
Guard the callback/schema-manager state before invoking update_callback during loading, or defer metadata-entry application until load completes.
Environment
- valkey-search
origin/main (8c260db); also reproduces on valkey/valkey-bundle:unstable
- Confirmed live on current
main: no recent commit touches ft_internal_update.cc or metadata_manager.cc.
Related: see the companion issue on corrupt-AOF FT.INTERNAL_UPDATE handling.
This issue was generated by AI but verified, with love, by a human.
Description
A node crashes with a SIGSEGV while loading its AOF if the AOF contains an
FT.INTERNAL_UPDATEentry — even a perfectly valid (empty) one. The module itself writesFT.INTERNAL_UPDATEto the AOF for index-metadata persistence in coordinator mode (MetadataManager::ReplicateFTInternalUpdate), so a node that persisted index metadata can crash-loop on restart / AOF reload.Steps to reproduce
Build a multi-part AOF containing a valid (empty)
FT.INTERNAL_UPDATEand load it:The entry is a valid empty protobuf, so this is not a malformed-input problem.
Crash output
Root cause
Under
VALKEYMODULE_CTX_FLAGS_LOADING,FTInternalUpdateCmd(src/commands/ft_internal_update.cc:82-88) callsMetadataManager::CreateEntryOnReplica, which callsTriggerCallbacks(src/coordinator/metadata_manager.cc:165-168). At metadata_manager.cc:168 the registeredupdate_callbackis invoked, but the callback / schema-manager state is not safely usable during early AOF replay, producing a null/invalid dereference. This is a load-path lifecycle bug, not input validation.Suggested fix
Guard the callback/schema-manager state before invoking
update_callbackduring loading, or defer metadata-entry application until load completes.Environment
origin/main(8c260db); also reproduces onvalkey/valkey-bundle:unstablemain: no recent commit touchesft_internal_update.ccormetadata_manager.cc.Related: see the companion issue on corrupt-AOF
FT.INTERNAL_UPDATEhandling.This issue was generated by AI but verified, with love, by a human.