Skip to content

FT.INTERNAL_UPDATE crashes server (SIGSEGV) on AOF load even with a valid entry; module writes these itself -> restart crash-loop #1108

@madolson

Description

@madolson

Description

A node crashes with a SIGSEGV while loading its AOF if the AOF contains an FT.INTERNAL_UPDATE entry — even a perfectly valid (empty) one. The module itself writes FT.INTERNAL_UPDATE to the AOF for index-metadata persistence in coordinator mode (MetadataManager::ReplicateFTInternalUpdate), so a node that persisted index metadata can crash-loop on restart / AOF reload.

Steps to reproduce

Build a multi-part AOF containing a valid (empty) FT.INTERNAL_UPDATE and load it:

mkdir -p /tmp/poisonaof/appendonlydir
printf 'file appendonly.aof.1.incr.aof seq 1 type i\n' > /tmp/poisonaof/appendonlydir/appendonly.aof.manifest
printf '*2\r\n$6\r\nSELECT\r\n$1\r\n0\r\n*4\r\n$18\r\nFT.INTERNAL_UPDATE\r\n$3\r\nidx\r\n$0\r\n\r\n$0\r\n\r\n' > /tmp/poisonaof/appendonlydir/appendonly.aof.1.incr.aof
docker run -d --name n -v /tmp/poisonaof:/data -e VALKEY_EXTRA_FLAGS="--appendonly yes" valkey/valkey-bundle:unstable
sleep 5
docker inspect n --format='{{.State.ExitCode}}'   # -> 139

The entry is a valid empty protobuf, so this is not a malformed-input problem.

Crash output

valkey ... crashed by signal: 11, si_code: 1
  TriggerCallbacks at src/coordinator/metadata_manager.cc:168
  <- CreateEntryOnReplica at src/coordinator/metadata_manager.cc:925
  <- FTInternalUpdateCmd at src/commands/ft_internal_update.cc:86
  <- loadSingleAppendOnlyFile

Root cause

Under VALKEYMODULE_CTX_FLAGS_LOADING, FTInternalUpdateCmd (src/commands/ft_internal_update.cc:82-88) calls MetadataManager::CreateEntryOnReplica, which calls TriggerCallbacks (src/coordinator/metadata_manager.cc:165-168). At metadata_manager.cc:168 the registered update_callback is invoked, but the callback / schema-manager state is not safely usable during early AOF replay, producing a null/invalid dereference. This is a load-path lifecycle bug, not input validation.

Suggested fix

Guard the callback/schema-manager state before invoking update_callback during loading, or defer metadata-entry application until load completes.

Environment

  • valkey-search origin/main (8c260db); also reproduces on valkey/valkey-bundle:unstable
  • Confirmed live on current main: no recent commit touches ft_internal_update.cc or metadata_manager.cc.

Related: see the companion issue on corrupt-AOF FT.INTERNAL_UPDATE handling.


This issue was generated by AI but verified, with love, by a human.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions