Description
When a client is subscribed to a sharded-pubsub channel in cluster mode and the slot is moved to another shard, or deleted, the client receives a spontaneous sunsubscribe
push message.
If a client has just sent a SUNSUBSCRIBE command, the client cannot know if the sunsubscribe
message is a response to the command or a sponaneous message.
In the following scenario, client 1 believes that the [sunsubscribe, ch, 0]
push message is received as a response to SUNSUBSCRIBE. The CLUSTERDOWN error reply remains to be read and it appears to be out of sync, i.e. to the client it doesn't appear to match a command it has sent. (If the client has sent another command in the pipeline, the CLUSTERDOWN appears to be its reply.)
Client 1 Client 2 Primary
| | |
| | DELSLOT |
| |---------------->|
| SUNSUBSCRIBE ch | |
|------------------------------------------->|
| | OK |
| [sunsubscribe, ch, 0] |<----------------|
|<-------------------------------------------|
| | |
| -CLUSTERDOWN | |
|<-------------------------------------------|
| | |
Originally posted by @zuiderkwast in #759 (comment) (but edited)
Test case
This test case passes, i.e. it illustrates what the clients actually see.
diff --git a/tests/unit/cluster/pubsubshard-slot-migration.tcl b/tests/unit/cluster/pubsubshard-slot-migration.tcl
index c5a324f09..26d6afe56 100644
--- a/tests/unit/cluster/pubsubshard-slot-migration.tcl
+++ b/tests/unit/cluster/pubsubshard-slot-migration.tcl
@@ -187,6 +187,32 @@ test "Delete a slot, verify sunsubscribe message" {
$subscribeclient close
}
+test "Slot deleted and unsubscribed simulaneously" {
+ set channelname ch5
+ set slot [$cluster cluster keyslot $channelname]
+
+ array set primary_client [$cluster masternode_for_slot $slot]
+
+ set subscribeclient [valkey_deferring_client_by_addr $primary_client(host) $primary_client(port)]
+ $subscribeclient HELLO 3
+ $subscribeclient read
+ $subscribeclient SSUBSCRIBE $channelname
+ $subscribeclient read
+
+ # Delete a slot.
+ assert_equal OK [$primary_client(link) CLUSTER DELSLOTS $slot]
+
+ # Send in a pipeline SUNSUBSCRIBE and DBSIZE
+ $subscribeclient SUNSUBSCRIBE $channelname
+ $subscribeclient DBSIZE
+
+ # We expect one reply per command, plus an implicit sunsubscribed message.
+ assert_equal "sunsubscribe $channelname 0" [$subscribeclient read]
+ catch {$subscribeclient read} e
+ assert_equal "CLUSTERDOWN Hash slot not served" $e
+ assert_equal 0 [$subscribeclient read]
+}
+
test "Reset cluster, verify sunsubscribe message" {
set channelname ch4
set slot [$cluster cluster keyslot $channelname]