Skip to content

Comparing value in TXN on non-existing key always returns failure. #19678

Open
@purpleidea

Description

@purpleidea

Bug report criteria

  • This bug report is not security related, security issues should be disclosed privately via etcd maintainers.
    This is not a support request or question, support requests or questions should be raised in the etcd discussion forums.
    You have read the etcd bug reporting guidelines.
    Existing open issues along with etcd frequently asked questions have been checked and this is not a duplicate.

What happened?

Etcd has a .Succeeded field on a Txn response to know which branch of the transaction (then or else) was run. This is critically important to know what happened.

If you have an OpTxn then the child .Succeeded values are NOT SET and we seem to always see the false zero value.

A full code reproducer is included below.

What did you expect to happen?

At least one of the two branches should see a true. Neither does.

How can we reproduce it (as minimally and precisely as possible)?

Run etcd plainly in a terminal. Then run this code:
I left in my hacking mess for anyone curious. At least one of the "SUCCESS" lines should print true. They always print false.

// etcdbug

package main

import (
	"context"
	"fmt"

	pb "go.etcd.io/etcd/api/v3/etcdserverpb"
	clientv3 "go.etcd.io/etcd/client/v3"
)

func main() {

	client, err := clientv3.New(clientv3.Config{
		Endpoints: []string{"localhost:2379"},
	})
	if err != nil {
		panic(err)
	}
	defer client.Close()

	ops := []clientv3.Op{}
	{
		ifs := []clientv3.Cmp{}
		thn := []clientv3.Op{}
		els := []clientv3.Op{}

		path := "/foo"
		data := "hello"
		ifs = append(ifs, clientv3.Compare(clientv3.Value(path), "!=", data)) // THIS ONE IS !=
		els = append(els, clientv3.OpPut(path, data))

		op := clientv3.OpTxn(ifs, thn, els)
		ops = append(ops, op)
	}
	{
		ifs := []clientv3.Cmp{}
		thn := []clientv3.Op{}
		els := []clientv3.Op{}

		path := "/bar"
		data := "world"
		ifs = append(ifs, clientv3.Compare(clientv3.Value(path), "=", data)) // THIS ONE IS =
		els = append(els, clientv3.OpPut(path, data))

		op := clientv3.OpTxn(ifs, thn, els)
		ops = append(ops, op)
	}

	txn := client.Txn(context.Background())
	txnResp, err := txn.If().Then(ops...).Else().Commit()
	if err != nil {
		panic(err)
	}

	//
	// Who even knows the correct way to pull out nested data from the
	// protobuf mess? Various approaches have been tried, thoughts welcome.
	//

	realOut := txnResp.OpResponse().Txn()

	fmt.Printf("XXX: OUT(%T): %+v\n", realOut, realOut)
	if !realOut.Succeeded {
		// else branch happened
		panic("unexpected branch")
	}

	//responses := realOut.GetResponses() // doesn't work
	responses := realOut.Responses

	fmt.Printf("XXX: NUMBER OF RESPONSES: %+v\n", len(responses))

	for _, resp := range responses {
		fmt.Printf("XXX: RESP(%T): %+v\n", resp, resp)

		x := resp.GetResponse()
		fmt.Printf("X (%T) %+v\n", x, x)

		thing, ok := x.(*pb.ResponseOp_ResponseTxn)
		if !ok {
			panic("woops")
		}

		//x := pb.ResponseOp_ResponseTxn(resp)
		fmt.Printf("THING (%T) %+v\n", thing, thing)

		// eventually we should see `true` in one of the loop iterations
		fmt.Printf("SUCCESS1? %+v\n", thing.ResponseTxn.GetSucceeded())
		fmt.Printf("SUCCESS2? %+v\n", resp.GetResponseTxn().Succeeded)

		fmt.Printf("---------------------------------\n\n")
	}
}

Anything else we need to know?

No response

Etcd version (please run commands below)

$ etcd --version
# paste output here

$ etcdctl version
# paste output here

Etcd configuration (command line flags or environment variables)

paste your configuration here

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

$ etcdctl member list -w table
# paste output here

$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here

Relevant log output

Activity

siyuanfoundation

siyuanfoundation commented on Apr 10, 2025

@siyuanfoundation
Contributor
siyuanfoundation

siyuanfoundation commented on Apr 10, 2025

@siyuanfoundation
Contributor

This is working as expected

// Always fail if comparing a value on a key/keys that doesn't exist;

purpleidea

purpleidea commented on Apr 10, 2025

@purpleidea
ContributorAuthor

This is working as expected

You mis-read the above code. It should be true in at least one variant.

serathius

serathius commented on Apr 11, 2025

@serathius
Member

I think you need to simplify code to provide minimal reproduction.

purpleidea

purpleidea commented on Apr 11, 2025

@purpleidea
ContributorAuthor

I think you need to simplify code to provide minimal reproduction.

Simpler?

TL;DR: There's no way I've found to ever get the .Succeeded variable to be true, when it's a nested OpTxn. I assume those are not getting set and golang default to the zero value for the bool which is false.

siyuanfoundation

siyuanfoundation commented on Apr 14, 2025

@siyuanfoundation
Contributor

@purpleidea Can you help me understand your question?
The db starts empty.
Then in the two nested Txn,
Txn1: if current get("/foo") != "hello", do nothing, else put("/foo", "hello")
Txn2: if current get("/bar") == "world", do nothing, else put("/bar", "world")
And you expect one of Txn1.Succeeded or Txn2.Succeeded should be true?

purpleidea

purpleidea commented on Apr 14, 2025

@purpleidea
ContributorAuthor

The .Succeeded value tells the user which branch in the Txn ran. It should be true if the "then" branch runs, otherwise it should be false if the "else" branch runs.

I've made an example with two Txn's. They should have opposite .Succeeded values. Neither is ever true.

This works properly when using a Txn directly, but it does NOT work when the Txn is nested. That's the bug.

HTH

siyuanfoundation

siyuanfoundation commented on Apr 14, 2025

@siyuanfoundation
Contributor

If you start the db empty, both of the following Txn.Succeeded would return false, because Txn.Succeeded always returns true if the key does not exist, even though logically get("/foo") != "hello" if the key does not exist.
Txn1: if current get("/foo") != "hello", do nothing, else put("/foo", "hello")
Txn2: if current get("/bar") == "world", do nothing, else put("/bar", "world")

If you question is not related to the non-existing key, can you also write down the test case when it works when Txn is used directly as expected?

purpleidea

purpleidea commented on Apr 14, 2025

@purpleidea
ContributorAuthor

@siyuanfoundation Try and write down any code example where the OpTxn has a .Succeeded to true.

serathius

serathius commented on Apr 15, 2025

@serathius
Member

@siyuanfoundation Try and write down any code example where the OpTxn has a .Succeeded to true.

Please note that etcd is an open source and any response you get is by volunteers spending their own personal time to help community. There should be no expectation that someone will read your 100 lines of code and try to understand your intention when you cannot explain it otherwise.

Wrote very simple repro that shows what is happening here:

etcd $ ./bin/etcdctl txn -i 
compares:
val("/key") = ""

success requests (get, put, del):

failure requests (get, put, del):

FAILURE
etcd $ ./bin/etcdctl txn -i 
compares:
val("/key") != ""

success requests (get, put, del):

failure requests (get, put, del):

FAILURE
etcd $ ./bin/etcdctl put "/key" ""
OK
etcd $ ./bin/etcdctl txn -i 
compares:
val("/key") = "" 

success requests (get, put, del):

failure requests (get, put, del):

SUCCESS
etcd $ ./bin/etcdctl txn -i 
compares:
val("/key") != "1"           

success requests (get, put, del):

failure requests (get, put, del):

SUCCESS

You cannot compare value on non existing key, it will always return failure. You need to put the key for value comparison to work.

I expect it works like this to distinguish between key not existing and value being empty. That make "value" comparison not consistent with other type like mod revision.

etcd $ ./bin/etcdctl txn -i 
compares:
mod("/key") = "0"

success requests (get, put, del):

failure requests (get, put, del):

SUCCESS
etcd $ ./bin/etcdctl txn -i 
compares:
mod("/key") != "0"

success requests (get, put, del):

failure requests (get, put, del):

FAILURE
etcd $ ./bin/etcdctl put "/key" ""
OK
etcd $ ./bin/etcdctl txn -i 
compares:
mod("/key") = "0"

success requests (get, put, del):

failure requests (get, put, del):

FAILURE
etcd $ ./bin/etcdctl txn -i 
compares:
mod("/key") != "0"

success requests (get, put, del):

failure requests (get, put, del):

SUCCESS
serathius

serathius commented on Apr 15, 2025

@serathius
Member

To fix this we would need to treat a non existing key as key with empty value. This solves fixes the issue with the equality comparison not being symmetrical, but breaks detecting if key exists for empty value. We have two options, either:

  • Be ok with breaking empty value case.
  • Forbid setting empty value.

No matter what we do, this will be a breaking change for user if they depend on success result. The options differ by if change the response in a way it's not visible to user or explicitly return error. I would prefer that if we commit to make a breaking change we should make it explicit.

cc @ahrtr @fuweid for opinion.

changed the title [-]OpTxn .Succeeded value is not set![/-] [+]TXN comparing value on non existing key always returns failure.[/+] on Apr 15, 2025
changed the title [-]TXN comparing value on non existing key always returns failure.[/-] [+]Comparing value in TXN on non-existing key always returns failure.[/+] on Apr 15, 2025
ahrtr

ahrtr commented on Apr 15, 2025

@ahrtr
Member

@siyuanfoundation Try and write down any code example where the OpTxn has a .Succeeded to true.

@purpleidea Just executing your example twice, then you will get a .Succeeded to true.

$ go run main.go 
XXX: OUT(*clientv3.TxnResponse): &{Header:cluster_id:14841639068965178418 member_id:10276657743932975437 revision:2 raft_term:2  Succeeded:true Responses:[response_txn:<header:<> responses:<response_put:<header:<revision:2 > > > >  response_txn:<header:<> responses:<response_put:<header:<revision:2 > > > > ] XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
XXX: NUMBER OF RESPONSES: 2
XXX: RESP(*etcdserverpb.ResponseOp): response_txn:<header:<> responses:<response_put:<header:<revision:2 > > > > 
X (*etcdserverpb.ResponseOp_ResponseTxn) &{ResponseTxn:header:<> responses:<response_put:<header:<revision:2 > > > }
THING (*etcdserverpb.ResponseOp_ResponseTxn) &{ResponseTxn:header:<> responses:<response_put:<header:<revision:2 > > > }
SUCCESS1? false
SUCCESS2? false
---------------------------------

XXX: RESP(*etcdserverpb.ResponseOp): response_txn:<header:<> responses:<response_put:<header:<revision:2 > > > > 
X (*etcdserverpb.ResponseOp_ResponseTxn) &{ResponseTxn:header:<> responses:<response_put:<header:<revision:2 > > > }
THING (*etcdserverpb.ResponseOp_ResponseTxn) &{ResponseTxn:header:<> responses:<response_put:<header:<revision:2 > > > }
SUCCESS1? false
SUCCESS2? false
---------------------------------

$ etcdctl get foo
foo
hello
$ etcdctl get bar
bar
world

$ go run main.go 
XXX: OUT(*clientv3.TxnResponse): &{Header:cluster_id:14841639068965178418 member_id:10276657743932975437 revision:3 raft_term:2  Succeeded:true Responses:[response_txn:<header:<> responses:<response_put:<header:<revision:3 > > > >  response_txn:<header:<> succeeded:true > ] XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
XXX: NUMBER OF RESPONSES: 2
XXX: RESP(*etcdserverpb.ResponseOp): response_txn:<header:<> responses:<response_put:<header:<revision:3 > > > > 
X (*etcdserverpb.ResponseOp_ResponseTxn) &{ResponseTxn:header:<> responses:<response_put:<header:<revision:3 > > > }
THING (*etcdserverpb.ResponseOp_ResponseTxn) &{ResponseTxn:header:<> responses:<response_put:<header:<revision:3 > > > }
SUCCESS1? false
SUCCESS2? false
---------------------------------

XXX: RESP(*etcdserverpb.ResponseOp): response_txn:<header:<> succeeded:true > 
X (*etcdserverpb.ResponseOp_ResponseTxn) &{ResponseTxn:header:<> succeeded:true }
THING (*etcdserverpb.ResponseOp_ResponseTxn) &{ResponseTxn:header:<> succeeded:true }
SUCCESS1? true
SUCCESS2? true
---------------------------------

To fix this we would need to treat a non existing key as key with empty value. This solves fixes the issue with the equality comparison not being symmetrical, but breaks detecting if key exists for empty value. We have two options, either:

  • Be ok with breaking empty value case.
  • Forbid setting empty value.

No matter what we do, this will be a breaking change for user if they depend on success result. The options differ by if change the response in a way it's not visible to user or explicitly return error. I would prefer that if we commit to make a breaking change we should make it explicit.

cc @ahrtr @fuweid for opinion.

I would suggest not to change any existing behaviour. The proposed changes/actions:

  • [required] Update document to clarify that comparing a value on a non-exist key/keys will always lead to .Succeeded be false
  • [optional] Extend the Compare to support the Semantics if the key exist or not. For example, @purpleidea does this make sense to you?
    • key("foo") exist
    • key("foo") not_exist
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @purpleidea@serathius@siyuanfoundation@ahrtr

        Issue actions

          Comparing value in TXN on non-existing key always returns failure. · Issue #19678 · etcd-io/etcd