Skip to content

Conversation

@worryg0d
Copy link
Contributor

@worryg0d worryg0d commented Nov 28, 2025

Protocol version negotiation doesn't work if server replies with stream id different than 0

Previously, protocol negotiation didn't work properly when C* was responding with stream id different from 0.

This patch changes the way protocol negotiation works. Instead of parsing a supported protocol version from C* error response, the driver tries to connect with each supported protocol starting from the latest.

Patch by Bohdan Siryk; Reviewed by <> for CASSGO-98

@worryg0d
Copy link
Contributor Author

WIP: currently working on tests for the whole protocol negotiation logic....

…am id different than 0

Previously, protocol negotiation didn't work properly when C* was responding with stream id different from 0.

This patch changes the way protocol negotiation works. Instead of parsing a supported protocol version from C* error response, the driver tries to connect with each supported protocol starting from the latest.

Patch by Bohdan Siryk; Reviewed by <> for CASSGO-98
@worryg0d worryg0d marked this pull request as ready for review November 28, 2025 11:50
@worryg0d
Copy link
Contributor Author

Ready for review

Copy link
Contributor

@joao-r-reis joao-r-reis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good

}
if err == nil {
c.session.logger.Debug("Discovered protocol version using host.",
NewLogFieldInt("protocol_version", connCfg.ProtoVersion), NewLogFieldIP("host_addr", host.ConnectAddress()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any particular reason you removed the host_id log field from both calls? Is it because it's not populated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is not populated, so there is really no useful information


// if a client uses an unsupported protocol version, we respond with an error
if !r.supportsProtocol(reqFrame.header.version) {
respFrame.writeHeader(0, opError, reqFrame.header.stream)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have a test case that replies with the stream id of the request in addition to this one? This behavior is not defined in the spec and I think some C* versions might behave differently

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please clarify what you exactly mean? The test server already replies with the stream id of the request

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I'm sorry I thought it was replying with stream id 0, in that case then can you add a test case where the response stream id is 0 and the request stream id is not 0?

c.session.logger.Debug("Discovered protocol version using host after parsing protocol error.",
NewLogFieldInt("protocol_version", proto), NewLogFieldIP("host_addr", host.ConnectAddress()), NewLogFieldString("host_id", host.HostID()))
return proto, nil
c.session.logger.Debug("Failed to connect to the host using protocol version.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should only attempt to reconnect to the same host with a lower protocol version if the error is assumed to be related to unsupported protocol version:

  • t's an error to the first request (OPTIONS)
  • the error type is PROTOCOL_ERROR or SERVER_ERROR - SERVER_ERROR is for old C* versions that reported this error as SERVER_ERROR instead of PROTOCOL_ERROR

I propose we use a custom internal error type (unsupportedProtocolVersionErr for example) just for this case that we can return inside the dial method and we test for it here in discoverProtocol(). Reference code in the java driver here.

Technically the java driver does check for the error string which I didn't know but I still believe we shouldn't do that, checking the above 2 conditions is enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, we don't really have to re-try each time if the error is not protocol-related

//go:build all || unit
// +build all unit

package gocql
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add license header, we should add a check for this in CI at some point

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

return nil
}

func TestProtocolNegotiation(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good job with this test, I feel like it was a weak spot in our test suite and now it's much better. Can you add -v to the go test command in the GH workflow step that runs the unit tests so we can double check which tests are running?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I'll add verbose for unit tests to ci

@worryg0d
Copy link
Contributor Author

Thanks, @joao-r-reis, for your feedback. I'm going to address this next week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants