Skip to content

Conversation

@tengu-alt
Copy link
Contributor

@tengu-alt tengu-alt commented Jan 13, 2025

Fix for the #1370

Panic from the ConnectAddress() was removed.
HostInfo struct creation was refactored to create via constructor to make sure the address is valid.

@tengu-alt tengu-alt force-pushed the refactor-connect-address-method branch from 6f8773a to 8aae39b Compare January 14, 2025 08:25
Copy link
Contributor

@joao-r-reis joao-r-reis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments. The host selection / load balancing policies need some changes, we shouldn't assume the address is always valid on those implementations.

if hi.host.ConnectAddress().String() == host {
connAddr, err := hi.host.ConnectAddress()
if err != nil {
t.Error(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this should be t.Fatal so the behavior of this test is consistent after the change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, fixed

control.go Outdated
for _, host := range hosts {
connAddr, err := host.ConnectAddress()
if err != nil {
c.session.logger.Printf("gocql: %v\n", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually we add a message for this error before we add the actual error string so something like:
c.session.logger.Printf("gocql: unable to use host for control connection, skipping it: %v\n", err)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

control.go Outdated
for _, host := range hosts {
connAddr, err := host.ConnectAddress()
if err != nil {
c.session.logger.Printf("gocql: %v\n", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as comment above

control_test.go Outdated
t.Errorf("expected ip %v got %v for addr %q", test.ip, host.ConnectAddress(), test.addr)
connAddr, err := host.ConnectAddress()
if err != nil {
t.Errorf("%d: %v", i, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t.Errorf("could not get connect address of host %q to compare with expected: %v", test.addr, err)

filters.go Outdated
m := make(map[string]bool, len(hostInfos))
for _, host := range hostInfos {
m[host.ConnectAddress().String()] = true
connAddr, _ := host.ConnectAddress()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if err != nil I don't think it should be whitelisted, it should just continue

policies.go Outdated
}

func (d *dcAwareRR) RemoveHost(host *HostInfo) {
connAddr, _ := host.ConnectAddress()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just return if unable to get conn addr but also need to add a check for this in AddHost so hosts with invalid connect addresses are not added

policies.go Outdated
dist := d.HostTier(host)
d.hosts[dist].remove(host.ConnectAddress())
connAddr, _ := host.ConnectAddress()
d.hosts[dist].remove(connAddr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, check in AddHost so we don't add hosts with invalid addresses and then we can just return here when unable to get connAddr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

session.go Outdated
buf := bytes.NewBufferString("Session.ring:")
for _, h := range hosts {
buf.WriteString("[" + h.ConnectAddress().String() + ":" + h.State().String() + "]")
connAddr, _ := h.ConnectAddress()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't really ignore these errors, it's better to handle them than assuming that connAddr is always valid, just continue if unable to get connAddr

session.go Outdated
func (qm *queryMetrics) hostMetricsLocked(host *HostInfo) *hostMetrics {
metrics, exists := qm.m[host.ConnectAddress().String()]
connAddr, _ := host.ConnectAddress()
metrics, exists := qm.m[connAddr.String()]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely should not be creating metrics objects for hosts with invalid connect addresses since they won't be used anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, fixed

token.go Outdated
buf.WriteString(th.token.String())
buf.WriteString(":")
buf.WriteString(th.host.ConnectAddress().String())
buf.WriteString(connAddr.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this print an empty string if err != nil ? Printing empty string on the address part could be fine but we could also print nil instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I have checked, it will print a nil

@joao-r-reis
Copy link
Contributor

Also can you create a JIRA for this?

@tengu-alt
Copy link
Contributor Author

Also can you create a JIRA for this?

Of course, currently working on refactoring.

@tengu-alt tengu-alt force-pushed the refactor-connect-address-method branch from 8aae39b to c884b90 Compare January 16, 2025 09:43
@tengu-alt tengu-alt changed the title ConnectAddress() was refactored CASSGO-45 - ConnectAddress() was refactored Jan 16, 2025
@tengu-alt tengu-alt changed the title CASSGO-45 - ConnectAddress() was refactored CASSGO-45 ConnectAddress() was refactored Jan 16, 2025
@joao-r-reis
Copy link
Contributor

I just had a realization that we are probably going about this the wrong way. Instead of changing ConnectAddress() to return an error we should probably ensure that we never create Host objects with invalid addresses in the first place, I think that would result in a clearer API and less intrusive changes to users.

Sorry for not realizing this earlier, I know you already worked a bit on this approach but I really think we should change the approach here.

First step is to find all occurences of straight Host struct initialization and replace them with a newHostInfo(addr net.IP, port int) (*Host, error) so that the ip validation occurs here in this method instead of host.ConnectAddress(). The only place where it will require some more changes is in func (r *ringDescriber) getClusterPeerInfo(localHost *HostInfo) ([]*HostInfo, error) because here the object is created without an IP address since it will be set after reading the columns from system.peers. A possible approach is to move the logic from h.connectAddressLocked() to a function outside of the hostinfo type, use it to compute the connect address and then use the result to create the host info object afterwards. After this change, I think h.ConnectAddress() can always read the value from the h.connectAddress field. After this change, h.SetConnectAddress should be changed to return an error in case the provided address is invalid (this is only in case a user is using this function because the driver itself doesn't use it).

@tengu-alt
Copy link
Contributor Author

Understood, I will work on it.

@tengu-alt
Copy link
Contributor Author

tengu-alt commented Jan 29, 2025

A possible approach is to move the logic from h.connectAddressLocked() to a function outside of the hostinfo type, use it to compute the connect address and then use the result to create the host info object afterwards.

I don't clearly understand how we can use the logic from the h.connectAddressLocked() if it computes connectAddress according to the IP's from the host which needs to be filled. As I see those IP's are filled inside of the hostInfoFromMap() method which requires *HostInfo.

I think it is better to create raw structure here as it is now, and add the validation inside of the hostInfoFromMap()
something like this:

	ip, port := s.cfg.translateAddressPort(host.ConnectAddress(), host.port)
	if !validIpAddr(ip) {
		return nil, errors.New("invalid connect address")
	}
	host.connectAddress = ip
	host.port = port

or validate it inside of getClusterPeerInfo()

@tengu-alt tengu-alt changed the title CASSGO-45 ConnectAddress() was refactored WIP CASSGO-45 ConnectAddress() was refactored Jan 29, 2025
@joao-r-reis
Copy link
Contributor

I don't clearly understand how we can use the logic from the h.connectAddressLocked() if it computes connectAddress according to the IP's from the host which needs to be filled. As I see those IP's are filled inside of the hostInfoFromMap() method which requires *HostInfo.

You would have those IPs as parameters of the function instead for example.

I think it is better to create raw structure here as it is now, and add the validation inside of the hostInfoFromMap()
something like this:

This can also work, as long as we guarantee that we don't have HostInfo objects going around with invalid IPs then I'm ok with it. I suggested the NewHostInfo function and replacing direct struct initialization with this function call because it would be a guaranteed way to ensure HostInfo objects always have valid IP addresses in the future

@tengu-alt tengu-alt force-pushed the refactor-connect-address-method branch from c884b90 to ffe9660 Compare January 30, 2025 15:29
@tengu-alt
Copy link
Contributor Author

@joao-r-reis I've made an update. I decided to add validation into hostInfoFromMap() for cases when we can't provide connectAddress for the constructor func (I have found another place inside of func (r *ringDescriber) getLocalHostInfo() (*HostInfo, error))

Copy link
Contributor

@joao-r-reis joao-r-reis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment about existing HostInfo struct initialization references in the codebase

host_source.go Outdated

ip, port := s.cfg.translateAddressPort(host.ConnectAddress(), host.port)
if !validIpAddr(ip) {
return nil, errors.New("invalid connect address")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hostAddr := host.ConnectAddress()
return nil, fmt.Errorf("invalid connect address after translating %v:%v", hostAddr.String(), host.port)

conn.go Outdated

for _, row := range rows {
host, err := c.session.hostInfoFromMap(row, &HostInfo{connectAddress: c.host.ConnectAddress(), port: c.session.cfg.Port})
h, err := newHostInfo(c.host.connectAddress, c.session.cfg.Port)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c.host.ConnectAdress()

@joao-r-reis joao-r-reis changed the title WIP CASSGO-45 ConnectAddress() was refactored CASSGO-45 Do not create hosts from invalid ip addresses Jan 31, 2025
@joao-r-reis joao-r-reis changed the title CASSGO-45 Do not create hosts from invalid ip addresses CASSGO-45 Return error instead of panic when host address is invalid Jan 31, 2025
@tengu-alt tengu-alt force-pushed the refactor-connect-address-method branch from ffe9660 to 9e7fbd5 Compare February 5, 2025 13:10
@tengu-alt tengu-alt requested a review from joao-r-reis February 5, 2025 13:12
host_source.go Outdated

func (h *HostInfo) SetConnectAddress(address net.IP) *HostInfo {
// TODO(zariel): should this not be exported?
func (h *HostInfo) SetConnectAddress(address net.IP) (*HostInfo, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly think we should just remove this method, what do you think @jameshartig ? Allowing users to just modify the connect address on a host is weird, there's already an AddressTranslator interface for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically I'd like to go through a version or so of it being deprecated but since we're about to cut a major release I think it makes sense to remove it and we can always add it back if there's a reason to.

Copy link
Contributor

@joao-r-reis joao-r-reis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of comments

host_source.go Outdated

ip, port := s.cfg.translateAddressPort(host.ConnectAddress(), host.port)
if !validIpAddr(ip) {
return nil, fmt.Errorf("invalid connect address after translating %v:%v", ip.String(), host.port)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized this will be called even if there is no address translator so maybe changing this error message to be less confusing:

"invalid host address (before translation: %v:%v, after translation: %v:%v)", host.ConnectAddress(), host.port, ip.String(), port)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored

host_source.go Outdated
} else if h.preferredIP != nil {
return h.preferredIP, "preferred_ip"
} else if validIpAddr(h.broadcastAddress) {
} else if h.broadcastAddress != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep the validIpAddr check on all of these (except the last) so that the "fallback" priority still works otherwise this would be a behavior change and could cause bugs.

Just skip the check on the last one (h.peer), this will be checked either way in hostInfoFromMap before returning the host info object so we're good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@tengu-alt tengu-alt force-pushed the refactor-connect-address-method branch from 9e7fbd5 to 12406bf Compare February 6, 2025 10:17
Copy link
Contributor

@jameshartig jameshartig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall good, but I agree, let's remove SetConnectAddress

@tengu-alt tengu-alt force-pushed the refactor-connect-address-method branch from 12406bf to 55525fe Compare February 7, 2025 09:17
@tengu-alt
Copy link
Contributor Author

Done.

@jameshartig
Copy link
Contributor

jameshartig commented Feb 13, 2025

@joao-r-reis want to give your approval?

@tengu-alt tengu-alt force-pushed the refactor-connect-address-method branch from 55525fe to ad5de46 Compare February 17, 2025 09:31
HostInfo struct creation was refactored to create via constructor to make sure the connectAddress is valid.
Panic in case of invalid connect address inside of ConnectAddress() method was removed.

patch by Oleksandr Luzhniy; reviewed by João Reis, James Hartig, for CASSGO-45
@tengu-alt tengu-alt force-pushed the refactor-connect-address-method branch from ad5de46 to 744871b Compare February 17, 2025 09:32
@tengu-alt tengu-alt changed the title CASSGO-45 Return error instead of panic when host address is invalid CASSGO-45 ConnectAddress() should not panic and HostInfo should not be created with invalid connect address Feb 17, 2025
@tengu-alt
Copy link
Contributor Author

@joao-r-reis I have updated the changelog and the commit message.

@joao-r-reis joao-r-reis merged commit 65e2caf into apache:trunk Feb 18, 2025
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants