-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor aro-dnsmasq-pre.sh to not overwrite /etc/resolv.conf #4100
base: master
Are you sure you want to change the base?
Conversation
/azp run ci,e2e |
Azure Pipelines successfully started running 2 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What testing do you think we should do before this merges?
Do we need to make a corresponding change to what the installer puts down? My understanding is that for new clusters, the changes in our operator won't get applied until the cluster is first upgraded, and the cluster will run with what the installer has until then. |
@tsatam When the Operator is first installed, it is set to allow all reconciliations, and then that is switched to only on upgrades at the end of the install process. So, this will apply to new clusters (at the cost of a reboot + install time, so we should also update the installer wrapper). |
830431a
to
2c2ca5c
Compare
I've tested this now in a UDR + misconfigured DNS cluster (vnet dns = 172.16.0.0). I set After all nodes roll out they have the following config, which looks good
I made sure all the cluster operators were healthy, and worker machinesets can scale up. N.B. even with this change we still end up touching /etc/resolv.conf with dnsmasq.service's ExecStopPost=/bin/bash -c '/bin/mv /etc/resolv.conf.dnsmasq /etc/resolv.conf; /usr/sbin/restorecon /etc/resolv.conf' I'll fix that up to delete |
A concern I have with this is we're losing |
Ok I've fixed the search domain by adding a
|
Companion installer PR openshift/installer-aro-wrapper#255 |
pkg/operator/controllers/dnsmasq/scripts/aro-dnsmasq-pre.sh.gotmpl
Outdated
Show resolved
Hide resolved
echo "$LOCAL_IPS_RAW" | while read -r line | ||
do | ||
echo "nameserver $line" | cut -d'/' -f 1 >> $TMPSELFRESOLV | ||
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless you were having trouble with the code to retrieve the search domains and IP addresses, I'd be tempted to keep it the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have problems with the existing code. It tries to make guesses about which network interface to use based on if the interface br-ex exists. We've seen a number of instances where this fails, particularly if the service startup order changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zaneb Another approach to finding the search domain
for DEV in $(nmcli --fields device,state,type --terse device | awk 'BEGIN {FS=":"} ; {if ($2 == "connected") { print $1 }}'); do nmcli dev sho $DEV | awk 'BEGIN {FS=":\\s*"}; { if ($1 ~ /DOMAIN/ && $2 ~ /.+/) { print $2} }'; done | sort -u
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, non-determinism is definitely not what you want here 😄
Looping over all interfaces doesn't look that bad though.
Which issue this PR addresses:
Fixes ARO-15180
Companion installer PR openshift/installer-aro-wrapper#255
Derived from the method in https://github.com/openshift/machine-config-operator/blob/master/templates/common/gcp/files/usr-local-bin-update-dns-server.yaml
What this PR does / why we need it:
We've been overwriting
/etc/resolv.conf
. NetworkManager owns this file and if NetworkManager needs to refresh it we will lose our changes. Instead, create a NetworkManager drop-in/etc/NetworkManager/conf.d/dns-servers.conf
with the node's IP.Test plan for issue:
Is there any documentation that needs to be updated for this PR?
No, but the change needs to be socialized amongst ARO SRE since it affects how nameservers are managed.
How do you know this will function as expected in production?
Testing has been done with an extant UDR+bad dns cluster to ensure there are no external DNS dependencies. Nodes boot and scale correctly.