-
Notifications
You must be signed in to change notification settings - Fork 88
Let NetworkManager manage resolv.conf after all downloads #1001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Changing NetworkManager to manage resolv.conf results in resolv.conf entries by cloud-init overwritten when using unprovisioned nodes. Let's change it just beforre running os-net-config so that tasks to download packages does not fail. jira: https://issues.redhat.com/browse/OSPRH-19018 Signed-off-by: rabi <[email protected]>
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/08b46ef8ba9540919ed2bc89543da1fe ✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 52m 59s |
|
recheck |
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/e54e5842f4674649be9909eec7e07e09 ✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 55m 41s |
|
recheck |
slagle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but the nfv folks should probably review as well since they originally put this change in.
@dsneddon @Jaganathancse
| retries: "{{ edpm_network_config_download_retries }}" | ||
| delay: "{{ edpm_network_config_download_delay }}" | ||
|
|
||
| - name: Import DNS NetworkManager configs tasks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Rabi for pushing this, came across this while looking a issue in Cu deployment due to this.
Looks like it will clear the issue with success cases i.e os-net-config apply succeeds. But if that fails for any reason next configure-network should fail and stuck as before without manual intervention, wdyt or i am missing something here? Considering it only changes resolv.conf parts may be we can move this after os-net-config runs ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, those changes can not be moved after os-net-config run as the dns changes would be done by network-manager duing os-net-config run. I think openstack-k8s-operators/edpm-image-builder#85 would fix all cases as reloading network-manager any number of times won't have any impact after that, but for customer using images without that this would be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But considering it moves without any new dns config applied on system during os-net-config run with edpm_bootstrap_network_resolvconf_update=false, looks like moving it at later stage should work just that new config will roll out a step later.
And we even can't use edpm_bootstrap_network_resolvconf_update=false as have other issues with nmstate enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though theoritically if we reload NetworkManager after os-net-config run it should update resolv.conf, I've not tested if not removing dns=none from 99-cloud-init.conf when os-net-config runs have any impact. If this can be tested and works we can move it after os-net-config run. Feel free to update the PR. However, IMO we should probably focus on openstack-k8s-operators/edpm-image-builder#85 which woud fix without any of these hacks, assuming we want everyone to move to nmstate provider with os-net-config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, failed os-net-config run would normally mess up things anyway, so I won't overly bother about that use-case here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked more on this and proposed openstack-k8s-operators/openstack-baremetal-operator#316 , let's see if that goes fine too
Not sure if openstack-k8s-operators/edpm-image-builder#85 will work in this scenario considering above patch, or is that validated already in this scenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if openstack-k8s-operators/edpm-image-builder#85 will work.
I think I've tested it.
[cloud-admin@edpm-compute-0 ~]$ cat /etc/resolv.conf
; Created by cloud-init automatically, do not edit.
;
nameserver 192.168.122.80
search ctlplane.example.com
a. Remove dns=none
[cloud-admin@edpm-compute-0 ~]$ NetworkManager --print-config
# NetworkManager configuration: /etc/NetworkManager/NetworkManager.conf, /usr/lib/NetworkManager/conf.d/00-server.conf, /etc/NetworkManager/conf.d/99-cloud-init.conf
[main]
# plugins=
# rc-manager=auto
# migrate-ifcfg-rh=false
# auth-polkit=true
# dhcp=internal
# iwd-config-path=
no-auto-default=*
ignore-carrier=*
configure-and-quit=no
[logging]
# backend=journal
# audit=false
[device]
# wifi.backend=wpa_supplicant
# no-auto-default file "/var/lib/NetworkManager/no-auto-default.state"
[cloud-admin@edpm-compute-0 ~]$ sudo systemctl reload NetworkManager
[cloud-admin@edpm-compute-0 ~]$ cat /etc/resolv.conf
# Generated by NetworkManager
b. Update renderor
[cloud-admin@edpm-compute-0 ~]$ cat /etc/cloud/cloud.cfg | grep renderer
renderers: ['network-manager', 'sysconfig', 'eni', 'netplan', 'networkd']
[cloud-admin@edpm-compute-0 ~]$ sudo cloud-init clean --logs --reboot
Connection to 192.168.122.100 closed by remote host.
Connection to 192.168.122.100 closed.
c. clean the network configured with sysconfig earlier (this won't be required when image has the cloud.cfg changes)
[cloud-admin@edpm-compute-0 ~]$ nmcli connection
NAME UUID TYPE DEVICE
System enp1s0 c0ab6b8c-0eac-a1b4-1c47-efe4b2d1191f ethernet enp1s0
lo a3135099-7820-4f7a-94f5-c48210ae43eb loopback lo
cloud-init enp1s0 a41601f3-3acc-5f60-ac5f-9d9011ab7c25 ethernet --
ens3 35f7245a-9a2b-4111-ba3e-b6fb322a1f25 ethernet --
[cloud-admin@edpm-compute-0 ~]$ sudo nmcli connection delete c0ab6b8c-0eac-a1b4-1c47-efe4b2d1191f
Connection 'System enp1s0' (c0ab6b8c-0eac-a1b4-1c47-efe4b2d1191f) successfully deleted.
[cloud-admin@edpm-compute-0 ~]$ nmcli connection
NAME UUID TYPE DEVICE
cloud-init enp1s0 a41601f3-3acc-5f60-ac5f-9d9011ab7c25 ethernet enp1s0
lo a3135099-7820-4f7a-94f5-c48210ae43eb loopback lo
ens3 35f7245a-9a2b-4111-ba3e-b6fb322a1f25 ethernet --
[cloud-admin@edpm-compute-0 ~]$ cat /etc/resolv.conf
# Generated by NetworkManager
search ctlplane.example.com
nameserver 192.168.122.80
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<< Also, failed os-net-config run would normally mess up things anyway, so I won't overly bother about that use-case here.
But that depends on how that failed, so like if failure is due to wrong os-net-config in next attempt we can fix config in the nodeset and rerun and that should work but in the current proposal it will just get stuck if nameserver get's wiped off in the previous run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though theoritically if we reload NetworkManager after os-net-config run it should update resolv.conf, I've not tested if not removing dns=none from
99-cloud-init.confwhen os-net-config runs have any impact. If this can be tested and works we can move it after os-net-config run. Feel free to update the PR. However, IMO we should probably focus on openstack-k8s-operators/edpm-image-builder#85 which woud fix without any of these hacks, assuming we want everyone to move to nmstate provider with os-net-config.
Ok looking more on it, with openstack-k8s-operators/openstack-baremetal-operator#316 and/or openstack-k8s-operators/edpm-image-builder#85 this PR shoudn't be needed but will also not hurt so the concerns raised for failure cases will not be much relevant. Have also done some tests with nmstate=false those also went fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with openstack-k8s-operators/openstack-baremetal-operator#316
I've commented in above PR but adding it here as well...
We should allow global dns settinngs as per the openstack networkdata schema https://docs.openstack.org/nova/latest/_downloads/9119ca7ac90aa2990e762c08baea3a36/network_data.json and not only interface level ones as done in this PR. We allow users to use custom networkData in nodeset spec and that can be anything as per shcema. As the current default is to use nmstate provider with os-net-config (and we plan to remove support for ifcfg scripts) we should switch the renderer as proposed in openstack-k8s-operators/edpm-image-builder#85.
| - /etc/NetworkManager/NetworkManager.conf | ||
| - /etc/NetworkManager/conf.d/99-cloud-init.conf | ||
| - name: Set 'rc-manager=unmanaged' in /etc/NetworkManager/NetworkManager.conf | ||
| - name: Unset 'rc-manager=unmanaged' in /etc/NetworkManager/NetworkManager.conf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below Reload task can also be made conditional i.e no need to reload if desired config is in place
Jaganathancse
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks Good. Thanks Rabi for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rabi
Looks like this edpm_network_config_tool 'nmstate' block also required this dns_nm_configs.yml changes.
- name: Configure network with network role from system roles [nmstate]
when: edpm_network_config_tool == 'nmstate'
become: true
block:- name: Render network_state variable
ansible.builtin.set_fact:
network_state: "{{ edpm_network_config_template | from_yaml }}" - name: Load system-roles.network tasks [nmstate]
ansible.builtin.include_role:
name: "{{ lookup('ansible.builtin.env', 'EDPM_SYSTEMROLES', default='fedora.linux_system_roles') + '.network' }}"
- name: Render network_state variable
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rabi The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
| state: restarted | ||
| when: nm_ovs_status.changed # noqa: no-handler | ||
|
|
||
| - name: Import DNS NetworkManager configs tasks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This task only configures DNS update will be done by NM or not.
is this updating DNS config for nmstate provider when using cloud-init nmstate config intial setup?
|
Closing this as openstack-k8s-operators/edpm-image-builder#85 has merged. But there are still issues with minor updates as mentioned in #1007 (comment) |
Changing NetworkManager to manage resolv.conf results in resolv.conf entries by cloud-init overwritten when using unprovisioned nodes. Let's change it just before running os-net-config so that tasks to download packages do not fail.
This is a regression from #908
jira: https://issues.redhat.com/browse/OSPRH-19018