Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gvforwarder as a systemd service #1003

Open
wants to merge 1 commit into
base: release-4.17
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 29 additions & 3 deletions createdisk.sh
Original file line number Diff line number Diff line change
Expand Up @@ -97,12 +97,38 @@ if podman manifest inspect quay.io/crcont/routes-controller:${OPENSHIFT_VERSION}
image_tag=${OPENSHIFT_VERSION}
fi

# create the tap device interface with specified mac address
# this mac address is used to allocate a specific IP to the VM
# when tap device is in use.
${SSH} core@${VM_IP} 'sudo bash -x -s' <<EOF
nmcli connection add type tun ifname tap0 con-name tap0 mode tap autoconnect yes 802-3-ethernet.cloned-mac-address 5A:94:EF:E4:0C:EE
EOF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this equivalent to ${SSH} core@${VM_IP} 'sudo nmcli connection add type tun ifname tap0 con-name tap0 mode tap autoconnect yes 802-3-ethernet.cloned-mac-address 5A:94:EF:E4:0C:EE'?

Copy link
Collaborator

@gbraad gbraad Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In short; you are questioning why this needs to be wrapped in a sudo bash -x -s.
Would otherwise an error occur? I do not see characters that would be wrongly interpreted by the host shell (like zsh could do).

@anjannath How was this solved for the self-sufficient bundle?

Copy link
Member

@anjannath anjannath Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not being changed for the self-sufficient bundle, its been tested with the existing situation which is that there is a container image which runs the gvforwarder and that container also has a dhcp client script which configures the interface using the dhcp service from gvisor-tap-vsock

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we can also scp the NetworkManger config file to /etc/NetworkManager/system-connections instead of running nmcli commands, there's a config file in: https://github.com/containers/gvisor-tap-vsock/blob/main/contrib/networkmanager/vsock0.nmconnection

[connection]
id=tap0
type=tun
autoconnect=true
interface-name=tap0

[tun]
mode=2

[802-3-ethernet]
cloned-mac-address=5A:94:EF:E4:0C:EE

[ipv4]
method=auto

[proxy]



# Add gvisor-tap-vsock service
${SSH} core@${VM_IP} 'sudo bash -x -s' <<EOF
podman create --name=gvisor-tap-vsock --privileged --net=host -v /etc/resolv.conf:/etc/resolv.conf -it quay.io/crcont/gvisor-tap-vsock:latest
podman generate systemd --restart-policy=no gvisor-tap-vsock > /etc/systemd/system/gvisor-tap-vsock.service
podman create --name=gvisor-tap-vsock quay.io/crcont/gvisor-tap-vsock:latest
podman cp gvisor-tap-vsock:/vm /usr/local/bin/gvforwarder
podman rm gvisor-tap-vsock
tee /etc/systemd/system/[email protected] <<TEE
[Unit]
Description=gvisor-tap-vsock Network Traffic Forwarder
After=NetworkManager.service
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we run this after NetworkManager-wait-online.service beacuse as per man NetworkManager-wait-online.service this make sure that delays reaching the network-online target until NetworkManager reports that the startup is completed on the D-Bus

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure NetworkManager-wait-online.service can complete before gv-user-network service is started, so I'm not sure it would work to order them the opposite way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I see this gv-user-network service is running the forwarder command for an existing interface and NetworkManager-wait-online service make sure that interface exist because it waits until all the network profile is enabled. Or my understanding is wrong here?

Copy link
Contributor

@cfergeau cfergeau Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gv-user-network unit file has:

BindsTo=sys-devices-virtual-net-%i.device
After=sys-devices-virtual-net-%i.device

This makes sure the interface exists before the gv-user-network unit tries to start.

man NetworkManager-wait-online.service says:

  • Startup is not complete as long as NetworkManager profiles are in an activating state.
  • When a device reaches the activate state depends on its configuration. For example, with a profile that has both IPv4 and IPv6 enabled, by default, NetworkManager considers the device as fully activated already when only one of the address families is ready.
    [...]

From this, it is not clear when a tun interface reaches the activate state. If NM waits until it gets an IP for example, this means gvforwarder must be running before the interface "activates", in which case it's problematic to order it after NetworkManager-wait-online.service. Maybe a tun interface is activated before getting an IP, in which case it would be less problematic, but I don't know from just reading the man page.

However, ordering the unit after NetworkManager and after the tun device is available seems enough to me, I don't think ordering it after NetworkManager-wait-online.service brings us anything useful?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not clear when a tun interface reaches the activate state.

Yes and looking at the CI failure I think it hit the network failure so let's stick to NetworkManager.service only instead NetworkManager-wait-online.service

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this and we should keep the After value as it is

BindsTo=sys-devices-virtual-net-%i.device
After=sys-devices-virtual-net-%i.device

[Service]
Environment=GV_VSOCK_PORT="1024"
EnvironmentFile=-/etc/sysconfig/gv-user-network
ExecStart=/usr/local/bin/gvforwarder -preexisting -iface %i -url vsock://2:\\\${GV_VSOCK_PORT}/connect

[Install]
WantedBy=multi-user.target

TEE
systemctl daemon-reload
systemctl enable gvisor-tap-vsock.service
systemctl enable [email protected]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... I like how this can be targeted with %i, but this 'depends' on actions performed previously by creating this device.

For this increment, this would work. But most likely would change with the self-sufficient bundle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nmcli command creates a Network Manager configuration file in the bundle, the self-sufficient bundle should be similar from that perspective?


EOF

# Add dummy crio-wipe service to instance
Expand Down