-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gvforwarder as a systemd service #1003
base: release-4.17
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test all |
I'd recommend also picking up the changes from https://github.com/cfergeau/snc/commits/gvisor_service/ which update the unit files used in the PR to use a unit file close to https://github.com/containers/gvisor-tap-vsock/tree/main/contrib/systemd With the current code, I still have this question/concern #673 (comment) |
fb9c40e
to
6278601
Compare
/retest |
1 similar comment
/retest |
04595ac
to
24924c0
Compare
/retest |
24924c0
to
327901a
Compare
/retest |
1 similar comment
/retest |
327901a
to
5264a19
Compare
/retest |
2 similar comments
/retest |
/retest |
/retest |
2 similar comments
/retest |
/retest |
ccc602e
to
4e0a92e
Compare
/retest |
4e0a92e
to
a999631
Compare
/retest |
a999631
to
d1501b4
Compare
/retest |
b83b47e
to
53cf03b
Compare
53cf03b
to
6cf746e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks a lot for putting this into shape/testing it.
/test e2e-snc |
6cf746e
to
abece15
Compare
- Create a tap device using nmcli with a hardcoded mac address - Start gvforwarder systemd service which will use this device Signed-off-by: vyasgun <[email protected]>
abece15
to
cf5affc
Compare
# when tap device is in use. | ||
${SSH} core@${VM_IP} 'sudo bash -x -s' <<EOF | ||
nmcli connection add type tun ifname tap0 con-name tap0 mode tap autoconnect yes 802-3-ethernet.cloned-mac-address 5A:94:EF:E4:0C:EE | ||
EOF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this equivalent to ${SSH} core@${VM_IP} 'sudo nmcli connection add type tun ifname tap0 con-name tap0 mode tap autoconnect yes 802-3-ethernet.cloned-mac-address 5A:94:EF:E4:0C:EE'
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In short; you are questioning why this needs to be wrapped in a sudo bash -x -s
.
Would otherwise an error occur? I do not see characters that would be wrongly interpreted by the host shell (like zsh could do).
@anjannath How was this solved for the self-sufficient bundle?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not being changed for the self-sufficient bundle, its been tested with the existing situation which is that there is a container image which runs the gvforwarder
and that container also has a dhcp client script which configures the interface using the dhcp service from gvisor-tap-vsock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we can also scp the NetworkManger
config file to /etc/NetworkManager/system-connections
instead of running nmcli
commands, there's a config file in: https://github.com/containers/gvisor-tap-vsock/blob/main/contrib/networkmanager/vsock0.nmconnection
[connection]
id=tap0
type=tun
autoconnect=true
interface-name=tap0
[tun]
mode=2
[802-3-ethernet]
cloned-mac-address=5A:94:EF:E4:0C:EE
[ipv4]
method=auto
[proxy]
@praveenkumar can you take a look at this PR? you also looked into this in the past. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would approve, but I question the use of the command... first why invoked like this... though, even ... why is the creation of the tap not part of a systemd unit by itself? As in that case you can depend on it...
systemctl daemon-reload | ||
systemctl enable gvisor-tap-vsock.service | ||
systemctl enable [email protected] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... I like how this can be targeted with %i
, but this 'depends' on actions performed previously by creating this device.
For this increment, this would work. But most likely would change with the self-sufficient bundle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nmcli
command creates a Network Manager configuration file in the bundle, the self-sufficient bundle should be similar from that perspective?
# when tap device is in use. | ||
${SSH} core@${VM_IP} 'sudo bash -x -s' <<EOF | ||
nmcli connection add type tun ifname tap0 con-name tap0 mode tap autoconnect yes 802-3-ethernet.cloned-mac-address 5A:94:EF:E4:0C:EE | ||
EOF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In short; you are questioning why this needs to be wrapped in a sudo bash -x -s
.
Would otherwise an error occur? I do not see characters that would be wrongly interpreted by the host shell (like zsh could do).
@anjannath How was this solved for the self-sufficient bundle?
In a way the creation of the tap is part of a systemd unit, it's added to NetworkManager configuration files, which is started through systemd.
|
tee /etc/systemd/system/[email protected] <<TEE | ||
[Unit] | ||
Description=gvisor-tap-vsock Network Traffic Forwarder | ||
After=NetworkManager.service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we run this after NetworkManager-wait-online.service
beacuse as per man NetworkManager-wait-online.service
this make sure that delays reaching the network-online target until NetworkManager reports that the startup is completed on the D-Bus
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure NetworkManager-wait-online.service
can complete before gv-user-network
service is started, so I'm not sure it would work to order them the opposite way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I see this gv-user-network
service is running the forwarder command for an existing interface and NetworkManager-wait-online service make sure that interface exist because it waits until all the network profile is enabled. Or my understanding is wrong here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The gv-user-network
unit file has:
BindsTo=sys-devices-virtual-net-%i.device
After=sys-devices-virtual-net-%i.device
This makes sure the interface exists before the gv-user-network
unit tries to start.
man NetworkManager-wait-online.service
says:
- Startup is not complete as long as NetworkManager profiles are in an activating state.
- When a device reaches the activate state depends on its configuration. For example, with a profile that has both IPv4 and IPv6 enabled, by default, NetworkManager considers the device as fully activated already when only one of the address families is ready.
[...]
From this, it is not clear when a tun
interface reaches the activate
state. If NM waits until it gets an IP for example, this means gvforwarder
must be running before the interface "activates", in which case it's problematic to order it after NetworkManager-wait-online.service
. Maybe a tun
interface is activated before getting an IP, in which case it would be less problematic, but I don't know from just reading the man page.
However, ordering the unit after NetworkManager
and after the tun
device is available seems enough to me, I don't think ordering it after NetworkManager-wait-online.service
brings us anything useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not clear when a tun interface reaches the activate state.
Yes and looking at the CI failure I think it hit the network failure so let's stick to NetworkManager.service
only instead NetworkManager-wait-online.service
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this and we should keep the After
value as it is
f4bab4e
to
cf5affc
Compare
@vyasgun: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Based on the following code:
#673
cfergeau@03a4054