Skip to content

Commit b4f321e

Browse files
akerouantongithub-actions[bot]
authored andcommitted
vm networking: add flag vnet_hdr
When segmentation offload is enabled, and unsegmented packets are sent to a VM (i.e. when running a container in the root netns), the kernel will detect that packets are larger than expected and proceed. That's not the case for containers (i.e. when running a container with its own netns, and a veth pair). In that case, packets reach the virtio-net interface, are forwarded to the bridge, and then to the appropriate veth. Unsegmented packets with GSO fields unset are dropped by the kernel either at the bridge or at the veth level. That may be due to the current network topology where the vnet interface is attached to a bridge. In that case, we need to tell libkrun that the network backend sends / receives virtio_net_hdr structs with the packets, and the backend need to preserve GSO fields for VM-to-VM connections, or populate them for host-to-VM connections. Signed-off-by: Albin Kerouanton <albin.kerouanton@docker.com>
1 parent ae64e08 commit b4f321e

2 files changed

Lines changed: 19 additions & 1 deletion

File tree

docs/vm-configuration.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,8 @@ that take the following fields:
5959
VFKIT magic sequence after connecting to the `socket`. Accept any of `1, t, T,
6060
TRUE, true, True, 0, f, F, FALSE, false, False`. Any other value is invalid and
6161
will produce an error.
62+
- `vnet_hdr` (optional, defaults to false): Indicate whether the VMM includes
63+
virtio-net headers along with Ethernet frames.
6264

6365
Note that the first network specified will be used as the default gateway.
6466

internal/shim/task/networking.go

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,11 @@ import (
3131
"github.com/containerd/nerdbox/internal/vm"
3232
)
3333

34+
const (
35+
NET_FLAG_VFKIT = 1 << iota // See https://github.com/containers/libkrun/blob/357ec63fee444b973e4fc76d2121fd41631f121e/include/libkrun.h#L271C9-L271C23
36+
NET_FLAG_INCLUDE_VNET_HEADER
37+
)
38+
3439
type networksProvider struct {
3540
nws []network
3641
}
@@ -44,6 +49,7 @@ type network struct {
4449
addr6 netip.Prefix // addr6 is the IPv6 address + subnet mask of the network interface
4550
features uint32 // features is a bitmask of virtio-net features enabled on this network endpoint
4651
vfkit bool // vfkit is a boolean flag indicating whether libkrun must send the VFKIT magic sequence after connecting to the socket.
52+
vnetHdr bool // vnetHdr is a boolean flag indicating whether libkrun must include virtio-net headers along with Ethernet frames.
4753
}
4854

4955
const (
@@ -58,6 +64,7 @@ const (
5864
addrField = "addr"
5965
featuresField = "features" // features is a bitwise-OR separated list of virtio-net features. See https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-v1.3-csd01.html#x1-2370003
6066
vfkitField = "vfkit" // vfkit is a boolean flag indicating whether libkrun must send the VFKIT magic sequence after connecting to the socket.
67+
vnetHdrField = "vnet_hdr"
6168

6269
nwModeUnixgram = "unixgram"
6370
nwModeUnixstream = "unixstream"
@@ -150,6 +157,12 @@ func parseNetwork(annotation string) (network, error) {
150157
return network{}, fmt.Errorf("parsing vfkit field: %w", err)
151158
}
152159
n.vfkit = vfkit
160+
case vnetHdrField:
161+
vnetHdr, err := strconv.ParseBool(value)
162+
if err != nil {
163+
return network{}, fmt.Errorf("parsing vnet_hdr field: %w", err)
164+
}
165+
n.vnetHdr = vnetHdr
153166
default:
154167
return network{}, fmt.Errorf("unknown network field: %s", key)
155168
}
@@ -181,7 +194,10 @@ func (p *networksProvider) SandboxOptions() []sandbox.Opt {
181194

182195
var flags uint32
183196
if nw.vfkit {
184-
flags = 1 // See https://github.com/containers/libkrun/blob/357ec63fee444b973e4fc76d2121fd41631f121e/include/libkrun.h#L271C9-L271C23
197+
flags = NET_FLAG_VFKIT
198+
}
199+
if nw.vnetHdr {
200+
flags |= NET_FLAG_INCLUDE_VNET_HEADER
185201
}
186202

187203
opts = append(opts, sandbox.WithNIC(nw.endpoint, nw.mac, int(nwMode), nw.features, flags))

0 commit comments

Comments
 (0)