Skip to content

Shared network for vfkit driver using vmnet-helper #20501

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 1, 2025

Conversation

nirs
Copy link
Contributor

@nirs nirs commented Mar 7, 2025

Add new network option for vfkit "vment-shared", connecting vfkit to the
vmnet shared network. Clusters using this network can access other
clusters in the same network, similar to socket_vment with QEMU driver.

If network is not specified, we default to the "nat" network, keeping
the previous behavior. If network is "vment-shared", the vfkit driver
manages 2 processes: vfkit and vmnet-helper.

Like vfkit, vmnet-helper is started in the background, in a new process
group, so it not terminated if the minikube process group is terminate.

Since vment-helper requires root to start the vmnet interface, we start
it with sudo, creating 2 child processes. vment-helper drops privileges
immediately after starting the vment interface, and run as the user and
group running minikube.

Stopping the cluster will stop sudo, which will stop the vmnet-helper
process. Deleting the cluster kill both sudo and vment-helper by killing
the process group.

This change is not complete, but it is good enough to play with the new
shared network.

Example usage:

  1. Install vmnet-helper:
    https://github.com/nirs/vmnet-helper?tab=readme-ov-file#installation

  2. Setup vment-helper sudoers rule:
    https://github.com/nirs/vmnet-helper?tab=readme-ov-file#granting-permission-to-run-vmnet-helper

  3. Start 2 clusters with vmnet-shared network:

% minikube start -p c1 --driver vfkit --network vment-shared
...

% minikube start -p c2 --driver vfkit --network vmnet-shared
...

% minikube ip -p c1
192.168.105.18

% minikube ip -p c2
192.168.105.19
  1. Both cluster can access the other cluster:
% minikube -p c1 ssh -- ping -c 3 192.168.105.19
PING 192.168.105.19 (192.168.105.19): 56 data bytes
64 bytes from 192.168.105.19: seq=0 ttl=64 time=0.621 ms
64 bytes from 192.168.105.19: seq=1 ttl=64 time=0.989 ms
64 bytes from 192.168.105.19: seq=2 ttl=64 time=0.490 ms

--- 192.168.105.19 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.490/0.700/0.989 ms

% minikube -p c2 ssh -- ping -c 3 192.168.105.18
PING 192.168.105.18 (192.168.105.18): 56 data bytes
64 bytes from 192.168.105.18: seq=0 ttl=64 time=0.289 ms
64 bytes from 192.168.105.18: seq=1 ttl=64 time=0.798 ms
64 bytes from 192.168.105.18: seq=2 ttl=64 time=0.993 ms

--- 192.168.105.18 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.289/0.693/0.993 ms

To complete this work we need:

  • Install vmnet-helper on the CI macOS hosts
  • Add test using --network vmnet-shared

Fixes #20557
Fixes #20558

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 7, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @nirs. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 7, 2025
@minikube-bot
Copy link
Collaborator

Can one of the admins verify this patch?

@nirs
Copy link
Contributor Author

nirs commented Mar 7, 2025

@afbjorklund can you review this?

@nirs nirs force-pushed the vmnet-helper branch 3 times, most recently from 86b449f to 6bf1d56 Compare March 9, 2025 22:38
@nirs
Copy link
Contributor Author

nirs commented Mar 9, 2025

Example machine config when vmnet-shared is used:

{
    "ConfigVersion": 3,
    "Driver": {
        "IPAddress": "192.168.105.21",
        "MachineName": "minikube",
        "SSHUser": "docker",
        "SSHPort": 22,
        "SSHKeyPath": "",
        "StorePath": "/Users/nir/.minikube",
        "SwarmMaster": false,
        "SwarmHost": "",
        "SwarmDiscovery": "",
        "Boot2DockerURL": "file:///Users/nir/.minikube/cache/iso/arm64/minikube-v1.35.0-arm64.iso",
        "DiskSize": 20000,
        "CPU": 2,
        "Memory": 6000,
        "Cmdline": "",
        "ExtraDisks": 0,
        "Network": "vmnet-shared",
        "MACAddress": "de:a2:9c:71:3c:f7",
        "VmnetHelper": {
            "MachineDir": "/Users/nir/.minikube/machines/minikube",
            "InterfaceID": "a0c43efb-2dcc-4abb-a310-568782c5dc7a"
        }
    },
    "DriverName": "vfkit",
    "HostOptions": {
        "Driver": "",
        "Memory": 0,
        "Disk": 0,
        "EngineOptions": {
            "ArbitraryFlags": null,
            "Dns": null,
            "GraphDir": "",
            "Env": null,
            "Ipv6": false,
            "InsecureRegistry": [
                "10.96.0.0/12"
            ],
            "Labels": null,
            "LogLevel": "",
            "StorageDriver": "",
            "SelinuxEnabled": false,
            "TlsVerify": false,
            "RegistryMirror": [],
            "InstallURL": "https://get.docker.com"
        },
        "SwarmOptions": {
            "IsSwarm": false,
            "Address": "",
            "Discovery": "",
            "Agent": false,
            "Master": false,
            "Host": "",
            "Image": "",
            "Strategy": "",
            "Heartbeat": 0,
            "Overcommit": 0,
            "ArbitraryFlags": null,
            "ArbitraryJoinFlags": null,
            "Env": null,
            "IsExperimental": false
        },
        "AuthOptions": {
            "CertDir": "/Users/nir/.minikube",
            "CaCertPath": "/Users/nir/.minikube/certs/ca.pem",
            "CaPrivateKeyPath": "/Users/nir/.minikube/certs/ca-key.pem",
            "CaCertRemotePath": "",
            "ServerCertPath": "/Users/nir/.minikube/machines/server.pem",
            "ServerKeyPath": "/Users/nir/.minikube/machines/server-key.pem",
            "ClientKeyPath": "/Users/nir/.minikube/certs/key.pem",
            "ServerCertRemotePath": "",
            "ServerKeyRemotePath": "",
            "ClientCertPath": "/Users/nir/.minikube/certs/cert.pem",
            "ServerCertSANs": null,
            "StorePath": "/Users/nir/.minikube"
        }
    },
    "Name": "minikube"
}

@nirs
Copy link
Contributor Author

nirs commented Mar 9, 2025

Example multi-node cluster

% minikube start --network vmnet-shared --nodes 2 --cni auto
😄  minikube v1.35.0 on Darwin 15.3.1 (arm64)
✨  Using the vfkit (experimental) driver based on user configuration
❗  --network flag is only valid with the docker/podman, KVM and Qemu drivers, it will be ignored
👍  Starting "minikube" primary control-plane node in "minikube" cluster
🔥  Creating vfkit VM (CPUs=2, Memory=4050MB, Disk=20000MB) ...
📦  Preparing Kubernetes v1.32.2 on containerd 1.7.23 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔗  Configuring CNI (Container Networking Interface) ...
🔎  Verifying Kubernetes components...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟  Enabled addons: default-storageclass, storage-provisioner

👍  Starting "minikube-m02" worker node in "minikube" cluster
🔥  Creating vfkit VM (CPUs=2, Memory=4050MB, Disk=20000MB) ...
🌐  Found network options:
    ▪ NO_PROXY=192.168.105.22
📦  Preparing Kubernetes v1.32.2 on containerd 1.7.23 ...
    ▪ env NO_PROXY=192.168.105.22
    > kubelet.sha256:  64 B / 64 B [-------------------------] 100.00% ? p/s 0s
    > kubectl.sha256:  64 B / 64 B [-------------------------] 100.00% ? p/s 0s
    > kubeadm.sha256:  64 B / 64 B [-------------------------] 100.00% ? p/s 0s
    > kubectl:  53.25 MiB / 53.25 MiB [------------] 100.00% 37.13 MiB p/s 1.6s
    > kubelet:  71.75 MiB / 71.75 MiB [------------] 100.00% 12.43 MiB p/s 6.0s
    > kubeadm:  66.81 MiB / 66.81 MiB [------------] 100.00% 10.11 MiB p/s 6.8s
🔎  Verifying Kubernetes components...
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default

% kubectl get node                                                           
NAME           STATUS   ROLES           AGE     VERSION
minikube       Ready    control-plane   5m1s    v1.32.2
minikube-m02   Ready    <none>          4m40s   v1.32.2

% kubectl get node -o jsonpath='{.items[*].status.addresses[0].address}{"\n"}'
192.168.105.22 192.168.105.23

Each node get its own vment-helper process:

% ps au | grep vmnet-helper | grep -v grep 
nir  60260   0.0  0.0 410743984   3792 s020  S     1:07AM   0:00.71 /opt/vmnet-helper/bin/vmnet-helper --fd 21 --interface-id c9ae713b-5937-42de-9018-bbb95a425506
root 60259   0.0  0.0 410737168   6448 s020  S     1:07AM   0:00.01 sudo --non-interactive --close-from 22 /opt/vmnet-helper/bin/vmnet-helper --fd 21 --interface-id c9ae713b-5937-42de-9018-bbb95a425506
nir  60254   0.0  0.0 410735792   3728 s020  S     1:07AM   0:00.73 /opt/vmnet-helper/bin/vmnet-helper --fd 13 --interface-id cf62fb28-2533-4c25-8d35-1eef1736f707
root 60253   0.0  0.0 410754576   6992 s020  S     1:07AM   0:00.01 sudo --non-interactive --close-from 14 /opt/vmnet-helper/bin/vmnet-helper --fd 13 --interface-id cf62fb28-2533-4c25-8d35-1eef1736f707

@afbjorklund
Copy link
Collaborator

@afbjorklund can you review this?

I could take a look at it later perhaps, but one of the minikube maintainers will still need to "take over" the vfkit driver.

Probably should have an issue, as well.

Copy link

@cfergeau cfergeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also noticed several vment typos in commit logs

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 10, 2025
nirs added 4 commits April 9, 2025 01:21
The package manages the vmnet-helper[1] child process, providing
connection to the vmnet network without running the guest as root.

We will use vmnet-helper for the vfkit driver, which does not have a way
to use shared network, when guests can access other guest in the
network.  We can use it later with the qemu driver as alternative to
socket_vmnet.

[1] https://github.com/nirs/vmnet-helper
Add new network option for vfkit "vmnet-shared", connecting vfkit to the
vmnet shared network. Clusters using this network can access other
clusters in the same network, similar to socket_vmnet with QEMU driver.

If network is not specified, we default to the "nat" network, keeping
the previous behavior. If network is "vmnet-shared", the vfkit driver
manages 2 processes: vfkit and vmnet-helper.

Like vfkit, vmnet-helper is started in the background, in a new process
group, so it not terminated if the minikube process group is terminate.

Since vmnet-helper requires root to start the vmnet interface, we start
it with sudo, creating 2 child processes. vmnet-helper drops privileges
immediately after starting the vmnet interface, and run as the user and
group running minikube.

Stopping the cluster will stop sudo, which will stop the vmnet-helper
process. Deleting the cluster kill both sudo and vmnet-helper by killing
the process group.

This change is not complete, but it is good enough to play with the new
shared network.

Example usage:

1. Install vmnet-helper:
   https://github.com/nirs/vmnet-helper?tab=readme-ov-file#installation

2. Setup vmnet-helper sudoers rule:
   https://github.com/nirs/vmnet-helper?tab=readme-ov-file#granting-permission-to-run-vmnet-helper

3. Start 2 clusters with vmnet-shared network:

    % minikube start -p c1 --driver vfkit --network vmnet-shared
    ...

    % minikube start -p c2 --driver vfkit --network vmnet-shared
    ...

    % minikube ip -p c1
    192.168.105.18

    % minikube ip -p c2
    192.168.105.19

4. Both cluster can access the other cluster:

    % minikube -p c1 ssh -- ping -c 3 192.168.105.19
    PING 192.168.105.19 (192.168.105.19): 56 data bytes
    64 bytes from 192.168.105.19: seq=0 ttl=64 time=0.621 ms
    64 bytes from 192.168.105.19: seq=1 ttl=64 time=0.989 ms
    64 bytes from 192.168.105.19: seq=2 ttl=64 time=0.490 ms

    --- 192.168.105.19 ping statistics ---
    3 packets transmitted, 3 packets received, 0% packet loss
    round-trip min/avg/max = 0.490/0.700/0.989 ms

    % minikube -p c2 ssh -- ping -c 3 192.168.105.18
    PING 192.168.105.18 (192.168.105.18): 56 data bytes
    64 bytes from 192.168.105.18: seq=0 ttl=64 time=0.289 ms
    64 bytes from 192.168.105.18: seq=1 ttl=64 time=0.798 ms
    64 bytes from 192.168.105.18: seq=2 ttl=64 time=0.993 ms

    --- 192.168.105.18 ping statistics ---
    3 packets transmitted, 3 packets received, 0% packet loss
    round-trip min/avg/max = 0.289/0.693/0.993 ms
Trailing whitespace is removed by some editors or displayed as a
warning. Clean up to make it easy to make maintain this file.
The vfkit driver supports now `nat` and `vmnet-shared` network options.
The `nat` option provides the best performance and is always available,
so it is the default network option. The `vmnet-shared` option provides
access between machines with lower performance compared to `nat`.

If `vment-shared` option is selected, we verify that vmnet-helper is
available. The check ensure that vmnet-helper is installed and sudoers
configuration allows the current user to run vment-helper without a
password.

If validating vment-helper failed, we return a new NotFoundVmnetHelper
reason pointing to vment-helper installation docs or recommending to use
`nat`. This is based on how we treat missing socket_vmnet for QEMU
driver.
@nirs
Copy link
Contributor Author

nirs commented Apr 8, 2025

@medyagh thanks for reviewing!

Changes in current version:

  • Fix typo in start_flags.go
  • Update comments for buffer sizes based on recent changes in vmnet-helper

https://github.com/kubernetes/minikube/compare/7125441c47b515bb30e4e5ff1f8323672c7d158f..4595c49781c9e25c283632264448e235cf0fce36

@minikube-pr-bot

This comment has been minimized.

@minikube-pr-bot
Copy link

Here are the number of top 10 failed tests in each environments with lowest flake rate.

Environment Test Name Flake Rate
KVM_Linux_crio (10 failed) TestFunctional/serial/KubectlGetPods(gopogh) 0.00% (chart)
KVM_Linux_crio (10 failed) TestFunctional/serial/MinikubeKubectlCmd(gopogh) 0.00% (chart)
KVM_Linux_crio (10 failed) TestFunctional/serial/MinikubeKubectlCmdDirectly(gopogh) 0.00% (chart)
KVM_Linux_crio (10 failed) TestFunctional/serial/SoftStart(gopogh) 6.52% (chart)
KVM_Linux_crio (10 failed) TestFunctional/serial/ExtraConfig(gopogh) 6.98% (chart)
KVM_Linux_containerd (1 failed) TestAddons/parallel/Registry(gopogh) 3.92% (chart)

Besides the following environments also have failed tests:

To see the flake rates of all tests by environment, click here.

@nirs
Copy link
Contributor Author

nirs commented Apr 12, 2025

@medyagh anything else to change?

@medyagh
Copy link
Member

medyagh commented Apr 24, 2025

only needs documentaiton chagne and then good to merge

@nirs nirs marked this pull request as draft April 24, 2025 18:09
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 24, 2025
@nirs
Copy link
Contributor Author

nirs commented Apr 30, 2025

Changes in latest version:

  • Add minikube version requirement
  • Simplify the IMPORTANT alert - github markdown alerts do not work inside tabs
  • Make the text more clear
  • Fix typos and double words (the the)

https://github.com/kubernetes/minikube/compare/4595c49781c9e25c283632264448e235cf0fce36..27b7ead729a89550bc48203674faa0d1fd2dc93a

Preview: https://deploy-preview-20501--kubernetes-sigs-minikube.netlify.app/docs/drivers/vfkit/

@medyagh The PR should be ready now.

@nirs nirs marked this pull request as ready for review April 30, 2025 15:34
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 30, 2025
@k8s-ci-robot k8s-ci-robot requested a review from spowelljr April 30, 2025 15:34
@minikube-pr-bot
Copy link

kvm2 driver with docker runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 20501) |
+----------------+----------+---------------------+
| minikube start | 50.6s    | 52.2s               |
| enable ingress | 16.0s    | 14.9s               |
+----------------+----------+---------------------+

Times for minikube start: 49.6s 51.3s 51.4s 51.1s 49.6s
Times for minikube (PR 20501) start: 55.2s 49.8s 50.9s 52.9s 52.3s

Times for minikube ingress: 16.0s 18.5s 14.5s 16.0s 15.0s
Times for minikube (PR 20501) ingress: 15.0s 15.0s 14.5s 15.0s 15.0s

docker driver with docker runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 20501) |
+----------------+----------+---------------------+
| minikube start | 21.5s    | 21.9s               |
| enable ingress | 12.5s    | 12.6s               |
+----------------+----------+---------------------+

Times for minikube start: 21.2s 21.0s 23.5s 20.1s 21.6s
Times for minikube (PR 20501) start: 20.5s 21.1s 24.2s 24.0s 20.0s

Times for minikube ingress: 12.3s 12.8s 12.8s 12.3s 12.3s
Times for minikube (PR 20501) ingress: 12.8s 12.8s 12.8s 12.3s 12.3s

docker driver with containerd runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 20501) |
+----------------+----------+---------------------+
| minikube start | 22.5s    | 20.1s               |
| enable ingress | 33.7s    | 32.6s               |
+----------------+----------+---------------------+

Times for minikube (PR 20501) ingress: 22.8s 39.3s 38.7s 23.3s 38.7s
Times for minikube ingress: 38.8s 27.8s 39.2s 23.3s 39.2s

Times for minikube start: 23.6s 22.9s 23.1s 23.2s 20.0s
Times for minikube (PR 20501) start: 20.0s 19.7s 19.4s 21.0s 20.5s

@nirs nirs requested a review from medyagh April 30, 2025 17:04
@medyagh
Copy link
Member

medyagh commented May 1, 2025

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 1, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: medyagh, nirs

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 1, 2025
@medyagh
Copy link
Member

medyagh commented May 1, 2025

thank you @nirs I am excited to try this myself

@medyagh medyagh merged commit 55b88a6 into kubernetes:master May 1, 2025
20 of 26 checks passed
@nirs nirs deleted the vmnet-helper branch May 1, 2025 18:25
nirs added a commit to nirs/ramen that referenced this pull request May 9, 2025
With the new vmnet network[1] support in minikube vfkit driver we can
use minikube on macOS. This change replace macOS defaults to use
minkkube with vfkit driver and vmnet-shared network support.

We need to make this configurable, but this change is good enough for
playing with the new option.

[1] kubernetes/minikube#20501

Signed-off-by: Nir Soffer <[email protected]>
nirs added a commit to nirs/ramen that referenced this pull request Jun 29, 2025
With the new vmnet network[1] support in minikube vfkit driver we can
use minikube on macOS. This change replace macOS defaults to use
minkkube with vfkit driver and vmnet-shared network support.

We need to make this configurable, but this change is good enough for
playing with the new option.

[1] kubernetes/minikube#20501

Signed-off-by: Nir Soffer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

vfkit: unable to access other other clusters from the node vfkit: unable to create multi-node cluster
7 participants