Skip to content

[Draft] KVM: enable no-mac-spoofing on virtual nics #8951

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

weizhouapache
Copy link
Member

@weizhouapache weizhouapache commented Apr 19, 2024

Description

In shared, isolated networks and vpcs in advanced zone, user vms can easily perform ip/arp/mac spoofing by pretending to be another vm in the same network.

  • change mac
ip link set dev eth0 down
ip link set dev eth0 address <MAC of another vm>
ip link set dev eth0 up
  • change IP
ip addr add dev eth0 <IP/cidr of another vm in same network>
ip addr del dev eth0 <old IP/cidr>
  • these can also be done by netplan/network-scripts configurations.

libvirt has a network traffic filtering subsystem which can be used to prevent spoofing. (https://libvirt.org/formatnwfilter.html#concepts)

  • It provides some pre-existing network filters (https://libvirt.org/formatnwfilter.html#pre-existing-network-filters), no-mac-spoofing is missed in the table.
  • Ideally we should use clean-traffic, however, the IP/ARP anti-spoofing does not work in our testing, as the IP is not specified in the libvirt vm definition XML by cloudstack.
  • libvirt can auto-detect the IP but the auto-detected IP is not always correct, for example, when user configures (wrong) static IP inside the user VM.

This PR adds no-mac-spoofing for each nic to prevent mac spoofing.

It could be an improvement PR to support all MAC/IP/ARP spoofing

  • it might lead to some overhead.
  • it is better to be configurable for guest network, network offering or account
  • IP/network information needs to be added to the InterfaceDef for each type of interface (bridge, direct, network, vhostuser, etc).
  • it might be not needed in advanced zone with security groups, as there are already iptables/ebtables/ipset rules to prevent the spoofings
  • it needs some discussion and testing. clean-traffic is good, but it might not be what we want. we need to evaluate the pre-existing network filters and probably consider creating customized filters. refer to https://libvirt.org/firewall.html#the-network-filter-driver

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

@weizhouapache weizhouapache force-pushed the 4.20-kvm-no-mac-spoofing branch from 0347115 to 5e0aa25 Compare April 19, 2024 18:02
@weizhouapache
Copy link
Member Author

@NuxRo
this is probably you want to have. I am not sure if there are downside .

@DaanHoogland and me have tested it. mac anti-spoofing works, but ip anti-spoofing does not work.

@weizhouapache
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 9339

@weizhouapache
Copy link
Member Author

@blueorangutan test rocky8 kvm-rocky8

@blueorangutan
Copy link

@weizhouapache a [SL] Trillian-Jenkins test job (rocky8 mgmt + kvm-rocky8) has been kicked to run smoke tests

@BryanMLima
Copy link
Contributor

Hey @weizhouapache, can you provide more context to the issue fixed by this PR? Should we add a no arp spoofing as well?

@weizhouapache
Copy link
Member Author

Hey @weizhouapache, can you provide more context to the issue fixed by this PR? Should we add a no arp spoofing as well?

@BryanMLima
Updated the PR description

@weizhouapache weizhouapache added this to the 4.20.0.0 milestone Apr 20, 2024
@blueorangutan
Copy link

[SF] Trillian test result (tid-9937)
Environment: kvm-rocky8 (x2), Advanced Networking with Mgmt server r8
Total time taken: 52334 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8951-t9937-kvm-rocky8.zip
Smoke tests completed. 127 look OK, 2 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_01_events_resource Error 314.34 test_events_resource.py
test_01_events_resource Error 314.35 test_events_resource.py
test_04_deploy_vm_for_other_user_and_test_vm_operations Failure 100.58 test_network_permissions.py
ContextSuite context=TestNetworkPermissions>:teardown Error 1.52 test_network_permissions.py

@wido
Copy link
Contributor

wido commented Apr 22, 2024

On KVM hypervisors with Security Groups enabled (Advanced + Shared networking) this is already handled by ebtables in security_group.py

Doing this you would do double packet inspection and there might even be a conflict. Have you looked at this?

My suggestion:

  • Enable this in Libvirt
  • Remove functionality from security_group.py

Copy link
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

@DaanHoogland
Copy link
Contributor

My suggestion:

* Enable this in Libvirt

* Remove functionality from security_group.py

meaning, @wido , this is good but we need to add a removal of some of the " -j DROP/ACCEPT" lines from the script? sounds like some precision surgery. Do you know which ones to drop? cc @weizhouapache .

@wido
Copy link
Contributor

wido commented Apr 22, 2024

My suggestion:

* Enable this in Libvirt

* Remove functionality from security_group.py

meaning, @wido , this is good but we need to add a removal of some of the " -j DROP/ACCEPT" lines from the script? sounds like some precision surgery. Do you know which ones to drop? cc @weizhouapache .

It would, I think if you take a look it starts here:

def default_ebtables_rules(vm_name, vm_ip, vm_mac, vif, is_first_nic=False):

  • default_ebtables_rules()
  • destroy_ebtables_rules()

Those would no longer be needed

@weizhouapache
Copy link
Member Author

My suggestion:

* Enable this in Libvirt

* Remove functionality from security_group.py

meaning, @wido , this is good but we need to add a removal of some of the " -j DROP/ACCEPT" lines from the script? sounds like some precision surgery. Do you know which ones to drop? cc @weizhouapache .

It would, I think if you take a look it starts here:

def default_ebtables_rules(vm_name, vm_ip, vm_mac, vif, is_first_nic=False):

  • default_ebtables_rules()
  • destroy_ebtables_rules()

Those would no longer be needed

@wido
actually I am thinking of disabling this change for vms with security groups
the script security_group.py programs iptables/ebtables rules including the mac/ip/arp anti-spoofing, it has been proved to be working well with both ipv4/ipv6 addresses and one/multiple network nics.
this PR only contains no-mac-spoofing which is not good enough to replace the security_group.py. it looks like a precise surgery to remove the ebtables rules, as @DaanHoogland said.
we could drop the methods in security_group.py if all mac/ip/arp anti-spoofing are supported (see the PR description).

other than that, the upgrade could be an issue as the VMs started in old versions (before upgrade) do not have the configuration in their VM XML definition.

@DaanHoogland
Copy link
Contributor

actually I am thinking of disabling this change for vms with security groups

I second that. It will be simpler and the will not cripple the much security groups implementation.

@wido
Copy link
Contributor

wido commented Apr 22, 2024

My suggestion:

* Enable this in Libvirt

* Remove functionality from security_group.py

meaning, @wido , this is good but we need to add a removal of some of the " -j DROP/ACCEPT" lines from the script? sounds like some precision surgery. Do you know which ones to drop? cc @weizhouapache .

It would, I think if you take a look it starts here:

def default_ebtables_rules(vm_name, vm_ip, vm_mac, vif, is_first_nic=False):

  • default_ebtables_rules()
  • destroy_ebtables_rules()

Those would no longer be needed

@wido actually I am thinking of disabling this change for vms with security groups the script security_group.py programs iptables/ebtables rules including the mac/ip/arp anti-spoofing, it has been proved to be working well with both ipv4/ipv6 addresses and one/multiple network nics. this PR only contains no-mac-spoofing which is not good enough to replace the security_group.py. it looks like a precise surgery to remove the ebtables rules, as @DaanHoogland said. we could drop the methods in security_group.py if all mac/ip/arp anti-spoofing are supported (see the PR description).

other than that, the upgrade could be an issue as the VMs started in old versions (before upgrade) do not have the configuration in their VM XML definition.

Sounds good. I would only add this to VMs without any SG. That would get my approval.

Copy link
Member

@rohityadavcloud rohityadavcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm didn’t test

@blueorangutan
Copy link

[SF] Trillian Build Failed (tid-10024)

@blueorangutan
Copy link

[SF] Trillian test result (tid-10030)
Environment: kvm-ubuntu22 (x2), Advanced Networking with Mgmt server u22
Total time taken: 70862 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8951-t10030-kvm-ubuntu22.zip
Smoke tests completed. 126 look OK, 3 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_01_events_resource Error 338.24 test_events_resource.py
test_01_events_resource Error 338.26 test_events_resource.py
test_list_system_vms_metrics_history Failure 0.48 test_metrics_api.py
test_04_deploy_vm_for_other_user_and_test_vm_operations Failure 102.16 test_network_permissions.py
ContextSuite context=TestNetworkPermissions>:teardown Error 1.57 test_network_permissions.py

@NuxRo
Copy link
Contributor

NuxRo commented May 1, 2024

@weizhouapache Good effort.

Like @wido says, the problems this would solve are not an issue in SG zones usually, so indeed we should not apply any of this there.

Otherwise it'd be a nice "win" for operators of regular Advanced Zones to apply anti-spoofing measures. We already have something somewhat similar for VMWare.

I'd be happy to use all reasonable libvirt nwfilter features, make them options in Network Offering:

  • IP anti-spoofing (with or without auto-detect)
  • ARP anti-spoofing
  • MAC anti-spoofing

Would it even be reasonable to allow the operator to specify more nwfilter? Ie load whatever xml file from /usr/share/libvirt/nwfilter/ that they want?

Copy link
Contributor

@NuxRo NuxRo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but taking into consideration the comments, particularly re security groups.

@weizhouapache weizhouapache changed the title KVM: enable no-mac-spoofing on virtual nics [Draft] KVM: enable no-mac-spoofing on virtual nics Jun 15, 2024
@JoaoJandre JoaoJandre removed this from the 4.20.0.0 milestone Sep 10, 2024
@JoaoJandre JoaoJandre added this to the 4.21.0.0 milestone Sep 10, 2024
@rohityadavcloud
Copy link
Member

@weizhouapache when you've bandwidth can you check why the github actions are/were failing (I've rekicked them now)

Copy link

codecov bot commented Feb 28, 2025

Codecov Report

Attention: Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Project coverage is 17.00%. Comparing base (cadbb56) to head (5e0aa25).
Report is 921 commits behind head on main.

Files with missing lines Patch % Lines
...om/cloud/hypervisor/kvm/resource/LibvirtVMDef.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##               main    #8951       +/-   ##
=============================================
+ Coverage     15.13%   17.00%    +1.87%     
+ Complexity    13469    13275      -194     
=============================================
  Files          4863     5270      +407     
  Lines        326031   465547   +139516     
  Branches      45838    54500     +8662     
=============================================
+ Hits          49349    79184    +29835     
- Misses       270066   377497   +107431     
- Partials       6616     8866     +2250     
Flag Coverage Δ
simulator-marvin-tests ?
unittests 17.00% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants