Skip to content

fix nat table by getting the fitting device for an address #9552

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: 4.19
Choose a base branch
from

Conversation

DaanHoogland
Copy link
Contributor

Description

This PR...

Fixes: #9473

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI
  • test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

Copy link

codecov bot commented Aug 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 15.08%. Comparing base (6e6a276) to head (20c4e4b).
Report is 5 commits behind head on 4.19.

Additional details and impacted files
@@             Coverage Diff              @@
##               4.19    #9552      +/-   ##
============================================
- Coverage     15.08%   15.08%   -0.01%     
- Complexity    11184    11185       +1     
============================================
  Files          5406     5406              
  Lines        472889   472915      +26     
  Branches      57738    57661      -77     
============================================
+ Hits          71352    71354       +2     
- Misses       393593   393617      +24     
  Partials       7944     7944              
Flag Coverage Δ
uitests 4.30% <ø> (ø)
unittests 15.80% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@weizhouapache
Copy link
Member

@DaanHoogland
I had a look at issue #8562 which has been fixed by #8599

Assume there are two public IPs in the VPC VR (and isolated network VR):

  • xx.xx.64.x (source nat, default public IP), on eth1
  • xx.xx.96.x (additional public ip range). on ethX

I think the expected behaviour should be

  • all vms (without Static Nat) has the source Ip xx.xx.64.x (this is current behaviour)
  • the VR should be able to connect to xx.xx.96.x network with source ip xx.xx.96.x (otherwise the gateway check may fail, see Failed VR health check gateways_check.py on additional public IP range #9473)
  • the VMs should be able to connect to xx.xx.96.x network with source ip xx.xx.64.x or xx.xx.96.x (to be discussed)

currently the rules are

-A POSTROUTING -j SNAT -o eth1 --to-source xx.xx.64.x
-A POSTROUTING -j SNAT -o ethX --to-source xx.xx.64.x

seems better to change to

-A POSTROUTING -j SNAT -o eth1 --to-source xx.xx.64.x
-A POSTROUTING -j SNAT -o ethX --to-source xx.xx.96.x

or

-A POSTROUTING -j SNAT -o eth1 --to-source xx.xx.64.x
-A POSTROUTING -j SNAT -o ethX -d xx.xx.96.1 --to-source xx.xx.96.x  (96.1 is gateway)
-A POSTROUTING -j SNAT -o ethX ! -d xx.xx.96.1 --to-source xx.xx.64.x  (96.1 is gateway)

to be discussed

@DaanHoogland DaanHoogland changed the base branch from main to 4.19 August 21, 2024 06:51
elif cmdline.get_source_nat_ip() and not self.is_private_gateway():
self.fw.append(
["nat", "", "-A POSTROUTING -j SNAT -o %s --to-source %s" % (self.dev, cmdline.get_source_nat_ip())])
self.fw.append(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there are multiple public ips (in multiple ranges), will there be same amount of rules ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand the question. I checked this in a lab env and the resulting nat table was exactly as described in the issue, with only the last line being different. Ar you considdering another configuration here @weizhouapache ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for each public ip (and private gateway), there will be a rule below, right ?

-A POSTROUTING -j SNAT -o ethX --to-source xx.yy.zz.xx

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DaanHoogland
to be clear, we need a rule for each public NIC, for example

-A POSTROUTING -j SNAT -o eth1 --to-source <source nat IP>    # this is for source nat NIC
-A POSTROUTING -j SNAT -o eth5 --to-source <first public IP on eth5>    # this is for additional public NIC

If I understand correctly, for the current changes , the rules are for example,

-A POSTROUTING -j SNAT -o eth1 --to-source <source nat IP>    # this is for source nat NIC
-A POSTROUTING -j SNAT -o eth1 --to-source <second IP on source nat NIC>    # this is for source nat NIC
-A POSTROUTING -j SNAT -o eth1 --to-source <third IP on source nat NIC>    # this is for source nat NIC

-A POSTROUTING -j SNAT -o eth5 --to-source <first public IP on eth5>    # this is for additional public NIC
-A POSTROUTING -j SNAT -o eth5 --to-source <second public IP on eth5>    # this is for additional public NIC
-A POSTROUTING -j SNAT -o eth5 --to-source <third public IP on eth5>    # this is for additional public NIC

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll verify that. Do you happen to know what condition to test for? I don't think the self.address object contains information on whether it is the first IP, does it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original issue does not exist in our lab (I can verify with infra).

we can only verify the iptables rules in the VR

  • create 2 public ip ranges with different vlan
  • acquire 3 public ips on each public ip and use them for static/pf/lb

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the self.address object contains information on whether it is the first IP, does it?

it does, but I am not sure if it is 100% correct.

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'll give it a try

@apache apache deleted a comment from blueorangutan Oct 31, 2024
@apache apache deleted a comment from blueorangutan Oct 31, 2024
@apache apache deleted a comment from blueorangutan Oct 31, 2024
@apache apache deleted a comment from blueorangutan Oct 31, 2024
@apache apache deleted a comment from blueorangutan Nov 1, 2024
@apache apache deleted a comment from blueorangutan Nov 1, 2024
@apache apache deleted a comment from blueorangutan Nov 1, 2024
@apache apache deleted a comment from blueorangutan Nov 1, 2024
@apache apache deleted a comment from blueorangutan Nov 1, 2024
@apache apache deleted a comment from blueorangutan Nov 1, 2024
@apache apache deleted a comment from blueorangutan Nov 1, 2024
@apache apache deleted a comment from blueorangutan Nov 1, 2024
@apache apache deleted a comment from blueorangutan Nov 1, 2024
@DaanHoogland DaanHoogland added this to the 4.19.2 milestone Dec 10, 2024
@apache apache deleted a comment from blueorangutan Jan 8, 2025
@apache apache deleted a comment from blueorangutan Jan 8, 2025
@apache apache deleted a comment from blueorangutan Jan 29, 2025
@DaanHoogland
Copy link
Contributor Author

minor issue, moving forward

@DaanHoogland DaanHoogland modified the milestones: 4.19.2, 4.19.3 Feb 3, 2025
@apache apache deleted a comment from blueorangutan Feb 13, 2025
@apache apache deleted a comment from blueorangutan Feb 13, 2025
@apache apache deleted a comment from blueorangutan Feb 13, 2025
@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12440

@Pearl1594
Copy link
Contributor

@DaanHoogland any update on this one?

@DaanHoogland
Copy link
Contributor Author

@Pearl1594 , no priority and no definite proof it works yet. I will revisit later. cc @weizhouapache

@Pearl1594 Pearl1594 moved this to In Progress in ACS 4.20.1 Mar 17, 2025
@weizhouapache weizhouapache self-assigned this Apr 10, 2025
@weizhouapache
Copy link
Member

tested this PR with a VPC

  • source nat: 10.0.53.6
  • other public ips on same subnet: 10.0.53.20, 10.0.53.3
  • public ips on different subnet: 10.0.64.110/111/112

main difference with iptables

 -A POSTROUTING -o eth1 -j SNAT --to-source 10.0.53.6       < ========= same (with Source NAT IP)

--A POSTROUTING -o eth2 -j SNAT --to-source 10.0.53.6       < ========== removed

+-A POSTROUTING -o eth1 -j SNAT --to-source 10.0.53.20    <========= new rules (with other public IPs)
+-A POSTROUTING -o eth1 -j SNAT --to-source 10.0.53.3
+-A POSTROUTING -o eth2 -j SNAT --to-source 10.0.64.111
+-A POSTROUTING -o eth2 -j SNAT --to-source 10.0.64.110
+-A POSTROUTING -o eth2 -j SNAT --to-source 10.0.64.112

the issue #8562 fixed by #8599 will come back
they are two different cases. it looks difficult to make both work ...

@DaanHoogland
Copy link
Contributor Author

@weizhouapache , sounds like it is impossible (both put snat for the secondary ip on its own interface and on the primary interface)

So how about making the heath check script accept this situation somehow?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

Failed VR health check gateways_check.py on additional public IP range
4 participants