Skip to content

feat: allow from snat eip to fip eip#6329

Merged
zbb88888 merged 1 commit intokubeovn:masterfrom
zbb88888:eip2eip
Mar 2, 2026
Merged

feat: allow from snat eip to fip eip#6329
zbb88888 merged 1 commit intokubeovn:masterfrom
zbb88888:eip2eip

Conversation

@zbb88888
Copy link
Copy Markdown
Collaborator

Pull Request

What type of this PR

Examples of user facing changes:

  • Features

Which issue(s) this PR fixes

Fixes #(issue-number)

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Feb 24, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @zbb88888, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the VPC NAT gateway functionality by introducing hairpin SNAT capabilities. This feature is crucial for correctly handling traffic originating from within a VPC that is destined for an External IP (EIP) that has been DNAT'd back into the same VPC. Without hairpin SNAT, such traffic could experience issues with connection tracking, leading to dropped packets. The changes include new iptables rules, helper functions for rule management, and comprehensive end-to-end tests to validate the new behavior.

Highlights

  • Hairpin SNAT Implementation: Introduced new hairpin-snat-add and hairpin-snat-del commands and corresponding functions to manage hairpin SNAT rules. This addresses a scenario where VPC-internal traffic destined for a DNAT'd EIP needs proper SNATing to prevent reply packets from bypassing conntrack and being dropped.
  • IPTables Integration: A new HAIRPIN_SNAT iptables chain was added to the nat table and integrated into the SNAT_FILTER chain within the POSTROUTING hook, ensuring that hairpin SNAT rules are applied correctly.
  • Automatic Rule Management: Modified the add_snat and del_snat functions to automatically create and delete hairpin SNAT rules, respectively, when a SNAT rule is added or removed for internal CIDRs. A new is_internal_cidr helper function was added to determine if a CIDR is internal to the VPC.
  • End-to-End Testing: Added new E2E test cases to verify the automatic creation and cleanup of hairpin SNAT rules, ensuring the functionality works as expected in a live environment.
Changelog
  • dist/images/vpcnatgateway/nat-gateway.sh
    • Added hairpin-snat-add and hairpin-snat-del commands to the help message.
    • Created a new HAIRPIN_SNAT iptables chain in the nat table.
    • Appended the HAIRPIN_SNAT chain to the SNAT_FILTER chain in POSTROUTING.
    • Implemented is_internal_cidr function to check if a CIDR is internal to the VPC.
    • Modified add_snat to automatically add hairpin SNAT rules for internal CIDRs.
    • Modified del_snat to automatically remove hairpin SNAT rules for internal CIDRs.
    • Added add_hairpin_snat and del_hairpin_snat functions to manage hairpin SNAT iptables rules.
    • Integrated hairpin-snat-add and hairpin-snat-del into the main script's command handling.
  • test/e2e/iptables-vpc-nat-gw/e2e_test.go
    • Imported e2epodoutput for executing commands in pods.
    • Added checkHairpinSnatRuleExists function to verify the presence of hairpin SNAT rules.
    • Added verifyHairpinSnatRule function to wait for hairpin SNAT rules to reach an expected state.
    • Included a test step to verify hairpin SNAT rule auto-creation after SNAT creation.
    • Included a test step to verify hairpin SNAT rule auto-cleanup after SNAT deletion.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces hairpin SNAT functionality, which is a valuable addition for enabling traffic from an internal CIDR to an EIP that is DNAT'd back to an internal IP. The changes in nat-gateway.sh correctly set up the necessary iptables chains and rules, and the new e2e tests in e2e_test.go provide good coverage for both creation and deletion of these rules. The suggestions to improve the performance of the shell script by reducing the number of calls to iptables-save within loops are valuable and align with preferring simpler, more efficient solutions.

@coveralls
Copy link
Copy Markdown

coveralls commented Feb 24, 2026

Pull Request Test Coverage Report for Build 22534103773

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 23.089%

Totals Coverage Status
Change from base Build 22522889470: 0.0%
Covered Lines: 12561
Relevant Lines: 54402

💛 - Coveralls

@zbb88888 zbb88888 marked this pull request as draft February 24, 2026 10:07
@zbb88888
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces hairpin SNAT functionality to enable internal VMs to access each other via their Floating IPs. The changes in nat-gateway.sh correctly set up the necessary iptables chains and logic to automatically manage hairpin rules alongside standard SNAT rules. The new e2e tests in test/e2e/iptables-vpc-nat-gw/e2e_test.go are well-written and provide good coverage for the new feature, including creation and cleanup verification. My feedback includes a performance optimization for the shell script and a suggestion to refactor the test code to reduce duplication. Overall, this is a solid implementation of a valuable feature.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces hairpin SNAT functionality, enabling internal VMs to communicate with each other using their floating IPs. However, a high-severity command injection risk has been identified in nat-gateway.sh due to the use of unquoted variables in shell commands executed via bash -c. This architectural pattern is insecure and requires hardening through proper quoting and refactoring the command execution mechanism. Additionally, while the iptables logic and e2e_test.go are comprehensive, an improvement is needed in del_snat to ensure consistency and prevent a potential bug.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces hairpin SNAT functionality to allow internal VMs to access other internal services via their Floating IP. A critical command injection vulnerability has been identified in the new nat-gateway.sh script, stemming from unvalidated user input and potentially leading to Remote Code Execution (RCE). Furthermore, there is a critical issue in the shell script's error handling and a high-severity issue in the e2e test logic. Addressing these points, particularly through strict input validation, is crucial to ensure the robustness, reliability, and security of this new feature.

@zbb88888
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces hairpin SNAT functionality, enabling internal VMs to access each other via their Floating IPs, implemented by adding a new HAIRPIN_SNAT iptables chain and logic in nat-gateway.sh. However, a critical security audit revealed three command injection vulnerabilities in the nat-gateway.sh script, specifically within the add_hairpin_snat, del_hairpin_snat, and add_snat functions. These vulnerabilities arise from the direct use of un-sanitized user-provided arguments, allowing for potential arbitrary code execution. Remediation requires implementing strict input validation for IP addresses and CIDR notations. While the e2e tests in e2e_test.go are comprehensive and the overall implementation is solid, there is also a minor suggestion to improve code clarity in the shell script.

@zbb88888
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces hairpin SNAT functionality to allow internal traffic to access services via their floating IPs. The changes in nat-gateway.sh are well-implemented, adding the necessary iptables chains and logic to automatically manage hairpin rules alongside SNAT rules. The E2E tests in e2e_test.go are comprehensive, covering creation, deletion, and shared EIP scenarios, which ensures the new feature is robust. The minor suggestions to improve code consistency and readability remain valid.

@zbb88888
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces hairpin SNAT functionality to the VPC NAT gateway, enabling internal VMs to access other internal VMs via their Floating IP (FIP) when SNAT is also configured. This is a significant improvement for network connectivity within the VPC. The implementation includes automatic management of hairpin SNAT rules during SNAT creation and deletion, as well as comprehensive E2E tests. I have identified a few opportunities for improvement in the shell script to enhance robustness, particularly around handling multiple matching iptables rules and ensuring safe variable expansion.

@zbb88888 zbb88888 force-pushed the eip2eip branch 5 times, most recently from 03d36bf to 408579f Compare February 28, 2026 12:46
@zbb88888
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces hairpin SNAT functionality to enable internal-to-internal traffic via Floating IPs, with modifications to the nat-gateway.sh script for managing iptables rules and new e2e tests. A critical vulnerability exists in the rule deletion process: the script's use of grep to identify rules for deletion can fail when multiple rules match, potentially leaving stale iptables rules active for both standard and hairpin SNAT. Additionally, feedback suggests improving the robustness of iptables rule matching in both the script and tests for better correctness and performance.

@zbb88888 zbb88888 marked this pull request as ready for review February 28, 2026 13:00
@zbb88888 zbb88888 changed the title allow from snat eip to fip eip feat: allow from snat eip to fip eip Feb 28, 2026
@zbb88888
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces hairpin NAT functionality to allow internal services to access each other using their external IPs (FIPs). The changes involve adding a new HAIRPIN_SNAT iptables chain and logic to automatically add/remove hairpin rules when SNAT rules for internal CIDRs are managed. The implementation in nat-gateway.sh is robust, using idempotent operations and detailed rule matching. The accompanying e2e tests in Go are comprehensive, covering rule creation, data path validation, and cleanup scenarios, ensuring the new feature is well-tested. I've suggested a minor performance optimization in the shell script to cache results of is_internal_cidr within function calls to avoid redundant operations in loops.

@zbb88888
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces hairpin SNAT functionality, enabling internal services to access each other via their floating IPs, with changes to the nat-gateway.sh script and new e2e tests. However, the implementation has critical security concerns, including a vulnerable command execution pattern in the controller that allows for RCE in the privileged NAT gateway pod. Additionally, the hairpin SNAT logic is too broad, potentially breaking multi-EIP configurations, and violates several shell scripting best practices, which could lead to globbing issues. For improved maintainability and test robustness, duplicated code in the shell script should be refactored, and the new e2e test needs to be more comprehensive by asserting the correctness of the source IP after NAT.

@zbb88888
Copy link
Copy Markdown
Collaborator Author

zbb88888 commented Mar 1, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces hairpin NAT functionality, allowing internal services to access other internal services via their external IPs (FIPs). The changes include adding new iptables chains and rules in nat-gateway.sh to handle hairpin SNAT, and corresponding E2E tests in Go to validate the new feature. The implementation appears correct and robust, with good use of efficient shell scripting patterns and comprehensive test coverage for rule creation, data path verification, and cleanup. I have one suggestion to improve code maintainability by reducing duplication.

@zbb88888
Copy link
Copy Markdown
Collaborator Author

zbb88888 commented Mar 2, 2026

@oilbeater 大佬,帮忙 review 下

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Mar 2, 2026
Signed-off-by: zbb88888 <jmdxjsjgcxy@gmail.com>
@zbb88888 zbb88888 merged commit 75b7911 into kubeovn:master Mar 2, 2026
10 checks passed
@zbb88888 zbb88888 deleted the eip2eip branch March 2, 2026 05:01
@SkalaNetworks
Copy link
Copy Markdown
Member

SkalaNetworks commented Mar 13, 2026

Hi @zbb88888

I'm testing this patch on my NAT gateways and I'm finding more use cases where we could improve the hairpin logic. I'd like to pass them through you:

  • Hairpin only applies if the SNAT targets a full subnet within the VPC (and not part of it, or something bigger, e.g 0.0.0.0/0)

The hairpin logic looks through routes on the VPC interface of the NAT gateway, but it matches the full route.
Say I've got a subnet 10.10.0.0/24 but I'm SNATing only the 10.10.0.0/25 part, then the hairpin won't match and won't kick in.

Or if I'm SNATing multiple subnets behind a single NAT gateway through a 0.0.0.0/0 SNAT, then hairpin will not work for anyone.

  • Hairpin doesn’t work for FIPs

Despite the title of the PR, I tested the hairpin rules and they only get created when doing SNATs, not when creating Flexible IPs, but I may be missing something

  • Hairpin doesn’t work when NATing to LBs

If you have a loadbalancer with a custom range within your Subnet (e.g. 198.18.0.0/24), this LB targets pods within your subnet transparently (it doesn't SNAT to its own address). But no hairpin is created for it.

  • If your NAT gateway is attached to two subnets, hairpin won't work for the "non-primary" one.

Say you have subnet A and B, with the NAT gateway being in subnet A. Redirect your traffic using a static route in subnet B to the NAT gateway. You now have one NAT gateway for 2 subnets.

Do SNAT+DNAT of a pod within subnet A, try to access it through its EIP from subnet B. No hairpin is created for it, as it was only created for subnet A.


Hairpin seems to be pretty hard... I'm examining solutions, and here's what I came up with:

  • If a FIP is created (EXCLUSIVE_SNAT) or a SNAT is created (SHARED_SNAT), resync the NAT gateway hairpins
  • Lookup every Subnet in the VPC (the NAT gateway automatically has routes injected for each subnet in the VPC)
  • Extract the CIDRs for these subnets.
  • For each CIDR, create an hairpin rule back to the subnet in which the FIP/SNAT was created.

Example:

  • Subnet A, B, C (10.10.0.0/24, 10.20.0.0/24, 10.30.0.0/24) in a VPC
  • NAT gateway in Subnet A
  • All 3 subnets use the same NAT gateway from Subnet A
  • A VM in Subnet A with address 10.10.0.10 is created and runs a webserver on port 80
  • EIP + SNAT + DNAT is created so that this VM has internet access + the internet can reach port 80

Without hairpin, none of the 3 subnets can reach the EIP and expect traffic coming back with the correct source IP.

When the SNAT is created, ovn-controller triggers the following reconciliation:

  • Pickup all Subnets in VPC (10.10.0.0/24, 10.20.0.0/24, 10.30.0.0/24)
  • Pickup the "source subnet" (the one targetted by the SNAT) -> here it's subnet A or 10.10.0.0/24
  • Create hairpin rules
    • If source is 10.10.0.0/24 and destination 10.10.0.0/24 then SNAT to EIP
    • If source is 10.20.0.0/24 and destination 10.10.0.0/24 then SNAT to EIP
    • If source is 10.30.0.0/24 and destination 10.10.0.0/24 then SNAT to EIP

That way, if any subnet in the VPC passes through the NAT gateway, hairpin is successful. For FIPs and not SNATs, replace 10.10.0.0/24 by 10.10.0.10/32 for example

For the loadbalancer case, if the route is statically injected in the VPC NAT gateway (it is possible through a new option I added to the CRD), we can add an option to hairpin the traffic.

What do you think?

@SkalaNetworks
Copy link
Copy Markdown
Member

Testing this #6445

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature New network feature lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files. test automation tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants