Skip to content

Container to container networking performance degradation  #293

Open
@brunograz

Description

@brunograz

Moving this issue out of cloudfoundry/cf-networking-release#213 as we have indications that it is related to the Stemcell.

Issue

We currently observe timeouts in C2C when moving CF from Bionic to Jammy.
Please note that this issue can only be observed when the Diego cells are migrated from Bionic to Jammy and cannot be reproduced on Bionic stemcells.
As additional information, we've also tested in different environments with and without dynamic ASGs.

Steps to Reproduce - See additional information below

  • Install cf-deployment [v27.2.0] on Jammy stemcell
  • Push two apps and add a network-policy enabling traffic from app A to app B
  • cf add-network-policy app-a app-b --protocol tcp --port 8080
  • ssh into app-a and try to reach app-b

Expected result

Successful connections from app-a to app-b.

Current result

Sporadic timeouts and slow connections from app-a to app-b.

[backend-wgnnmafs]: Hello!
real    0m2.035s
user    0m0.000s
sys     0m0.007s
[backend-wgnnmafs]: Hello!
real    0m0.018s
user    0m0.000s
sys     0m0.007s
[backend-wgnnmafs]: Hello!
real    0m1.039s
user    0m0.000s
sys     0m0.007s

Workaround

In every CloudFoundry diego cell you should disable a configuration parameter in the networking interface:
ethtool -K eth0 tx-udp_tnl-segmentation off && ethtool -K eth0 tx-udp_tnl-csum-segmentation off

This is currently disabled (off) by default on Bionic compared to Jammy.

Further information

Infrastructure: ESXI prepared with NSX-T / NSX-V (tested on both) - not sure if it can be reproduced in other cloud environments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    • Status

      Waiting for Changes | Open for Contribution

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions