Skip to content

Instances lost outbound internet access after changing silo default IP pool #7297

Open
@askfongjojo

Description

@askfongjojo

@iliana has an instance with an ephemeral IP losing its outbound internet access on rack3 (inbound access is working just fine). I took a look at its opte entries and noticed that its router-target in the Outbound Rules section didn't have an internet gateway uuid (meta: router-target=ig).

BRM42220054 # opteadm dump-layer nat -p opte4
Port opte4 - Layer nat
======================================================================
Inbound Flows
----------------------------------------------------------------------
PROTO  SRC IP          SPORT  DST IP          DPORT  HITS  ACTION
TCP    92.255.85.253   40574  45.154.216.171  22     0     NAT
TCP    92.255.85.253   40576  45.154.216.171  22     0     NAT
[SNIP]

Outbound Flows
----------------------------------------------------------------------
PROTO  SRC IP      SPORT  DST IP          DPORT  HITS  ACTION
TCP    172.30.0.6  22     92.255.85.253   40574  1     NAT
[SNIP]

Inbound Rules
----------------------------------------------------------------------
ID   PRI  HITS    PREDICATES                   ACTION
5    10   155268  inner.ip.dst=45.154.216.171  "Stateful: 172.30.0.6 <=> (external)"
DEF  --   17176   --                           "allow"

Outbound Rules
----------------------------------------------------------------------
ID   PRI  HITS   PREDICATES                    ACTION
15   10   0      inner.ether.ether_type=IPv4   "Stateful: 172.30.0.6 <=> 45.154.216.171"
                 meta: router-target=ig        
                                               
16   100  0      inner.ether.ether_type=IPv4   "Stateful: 45.154.216.124:16384-32767"
                 meta: router-target=ig        
                                               
17   255  28267  meta: router-target-class=ig  "Deny"
DEF  --   100    --                            "allow"

For a comparison, this is how the output looks like for another instance with an ephemeral IP in the same IP pool (we see meta: router-target=ig=46452e5f-1ddc-4b7c-9013-114d1a26d936):

BRM42220054 # opteadm dump-layer nat -p opte6
Port opte6 - Layer nat
======================================================================
Inbound Flows
----------------------------------------------------------------------
PROTO  SRC IP          SPORT  DST IP          DPORT  HITS  ACTION
TCP    3.136.208.236   45761  45.154.216.194  49203  0     NAT
TCP    60.167.165.58   48622  45.154.216.194  22     1     NAT
[SNIP]

Outbound Flows
----------------------------------------------------------------------
PROTO  SRC IP      SPORT  DST IP          DPORT  HITS  ACTION
TCP    172.30.0.5  49203  3.136.208.236   45761  0     NAT
[SNIP]

Inbound Rules
----------------------------------------------------------------------
ID   PRI  HITS    PREDICATES                   ACTION
4    10   197877  inner.ip.dst=45.154.216.194  "Stateful: 172.30.0.5 <=> (external)"
DEF  --   11004   --                           "allow"

Outbound Rules
----------------------------------------------------------------------
ID   PRI  HITS   PREDICATES                                                   ACTION
12   10   13687  inner.ether.ether_type=IPv4                                  "Stateful: 172.30.0.5 <=> 45.154.216.194"
                 meta: router-target=ig=46452e5f-1ddc-4b7c-9013-114d1a26d936  
                                                                              
13   100  0      inner.ether.ether_type=IPv4                                  "Stateful: 45.154.216.87:32768-49151"
                 meta: router-target=ig=46452e5f-1ddc-4b7c-9013-114d1a26d936  
                                                                              
14   255  0      meta: router-target-class=ig                                 "Deny"
DEF  --   5908   --                                                           "allow"

The port in question does have the correct internet gateway id captured in the opte router output:

BRM42220054 # opteadm dump-layer router -p opte4
Port opte4 - Layer router
======================================================================
Inbound Flows
----------------------------------------------------------------------
PROTO  SRC IP  SPORT  DST IP  DPORT  HITS  ACTION

Outbound Flows
----------------------------------------------------------------------
PROTO  SRC IP  SPORT  DST IP  DPORT  HITS  ACTION

Inbound Rules
----------------------------------------------------------------------
ID   PRI  HITS    PREDICATES  ACTION
DEF  --   212410  --          "allow"

Outbound Rules
----------------------------------------------------------------------
ID   PRI  HITS    PREDICATES                         ACTION
1    31   100     inner.ip.dst=172.30.0.0/22         "Meta: Target = Subnet: 172.30.0.0/22"
2    75   110613  inner.ip.dst=0.0.0.0/0             "Meta: Target = IG(Some(00f46642-721c-45aa-b4da-0534ab36b49f))"
0    139  0       inner.ip6.dst=fd37:ff93:8bab::/64  "Meta: Target = Subnet: fd37:ff93:8bab::/64"
3    267  0       inner.ip6.dst=::/0                 "Meta: Target = IG(Some(00f46642-721c-45aa-b4da-0534ab36b49f))"
DEF  --   0       --                                 "deny"

I wonder if this is because the default IP pool for the silo was changed between when the ephemeral IP was allocated and when the migration script schema/crdb/internet-gateway/up13.sql was executed. The migration script auto-created a default gateway attached to the current default IP pool (the pool named eng-vpn) while the instance has its external IP in the original default pool named public. @FelixMcFelix - thoughts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    customerFor any bug reports or feature requests tied to customer requestsknown issueTo include in customer documentation and training

    Type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions