Skip to content

Commit d091d19

Browse files
committed
Add retry logic for tunnel connection attempts using backoff
Introduce backoff-based retry mechanism in `dialer` to handle cases where the target IP isn't ready to accept requests immediately. Signed-off-by: Thomas Hallgren <thomas@tada.se>
1 parent 6fe17dd commit d091d19

File tree

4 files changed

+31
-6
lines changed

4 files changed

+31
-6
lines changed

CHANGELOG.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,11 @@ items:
6060
The Traffic Agent's retry interval when it establishes its watcher for intercepts is now configurable using the Helm chart value
6161
`agent.watchRetryInterval`. The default retry interval was also increased from 2 seconds to 10 seconds to improve resilience when
6262
connections to the traffic manager are lost.
63+
- type: bugfix
64+
title: Add retry logic for tunnel connection attempts
65+
body: >-
66+
The tunnel dialer now uses a backoff-based retry mechanism when establishing connections. This ensures that dialing attempts for
67+
both TCP and UDP protocols persist if the target IP is not immediately ready to receive requests.
6368
- type: bugfix
6469
title: Retry mechanism for client tunnel creation
6570
body: >-

docs/release-notes.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,12 @@ The watchable map has been refactored into a client/server model that supports d
3232
The Traffic Agent's retry interval when it establishes its watcher for intercepts is now configurable using the Helm chart value `agent.watchRetryInterval`. The default retry interval was also increased from 2 seconds to 10 seconds to improve resilience when connections to the traffic manager are lost.
3333
</div>
3434

35+
## <div style="display:flex;"><img src="images/bugfix.png" alt="bugfix" style="width:30px;height:fit-content;"/><div style="display:flex;margin-left:7px;">Add retry logic for tunnel connection attempts</div></div>
36+
<div style="margin-left: 15px">
37+
38+
The tunnel dialer now uses a backoff-based retry mechanism when establishing connections. This ensures that dialing attempts for both TCP and UDP protocols persist if the target IP is not immediately ready to receive requests.
39+
</div>
40+
3541
## <div style="display:flex;"><img src="images/bugfix.png" alt="bugfix" style="width:30px;height:fit-content;"/><div style="display:flex;margin-left:7px;">Retry mechanism for client tunnel creation</div></div>
3642
<div style="margin-left: 15px">
3743

docs/release-notes.mdx

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,12 @@ The watchable map has been refactored into a client/server model that supports d
3838
The Traffic Agent's retry interval when it establishes its watcher for intercepts is now configurable using the Helm chart value `agent.watchRetryInterval`. The default retry interval was also increased from 2 seconds to 10 seconds to improve resilience when connections to the traffic manager are lost.
3939
</Body>
4040
</Note>
41+
<Note>
42+
<Title type="bugfix">Add retry logic for tunnel connection attempts</Title>
43+
<Body>
44+
The tunnel dialer now uses a backoff-based retry mechanism when establishing connections. This ensures that dialing attempts for both TCP and UDP protocols persist if the target IP is not immediately ready to receive requests.
45+
</Body>
46+
</Note>
4147
<Note>
4248
<Title type="bugfix">Retry mechanism for client tunnel creation</Title>
4349
<Body>

pkg/tunnel/dialer.go

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ import (
1313
"sync/atomic"
1414
"time"
1515

16+
"github.com/cenkalti/backoff/v4"
1617
"google.golang.org/grpc/codes"
1718
"google.golang.org/grpc/status"
1819

@@ -167,12 +168,19 @@ func (h *dialer) Start(ctx context.Context) {
167168
dtoCtx, cancel := context.WithTimeout(ctx, dto)
168169
defer cancel()
169170
var conn net.Conn
170-
var err error
171-
if id.Protocol() == types.ProtoUDP {
172-
conn, err = d.DialUDP(dtoCtx, netip.AddrPort{}, id.Destination())
173-
} else {
174-
conn, err = d.DialTCP(dtoCtx, id.Destination())
175-
}
171+
172+
// A retry is needed here because the attempt to establish a Tunnel might arrive before
173+
// the target IP is ready to receive requests. The target IP might well be intercepted
174+
// (or in progress of switching to become intercepted).
175+
err := backoff.Retry(func() error {
176+
var err error
177+
if id.Protocol() == types.ProtoUDP {
178+
conn, err = d.DialUDP(dtoCtx, netip.AddrPort{}, id.Destination())
179+
} else {
180+
conn, err = d.DialTCP(dtoCtx, id.Destination())
181+
}
182+
return err
183+
}, backoff.WithContext(backoff.NewConstantBackOff(time.Second), dtoCtx))
176184
if err != nil {
177185
dlog.Errorf(ctx, "!> %s %s, failed to establish connection: %v", tag, id, err)
178186
if err = h.stream.Send(ctx, NewMessage(DialReject, nil)); err != nil {

0 commit comments

Comments
 (0)