Skip to content

Conversation

@pappz
Copy link
Contributor

@pappz pappz commented Dec 30, 2025

  • Moving auth functions from client/internal to client/internal/auth package
  • Creating unified auth.Auth client with NewAuth() constructor
  • Replacing direct auth function calls with auth client methods
  • Refactoring device flow and PKCE flow implementations
  • Updating iOS/Android/server code to use new auth client API

Describe your changes

Issue ticket number and link

Stack

Checklist

  • Is it a bug fix
  • Is a typo/documentation fix
  • Is a feature enhancement
  • It is a refactor
  • Created tests that fail without the change (if possible)

By submitting this pull request, you confirm that you have read and agree to the terms of the Contributor License Agreement.

Documentation

Select exactly one:

  • I added/updated documentation for this change
  • Documentation is not needed for this change (explain why)

Docs PR URL (required if "docs added" is checked)

Paste the PR link from https://github.com/netbirdio/docs here:

https://github.com/netbirdio/docs/pull/__

Summary by CodeRabbit

  • New Features

    • Unified authentication client used across platforms for login, SSO checks, and OAuth flows.
    • Public constructor to create auth instances from existing configs; new public auth config types for PKCE/Device flows.
  • Bug Fixes

    • Clearer separation of authentication vs transient errors.
    • Simplified token-wait behavior with timeout handled per-flow.
  • Refactor

    • Centralized PKCE and Device Code handling into the auth client and removed legacy backoff/retry probing.

✏️ Tip: You can customize this high-level summary in your review settings.

- Moving auth functions from client/internal to client/internal/auth package
- Creating unified auth.Auth client with NewAuth() constructor
- Replacing direct auth function calls with auth client methods
- Refactoring device flow and PKCE flow implementations
- Updating iOS/Android/server code to use new auth client API
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 30, 2025

📝 Walkthrough

Walkthrough

Introduces a centralized auth client (package auth) and replaces legacy internal PKCE/device/login initialization and backoff retries with auth.Auth usage across Android, iOS, CLI, embed, and server code; exposes public provider config types and constructors and removes deprecated internal helpers.

Changes

Cohort / File(s) Summary
Core Auth Client
client/internal/auth/auth.go
New persistent auth.Auth managing mgm gRPC connection, reconnection/backoff, IsSSOSupported, GetOAuthFlow, IsLoginRequired, Login, registerPeer, Close, and retry/error-classification helpers.
PKCE & Device Flows (Public API)
client/internal/auth/pkce_flow.go, client/internal/auth/device_flow.go
Introduced public configs PKCEAuthProviderConfig, DeviceAuthProviderConfig, constructors NewPKCEAuthorizationFlow/NewDeviceAuthorizationFlow, SetLoginHint, and timeout-aware WaitToken behavior.
OAuth Integration
client/internal/auth/oauth.go
Delegates flow retrieval to auth.Auth (new client) instead of former internal functions; conditional login-hint handling and preserved error semantics.
Removed Legacy Init Files
client/internal/pkce_auth.go, client/internal/device_auth.go, client/internal/login.go
Deleted legacy functions that previously fetched PKCE/Device flows and performed internal login/probing; replaced by auth.Auth methods.
Android Client
client/android/login.go
Switched to auth.Auth for IsSSOSupported/IsLoginRequired/Login; foregroundGetTokenInfo now accepts *auth.Auth; added NewAuthWithConfig; removed backoff helper usage.
iOS / tvOS SDK
client/ios/NetBirdSDK/login.go, client/ios/NetBirdSDK/client.go
Replaced probing/backoff with auth.Auth usage (IsSSOSupported, IsLoginRequired, Login, GetOAuthFlow); token wait uses provided context and auth client.
CLI
client/cmd/login.go
Replaced backoff-based internal login with auth.Auth flow; simplified token waiting (no explicit timeout ctx) and removed time/backoff imports.
Embedded Client
client/embed/embed.go
Uses auth.NewAuth and authClient.Login for setupKey/jwt login; defers Close and preserves post-login steps.
Server
client/server/server.go
loginAttempt uses auth.Auth; distinguishes auth errors vs other failures; WaitSSOLogin uses cancelable ctx instead of fixed timeout.
Tests Updated
client/internal/auth/pkce_flow_test.go, client/internal/auth/device_flow_test.go, client/internal/auth/pkce_flow_windows_test.go
Tests updated to use public config types and constructors (removed internal. type references).
Deps / Manifest
go.mod, manifest_file
Adjusted imports/modules to support new auth package and removed unused imports (backoff/time/logrus/etc.).

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client\n(Android/iOS/CLI/Server)
    participant Auth as Auth Client\n(auth.Auth)
    participant MGM as Management\nServer (mgm)
    participant OAuth as OAuth\nProvider

    Client->>Auth: NewAuth(ctx, privateKey, mgmURL, config)
    Auth->>MGM: Establish gRPC connection
    MGM-->>Auth: Connection ready

    Client->>Auth: IsSSOSupported(ctx)
    Auth->>MGM: Request PKCE flow
    alt PKCE available
        MGM-->>Auth: PKCE config
        Auth-->>Client: SSO supported (PKCE)
    else try device
        Auth->>MGM: Request Device flow
        MGM-->>Auth: Device config / NotFound
        Auth-->>Client: SSO supported? (device or not)
    end

    Client->>Auth: GetOAuthFlow(ctx, forceDevice)
    Auth->>MGM: Fetch flow config
    Auth-->>Client: OAuthFlow
    Client->>OAuth: RequestAuthInfo / WaitToken
    OAuth-->>Client: TokenInfo (JWT)

    Client->>Auth: Login(ctx, setupKey, jwtToken)
    Auth->>MGM: doMgmLogin(pubSSHKey)
    alt Login success
        MGM-->>Auth: LoginResponse
        Auth-->>Client: success
    else Registration needed
        Auth->>MGM: registerPeer(setupKey, jwtToken)
        MGM-->>Auth: Registration response
        Auth-->>Client: success
    end

    Client->>Auth: Close()
    Auth->>MGM: Disconnect
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • mlsmaycon
  • lixmal
  • pascal-fischer

Poem

🐇
I nibbled code and found the trail,
Built an auth client to tip the scale,
PKCE and Device now share one tail,
Backoff ghosts hopped off the rail,
Tokens arrive on a steady gale.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title '[client] Consolidate authentication logic' clearly summarizes the main refactoring objective of moving and unifying authentication logic into a new auth client.
Description check ✅ Passed The description provides a bulleted summary of changes and completes the required template with refactor checkbox marked, but lacks detailed explanation of 'Describe your changes' section.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
client/cmd/login.go (1)

324-334: Using context.TODO() removes user cancellation capability.

Both RequestAuthInfo and WaitToken use context.TODO(), which means the user cannot cancel the login flow via Ctrl+C. If the browser login never completes, the CLI will hang indefinitely.

Consider using the passed ctx parameter instead to preserve cancellation semantics:

🔎 Proposed fix
-	flowInfo, err := oAuthFlow.RequestAuthInfo(context.TODO())
+	flowInfo, err := oAuthFlow.RequestAuthInfo(ctx)
 	if err != nil {
 		return nil, fmt.Errorf("getting a request OAuth flow info failed: %v", err)
 	}

 	openURL(cmd, flowInfo.VerificationURIComplete, flowInfo.UserCode, noBrowser)

-	tokenInfo, err := oAuthFlow.WaitToken(context.TODO(), flowInfo)
+	tokenInfo, err := oAuthFlow.WaitToken(ctx, flowInfo)
 	if err != nil {
 		return nil, fmt.Errorf("waiting for browser login failed: %v", err)
 	}
🧹 Nitpick comments (8)
client/internal/auth/auth.go (2)

171-198: Unconventional return signature: (error, bool) instead of (bool, error).

Go convention places error as the last return value. This reversed signature may confuse callers and lead to bugs where the values are swapped.

🔎 Proposed signature change
-func (a *Auth) Login(ctx context.Context, setupKey string, jwtToken string) (error, bool) {
+func (a *Auth) Login(ctx context.Context, setupKey string, jwtToken string) (bool, error) {
 	// ... implementation ...
-	return err, isAuthError
+	return isAuthError, err
 }

This would require updating all call sites to use isAuthError, err := authClient.Login(...) instead.


300-317: Consider extracting duplicate system info setup.

The SetFlags call with identical parameters appears in both doMgmLogin and registerPeer. Consider extracting this to a helper method to reduce duplication and ensure consistency.

🔎 Suggested helper
func (a *Auth) configureSystemInfo(info *system.Info) {
	info.SetFlags(
		a.config.RosenpassEnabled,
		a.config.RosenpassPermissive,
		a.config.ServerSSHAllowed,
		a.config.DisableClientRoutes,
		a.config.DisableServerRoutes,
		a.config.DisableDNS,
		a.config.DisableFirewall,
		a.config.BlockLANAccess,
		a.config.BlockInbound,
		a.config.LazyConnectionEnabled,
		a.config.EnableSSHRoot,
		a.config.EnableSSHSFTP,
		a.config.EnableSSHLocalPortForwarding,
		a.config.EnableSSHRemotePortForwarding,
		a.config.DisableSSHAuth,
	)
}

Also applies to: 337-354

client/internal/auth/device_flow.go (2)

135-138: SetLoginHint mutates internal state - consider thread safety.

This setter modifies providerConfig.LoginHint without synchronization. If DeviceAuthorizationFlow could be used concurrently, this could cause a data race. If single-threaded use is expected, this is fine.


250-260: Double timeout context - clarify the intended behavior.

WaitToken now creates its own timeout context from info.ExpiresIn, ignoring any timeout already set on the passed ctx. The doc comment mentions this, but callers might expect their context's deadline to take precedence.

Looking at test code and iOS/Android callers, they also create their own timeout context before calling WaitToken, resulting in nested timeouts. This works but is redundant.

Consider either:

  1. Removing the internal timeout and letting callers control it (current pattern in tests/callers)
  2. Keeping internal timeout but documenting that caller timeouts are ignored
client/ios/NetBirdSDK/login.go (1)

281-284: Redundant timeout context.

A timeout context is created here with flowInfo.ExpiresIn, but WaitToken also creates its own internal timeout from the same value. The outer timeout is effectively unused.

client/android/login.go (3)

119-122: isAuthError is ignored - inconsistent with iOS handling.

Android ignores the second return value (isAuthError) while iOS uses it to provide differentiated error messages. Consider handling it consistently:

🔎 Proposed fix for consistent error handling
-	err, _ = authClient.Login(ctxWithValues, setupKey, "")
+	err, isAuthError := authClient.Login(ctxWithValues, setupKey, "")
 	if err != nil {
-		return fmt.Errorf("login failed: %v", err)
+		if isAuthError {
+			return fmt.Errorf("authentication error: %v", err)
+		}
+		return fmt.Errorf("login failed: %v", err)
 	}

161-164: Same issue: isAuthError ignored here as well.

Consider consistent error handling with iOS.


177-177: Avoid context.TODO() - use the available context.

context.TODO() is intended as a placeholder during refactoring. Use a.ctx for consistency with the rest of the method.

🔎 Proposed fix
-	flowInfo, err := oAuthFlow.RequestAuthInfo(context.TODO())
+	flowInfo, err := oAuthFlow.RequestAuthInfo(a.ctx)
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 96cdd56 and a1fb1a7.

📒 Files selected for processing (15)
  • client/android/login.go
  • client/cmd/login.go
  • client/embed/embed.go
  • client/internal/auth/auth.go
  • client/internal/auth/device_flow.go
  • client/internal/auth/device_flow_test.go
  • client/internal/auth/oauth.go
  • client/internal/auth/pkce_flow.go
  • client/internal/auth/pkce_flow_test.go
  • client/internal/device_auth.go
  • client/internal/login.go
  • client/internal/pkce_auth.go
  • client/ios/NetBirdSDK/client.go
  • client/ios/NetBirdSDK/login.go
  • client/server/server.go
💤 Files with no reviewable changes (3)
  • client/internal/login.go
  • client/internal/device_auth.go
  • client/internal/pkce_auth.go
🧰 Additional context used
🧬 Code graph analysis (10)
client/embed/embed.go (1)
client/internal/auth/auth.go (1)
  • NewAuth (38-65)
client/ios/NetBirdSDK/client.go (3)
client/android/login.go (1)
  • NewAuth (39-54)
client/internal/auth/auth.go (1)
  • NewAuth (38-65)
client/ios/NetBirdSDK/login.go (1)
  • NewAuth (44-59)
client/internal/auth/pkce_flow_test.go (1)
client/internal/auth/pkce_flow.go (1)
  • PKCEAuthProviderConfig (38-63)
client/cmd/login.go (3)
client/android/login.go (1)
  • NewAuth (39-54)
client/internal/auth/auth.go (1)
  • NewAuth (38-65)
client/ios/NetBirdSDK/login.go (1)
  • NewAuth (44-59)
client/server/server.go (2)
client/internal/auth/auth.go (1)
  • NewAuth (38-65)
client/internal/state.go (2)
  • StatusLoginFailed (16-16)
  • StatusType (8-8)
client/internal/auth/oauth.go (2)
client/internal/auth/auth.go (1)
  • NewAuth (38-65)
client/internal/profilemanager/config.go (1)
  • Config (91-162)
client/internal/auth/auth.go (5)
shared/management/status/error.go (3)
  • Errorf (70-75)
  • Error (54-57)
  • FromError (78-87)
client/internal/auth/oauth.go (1)
  • OAuthFlow (17-21)
client/internal/auth/pkce_flow.go (3)
  • PKCEAuthorizationFlow (89-94)
  • PKCEAuthProviderConfig (38-63)
  • NewPKCEAuthorizationFlow (97-128)
shared/management/client/common/types.go (1)
  • LoginFlag (11-11)
client/internal/auth/device_flow.go (3)
  • DeviceAuthorizationFlow (75-78)
  • DeviceAuthProviderConfig (29-49)
  • NewDeviceAuthorizationFlow (103-128)
client/internal/auth/pkce_flow.go (2)
shared/management/client/common/types.go (1)
  • LoginFlag (11-11)
client/internal/auth/oauth.go (2)
  • AuthFlowInfo (29-36)
  • TokenInfo (44-52)
client/internal/auth/device_flow.go (3)
shared/management/status/error.go (1)
  • Errorf (70-75)
client/internal/auth/oauth.go (3)
  • HTTPClient (24-26)
  • AuthFlowInfo (29-36)
  • TokenInfo (44-52)
management/server/http/testing/testing_tools/tools.go (1)
  • ExpiresIn (43-43)
client/android/login.go (4)
client/internal/auth/auth.go (2)
  • NewAuth (38-65)
  • Auth (26-33)
client/ios/NetBirdSDK/login.go (2)
  • NewAuth (44-59)
  • Auth (37-41)
shared/management/status/error.go (1)
  • Errorf (70-75)
client/internal/auth/oauth.go (1)
  • TokenInfo (44-52)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: Management / Unit (amd64, postgres)
  • GitHub Check: Management / Unit (amd64, mysql)
  • GitHub Check: Relay / Unit (386)
  • GitHub Check: Client / Unit (386)
  • GitHub Check: Management / Unit (amd64, sqlite)
  • GitHub Check: Relay / Unit (amd64, -race)
  • GitHub Check: Client (Docker) / Unit
  • GitHub Check: Client / Unit (amd64)
  • GitHub Check: Darwin
  • GitHub Check: Client / Unit
  • GitHub Check: Windows
  • GitHub Check: Linux
  • GitHub Check: release
  • GitHub Check: FreeBSD Port / Build & Test
  • GitHub Check: Client / Unit
  • GitHub Check: release_ui_darwin
  • GitHub Check: Client / Unit
  • GitHub Check: Android / Build
  • GitHub Check: JS / Lint
🔇 Additional comments (25)
client/internal/auth/pkce_flow.go (3)

37-85: LGTM on PKCEAuthProviderConfig and validation.

The public type definition and validation function are well-structured. The validation covers all essential fields with clear error messages that help administrators troubleshoot configuration issues.


176-179: LGTM on SetLoginHint method.

Simple setter that allows modifying the login hint after flow construction, enabling flexibility in the OAuth flow setup.


184-219: LGTM on the WaitToken timeout refactor.

The internal timeout context based on info.ExpiresIn is a cleaner design - callers no longer need to manage the timeout externally. Using context.Background() for the shutdown context (line 201) is correct since the server shutdown should complete regardless of whether the parent context was cancelled.

client/internal/auth/pkce_flow_test.go (1)

52-62: LGTM on test update.

The test correctly references the new PKCEAuthProviderConfig type that was moved to the current package scope, consistent with the public API refactor.

client/internal/auth/auth.go (1)

401-406: Verify that codes.Internal should trigger reconnection.

Including codes.Internal in connection errors may cause unnecessary reconnection attempts for server-side bugs that won't be fixed by reconnecting. This could mask actual issues.

Consider whether codes.Internal errors should trigger reconnection or be treated as permanent failures that bubble up to the caller.

client/embed/embed.go (1)

172-180: LGTM on auth client integration.

The embedded client correctly creates an auth client, defers its cleanup, and performs login using the new unified auth API. Error handling wraps failures appropriately.

client/ios/NetBirdSDK/client.go (2)

266-281: LGTM on IsLoginRequired refactor.

Correctly creates an auth client, handles creation errors by assuming login is required (safe fallback), and properly closes the client.


323-341: Goroutine uses parent context that may be cancelled.

The goroutine at line 323 uses ctx which is created at line 296 with c.ctxCancel. If the client is stopped (calling c.ctxCancel()), the WaitToken call and subsequent Login will fail. Ensure this is the intended behavior - that stopping the client cancels in-progress logins.

Verify that cancellation of the parent context during SSO login flow is handled gracefully on the Swift side.

client/server/server.go (2)

285-305: LGTM on loginAttempt refactor.

The function correctly integrates the new auth client, properly handles error/auth-error distinction, and closes the client on exit.


620-621: LGTM on context change.

The switch from context.WithTimeout to context.WithCancel is correct since WaitToken now manages its own timeout internally based on flowInfo.ExpiresIn. The cancel function is still stored for explicit cancellation if needed.

client/internal/auth/oauth.go (2)

88-105: LGTM on PKCE flow initialization.

The function correctly creates an auth client, retrieves the PKCE flow configuration, conditionally sets the login hint, and ensures cleanup via defer.


108-135: LGTM on Device Code flow initialization.

Good error handling with specific messages for NotFound and Unimplemented cases, guiding users toward setup keys when SSO is unavailable.

client/cmd/login.go (1)

277-306: LGTM on foregroundLogin refactor.

The function correctly uses the auth client for login checks and token-based authentication. The two-phase approach (check if login needed, then login with credentials) is clear.

client/internal/auth/device_flow_test.go (2)

116-127: LGTM! Clean transition to constructor-based initialization.

The test now properly uses the public NewDeviceAuthorizationFlow constructor with DeviceAuthProviderConfig, and assigns the mock HTTP client post-construction. This pattern is appropriate for testing and aligns with the refactored public API.


282-293: Consistent with the first test function.

Same constructor-based pattern applied correctly here.

client/internal/auth/device_flow.go (4)

28-49: Well-structured configuration type with clear documentation.

The exported DeviceAuthProviderConfig struct is well-documented with comments explaining each field. The deprecation notice on Domain is helpful.


103-128: Constructor looks good but missing validation call.

The constructor properly initializes HTTP transport with certificate handling and reasonable defaults. As noted above, add the validation call at the start.


258-260: Good practice: defer ticker.Stop() added.

Properly stopping the ticker prevents resource leaks.


51-71: No action needed. validateDeviceAuthConfig is already called in the getDeviceFlow() method (lines 280-282 in client/internal/auth/auth.go) before NewDeviceAuthorizationFlow is instantiated. Invalid configurations are validated at the entry point and rejected before the constructor runs—this is the correct design pattern.

client/ios/NetBirdSDK/login.go (3)

88-92: Good: Proper auth client lifecycle management.

Creating the auth client and deferring Close() ensures resources are properly released. This pattern is correctly applied throughout the file.


164-171: Correct error classification handling.

The isAuthError flag is properly used to distinguish authentication errors (like PermissionDenied) from other failures, providing meaningful error messages.


264-268: Clean refactor: foregroundGetTokenInfo now accepts auth client.

The method signature change to accept *auth.Auth instead of creating its own is appropriate since the caller already has the auth client instance.

client/android/login.go (3)

79-83: Good: Consistent auth client lifecycle.

Same proper pattern as iOS with defer authClient.Close().


166-167: OnLoginSuccess called asynchronously - differs from iOS.

Android uses go urlOpener.OnLoginSuccess() while iOS calls it synchronously with a comment explaining why. If synchronous behavior is intentional on iOS for persistence guarantees, consider whether Android should match that pattern.


171-175: Good: Method signature updated to accept auth client.

Consistent with iOS changes, avoiding redundant auth client creation.

- Remove unnecessary internal package reference in PKCE flow test
- Adjust context assignment placement in iOS and Android login methods
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
client/ios/NetBirdSDK/login.go (1)

126-134: Context with device name not passed to auth client creation.

This is the same issue flagged in previous reviews: ctxWithValues is created with the device name on line 133, but auth.NewAuth on line 126 uses a.ctx instead. If the management client or auth client needs the device name during initialization, it won't be available.

🔎 Proposed fix
+	//nolint
+	ctxWithValues := context.WithValue(a.ctx, system.DeviceNameCtxKey, deviceName)
-	authClient, err := auth.NewAuth(a.ctx, a.config.PrivateKey, a.config.ManagementURL, a.config)
+	authClient, err := auth.NewAuth(ctxWithValues, a.config.PrivateKey, a.config.ManagementURL, a.config)
 	if err != nil {
 		return fmt.Errorf("failed to create auth client: %v", err)
 	}
 	defer authClient.Close()
-
-	//nolint
-	ctxWithValues := context.WithValue(a.ctx, system.DeviceNameCtxKey, deviceName)
 	err, _ = authClient.Login(ctxWithValues, setupKey, "")
client/android/login.go (1)

111-119: Context with device name not passed to auth client creation.

Same issue as in the iOS file and flagged in previous reviews: ctxWithValues with the device name is created on line 118, but auth.NewAuth on line 111 uses a.ctx. The device name won't be available during auth client initialization.

🔎 Proposed fix
+	//nolint
+	ctxWithValues := context.WithValue(a.ctx, system.DeviceNameCtxKey, deviceName)
-	authClient, err := auth.NewAuth(a.ctx, a.config.PrivateKey, a.config.ManagementURL, a.config)
+	authClient, err := auth.NewAuth(ctxWithValues, a.config.PrivateKey, a.config.ManagementURL, a.config)
 	if err != nil {
 		return fmt.Errorf("failed to create auth client: %v", err)
 	}
 	defer authClient.Close()
-
-	//nolint
-	ctxWithValues := context.WithValue(a.ctx, system.DeviceNameCtxKey, deviceName)
 	err, _ = authClient.Login(ctxWithValues, setupKey, "")
🧹 Nitpick comments (3)
client/ios/NetBirdSDK/login.go (1)

134-134: Consider handling the isAuthError return value.

Unlike LoginSync (line 166) and login (line 236) which check the isAuthError boolean to distinguish authentication errors from other failures, this line ignores it with _. For consistency and better error diagnostics, consider checking this value.

🔎 Proposed refactor
-	err, _ = authClient.Login(ctxWithValues, setupKey, "")
+	err, isAuthError := authClient.Login(ctxWithValues, setupKey, "")
 	if err != nil {
+		if isAuthError {
+			return fmt.Errorf("authentication error: %v", err)
+		}
 		return fmt.Errorf("login failed: %v", err)
 	}
client/android/login.go (2)

119-119: Consider handling the isAuthError return value.

The second return value from authClient.Login is ignored. While setup key authentication may not need the same error differentiation as OAuth flows, handling this value would provide better error diagnostics and consistency with potential future changes to the Login method.

🔎 Proposed refactor
-	err, _ = authClient.Login(ctxWithValues, setupKey, "")
+	err, isAuthError := authClient.Login(ctxWithValues, setupKey, "")
 	if err != nil {
+		if isAuthError {
+			return fmt.Errorf("authentication error: %v", err)
+		}
 		return fmt.Errorf("login failed: %v", err)
 	}

177-177: Consider using a timeout context instead of context.TODO().

RequestAuthInfo uses context.TODO() while WaitToken (line 184) uses a.ctx. The iOS implementation uses a bounded timeout context for this call (lines 271-272 in iOS file). Using context.TODO() means this request has no cancellation or timeout, which could lead to indefinite hangs.

🔎 Proposed refactor
+	// Use a bounded timeout for the auth info request to prevent indefinite hangs
+	authInfoCtx, authInfoCancel := context.WithTimeout(a.ctx, 30*time.Second)
+	defer authInfoCancel()
+
-	flowInfo, err := oAuthFlow.RequestAuthInfo(context.TODO())
+	flowInfo, err := oAuthFlow.RequestAuthInfo(authInfoCtx)

Note: You'll need to import time if not already imported.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a1fb1a7 and 6960926.

📒 Files selected for processing (3)
  • client/android/login.go
  • client/internal/auth/pkce_flow_windows_test.go
  • client/ios/NetBirdSDK/login.go
🧰 Additional context used
🧬 Code graph analysis (2)
client/internal/auth/pkce_flow_windows_test.go (1)
client/internal/auth/pkce_flow.go (1)
  • PKCEAuthProviderConfig (38-63)
client/android/login.go (4)
client/internal/auth/auth.go (2)
  • NewAuth (38-65)
  • Auth (26-33)
shared/management/status/error.go (1)
  • Errorf (70-75)
client/system/info.go (1)
  • DeviceNameCtxKey (16-16)
client/internal/auth/oauth.go (1)
  • TokenInfo (44-52)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: Relay / Unit (386)
  • GitHub Check: Client / Unit (386)
  • GitHub Check: Management / Unit (amd64, postgres)
  • GitHub Check: Relay / Unit (amd64, -race)
  • GitHub Check: Client / Unit (amd64)
  • GitHub Check: Management / Unit (amd64, sqlite)
  • GitHub Check: Management / Unit (amd64, mysql)
  • GitHub Check: Signal / Unit (amd64)
  • GitHub Check: Client (Docker) / Unit
  • GitHub Check: Linux
  • GitHub Check: Windows
  • GitHub Check: Darwin
  • GitHub Check: Android / Build
  • GitHub Check: iOS / Build
  • GitHub Check: release_ui
  • GitHub Check: release
  • GitHub Check: release_ui_darwin
  • GitHub Check: FreeBSD Port / Build & Test
  • GitHub Check: Client / Unit
  • GitHub Check: Client / Unit
  • GitHub Check: JS / Lint
🔇 Additional comments (7)
client/internal/auth/pkce_flow_windows_test.go (1)

96-107: LGTM! Type reference correctly updated and all references migrated.

The change from internal.PKCEAuthProviderConfig to PKCEAuthProviderConfig is correct. Since the test is in the auth package, the direct reference to the now-exported type is appropriate. All test configuration fields (ClientID, Audience, TokenEndpoint, Scope, AuthorizationEndpoint, RedirectURLs, UseIDToken) match the struct definition. Verification confirms there are no remaining references to internal.PKCEAuthProviderConfig in the codebase, so the migration is complete.

client/ios/NetBirdSDK/login.go (4)

87-107: LGTM!

The auth client is properly created, cleaned up with defer, and error handling is appropriate for the SSO support check flow.


146-174: LGTM!

The synchronous login flow correctly creates the auth client, checks login requirements, and handles authentication errors appropriately.


205-217: Correct context handling pattern.

This function demonstrates the proper pattern: the enriched context with device name is created first (lines 207-211), then passed to auth.NewAuth (line 213), ensuring the device name is available throughout the auth client's lifetime.


264-290: LGTM!

The function signature is correctly updated to accept the auth client, and the OAuth flow with bounded timeout for auth info request is well-implemented.

client/android/login.go (2)

78-96: LGTM!

The auth client is properly created and cleaned up, and the SSO support check flow is correctly implemented.


139-169: Auth client integration looks good.

The auth client is properly created, cleaned up, and the login flow is correctly refactored. Note that unlike the iOS version, this doesn't handle device name context in the interactive login flow.

shuuri-labs
shuuri-labs previously approved these changes Dec 30, 2025
@pappz pappz requested a review from lixmal January 8, 2026 09:25
// Login attempts to log in or register the client with the management server
// Returns error and a boolean indicating if it's an authentication error (permission denied) that should stop retries.
// Automatically retries with backoff and reconnection on connection errors.
func (a *Auth) Login(ctx context.Context, setupKey string, jwtToken string) (error, bool) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention is to return error last, let's stick to that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to fully remove the second boolean return parameter. It is almost unused and generally ignored. However, this change opens up a new batch of changes that I will address in the next step. This will be the final state:

func (a *Auth) Login(ctx context.Context, setupKey string, jwtToken string) error {

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In @client/internal/auth/auth.go:
- Around line 300-316: Duplicate SetFlags call appears in doMgmLogin and
registerPeer; add a helper method on Auth named setSystemInfoFlags(info
*system.Info) that calls info.SetFlags(...) with the same 15 a.config.* booleans
(RosenpassEnabled, RosenpassPermissive, ServerSSHAllowed, DisableClientRoutes,
DisableServerRoutes, DisableDNS, DisableFirewall, BlockLANAccess, BlockInbound,
LazyConnectionEnabled, EnableSSHRoot, EnableSSHSFTP,
EnableSSHLocalPortForwarding, EnableSSHRemotePortForwarding, DisableSSHAuth),
then replace the inline SetFlags calls in doMgmLogin and registerPeer with
a.setSystemInfoFlags(sysInfo) to remove duplication.
- Around line 26-33: Data race: the Auth.client field is read without
synchronization in getPKCEFlow, getDeviceFlow, doMgmLogin, and registerPeer
while reconnect modifies it under mutex, causing concurrent access races; fix by
protecting all accesses to a.client with the Auth.mutex—either add a getClient()
helper that locks, returns the client (or nil) and is used everywhere those
methods need the client, or wrap explicit mutex.Lock()/Unlock() around reads in
getPKCEFlow, getDeviceFlow, doMgmLogin, and registerPeer; ensure callers check
for nil after obtaining the client and keep reconnect's client mutation under
the existing lock.
- Around line 391-405: isConnectionError currently treats codes.Internal as a
transient connection error; remove codes.Internal from the set that triggers
reconnection in the isConnectionError function so only codes.Unavailable,
codes.DeadlineExceeded, and codes.Canceled are considered connection errors.
Update the boolean return expression in isConnectionError to exclude
codes.Internal (and optionally adjust the function comment to reflect the
narrowed set of connection error codes), mirroring the classification used in
shared/management/client/grpc.go.
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6960926 and c6b13fc.

📒 Files selected for processing (1)
  • client/internal/auth/auth.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: Client / Unit (amd64)
  • GitHub Check: Client / Unit (386)
  • GitHub Check: Management / Unit (amd64, mysql)
  • GitHub Check: Management / Unit (amd64, sqlite)
  • GitHub Check: Management / Unit (amd64, postgres)
  • GitHub Check: Relay / Unit (amd64, -race)
  • GitHub Check: Relay / Unit (386)
  • GitHub Check: Client (Docker) / Unit
  • GitHub Check: Client / Unit
  • GitHub Check: Client / Unit
  • GitHub Check: Client / Unit
  • GitHub Check: JS / Lint
  • GitHub Check: Windows
  • GitHub Check: Darwin
  • GitHub Check: Linux
  • GitHub Check: FreeBSD Port / Build & Test
  • GitHub Check: release
  • GitHub Check: iOS / Build
  • GitHub Check: release_ui_darwin
  • GitHub Check: Android / Build

Comment on lines +391 to +405
// isConnectionError checks if the error is a connection-related error that should trigger reconnection
func isConnectionError(err error) bool {
if err == nil {
return false
}
s, ok := status.FromError(err)
if !ok {
return false
}
// These error codes indicate connection issues
return s.Code() == codes.Unavailable ||
s.Code() == codes.DeadlineExceeded ||
s.Code() == codes.Canceled ||
s.Code() == codes.Internal
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's locate and examine the file
find . -type f -name "auth.go" | grep -E "client/internal/auth"

Repository: netbirdio/netbird

Length of output: 91


🏁 Script executed:

# Let's check the full context around the isConnectionError function
cd client/internal/auth && wc -l auth.go

Repository: netbirdio/netbird

Length of output: 72


🏁 Script executed:

# Read the function and its surrounding context
sed -n '385,415p' client/internal/auth/auth.go

Repository: netbirdio/netbird

Length of output: 1048


🏁 Script executed:

# Find all usages of isConnectionError
rg "isConnectionError" --type go -B 2 -A 2

Repository: netbirdio/netbird

Length of output: 789


🏁 Script executed:

# Search for backoff or retry related code
rg "MaxElapsedTime|backoff|retry" --type go -A 2

Repository: netbirdio/netbird

Length of output: 42500


🏁 Script executed:

# Search for any comments or documentation about codes.Internal or error handling
rg "codes\.Internal|Internal error" --type go -B 2 -A 2

Repository: netbirdio/netbird

Length of output: 11850


🏁 Script executed:

# Check where codes.Internal is actually returned from the server
rg "codes\.Internal" --type go management/internals/shared/grpc/server.go | head -20

Repository: netbirdio/netbird

Length of output: 1565


🏁 Script executed:

# Let's see if there are tests for isConnectionError
rg "isConnectionError" --type go -B 5 -A 5

Repository: netbirdio/netbird

Length of output: 1427


🏁 Script executed:

# Check the full withRetry implementation to understand the error handling
sed -n '412,460p' client/internal/auth/auth.go

Repository: netbirdio/netbird

Length of output: 1467


🏁 Script executed:

# Look at error handling patterns in similar code (shared/management/client/grpc.go)
rg "codes\.(Unavailable|DeadlineExceeded|Canceled|Internal)" shared/management/client/grpc.go -B 2 -A 2

Repository: netbirdio/netbird

Length of output: 464


Remove codes.Internal from the list of connection errors that trigger reconnection.

The isConnectionError function incorrectly treats codes.Internal as a connection issue. However, the management server uses this code for various operational failures (encryption errors, message processing failures, settings errors), not transient connection problems. Reconnecting when the server responds with codes.Internal won't resolve these errors. Limit reconnection triggers to codes that definitively indicate connection issues: Unavailable, DeadlineExceeded, and Canceled. See how shared/management/client/grpc.go handles error classification for reference.

🤖 Prompt for AI Agents
In @client/internal/auth/auth.go around lines 391 - 405, isConnectionError
currently treats codes.Internal as a transient connection error; remove
codes.Internal from the set that triggers reconnection in the isConnectionError
function so only codes.Unavailable, codes.DeadlineExceeded, and codes.Canceled
are considered connection errors. Update the boolean return expression in
isConnectionError to exclude codes.Internal (and optionally adjust the function
comment to reflect the narrowed set of connection error codes), mirroring the
classification used in shared/management/client/grpc.go.

@pappz pappz requested a review from lixmal January 9, 2026 10:43
lixmal
lixmal previously approved these changes Jan 13, 2026
@sonarqubecloud
Copy link

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
client/cmd/login.go (1)

326-336: Use provided ctx instead of context.TODO().

Both RequestAuthInfo and WaitToken use context.TODO() when the function receives a valid ctx parameter. This prevents proper cancellation handling (e.g., Ctrl+C won't interrupt these operations).

🐛 Proposed fix
-	flowInfo, err := oAuthFlow.RequestAuthInfo(context.TODO())
+	flowInfo, err := oAuthFlow.RequestAuthInfo(ctx)
 	if err != nil {
 		return nil, fmt.Errorf("getting a request OAuth flow info failed: %v", err)
 	}

 	openURL(cmd, flowInfo.VerificationURIComplete, flowInfo.UserCode, noBrowser)

-	tokenInfo, err := oAuthFlow.WaitToken(context.TODO(), flowInfo)
+	tokenInfo, err := oAuthFlow.WaitToken(ctx, flowInfo)
🧹 Nitpick comments (2)
client/internal/auth/auth.go (2)

173-200: Unconventional return order: (error, bool).

The method signature Login(...) (error, bool) deviates from Go's conventional pattern of (result, error). While this works and is used consistently by callers, it may confuse developers expecting standard Go idioms.

Consider either:

  • Reordering to (bool, error)
  • Using a custom result type
  • Documenting clearly why this order was chosen
♻️ Suggested signature change
-func (a *Auth) Login(ctx context.Context, setupKey string, jwtToken string) (error, bool) {
+func (a *Auth) Login(ctx context.Context, setupKey string, jwtToken string) (isAuthError bool, err error) {

This uses named return values to clarify semantics while keeping the unconventional order, or consider:

-func (a *Auth) Login(ctx context.Context, setupKey string, jwtToken string) (error, bool) {
+func (a *Auth) Login(ctx context.Context, setupKey string, jwtToken string) (bool, error) {

295-306: Context should be the first parameter per Go conventions.

The ctx parameter is conventionally the first parameter in Go functions. Here it appears as the second parameter after client.

♻️ Suggested parameter reordering
-func (a *Auth) doMgmLogin(client *mgm.GrpcClient, ctx context.Context, pubSSHKey []byte) (*wgtypes.Key, *mgmProto.LoginResponse, error) {
+func (a *Auth) doMgmLogin(ctx context.Context, client *mgm.GrpcClient, pubSSHKey []byte) (*wgtypes.Key, *mgmProto.LoginResponse, error) {

Same applies to registerPeer at line 310.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c6b13fc and ad0fc13.

📒 Files selected for processing (5)
  • client/cmd/login.go
  • client/internal/auth/auth.go
  • client/internal/auth/oauth.go
  • client/ios/NetBirdSDK/client.go
  • client/server/server.go
🧰 Additional context used
🧬 Code graph analysis (5)
client/cmd/login.go (1)
client/internal/auth/auth.go (1)
  • NewAuth (38-64)
client/internal/auth/oauth.go (1)
client/internal/auth/auth.go (1)
  • NewAuth (38-64)
client/server/server.go (2)
client/internal/auth/auth.go (1)
  • NewAuth (38-64)
client/internal/state.go (2)
  • StatusLoginFailed (16-16)
  • StatusType (8-8)
client/internal/auth/auth.go (5)
shared/management/status/error.go (2)
  • Errorf (70-75)
  • FromError (78-87)
client/internal/auth/oauth.go (1)
  • OAuthFlow (17-21)
client/internal/auth/pkce_flow.go (3)
  • PKCEAuthorizationFlow (89-94)
  • PKCEAuthProviderConfig (38-63)
  • NewPKCEAuthorizationFlow (97-128)
client/internal/auth/device_flow.go (3)
  • DeviceAuthorizationFlow (75-78)
  • DeviceAuthProviderConfig (29-49)
  • NewDeviceAuthorizationFlow (103-128)
client/system/info.go (1)
  • Info (45-81)
client/ios/NetBirdSDK/client.go (1)
client/internal/auth/auth.go (1)
  • NewAuth (38-64)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: FreeBSD Port / Build & Test
  • GitHub Check: release_ui
  • GitHub Check: release
  • GitHub Check: Client / Unit
  • GitHub Check: Client / Unit
  • GitHub Check: iOS / Build
  • GitHub Check: Client / Unit
  • GitHub Check: Android / Build
  • GitHub Check: JS / Lint
  • GitHub Check: Build Cache
  • GitHub Check: Windows
  • GitHub Check: Linux
  • GitHub Check: Darwin
🔇 Additional comments (13)
client/internal/auth/oauth.go (2)

108-134: Device code flow implementation looks correct.

The error handling appropriately preserves user-friendly messages for NotFound and Unimplemented cases. The conditional SetLoginHint only when hint is non-empty is a good pattern.

Same consideration as PKCE flow regarding the deferred client closure applies here, but DeviceAuthorizationFlow maintains its own HTTPClient for token operations.


89-104: Code is safe—authClient closure does not affect the returned flow.

The PKCEAuthorizationFlow object returned from authClient.getPKCEFlow() is self-contained and requires no further access to the gRPC connection. All necessary configuration is extracted during the gRPC calls within getPKCEFlow(), and subsequent operations in WaitToken() and RequestAuthInfo() use only the embedded oAuthConfig (oauth2.Config) and local HTTP server for the OAuth flow. The deferred authClient.Close() is safe and will not affect the returned flow object.

client/internal/auth/auth.go (6)

38-75: LGTM: Constructor and Close are well-implemented.

The NewAuth constructor properly validates the private key, determines TLS from URL scheme, and establishes the management connection. The Close method is thread-safe with proper mutex usage and nil check.


81-117: SSO detection logic is correct.

The method properly tries PKCE first, falls back to device flow, and correctly interprets NotFound/Unimplemented as "not supported" rather than errors. Other error types are appropriately propagated.


360-392: Reconnection logic is well-designed.

The implementation correctly:

  1. Uses mutex to prevent concurrent reconnections
  2. Double-checks if another thread already reconnected (avoids redundant work)
  3. Creates new connection before closing old one (prevents nil client state)
  4. Gracefully handles close errors with logging

This is a solid pattern for safe client replacement under concurrent access.


412-465: Retry logic with exponential backoff is correctly implemented.

Key observations:

  • Uses RLock appropriately to capture client reference without blocking other readers
  • Correctly returns original error after successful reconnection to trigger retry with new client
  • Auth errors are made permanent via backoff.Permanent() to prevent futile retries
  • 2-minute max elapsed time is reasonable for interactive auth operations

395-408: Connection error classification includes codes.Internal.

Including codes.Internal in connection errors may cause reconnections for server-side errors that aren't connection-related. However, this conservative approach is likely intentional—when in doubt, reconnecting is safer than failing.


247-292: Good backward compatibility handling for device flow scope.

The fallback to "openid" scope when empty maintains compatibility with older management server versions. Both flow retrieval methods properly validate configuration before creating flow instances.

client/ios/NetBirdSDK/client.go (2)

266-281: Proper auth client lifecycle management.

The implementation correctly:

  • Creates the auth client with error handling
  • Defers Close() immediately after successful creation
  • Returns true (login required) as a fail-safe when client creation fails

323-341: Auth client properly isolated within goroutine.

The goroutine creates its own auth client instance and defers Close(), ensuring proper resource cleanup regardless of login success or failure.

Note: The goroutine uses the parent ctx which can be canceled via c.ctxCancel(). Ensure that WaitToken and authClient.Login handle context cancellation gracefully to avoid resource leaks.

client/server/server.go (2)

256-276: Login error handling correctly maps to appropriate status types.

The implementation properly distinguishes between:

  • Authentication errors (permission denied, invalid args) → StatusNeedsLogin
  • Other errors (connection, server issues) → StatusLoginFailed

Returning empty string for success status is unconventional but consistent with how the return value is checked at call sites (e.g., line 477: if _, err := s.loginAttempt(...); err == nil).


591-598: WaitToken implementations handle expiry internally—no indefinite hang risk.

The change removes the explicit timeout at the caller level, but both PKCEAuthorizationFlow.WaitToken and DeviceAuthorizationFlow.WaitToken create their own timeout contexts based on info.ExpiresIn. Both implementations include comments stating "Create timeout context based on flow expiration" and enforce the timeout at the function level via context.WithTimeout(ctx, timeout) where timeout := time.Duration(info.ExpiresIn) * time.Second. Passing a context.WithCancel context is safe because WaitToken will apply its own timeout to the child context it creates.

client/cmd/login.go (1)

278-309: Foreground login flow is correctly structured.

The two-phase approach is sound:

  1. First Login call with empty credentials determines if login is needed
  2. If auth error (needs login), obtains JWT token interactively
  3. Second Login call with actual credentials performs the login

The error handling properly distinguishes between auth errors (triggering SSO) and other errors (failure).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants