-
-
Notifications
You must be signed in to change notification settings - Fork 1k
[client] Consolidate authentication logic #5010
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Moving auth functions from client/internal to client/internal/auth package - Creating unified auth.Auth client with NewAuth() constructor - Replacing direct auth function calls with auth client methods - Refactoring device flow and PKCE flow implementations - Updating iOS/Android/server code to use new auth client API
📝 WalkthroughWalkthroughIntroduces a centralized auth client (package Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client\n(Android/iOS/CLI/Server)
participant Auth as Auth Client\n(auth.Auth)
participant MGM as Management\nServer (mgm)
participant OAuth as OAuth\nProvider
Client->>Auth: NewAuth(ctx, privateKey, mgmURL, config)
Auth->>MGM: Establish gRPC connection
MGM-->>Auth: Connection ready
Client->>Auth: IsSSOSupported(ctx)
Auth->>MGM: Request PKCE flow
alt PKCE available
MGM-->>Auth: PKCE config
Auth-->>Client: SSO supported (PKCE)
else try device
Auth->>MGM: Request Device flow
MGM-->>Auth: Device config / NotFound
Auth-->>Client: SSO supported? (device or not)
end
Client->>Auth: GetOAuthFlow(ctx, forceDevice)
Auth->>MGM: Fetch flow config
Auth-->>Client: OAuthFlow
Client->>OAuth: RequestAuthInfo / WaitToken
OAuth-->>Client: TokenInfo (JWT)
Client->>Auth: Login(ctx, setupKey, jwtToken)
Auth->>MGM: doMgmLogin(pubSSHKey)
alt Login success
MGM-->>Auth: LoginResponse
Auth-->>Client: success
else Registration needed
Auth->>MGM: registerPeer(setupKey, jwtToken)
MGM-->>Auth: Registration response
Auth-->>Client: success
end
Client->>Auth: Close()
Auth->>MGM: Disconnect
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
client/cmd/login.go (1)
324-334: Usingcontext.TODO()removes user cancellation capability.Both
RequestAuthInfoandWaitTokenusecontext.TODO(), which means the user cannot cancel the login flow via Ctrl+C. If the browser login never completes, the CLI will hang indefinitely.Consider using the passed
ctxparameter instead to preserve cancellation semantics:🔎 Proposed fix
- flowInfo, err := oAuthFlow.RequestAuthInfo(context.TODO()) + flowInfo, err := oAuthFlow.RequestAuthInfo(ctx) if err != nil { return nil, fmt.Errorf("getting a request OAuth flow info failed: %v", err) } openURL(cmd, flowInfo.VerificationURIComplete, flowInfo.UserCode, noBrowser) - tokenInfo, err := oAuthFlow.WaitToken(context.TODO(), flowInfo) + tokenInfo, err := oAuthFlow.WaitToken(ctx, flowInfo) if err != nil { return nil, fmt.Errorf("waiting for browser login failed: %v", err) }
🧹 Nitpick comments (8)
client/internal/auth/auth.go (2)
171-198: Unconventional return signature:(error, bool)instead of(bool, error).Go convention places error as the last return value. This reversed signature may confuse callers and lead to bugs where the values are swapped.
🔎 Proposed signature change
-func (a *Auth) Login(ctx context.Context, setupKey string, jwtToken string) (error, bool) { +func (a *Auth) Login(ctx context.Context, setupKey string, jwtToken string) (bool, error) { // ... implementation ... - return err, isAuthError + return isAuthError, err }This would require updating all call sites to use
isAuthError, err := authClient.Login(...)instead.
300-317: Consider extracting duplicate system info setup.The
SetFlagscall with identical parameters appears in bothdoMgmLoginandregisterPeer. Consider extracting this to a helper method to reduce duplication and ensure consistency.🔎 Suggested helper
func (a *Auth) configureSystemInfo(info *system.Info) { info.SetFlags( a.config.RosenpassEnabled, a.config.RosenpassPermissive, a.config.ServerSSHAllowed, a.config.DisableClientRoutes, a.config.DisableServerRoutes, a.config.DisableDNS, a.config.DisableFirewall, a.config.BlockLANAccess, a.config.BlockInbound, a.config.LazyConnectionEnabled, a.config.EnableSSHRoot, a.config.EnableSSHSFTP, a.config.EnableSSHLocalPortForwarding, a.config.EnableSSHRemotePortForwarding, a.config.DisableSSHAuth, ) }Also applies to: 337-354
client/internal/auth/device_flow.go (2)
135-138: SetLoginHint mutates internal state - consider thread safety.This setter modifies
providerConfig.LoginHintwithout synchronization. IfDeviceAuthorizationFlowcould be used concurrently, this could cause a data race. If single-threaded use is expected, this is fine.
250-260: Double timeout context - clarify the intended behavior.
WaitTokennow creates its own timeout context frominfo.ExpiresIn, ignoring any timeout already set on the passedctx. The doc comment mentions this, but callers might expect their context's deadline to take precedence.Looking at test code and iOS/Android callers, they also create their own timeout context before calling
WaitToken, resulting in nested timeouts. This works but is redundant.Consider either:
- Removing the internal timeout and letting callers control it (current pattern in tests/callers)
- Keeping internal timeout but documenting that caller timeouts are ignored
client/ios/NetBirdSDK/login.go (1)
281-284: Redundant timeout context.A timeout context is created here with
flowInfo.ExpiresIn, butWaitTokenalso creates its own internal timeout from the same value. The outer timeout is effectively unused.client/android/login.go (3)
119-122:isAuthErroris ignored - inconsistent with iOS handling.Android ignores the second return value (
isAuthError) while iOS uses it to provide differentiated error messages. Consider handling it consistently:🔎 Proposed fix for consistent error handling
- err, _ = authClient.Login(ctxWithValues, setupKey, "") + err, isAuthError := authClient.Login(ctxWithValues, setupKey, "") if err != nil { - return fmt.Errorf("login failed: %v", err) + if isAuthError { + return fmt.Errorf("authentication error: %v", err) + } + return fmt.Errorf("login failed: %v", err) }
161-164: Same issue:isAuthErrorignored here as well.Consider consistent error handling with iOS.
177-177: Avoidcontext.TODO()- use the available context.
context.TODO()is intended as a placeholder during refactoring. Usea.ctxfor consistency with the rest of the method.🔎 Proposed fix
- flowInfo, err := oAuthFlow.RequestAuthInfo(context.TODO()) + flowInfo, err := oAuthFlow.RequestAuthInfo(a.ctx)
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
client/android/login.goclient/cmd/login.goclient/embed/embed.goclient/internal/auth/auth.goclient/internal/auth/device_flow.goclient/internal/auth/device_flow_test.goclient/internal/auth/oauth.goclient/internal/auth/pkce_flow.goclient/internal/auth/pkce_flow_test.goclient/internal/device_auth.goclient/internal/login.goclient/internal/pkce_auth.goclient/ios/NetBirdSDK/client.goclient/ios/NetBirdSDK/login.goclient/server/server.go
💤 Files with no reviewable changes (3)
- client/internal/login.go
- client/internal/device_auth.go
- client/internal/pkce_auth.go
🧰 Additional context used
🧬 Code graph analysis (10)
client/embed/embed.go (1)
client/internal/auth/auth.go (1)
NewAuth(38-65)
client/ios/NetBirdSDK/client.go (3)
client/android/login.go (1)
NewAuth(39-54)client/internal/auth/auth.go (1)
NewAuth(38-65)client/ios/NetBirdSDK/login.go (1)
NewAuth(44-59)
client/internal/auth/pkce_flow_test.go (1)
client/internal/auth/pkce_flow.go (1)
PKCEAuthProviderConfig(38-63)
client/cmd/login.go (3)
client/android/login.go (1)
NewAuth(39-54)client/internal/auth/auth.go (1)
NewAuth(38-65)client/ios/NetBirdSDK/login.go (1)
NewAuth(44-59)
client/server/server.go (2)
client/internal/auth/auth.go (1)
NewAuth(38-65)client/internal/state.go (2)
StatusLoginFailed(16-16)StatusType(8-8)
client/internal/auth/oauth.go (2)
client/internal/auth/auth.go (1)
NewAuth(38-65)client/internal/profilemanager/config.go (1)
Config(91-162)
client/internal/auth/auth.go (5)
shared/management/status/error.go (3)
Errorf(70-75)Error(54-57)FromError(78-87)client/internal/auth/oauth.go (1)
OAuthFlow(17-21)client/internal/auth/pkce_flow.go (3)
PKCEAuthorizationFlow(89-94)PKCEAuthProviderConfig(38-63)NewPKCEAuthorizationFlow(97-128)shared/management/client/common/types.go (1)
LoginFlag(11-11)client/internal/auth/device_flow.go (3)
DeviceAuthorizationFlow(75-78)DeviceAuthProviderConfig(29-49)NewDeviceAuthorizationFlow(103-128)
client/internal/auth/pkce_flow.go (2)
shared/management/client/common/types.go (1)
LoginFlag(11-11)client/internal/auth/oauth.go (2)
AuthFlowInfo(29-36)TokenInfo(44-52)
client/internal/auth/device_flow.go (3)
shared/management/status/error.go (1)
Errorf(70-75)client/internal/auth/oauth.go (3)
HTTPClient(24-26)AuthFlowInfo(29-36)TokenInfo(44-52)management/server/http/testing/testing_tools/tools.go (1)
ExpiresIn(43-43)
client/android/login.go (4)
client/internal/auth/auth.go (2)
NewAuth(38-65)Auth(26-33)client/ios/NetBirdSDK/login.go (2)
NewAuth(44-59)Auth(37-41)shared/management/status/error.go (1)
Errorf(70-75)client/internal/auth/oauth.go (1)
TokenInfo(44-52)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: Management / Unit (amd64, postgres)
- GitHub Check: Management / Unit (amd64, mysql)
- GitHub Check: Relay / Unit (386)
- GitHub Check: Client / Unit (386)
- GitHub Check: Management / Unit (amd64, sqlite)
- GitHub Check: Relay / Unit (amd64, -race)
- GitHub Check: Client (Docker) / Unit
- GitHub Check: Client / Unit (amd64)
- GitHub Check: Darwin
- GitHub Check: Client / Unit
- GitHub Check: Windows
- GitHub Check: Linux
- GitHub Check: release
- GitHub Check: FreeBSD Port / Build & Test
- GitHub Check: Client / Unit
- GitHub Check: release_ui_darwin
- GitHub Check: Client / Unit
- GitHub Check: Android / Build
- GitHub Check: JS / Lint
🔇 Additional comments (25)
client/internal/auth/pkce_flow.go (3)
37-85: LGTM on PKCEAuthProviderConfig and validation.The public type definition and validation function are well-structured. The validation covers all essential fields with clear error messages that help administrators troubleshoot configuration issues.
176-179: LGTM on SetLoginHint method.Simple setter that allows modifying the login hint after flow construction, enabling flexibility in the OAuth flow setup.
184-219: LGTM on the WaitToken timeout refactor.The internal timeout context based on
info.ExpiresInis a cleaner design - callers no longer need to manage the timeout externally. Usingcontext.Background()for the shutdown context (line 201) is correct since the server shutdown should complete regardless of whether the parent context was cancelled.client/internal/auth/pkce_flow_test.go (1)
52-62: LGTM on test update.The test correctly references the new
PKCEAuthProviderConfigtype that was moved to the current package scope, consistent with the public API refactor.client/internal/auth/auth.go (1)
401-406: Verify thatcodes.Internalshould trigger reconnection.Including
codes.Internalin connection errors may cause unnecessary reconnection attempts for server-side bugs that won't be fixed by reconnecting. This could mask actual issues.Consider whether
codes.Internalerrors should trigger reconnection or be treated as permanent failures that bubble up to the caller.client/embed/embed.go (1)
172-180: LGTM on auth client integration.The embedded client correctly creates an auth client, defers its cleanup, and performs login using the new unified auth API. Error handling wraps failures appropriately.
client/ios/NetBirdSDK/client.go (2)
266-281: LGTM on IsLoginRequired refactor.Correctly creates an auth client, handles creation errors by assuming login is required (safe fallback), and properly closes the client.
323-341: Goroutine uses parent context that may be cancelled.The goroutine at line 323 uses
ctxwhich is created at line 296 withc.ctxCancel. If the client is stopped (callingc.ctxCancel()), theWaitTokencall and subsequentLoginwill fail. Ensure this is the intended behavior - that stopping the client cancels in-progress logins.Verify that cancellation of the parent context during SSO login flow is handled gracefully on the Swift side.
client/server/server.go (2)
285-305: LGTM on loginAttempt refactor.The function correctly integrates the new auth client, properly handles error/auth-error distinction, and closes the client on exit.
620-621: LGTM on context change.The switch from
context.WithTimeouttocontext.WithCancelis correct sinceWaitTokennow manages its own timeout internally based onflowInfo.ExpiresIn. The cancel function is still stored for explicit cancellation if needed.client/internal/auth/oauth.go (2)
88-105: LGTM on PKCE flow initialization.The function correctly creates an auth client, retrieves the PKCE flow configuration, conditionally sets the login hint, and ensures cleanup via defer.
108-135: LGTM on Device Code flow initialization.Good error handling with specific messages for NotFound and Unimplemented cases, guiding users toward setup keys when SSO is unavailable.
client/cmd/login.go (1)
277-306: LGTM on foregroundLogin refactor.The function correctly uses the auth client for login checks and token-based authentication. The two-phase approach (check if login needed, then login with credentials) is clear.
client/internal/auth/device_flow_test.go (2)
116-127: LGTM! Clean transition to constructor-based initialization.The test now properly uses the public
NewDeviceAuthorizationFlowconstructor withDeviceAuthProviderConfig, and assigns the mock HTTP client post-construction. This pattern is appropriate for testing and aligns with the refactored public API.
282-293: Consistent with the first test function.Same constructor-based pattern applied correctly here.
client/internal/auth/device_flow.go (4)
28-49: Well-structured configuration type with clear documentation.The exported
DeviceAuthProviderConfigstruct is well-documented with comments explaining each field. The deprecation notice onDomainis helpful.
103-128: Constructor looks good but missing validation call.The constructor properly initializes HTTP transport with certificate handling and reasonable defaults. As noted above, add the validation call at the start.
258-260: Good practice: defer ticker.Stop() added.Properly stopping the ticker prevents resource leaks.
51-71: No action needed.validateDeviceAuthConfigis already called in thegetDeviceFlow()method (lines 280-282 inclient/internal/auth/auth.go) beforeNewDeviceAuthorizationFlowis instantiated. Invalid configurations are validated at the entry point and rejected before the constructor runs—this is the correct design pattern.client/ios/NetBirdSDK/login.go (3)
88-92: Good: Proper auth client lifecycle management.Creating the auth client and deferring
Close()ensures resources are properly released. This pattern is correctly applied throughout the file.
164-171: Correct error classification handling.The
isAuthErrorflag is properly used to distinguish authentication errors (like PermissionDenied) from other failures, providing meaningful error messages.
264-268: Clean refactor: foregroundGetTokenInfo now accepts auth client.The method signature change to accept
*auth.Authinstead of creating its own is appropriate since the caller already has the auth client instance.client/android/login.go (3)
79-83: Good: Consistent auth client lifecycle.Same proper pattern as iOS with
defer authClient.Close().
166-167:OnLoginSuccesscalled asynchronously - differs from iOS.Android uses
go urlOpener.OnLoginSuccess()while iOS calls it synchronously with a comment explaining why. If synchronous behavior is intentional on iOS for persistence guarantees, consider whether Android should match that pattern.
171-175: Good: Method signature updated to accept auth client.Consistent with iOS changes, avoiding redundant auth client creation.
- Remove unnecessary internal package reference in PKCE flow test - Adjust context assignment placement in iOS and Android login methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
client/ios/NetBirdSDK/login.go (1)
126-134: Context with device name not passed to auth client creation.This is the same issue flagged in previous reviews:
ctxWithValuesis created with the device name on line 133, butauth.NewAuthon line 126 usesa.ctxinstead. If the management client or auth client needs the device name during initialization, it won't be available.🔎 Proposed fix
+ //nolint + ctxWithValues := context.WithValue(a.ctx, system.DeviceNameCtxKey, deviceName) - authClient, err := auth.NewAuth(a.ctx, a.config.PrivateKey, a.config.ManagementURL, a.config) + authClient, err := auth.NewAuth(ctxWithValues, a.config.PrivateKey, a.config.ManagementURL, a.config) if err != nil { return fmt.Errorf("failed to create auth client: %v", err) } defer authClient.Close() - - //nolint - ctxWithValues := context.WithValue(a.ctx, system.DeviceNameCtxKey, deviceName) err, _ = authClient.Login(ctxWithValues, setupKey, "")client/android/login.go (1)
111-119: Context with device name not passed to auth client creation.Same issue as in the iOS file and flagged in previous reviews:
ctxWithValueswith the device name is created on line 118, butauth.NewAuthon line 111 usesa.ctx. The device name won't be available during auth client initialization.🔎 Proposed fix
+ //nolint + ctxWithValues := context.WithValue(a.ctx, system.DeviceNameCtxKey, deviceName) - authClient, err := auth.NewAuth(a.ctx, a.config.PrivateKey, a.config.ManagementURL, a.config) + authClient, err := auth.NewAuth(ctxWithValues, a.config.PrivateKey, a.config.ManagementURL, a.config) if err != nil { return fmt.Errorf("failed to create auth client: %v", err) } defer authClient.Close() - - //nolint - ctxWithValues := context.WithValue(a.ctx, system.DeviceNameCtxKey, deviceName) err, _ = authClient.Login(ctxWithValues, setupKey, "")
🧹 Nitpick comments (3)
client/ios/NetBirdSDK/login.go (1)
134-134: Consider handling theisAuthErrorreturn value.Unlike
LoginSync(line 166) andlogin(line 236) which check theisAuthErrorboolean to distinguish authentication errors from other failures, this line ignores it with_. For consistency and better error diagnostics, consider checking this value.🔎 Proposed refactor
- err, _ = authClient.Login(ctxWithValues, setupKey, "") + err, isAuthError := authClient.Login(ctxWithValues, setupKey, "") if err != nil { + if isAuthError { + return fmt.Errorf("authentication error: %v", err) + } return fmt.Errorf("login failed: %v", err) }client/android/login.go (2)
119-119: Consider handling theisAuthErrorreturn value.The second return value from
authClient.Loginis ignored. While setup key authentication may not need the same error differentiation as OAuth flows, handling this value would provide better error diagnostics and consistency with potential future changes to the Login method.🔎 Proposed refactor
- err, _ = authClient.Login(ctxWithValues, setupKey, "") + err, isAuthError := authClient.Login(ctxWithValues, setupKey, "") if err != nil { + if isAuthError { + return fmt.Errorf("authentication error: %v", err) + } return fmt.Errorf("login failed: %v", err) }
177-177: Consider using a timeout context instead ofcontext.TODO().
RequestAuthInfousescontext.TODO()whileWaitToken(line 184) usesa.ctx. The iOS implementation uses a bounded timeout context for this call (lines 271-272 in iOS file). Usingcontext.TODO()means this request has no cancellation or timeout, which could lead to indefinite hangs.🔎 Proposed refactor
+ // Use a bounded timeout for the auth info request to prevent indefinite hangs + authInfoCtx, authInfoCancel := context.WithTimeout(a.ctx, 30*time.Second) + defer authInfoCancel() + - flowInfo, err := oAuthFlow.RequestAuthInfo(context.TODO()) + flowInfo, err := oAuthFlow.RequestAuthInfo(authInfoCtx)Note: You'll need to import
timeif not already imported.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
client/android/login.goclient/internal/auth/pkce_flow_windows_test.goclient/ios/NetBirdSDK/login.go
🧰 Additional context used
🧬 Code graph analysis (2)
client/internal/auth/pkce_flow_windows_test.go (1)
client/internal/auth/pkce_flow.go (1)
PKCEAuthProviderConfig(38-63)
client/android/login.go (4)
client/internal/auth/auth.go (2)
NewAuth(38-65)Auth(26-33)shared/management/status/error.go (1)
Errorf(70-75)client/system/info.go (1)
DeviceNameCtxKey(16-16)client/internal/auth/oauth.go (1)
TokenInfo(44-52)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
- GitHub Check: Relay / Unit (386)
- GitHub Check: Client / Unit (386)
- GitHub Check: Management / Unit (amd64, postgres)
- GitHub Check: Relay / Unit (amd64, -race)
- GitHub Check: Client / Unit (amd64)
- GitHub Check: Management / Unit (amd64, sqlite)
- GitHub Check: Management / Unit (amd64, mysql)
- GitHub Check: Signal / Unit (amd64)
- GitHub Check: Client (Docker) / Unit
- GitHub Check: Linux
- GitHub Check: Windows
- GitHub Check: Darwin
- GitHub Check: Android / Build
- GitHub Check: iOS / Build
- GitHub Check: release_ui
- GitHub Check: release
- GitHub Check: release_ui_darwin
- GitHub Check: FreeBSD Port / Build & Test
- GitHub Check: Client / Unit
- GitHub Check: Client / Unit
- GitHub Check: JS / Lint
🔇 Additional comments (7)
client/internal/auth/pkce_flow_windows_test.go (1)
96-107: LGTM! Type reference correctly updated and all references migrated.The change from
internal.PKCEAuthProviderConfigtoPKCEAuthProviderConfigis correct. Since the test is in theauthpackage, the direct reference to the now-exported type is appropriate. All test configuration fields (ClientID, Audience, TokenEndpoint, Scope, AuthorizationEndpoint, RedirectURLs, UseIDToken) match the struct definition. Verification confirms there are no remaining references tointernal.PKCEAuthProviderConfigin the codebase, so the migration is complete.client/ios/NetBirdSDK/login.go (4)
87-107: LGTM!The auth client is properly created, cleaned up with defer, and error handling is appropriate for the SSO support check flow.
146-174: LGTM!The synchronous login flow correctly creates the auth client, checks login requirements, and handles authentication errors appropriately.
205-217: Correct context handling pattern.This function demonstrates the proper pattern: the enriched context with device name is created first (lines 207-211), then passed to
auth.NewAuth(line 213), ensuring the device name is available throughout the auth client's lifetime.
264-290: LGTM!The function signature is correctly updated to accept the auth client, and the OAuth flow with bounded timeout for auth info request is well-implemented.
client/android/login.go (2)
78-96: LGTM!The auth client is properly created and cleaned up, and the SSO support check flow is correctly implemented.
139-169: Auth client integration looks good.The auth client is properly created, cleaned up, and the login flow is correctly refactored. Note that unlike the iOS version, this doesn't handle device name context in the interactive login flow.
| // Login attempts to log in or register the client with the management server | ||
| // Returns error and a boolean indicating if it's an authentication error (permission denied) that should stop retries. | ||
| // Automatically retries with backoff and reconnection on connection errors. | ||
| func (a *Auth) Login(ctx context.Context, setupKey string, jwtToken string) (error, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The convention is to return error last, let's stick to that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal is to fully remove the second boolean return parameter. It is almost unused and generally ignored. However, this change opens up a new batch of changes that I will address in the next step. This will be the final state:
netbird/client/internal/auth/auth.go
Line 112 in 4b3e1f1
| func (a *Auth) Login(ctx context.Context, setupKey string, jwtToken string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In @client/internal/auth/auth.go:
- Around line 300-316: Duplicate SetFlags call appears in doMgmLogin and
registerPeer; add a helper method on Auth named setSystemInfoFlags(info
*system.Info) that calls info.SetFlags(...) with the same 15 a.config.* booleans
(RosenpassEnabled, RosenpassPermissive, ServerSSHAllowed, DisableClientRoutes,
DisableServerRoutes, DisableDNS, DisableFirewall, BlockLANAccess, BlockInbound,
LazyConnectionEnabled, EnableSSHRoot, EnableSSHSFTP,
EnableSSHLocalPortForwarding, EnableSSHRemotePortForwarding, DisableSSHAuth),
then replace the inline SetFlags calls in doMgmLogin and registerPeer with
a.setSystemInfoFlags(sysInfo) to remove duplication.
- Around line 26-33: Data race: the Auth.client field is read without
synchronization in getPKCEFlow, getDeviceFlow, doMgmLogin, and registerPeer
while reconnect modifies it under mutex, causing concurrent access races; fix by
protecting all accesses to a.client with the Auth.mutex—either add a getClient()
helper that locks, returns the client (or nil) and is used everywhere those
methods need the client, or wrap explicit mutex.Lock()/Unlock() around reads in
getPKCEFlow, getDeviceFlow, doMgmLogin, and registerPeer; ensure callers check
for nil after obtaining the client and keep reconnect's client mutation under
the existing lock.
- Around line 391-405: isConnectionError currently treats codes.Internal as a
transient connection error; remove codes.Internal from the set that triggers
reconnection in the isConnectionError function so only codes.Unavailable,
codes.DeadlineExceeded, and codes.Canceled are considered connection errors.
Update the boolean return expression in isConnectionError to exclude
codes.Internal (and optionally adjust the function comment to reflect the
narrowed set of connection error codes), mirroring the classification used in
shared/management/client/grpc.go.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
client/internal/auth/auth.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: Client / Unit (amd64)
- GitHub Check: Client / Unit (386)
- GitHub Check: Management / Unit (amd64, mysql)
- GitHub Check: Management / Unit (amd64, sqlite)
- GitHub Check: Management / Unit (amd64, postgres)
- GitHub Check: Relay / Unit (amd64, -race)
- GitHub Check: Relay / Unit (386)
- GitHub Check: Client (Docker) / Unit
- GitHub Check: Client / Unit
- GitHub Check: Client / Unit
- GitHub Check: Client / Unit
- GitHub Check: JS / Lint
- GitHub Check: Windows
- GitHub Check: Darwin
- GitHub Check: Linux
- GitHub Check: FreeBSD Port / Build & Test
- GitHub Check: release
- GitHub Check: iOS / Build
- GitHub Check: release_ui_darwin
- GitHub Check: Android / Build
| // isConnectionError checks if the error is a connection-related error that should trigger reconnection | ||
| func isConnectionError(err error) bool { | ||
| if err == nil { | ||
| return false | ||
| } | ||
| s, ok := status.FromError(err) | ||
| if !ok { | ||
| return false | ||
| } | ||
| // These error codes indicate connection issues | ||
| return s.Code() == codes.Unavailable || | ||
| s.Code() == codes.DeadlineExceeded || | ||
| s.Code() == codes.Canceled || | ||
| s.Code() == codes.Internal | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's locate and examine the file
find . -type f -name "auth.go" | grep -E "client/internal/auth"Repository: netbirdio/netbird
Length of output: 91
🏁 Script executed:
# Let's check the full context around the isConnectionError function
cd client/internal/auth && wc -l auth.goRepository: netbirdio/netbird
Length of output: 72
🏁 Script executed:
# Read the function and its surrounding context
sed -n '385,415p' client/internal/auth/auth.goRepository: netbirdio/netbird
Length of output: 1048
🏁 Script executed:
# Find all usages of isConnectionError
rg "isConnectionError" --type go -B 2 -A 2Repository: netbirdio/netbird
Length of output: 789
🏁 Script executed:
# Search for backoff or retry related code
rg "MaxElapsedTime|backoff|retry" --type go -A 2Repository: netbirdio/netbird
Length of output: 42500
🏁 Script executed:
# Search for any comments or documentation about codes.Internal or error handling
rg "codes\.Internal|Internal error" --type go -B 2 -A 2Repository: netbirdio/netbird
Length of output: 11850
🏁 Script executed:
# Check where codes.Internal is actually returned from the server
rg "codes\.Internal" --type go management/internals/shared/grpc/server.go | head -20Repository: netbirdio/netbird
Length of output: 1565
🏁 Script executed:
# Let's see if there are tests for isConnectionError
rg "isConnectionError" --type go -B 5 -A 5Repository: netbirdio/netbird
Length of output: 1427
🏁 Script executed:
# Check the full withRetry implementation to understand the error handling
sed -n '412,460p' client/internal/auth/auth.goRepository: netbirdio/netbird
Length of output: 1467
🏁 Script executed:
# Look at error handling patterns in similar code (shared/management/client/grpc.go)
rg "codes\.(Unavailable|DeadlineExceeded|Canceled|Internal)" shared/management/client/grpc.go -B 2 -A 2Repository: netbirdio/netbird
Length of output: 464
Remove codes.Internal from the list of connection errors that trigger reconnection.
The isConnectionError function incorrectly treats codes.Internal as a connection issue. However, the management server uses this code for various operational failures (encryption errors, message processing failures, settings errors), not transient connection problems. Reconnecting when the server responds with codes.Internal won't resolve these errors. Limit reconnection triggers to codes that definitively indicate connection issues: Unavailable, DeadlineExceeded, and Canceled. See how shared/management/client/grpc.go handles error classification for reference.
🤖 Prompt for AI Agents
In @client/internal/auth/auth.go around lines 391 - 405, isConnectionError
currently treats codes.Internal as a transient connection error; remove
codes.Internal from the set that triggers reconnection in the isConnectionError
function so only codes.Unavailable, codes.DeadlineExceeded, and codes.Canceled
are considered connection errors. Update the boolean return expression in
isConnectionError to exclude codes.Internal (and optionally adjust the function
comment to reflect the narrowed set of connection error codes), mirroring the
classification used in shared/management/client/grpc.go.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
client/cmd/login.go (1)
326-336: Use providedctxinstead ofcontext.TODO().Both
RequestAuthInfoandWaitTokenusecontext.TODO()when the function receives a validctxparameter. This prevents proper cancellation handling (e.g., Ctrl+C won't interrupt these operations).🐛 Proposed fix
- flowInfo, err := oAuthFlow.RequestAuthInfo(context.TODO()) + flowInfo, err := oAuthFlow.RequestAuthInfo(ctx) if err != nil { return nil, fmt.Errorf("getting a request OAuth flow info failed: %v", err) } openURL(cmd, flowInfo.VerificationURIComplete, flowInfo.UserCode, noBrowser) - tokenInfo, err := oAuthFlow.WaitToken(context.TODO(), flowInfo) + tokenInfo, err := oAuthFlow.WaitToken(ctx, flowInfo)
🧹 Nitpick comments (2)
client/internal/auth/auth.go (2)
173-200: Unconventional return order:(error, bool).The method signature
Login(...) (error, bool)deviates from Go's conventional pattern of(result, error). While this works and is used consistently by callers, it may confuse developers expecting standard Go idioms.Consider either:
- Reordering to
(bool, error)- Using a custom result type
- Documenting clearly why this order was chosen
♻️ Suggested signature change
-func (a *Auth) Login(ctx context.Context, setupKey string, jwtToken string) (error, bool) { +func (a *Auth) Login(ctx context.Context, setupKey string, jwtToken string) (isAuthError bool, err error) {This uses named return values to clarify semantics while keeping the unconventional order, or consider:
-func (a *Auth) Login(ctx context.Context, setupKey string, jwtToken string) (error, bool) { +func (a *Auth) Login(ctx context.Context, setupKey string, jwtToken string) (bool, error) {
295-306: Context should be the first parameter per Go conventions.The
ctxparameter is conventionally the first parameter in Go functions. Here it appears as the second parameter afterclient.♻️ Suggested parameter reordering
-func (a *Auth) doMgmLogin(client *mgm.GrpcClient, ctx context.Context, pubSSHKey []byte) (*wgtypes.Key, *mgmProto.LoginResponse, error) { +func (a *Auth) doMgmLogin(ctx context.Context, client *mgm.GrpcClient, pubSSHKey []byte) (*wgtypes.Key, *mgmProto.LoginResponse, error) {Same applies to
registerPeerat line 310.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
client/cmd/login.goclient/internal/auth/auth.goclient/internal/auth/oauth.goclient/ios/NetBirdSDK/client.goclient/server/server.go
🧰 Additional context used
🧬 Code graph analysis (5)
client/cmd/login.go (1)
client/internal/auth/auth.go (1)
NewAuth(38-64)
client/internal/auth/oauth.go (1)
client/internal/auth/auth.go (1)
NewAuth(38-64)
client/server/server.go (2)
client/internal/auth/auth.go (1)
NewAuth(38-64)client/internal/state.go (2)
StatusLoginFailed(16-16)StatusType(8-8)
client/internal/auth/auth.go (5)
shared/management/status/error.go (2)
Errorf(70-75)FromError(78-87)client/internal/auth/oauth.go (1)
OAuthFlow(17-21)client/internal/auth/pkce_flow.go (3)
PKCEAuthorizationFlow(89-94)PKCEAuthProviderConfig(38-63)NewPKCEAuthorizationFlow(97-128)client/internal/auth/device_flow.go (3)
DeviceAuthorizationFlow(75-78)DeviceAuthProviderConfig(29-49)NewDeviceAuthorizationFlow(103-128)client/system/info.go (1)
Info(45-81)
client/ios/NetBirdSDK/client.go (1)
client/internal/auth/auth.go (1)
NewAuth(38-64)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
- GitHub Check: FreeBSD Port / Build & Test
- GitHub Check: release_ui
- GitHub Check: release
- GitHub Check: Client / Unit
- GitHub Check: Client / Unit
- GitHub Check: iOS / Build
- GitHub Check: Client / Unit
- GitHub Check: Android / Build
- GitHub Check: JS / Lint
- GitHub Check: Build Cache
- GitHub Check: Windows
- GitHub Check: Linux
- GitHub Check: Darwin
🔇 Additional comments (13)
client/internal/auth/oauth.go (2)
108-134: Device code flow implementation looks correct.The error handling appropriately preserves user-friendly messages for
NotFoundandUnimplementedcases. The conditionalSetLoginHintonly when hint is non-empty is a good pattern.Same consideration as PKCE flow regarding the deferred client closure applies here, but
DeviceAuthorizationFlowmaintains its ownHTTPClientfor token operations.
89-104: Code is safe—authClient closure does not affect the returned flow.The
PKCEAuthorizationFlowobject returned fromauthClient.getPKCEFlow()is self-contained and requires no further access to the gRPC connection. All necessary configuration is extracted during the gRPC calls withingetPKCEFlow(), and subsequent operations inWaitToken()andRequestAuthInfo()use only the embeddedoAuthConfig(oauth2.Config) and local HTTP server for the OAuth flow. The deferredauthClient.Close()is safe and will not affect the returned flow object.client/internal/auth/auth.go (6)
38-75: LGTM: Constructor and Close are well-implemented.The
NewAuthconstructor properly validates the private key, determines TLS from URL scheme, and establishes the management connection. TheClosemethod is thread-safe with proper mutex usage and nil check.
81-117: SSO detection logic is correct.The method properly tries PKCE first, falls back to device flow, and correctly interprets
NotFound/Unimplementedas "not supported" rather than errors. Other error types are appropriately propagated.
360-392: Reconnection logic is well-designed.The implementation correctly:
- Uses mutex to prevent concurrent reconnections
- Double-checks if another thread already reconnected (avoids redundant work)
- Creates new connection before closing old one (prevents nil client state)
- Gracefully handles close errors with logging
This is a solid pattern for safe client replacement under concurrent access.
412-465: Retry logic with exponential backoff is correctly implemented.Key observations:
- Uses
RLockappropriately to capture client reference without blocking other readers- Correctly returns original error after successful reconnection to trigger retry with new client
- Auth errors are made permanent via
backoff.Permanent()to prevent futile retries- 2-minute max elapsed time is reasonable for interactive auth operations
395-408: Connection error classification includescodes.Internal.Including
codes.Internalin connection errors may cause reconnections for server-side errors that aren't connection-related. However, this conservative approach is likely intentional—when in doubt, reconnecting is safer than failing.
247-292: Good backward compatibility handling for device flow scope.The fallback to
"openid"scope when empty maintains compatibility with older management server versions. Both flow retrieval methods properly validate configuration before creating flow instances.client/ios/NetBirdSDK/client.go (2)
266-281: Proper auth client lifecycle management.The implementation correctly:
- Creates the auth client with error handling
- Defers
Close()immediately after successful creation- Returns
true(login required) as a fail-safe when client creation fails
323-341: Auth client properly isolated within goroutine.The goroutine creates its own auth client instance and defers
Close(), ensuring proper resource cleanup regardless of login success or failure.Note: The goroutine uses the parent
ctxwhich can be canceled viac.ctxCancel(). Ensure thatWaitTokenandauthClient.Loginhandle context cancellation gracefully to avoid resource leaks.client/server/server.go (2)
256-276: Login error handling correctly maps to appropriate status types.The implementation properly distinguishes between:
- Authentication errors (permission denied, invalid args) →
StatusNeedsLogin- Other errors (connection, server issues) →
StatusLoginFailedReturning empty string for success status is unconventional but consistent with how the return value is checked at call sites (e.g., line 477:
if _, err := s.loginAttempt(...); err == nil).
591-598: WaitToken implementations handle expiry internally—no indefinite hang risk.The change removes the explicit timeout at the caller level, but both
PKCEAuthorizationFlow.WaitTokenandDeviceAuthorizationFlow.WaitTokencreate their own timeout contexts based oninfo.ExpiresIn. Both implementations include comments stating "Create timeout context based on flow expiration" and enforce the timeout at the function level viacontext.WithTimeout(ctx, timeout)wheretimeout := time.Duration(info.ExpiresIn) * time.Second. Passing acontext.WithCancelcontext is safe becauseWaitTokenwill apply its own timeout to the child context it creates.client/cmd/login.go (1)
278-309: Foreground login flow is correctly structured.The two-phase approach is sound:
- First
Logincall with empty credentials determines if login is needed- If auth error (needs login), obtains JWT token interactively
- Second
Logincall with actual credentials performs the loginThe error handling properly distinguishes between auth errors (triggering SSO) and other errors (failure).



Describe your changes
Issue ticket number and link
Stack
Checklist
Documentation
Select exactly one:
Docs PR URL (required if "docs added" is checked)
Paste the PR link from https://github.com/netbirdio/docs here:
https://github.com/netbirdio/docs/pull/__
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.