-
Notifications
You must be signed in to change notification settings - Fork 1
Implement token authentication #118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add support for token-based authentication with both static tokens and dynamic token handlers. Authentication tokens are included in the CONNECT protocol message and authentication failures are properly detected and reported. Features: - ConnectionOptions.token for static token authentication - ConnectionOptions.token_handler for dynamic token generation - Enhanced error detection for authentication failures in -ERR responses - Test infrastructure with token-authenticated NATS server
WalkthroughAdds token-based authentication support to the client (ConnectionOptions.token / token_handler), includes resolved auth_token in the CONNECT handshake, and maps server Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tests/utils.zig (1)
84-86: Wait for all 4 services to be healthy to avoid auth test flakiness.We now run 4 services (including nats-token-auth). Waiting for only 3 can race the auth tests. Bump the threshold.
Apply:
- if (healthy_count >= 3) { + if (healthy_count >= 4) { return; }src/connection.zig (1)
1314-1339: Broaden auth error detection to be case-insensitive and include common variants.Current checks miss messages like “Authentication Timeout” (capital A). Make matching robust.
Apply:
- if (std.mem.containsAtLeast(u8, err_msg, 1, "Authorization Violation") or - std.mem.containsAtLeast(u8, err_msg, 1, "authorization violation") or - std.mem.containsAtLeast(u8, err_msg, 1, "authentication")) + if (std.mem.containsAtLeast(u8, err_msg, 1, "Authorization Violation") or + std.mem.containsAtLeast(u8, err_msg, 1, "authorization violation") or + std.mem.containsAtLeast(u8, err_msg, 1, "Authorization Timeout") or + std.mem.containsAtLeast(u8, err_msg, 1, "authentication") or + std.mem.containsAtLeast(u8, err_msg, 1, "Authentication")) { self.handshake_error = ConnectionError.AuthFailed; } else { self.handshake_error = ConnectionError.ConnectionFailed; }
🧹 Nitpick comments (4)
docker-compose.test.yml (1)
81-100: Service definition for token-auth is good; consider pinning image to a patch tag.
nats:2.10-alpinecan drift under you. For reproducible CI, pin to a known-good patch (e.g.,2.10.x-alpine) that you already validate locally.src/connection.zig (2)
197-200: Public API: token and token_handlerAPI shape is reasonable. Two small improvements to consider:
- Accept an error-able token handler:
?*const fn () anyerror![]const u8to allow retrieval failures.- If you expect handlers needing state, add an optional
token_handler_ctx: ?*anyopaqueand a signaturefn (*anyopaque) []const u8.
1208-1213: Avoid emitting "auth_token": null — set emit_null_optional_fields = falseConfirmed: std.json.StringifyOptions exposes emit_null_optional_fields (bool).
File: src/connection.zig lines 1208-1213 (also apply at 1224-1235)
Apply:
- try std.json.stringify(connect_obj, .{}, buffer.writer()); + try std.json.stringify(connect_obj, .{ .emit_null_optional_fields = false }, buffer.writer());tests/auth_test.zig (1)
64-83: Tighten expected error once auth classification is broadened.After making auth detection case-insensitive, this should deterministically be
AuthFailed.Apply:
- // Accept either AuthFailed or ConnectionFailed for now - try std.testing.expect(err == nats.ConnectionError.AuthFailed or err == nats.ConnectionError.ConnectionFailed); + try std.testing.expect(err == nats.ConnectionError.AuthFailed);
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
docker-compose.test.yml(1 hunks)src/connection.zig(7 hunks)tests/all_tests.zig(1 hunks)tests/auth_test.zig(1 hunks)tests/utils.zig(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-25T14:51:33.570Z
Learnt from: lalinsky
PR: lalinsky/nats.zig#31
File: tests/jetstream_pull_test.zig:21-23
Timestamp: 2025-08-25T14:51:33.570Z
Learning: The nats.zig project uses a "tests:beforeEach" test in tests/all_tests.zig that runs before each test to clean up all JetStream streams and consumers, ensuring test isolation. This eliminates the need for unique test resource names as each test starts with a clean JetStream state.
Applied to files:
tests/all_tests.zigtests/auth_test.zig
📚 Learning: 2025-08-30T18:48:28.333Z
Learnt from: lalinsky
PR: lalinsky/nats.zig#66
File: src/connection.zig:317-323
Timestamp: 2025-08-30T18:48:28.333Z
Learning: In src/connection.zig, the establishConnection() call in connectToServer() is during initial connection setup where no race conditions exist - the connection starts disconnected and other threads would see the connecting status and wait appropriately.
Applied to files:
src/connection.zig
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (7)
tests/utils.zig (1)
10-11: Enum addition looks correct and consistent with docker-compose.
token_auth = 14225matches the new service port, and bumpingunknownavoids collision. No issues.src/connection.zig (1)
378-409: Good: automatic cleanup on initial connect failures.
errdefer self.close()plus waiting for handshake under the mutex is solid and matches the prior “no races during initial connect” guidance.tests/all_tests.zig (2)
8-8: Adding auth tests to the suite: good.In combination with bumping the healthy-count to 4, this should be stable.
40-41: Minor: align health wait with 4 services.Once utils waits for 4 “healthy”, this stays as-is; otherwise, consider increasing the timeout slightly if CI is slow.
tests/auth_test.zig (3)
7-20: Static token success path looks correct.Covers the handshake and basic I/O; good smoke test.
42-62: Precedence test reads well.Validates handler > static token, which matches the client behavior.
86-99: No-auth against auth serverThis is the right negative test and will remain stable with the CONNECT changes above.
Implement detailed error categorization for all NATS server -ERR messages matching the nats.go client approach. This allows client code to handle specific protocol errors instead of generic connection failures. Features: - ProtocolError enum with 15+ specific error types for authentication, limits, protocol violations, and connection issues - parseProtocolError() function to map -ERR messages to specific errors - Enhanced processErr() to propagate specific errors to client code - Updated tests to expect AuthorizationViolation instead of generic AuthFailed - Export ProtocolError for client access Client code can now distinguish between different failure types: - AuthorizationViolation vs AuthExpired vs AuthRevoked - MaxConnectionsExceeded vs SecureConnectionRequired - PermissionViolation vs MaxPayloadViolation - UnknownServerError fallback for unrecognized errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (5)
src/root.zig (1)
21-21: Add a brief doc comment and changelog entry for new public APIDocument ProtocolError at the root to help users discover it and note the addition in the release notes.
src/connection.zig (4)
121-148: Cover a few common NATS -ERR variants as distinct ProtocolError casesConsider adding specific variants for messages that appear frequently in the wild to improve diagnostics and branching:
- authorization timeout
- permissions violation for publish/subscription (more precise than a generic permissions violation)
- parser error
Apply:
pub const ProtocolError = error{ // Authentication/Authorization errors AuthorizationViolation, // "Authorization Violation" + AuthorizationTimeout, // "Authorization Timeout" AuthExpired, // "User Authentication Expired" AuthRevoked, // "User Authentication Revoked" AccountAuthExpired, // "Account Authentication Expired" - PermissionViolation, // "Permissions Violation" + PermissionViolation, // generic fallback + PermissionViolationPublish, // "Permissions Violation for Publish" + PermissionViolationSubscription, // "Permissions Violation for Subscription" // Connection/Limit errors MaxConnectionsExceeded, // "maximum connections exceeded" ConnectionThrottling, // "Connection throttling is active" MaxPayloadViolation, // "Maximum Payload Violation" MaxSubscriptionsExceeded, // "maximum subscriptions exceeded" // Protocol errors SecureConnectionRequired, // "Secure Connection - TLS Required" InvalidClientProtocol, // "invalid client protocol" + ParserError, // "Parser Error" UnknownProtocolOperation, // "Unknown Protocol Operation" InvalidPublishSubject, // "Invalid Publish Subject" NoRespondersRequiresHeaders, // "no responders requires headers support" // Account errors FailedAccountRegistration, // "Failed Account Registration" // Generic fallback UnknownServerError, // For unrecognized -ERR messages };And extend the matcher:
- } else if (std.mem.containsAtLeast(u8, lower_err, 1, "permissions violation")) { + } else if (std.mem.containsAtLeast(u8, lower_err, 1, "permissions violation for publish")) { + return ProtocolError.PermissionViolationPublish; + } else if (std.mem.containsAtLeast(u8, lower_err, 1, "permissions violation for subscription")) { + return ProtocolError.PermissionViolationSubscription; + } else if (std.mem.containsAtLeast(u8, lower_err, 1, "permissions violation")) { return ProtocolError.PermissionViolation; } + // ... + else if (std.mem.containsAtLeast(u8, lower_err, 1, "authorization timeout")) { + return ProtocolError.AuthorizationTimeout; + } else if (std.mem.containsAtLeast(u8, lower_err, 1, "parser error")) { + return ProtocolError.ParserError; + }
226-229: Token handler API lacks error and allocation semanticsThe zero-arg, infallible token_handler returning a borrowed slice constrains real-world use (e.g., fetching/refreshing tokens, handling failures) and is ambiguous about ownership. Recommend making it fallible and allocator-aware.
Apply:
pub const ConnectionOptions = struct { // Authentication token: ?[]const u8 = null, - token_handler: ?*const fn () []const u8 = null, + token_handler: ?*const fn (allocator: Allocator) ![]const u8 = null, };Then call with allocator and free after use (see suggestion on Lines 1237-1242).
1237-1242: Prefer an allocator-aware, fallible token handler and ensure freeingIf you adopt the fallible handler, capture ownership and free it once CONNECT is built.
Apply:
- // Get authentication token (dynamic handler takes precedence) - const auth_token = if (self.options.token_handler) |handler| - handler() - else - self.options.token; + // Get authentication token (dynamic handler takes precedence) + var auth_token: ?[]const u8 = null; + var owned_token: ?[]const u8 = null; + if (self.options.token_handler) |handler| { + const tok = try handler(self.allocator); + auth_token = tok; + owned_token = tok; // owned; will free before returning + } else { + auth_token = self.options.token; // borrowed from user; do not free + } + defer if (owned_token) |t| self.allocator.free(t);
1343-1389: Parsing -ERR: good coverage; minor improvements
- No allocation would be even better, but OK for rare -ERRs.
- Consider matching specific “permissions violation for publish/subscription”, “authorization timeout”, and “parser error” (see earlier diff).
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/connection.zig(9 hunks)src/root.zig(1 hunks)tests/auth_test.zig(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/auth_test.zig
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-30T18:48:28.333Z
Learnt from: lalinsky
PR: lalinsky/nats.zig#66
File: src/connection.zig:317-323
Timestamp: 2025-08-30T18:48:28.333Z
Learning: In src/connection.zig, the establishConnection() call in connectToServer() is during initial connection setup where no race conditions exist - the connection starts disconnected and other threads would see the connecting status and wait appropriately.
Applied to files:
src/connection.zig
🔇 Additional comments (7)
src/root.zig (1)
21-21: Public re-export of ProtocolError — looks goodThis surfaces protocol errors cleanly at the root. No functional risk.
src/connection.zig (6)
119-120: Plumbing ProtocolError into ConnectionError is correctThis lets callers handle specific protocol failures (e.g., AuthorizationViolation) distinctly. Good call.
407-408: Good: errdefer close() on connect failuresThis ensures cleanup if establishConnection/handshake fails later. The unlock defer runs before close(), so no lock inversion.
425-425: Propagating establishConnection failure with tryRight choice; avoids swallowing socket setup failures.
437-437: Waiting for handshake here is fineCondition wait releases the mutex, allowing reader to drive state transitions.
1402-1406: Solid: parse once, log mapped errorKeeps logs meaningful and avoids re-parsing downstream.
1409-1415: Correct: propagate ProtocolError during handshakeThis gives precise failure reasons to callers and integrates cleanly with ConnectionError.
| .lang = build_options.lang, | ||
| .version = build_options.version, | ||
| .protocol = 1, | ||
| .auth_token = auth_token, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Ensure we don’t emit "auth_token": null
If auth_token is null and stringify emits nulls, some servers reject the CONNECT. Set the option explicitly.
- try std.json.stringify(connect_obj, .{}, buffer.writer());
+ try std.json.stringify(connect_obj, .{ .emit_null_optional_fields = false }, buffer.writer());🌐 Web query:
In current Zig std (0.12.x), what is the default for std.json.stringify option `emit_null_optional_fields`? Is it false by default?
💡 Result:
No — in Zig std 0.12.x the default for StringifyOptions.emit_null_optional_fields is true. [1][2]
Sources:
[1] std/json/stringify.zig (browseable) — shows emit_null_optional_fields: bool = true.
[2] std/lib source copy (Fossies) — documents Options.emit_null_optional_fields and the behavior.
Don't emit "auth_token": null — explicitly disable emitting null optional fields.
Zig std 0.12.x defaults emit_null_optional_fields = true, so stringify will emit nulls; set it to false.
- try std.json.stringify(connect_obj, .{}, buffer.writer());
+ try std.json.stringify(connect_obj, .{ .emit_null_optional_fields = false }, buffer.writer());📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| .auth_token = auth_token, | |
| .auth_token = auth_token, | |
| }; | |
| try std.json.stringify(connect_obj, .{ .emit_null_optional_fields = false }, buffer.writer()); |
🤖 Prompt for AI Agents
In src/connection.zig around line 1253, the JSON output is emitting
"auth_token": null because Zig 0.12.x defaults emit_null_optional_fields = true;
update the stringify call or the StringifyOptions used there to set
emit_null_optional_fields = false so optional fields that are null are omitted
from output (i.e., explicitly disable emitting null optional fields in the
options passed to std.json.stringify/format).
Add support for token-based authentication with both static tokens and dynamic token handlers. Authentication tokens are included in the CONNECT protocol message and authentication failures are properly detected and reported.
Features: