-
Notifications
You must be signed in to change notification settings - Fork 161
feat (jans-cedarling): Load trusted issuers on startup on the background #13125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ucture Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
…ized iss value Using this structure easier to understand where actual iss value is used. Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
…SSUER_LOADER_ASYNC` Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
…ng trusted issuers. But only sync version Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
…issuers` Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
on command `cargo clippy --tests` Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
…ypes now user can use `CEDARLING_TRUSTED_ISSUER_LOADER_TYPE` and `CEDARLING_TRUSTED_ISSUER_LOADER_WORKERS` parameters Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
…ing fields and defaults Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
|
Caution Review failedAn error occurred during the review process. Please try again later. 📝 WalkthroughWalkthroughAdds an IssClaim newtype and migrates issuer handling to it, makes multiple JWT/key/status-list caches thread-safe with Arc/RwLock, introduces a TrustedIssuerLoader (sync/async) configurable via bootstrap, tightens policy-store visibility, and updates many modules and tests to use IssClaim and Arc-wrapped keys. Changes
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Error: Hi @olehbozhok, You did not reference an open issue in your PR. I attempted to create an issue for you. |
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
jans-cedarling/AGENTS.md (1)
91-93:⚠️ Potential issue | 🟡 MinorFix spelling in JWT status validation description.
“retreive” → “retrieve”.✅ Proposed fix
- On startup, the Cedarling should fetch and retreive the latest Status List JWT from the `.well-known/openid-configuration` via the `status_list_endpoint` claim and cache it. + On startup, the Cedarling should fetch and retrieve the latest Status List JWT from the `.well-known/openid-configuration` via the `status_list_endpoint` claim and cache it.jans-cedarling/cedarling/src/jwt/http_utils.rs (1)
140-143:⚠️ Potential issue | 🟡 MinorFix typos in error messages: "respose" → "response".
Lines 140 and 142 contain the typo "respose" which should be "response".
✏️ Proposed fix
- #[error("failed to deserialize respose from JSON: {0}")] + #[error("failed to deserialize response from JSON: {0}")] JsonDeserializeResponse(#[source] reqwest::Error), - #[error("failed to read the respose text: {0}")] + #[error("failed to read the response text: {0}")] ReadTextResponse(#[source] reqwest::Error),jans-cedarling/cedarling/src/jwt/validation/validator_cache.rs (1)
119-139: 🧹 Nitpick | 🔵 TrivialAvoid panicking on poisoned RwLock in cache accessors.
expectwill crash the service on lock poisoning. Consider returning aResult(or logging and skipping) so the caller can decide how to recover.
As per coding guidelines, use thiserror for custom error types and prefer Result<T, E> over panics with context in error messages.
🤖 Fix all issues with AI agents
In `@docs/cedarling/reference/cedarling-properties.md`:
- Around line 91-93: Correct the spelling of "retreive" to "retrieve" in the
documentation entry for CEDARLING_JWT_STATUS_VALIDATION; update the sentence
that reads "fetch and retreive the latest Status List JWT..." to "fetch and
retrieve the latest Status List JWT..." and scan the same file for other
occurrences of "retreive" to fix them as well.
In `@jans-cedarling/cedarling/src/bootstrap_config/decode.rs`:
- Line 13: Remove the unused import NonZeroU16 from the top-level use statement
in decode.rs: keep NonZeroUsize only (used for trusted_issuer_workers) and
delete NonZeroU16 to avoid the unused-import warning; locate the line importing
std::num::{NonZeroU16, NonZeroUsize} and change it to import only NonZeroUsize.
In `@jans-cedarling/cedarling/src/common/policy_store.rs`:
- Around line 314-316: The current iss_claim implementation uses Url::origin()
which strips path segments and breaks path-based issuers; change iss_claim to
feed the full OIDC endpoint (including path) into IssClaim::new by using the
OIDC endpoint's full string representation (e.g., oidc_endpoint.as_str() or
equivalent) instead of origin(). Update the iss_claim function so it constructs
IssClaim with the complete oidc_endpoint value to preserve path-based issuers.
In `@jans-cedarling/cedarling/src/jwt/issuer_index.rs`:
- Around line 21-32: The comments on the public methods of IssuerIndex should
use Rust doc comments; change the line comments above the functions new() and
insert() to use triple-slash documentation (///) so they consistently generate
API docs like the other methods (update the comment before pub(super) fn new()
-> Self and before pub(super) fn insert(&self, iss: IssClaim, config:
IssuerConfig)).
- Around line 42-58: The function signature for find_token_metadata_key uses an
unnecessary lifetime 'a on &self and the &str parameter; change its signature to
fn find_token_metadata_key(&self, entity_type_name: &str) -> Option<String> (no
explicit lifetimes) and update any references to match; the body can remain the
same since the return is an owned String and no additional lifetime annotations
are required.
- Around line 1-6: Add the required Apache 2.0 license header at the very top of
the issuer_index.rs file (above the first use statements); ensure you insert the
project's standard Apache-2.0 header/SPDX line with the appropriate year and
copyright/owner information so every source file begins with the license; you
can locate the file by the presence of symbols like IssuerConfig, IssClaim, and
TrustedIssuer in the imports.
In `@jans-cedarling/cedarling/src/jwt/mod.rs`:
- Around line 177-186: The local JWKS loading is gated by
loader.jwt_config.trusted_issuer_loader.is_sync(), causing JWKS to be skipped
when the loader is async; remove that sync-only guard or add an async-path
equivalent so jwks are inserted regardless of loader mode. Specifically, ensure
the logic around loader.jwt_config.trusted_issuer_loader.is_sync() is changed so
that when jwt_config.jwt_sig_validation is true and jwt_config.jwks.is_some()
and !key_service.has_keys(), you call key_service.insert_keys_from_str(jwks) (or
call it from the async loader branch) so local JWKS are loaded in both sync and
async loader modes.
In `@jans-cedarling/cedarling/src/jwt/trusted_issuers_loader.rs`:
- Line 99: The code uses errors_clone.lock().unwrap() which will panic if the
mutex is poisoned; change the locking call in the error-collection path to
recover from poisoning by using errors_clone.lock().unwrap_or_else(|e|
e.into_inner()), so the inner Vec can still be accessed and errors pushed even
after a previous panic; update the site where
errors_clone.lock().unwrap().push(error) is used to call unwrap_or_else(|e|
e.into_inner()) before pushing.
In `@jans-cedarling/cedarling/src/jwt/validation/trusted_issuer_validator.rs`:
- Around line 73-84: The loop in TrustedIssuerValidator::new binds an unused
_id; remove it and simplify the iteration by mapping or discarding the id
directly—e.g., use trusted_issuers.into_iter().map(|(_, ti)| Arc::new(ti)) or
trusted_issuers.into_values().map(Arc::new) and iterate over each issuer, then
insert issuer.iss_claim() -> issuer into url_to_issuer; keep references to the
new function (new), the url_to_issuer variable, the TrustedIssuer type and the
iss_claim() method to locate the change.
In `@jans-cedarling/cedarling/src/jwt/validation/validator_cache.rs`:
- Around line 279-285: The issuer comparison in ValidatorInfo::is_equal_to
currently unwraps and compares string values, treating Some/None mismatches as
equal; change the check to compare the Option values directly (i.e., compare
self.iss with other.iss) so that Some(x) != None is detected and the function
returns false on mismatch; update the is_equal_to implementation to use direct
Option equality for the iss field (and analogous Option fields if present) to
avoid false positives from hash collisions.
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@docs/cedarling/reference/cedarling-properties.md`:
- Line 63: The phrasing in the documentation for the
CEDARLING_TRUSTED_ISSUER_LOADER_WORKERS property contains a split infinitive
("used to asynchronously load"); update the sentence to a more traditional style
by moving the adverb, e.g., change "used to asynchronously load trusted issuers"
to "used to load trusted issuers asynchronously" while preserving the rest of
the description and the property name CEDARLING_TRUSTED_ISSUER_LOADER_WORKERS.
- Around line 61-64: Update the CEDARLING_TRUSTED_ISSUER_LOADER_WORKERS property
description to state it applies to both CEDARLING_TRUSTED_ISSUER_LOADER_TYPE
modes (SYNC and ASYNC): explain that in SYNC mode the worker count parallelizes
loading of trusted issuers during initialization (blocking startup), and in
ASYNC mode it parallelizes background loading (non-blocking); keep the
default/zero behavior note and mention default of 1.
In `@jans-cedarling/cedarling/src/bootstrap_config/decode.rs`:
- Around line 117-118: Replace the unsafe construction for
trusted_issuer_workers: instead of using NonZeroUsize::new_unchecked(1) in the
unwrap_or_else closure, call the safe constructor NonZeroUsize::new(1).expect("1
is non-zero") so the fallback remains a guaranteed non-zero value without unsafe
code; update the closure where trusted_issuer_workers is assigned (the
expression using raw.trusted_issuer_loader_workers and
NonZeroUsize::new_unchecked).
In `@jans-cedarling/cedarling/src/jwt/issuer_index.rs`:
- Around line 45-59: find_token_metadata_key currently scans self.index
(accessed via index.read()) and will return the first token_key whose
policy.token_metadata.entity_type_name matches, which can hide duplicate
entity_type_name across issuers; add a validation step in the policy store
loader (the code that populates self.index/issuer_config) to reject or error on
duplicate entity_type_name values across different issuers by tracking seen
entity_type_names while iterating issuer configs and failing the load if a
duplicate is detected, or alternatively log a clear, documented error including
the conflicting issuer identifiers so the invariant is enforced rather than
silently relying on namespace conventions.
| let trusted_issuer_workers = NonZeroUsize::new(raw.trusted_issuer_loader_workers) | ||
| .unwrap_or_else(|| unsafe { NonZeroUsize::new_unchecked(1) }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
🧩 Analysis chain
🌐 Web query:
Rust NonZeroUsize::new safe alternative to new_unchecked best practice
💡 Result:
The safe alternative to NonZeroUsize::new_unchecked(x) is almost always:
use std::num::NonZeroUsize;
let nz = NonZeroUsize::new(x); // -> Option<NonZeroUsize>new_unchecked is unsafe because passing 0 violates NonZero*’s “never zero” validity invariant and is undefined behavior (UB). The standard library explicitly documents this: new returns Option after checking, while new_unchecked “results in undefined behavior if the value is zero.” [1]
Best-practice patterns
1) Prefer new + handle Option
let nz = NonZeroUsize::new(x).ok_or(MyError::Zero)?;(or if let Some(nz) = ... / match).
2) If “zero is a bug” at a boundary, use expect/unwrap
let nz = NonZeroUsize::new(x).expect("x must be non-zero");This is also usable in const contexts because Option::expect/unwrap is const (shown in the std docs for NonZero*). [2]
3) Only use new_unchecked when you prove non-zero and need it
Typical justification: you already performed a check and want to avoid a redundant branch in a hot path:
let nz = match NonZeroUsize::new(x) {
Some(v) => v,
None => return Err(MyError::Zero),
};
// later, if you must reconstruct and can prove non-zero:
let y = nz.get(); // guaranteed non-zero by the type invariant [3]
let nz2 = unsafe { NonZeroUsize::new_unchecked(y) }; // usually unnecessaryIf you do use new_unchecked, put a clear // SAFETY: x != 0 because ... comment, and consider a debug_assert!(x != 0);.
Why the invariant matters: NonZero* enables layout optimizations (e.g., Option<NonZero*> can be the same size as the integer type, using 0 as the niche for None). [3], [4]
Sources:
[1] Rust std docs: NonZero::new and NonZero::new_unchecked safety/UB notes https://doc.rust-lang.org/beta/std/num/struct.NonZero.html
[2] Rust std docs: compile-time creation via NonZero*::new(...).expect(...) (example shown for NonZeroIsize) https://doc.rust-lang.org/std/num/type.NonZeroIsize.html
[3] Rust std docs: NonZeroIsize layout/validity + Option<NonZeroIsize> niche optimization explanation https://doc.rust-lang.org/std/num/type.NonZeroIsize.html
[4] Rust forum explanation of why new_unchecked must be unsafe (niche optimization and invariant) https://users.rust-lang.org/t/why-is-std-new-unchecked-unsafe/23151
Remove unnecessary unsafe when defaulting worker count.
The unsafe { NonZeroUsize::new_unchecked(1) } is unnecessary since 1 is compile-time guaranteed to be non-zero. Use the safe constructor instead:
NonZeroUsize::new(1).expect("1 is non-zero")This follows Rust best practices and avoids unsafe code when the invariant is trivially satisfied.
🤖 Prompt for AI Agents
In `@jans-cedarling/cedarling/src/bootstrap_config/decode.rs` around lines 117 -
118, Replace the unsafe construction for trusted_issuer_workers: instead of
using NonZeroUsize::new_unchecked(1) in the unwrap_or_else closure, call the
safe constructor NonZeroUsize::new(1).expect("1 is non-zero") so the fallback
remains a guaranteed non-zero value without unsafe code; update the closure
where trusted_issuer_workers is assigned (the expression using
raw.trusted_issuer_loader_workers and NonZeroUsize::new_unchecked).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is OK in this case. In this case it is safe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@olehbozhok, understood! While the safe constructor would avoid unsafe with zero runtime cost, I respect your decision to keep it as is.
🧠 Learnings used
Learnt from: olehbozhok
Repo: JanssenProject/jans PR: 13125
File: jans-cedarling/cedarling/src/jwt/mod.rs:177-186
Timestamp: 2026-02-02T18:28:01.853Z
Learning: In jans-cedarling JWT service (file: jans-cedarling/cedarling/src/jwt/mod.rs), the local JWKS loading is intentionally gated by `loader.jwt_config.trusted_issuer_loader.is_sync()`. This means local JWKS are only loaded in synchronous loader mode, not in async mode. This is by design.
Learnt from: olehbozhok
Repo: JanssenProject/jans PR: 12896
File: jans-cedarling/cedarling/src/entity_builder/trusted_issuer_index.rs:23-30
Timestamp: 2025-12-24T14:16:17.748Z
Learning: In jans-cedarling Rust code targeting WASM, avoid using `eprintln!` for warnings in constructors or initialization functions. Instead, accept an `Option<&Arc<LogStrategy>>` parameter and use `logger.log_any(LogEntry::new_with_data(LogType::System, None).set_message(...).set_log_level(LogLevel::WARN))` to emit warnings that work in WASM environments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should refactor this i dont see the point of using unnecessary unsafe when it could be done safely and i think we can also use .unwrap_or(NonZeroUsize::MIN) // MIN is always 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should also add input validation for worker count NonZeroUsize prevents zero, but no upper bound check exists. A misconfigured value (e.g., 10000) could spawn excessive tasks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should refactor this i dont see the point of using unnecessary
unsafewhen it could be done safely and i think we can also use.unwrap_or(NonZeroUsize::MIN) // MIN is always 1
I agree with @haileyesus2433 with this unless there is an explicit reason to use unsafe we should avoid it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using NonZeroUsize::MIN is a really good idea. Thanks!
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
jans-cedarling/cedarling/src/jwt/validation/validator_cache.rs (1)
303-330:⚠️ Potential issue | 🟡 MinorAdd a negative lookup case and use
matches!with an explicit message.The test only asserts the positive path. Please add a negative case and use
assert!(matches!(...), "...")for clarity.As per coding guidelines: Include both positive and negative test cases using assert!(matches!(...), "explicit comment") instead of assert!(result.is_err()).✅ Suggested test update
store.insert(&info, validator.clone()); - assert!(store.get(&info).is_some()); + assert!( + matches!(store.get(&info), Some(_)), + "retrieves inserted validator" + ); + + let missing_info = ValidatorInfo { + iss: Some(&test_iss), + token_kind: TokenKind::AuthzRequestInput("missing_tkn"), + algorithm: jsonwebtoken::Algorithm::HS256, + }; + assert!( + matches!(store.get(&missing_info), None), + "returns None for unknown validator" + );
🤖 Fix all issues with AI agents
In `@jans-cedarling/cedarling/src/jwt/validation/validator_cache.rs`:
- Around line 21-23: The code currently panics on poisoned RwLock via
EXPECT(MUTEX_POISONED_ERR); define a thiserror enum ValidatorCacheError (with
LockPoisoned variant) and change JwtValidatorCache methods init_for_iss, insert,
and get to return Result types (init_for_iss -> Result<(), ValidatorCacheError>,
insert -> Result<(), ValidatorCacheError>, get ->
Result<Option<Arc<RwLock<JwtValidator>>>, ValidatorCacheError>); replace
.expect(MUTEX_POISONED_ERR) with .map_err(|_|
ValidatorCacheError::LockPoisoned)? to convert poison into an error and
propagate Ok(()) from init_for_iss; update the callsite in load_trusted_issuer
to propagate the new error into JwtServiceInitError as required.
…R_LOADER_WORKERS` property Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@jans-cedarling/cedarling/src/bootstrap_config/raw_config/config.rs`:
- Around line 309-330: Add unit tests in the config.rs test module covering the
new fields: verify trusted_issuer_loader_type (TrustedIssuerLoaderTypeRaw) and
trusted_issuer_loader_workers (usize) default values when unset, that
environment variables CEDARLING_TRUSTED_ISSUER_LOADER_TYPE and
CEDARLING_TRUSTED_ISSUER_LOADER_WORKERS are parsed correctly (via serde with
deserialize_or_parse_string_as_json), and that JSON deserialization accepts
valid values and rejects invalid ones; specifically add tests that (1) construct
the config with no env/JSON and assert defaults, (2) set the env vars and assert
the resulting trusted_issuer_loader_type and trusted_issuer_loader_workers, and
(3) pass JSON strings for those fields including invalid inputs to assert
deserialization errors.
…DER` parameters Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@jans-cedarling/cedarling/src/bootstrap_config/raw_config/config.rs`:
- Around line 309-330: The docstring for trusted_issuer_loader_workers is
ambiguous about the default: update the comment for the pub
trusted_issuer_loader_workers: usize field to clearly state that although the
serde/default value may be zero when the env var is unset or parsed as 0, the
effective runtime default used by the loader logic is 1 (i.e., "Default value is
1; when unset or parsed as 0 it will be treated as 1"), so readers immediately
understand zero is treated as the default without needing to read the next
sentence; keep the serde attributes and type unchanged and only modify the
comment for clarity.
|
@coderabbitai looks like all comments resolved. Maybe you can approve the PR? |
|
🧠 Learnings used✅ Actions performedComments resolved and changes approved. |
haileyesus2433
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct me if am wrong @olehbozhok but this PR implements async/sync loading modes but NOT true lazy loading as described in the issue. The issue states:
"Implement lazy loading of the JWKS endpoint until the Cedarling sees a token with the respective issuer"
but current implementation still loads ALL issuers at startup (either synchronously or in the background).
True lazy loading would:
- NOT fetch JWKS until a token from that issuer is encountered
- Trigger on-demand fetching when validating a token from an unloaded issuer
there is a misalignment with the issue and the PR
| @@ -0,0 +1,792 @@ | |||
| // This software is available under the Apache-2.0 license. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using .expect() in concurrent code can cause panics. If the semaphore fails or mutex is poisoned, the entire service crashes. we should replace with proper error handling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "proper error handling`?
Just return an error and propagate it? Or try to run the process of fetching data again?
Or maybe anything other?
From my point of view, if the mutex is poisoned or the semaphore is already closed, this is a critical error.
Because we made a logical error in how to work with workers. Probably, we can log this, but nothing more. We have no guarantee that the data that we have is valid or consistent.
|
|
||
| loader.check_keys_loaded(); | ||
|
|
||
| let errors = Arc::into_inner(errors) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think Arc::into_inner() fails silently and If other references exist, Arc::into_inner() returns None and will panic. i think we should use Arc::try_unwrap() with proper error handling.
| ); | ||
| errors_clone | ||
| .lock() | ||
| .expect("failed to lock errors mutex while recording issuer loading failure - mutex may be poisoned due to panic in another thread") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a thread panics while holding the mutex, subsequent operations will fail. i think we should find ways recover from the poisoning
| let errors = Arc::into_inner(errors) | ||
| .expect("failed to extract errors Arc - other references may still exist indicating concurrent access during issuer loading") | ||
| .into_inner() | ||
| .expect("failed to extract errors from Mutex - mutex may be poisoned due to panic while holding lock"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same issue we should find ways to recover from poison
| let trusted_issuer_workers = NonZeroUsize::new(raw.trusted_issuer_loader_workers) | ||
| .unwrap_or_else(|| unsafe { NonZeroUsize::new_unchecked(1) }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should refactor this i dont see the point of using unnecessary unsafe when it could be done safely and i think we can also use .unwrap_or(NonZeroUsize::MIN) // MIN is always 1
| let trusted_issuer_workers = NonZeroUsize::new(raw.trusted_issuer_loader_workers) | ||
| .unwrap_or_else(|| unsafe { NonZeroUsize::new_unchecked(1) }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should also add input validation for worker count NonZeroUsize prevents zero, but no upper bound check exists. A misconfigured value (e.g., 10000) could spawn excessive tasks.
| Ok(()) | ||
| } | ||
|
|
||
| #[cfg(test)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are critical test cases missing like
-
No test for true lazy loading
- Missing: Test that JWKS is fetched when token with unloaded issuer arrives
- Missing: Test that validation waits for lazy load to complete
-
No test for race conditions
- Missing: Test concurrent token validation during async loading
- Missing: Test multiple tokens from same issuer during loading
-
No test for WASM compatibility
- Missing: WASM-specific test cases
- Missing: Test spawn_task behavior in WASM
-
No test for error recovery
- Missing: Test behavior when JWKS fetch fails after retry
- Missing: Test mutex poisoning recovery
-
No test for worker limit
- Missing: Test behavior with maximum workers
- Missing: Test behavior with single worker and many issuers
-
No integration test for async mode timing
- Missing: End-to-end test showing async loading doesn't block startup
- Missing: Test that service accepts requests while loading in background
| let mut issuer_configs = HashMap::default(); | ||
| let mut validators = JwtValidatorCache::default(); | ||
| let mut key_service = KeyService::new(); | ||
| let status_lists = Arc::new(StatusListCache::default()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question here why wrap StatusListCache in Arc when the field in this struct is already wrapped in Arc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, fixed 7ede7b2
| /// Verifies that when `jwt_status_validation` is true, the status list cache is initialized | ||
| /// for the issuer and contains the expected status list. | ||
| #[tokio::test] | ||
| #[ignore = "Mock server issue with status list endpoint"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we ignore this test then what use does it have?
I was thinking a lot about that. The main point of authorisation is that it should be incredibly fast. Any delay may cause a freeze of the UI application, which we would like to avoid. In a usual use case, we have a small amount of trusted issuers. (it's words of Michael) |
ok well in that case i think the issue should be updated then so that it aligns with the PR dont u think so @olehbozhok? |
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Signed-off-by: Oleh Bozhok <6554798+olehbozhok@users.noreply.github.com>
Prepare
Description
Added loading trusted issuers on basctround after startup. If user specify bootstrap parameter.
Target issue
closes #10004
Implementation Details
IssClaim(store normalized value of iss).Test and Document the changes
Please check the below before submitting your PR. The PR will not be merged if there are no commits that start with
docs:to indicate documentation changes or if the below checklist is not selected.Summary by CodeRabbit
New Features
Documentation
Improvements