Skip to content

Use MSAL's recent UserFIC API for agentic flows#3842

Open
Avery-Dunn wants to merge 8 commits into
masterfrom
avdunn/agentic-fic-scenario-fix
Open

Use MSAL's recent UserFIC API for agentic flows#3842
Avery-Dunn wants to merge 8 commits into
masterfrom
avdunn/agentic-fic-scenario-fix

Conversation

@Avery-Dunn

@Avery-Dunn Avery-Dunn commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Replaces the internal ROPC piggybacking mechanism for agentic User FIC token acquisition with MSAL .NET's native AcquireTokenByUserFederatedIdentityCredential API. This modernizes the agentic flow to use purpose-built MSAL APIs, simplifies the implementation, and resolves a customer-reported caching bug (#3840).

Background

ID Web's agentic User FIC flow previously worked by hijacking the ROPC (AcquireTokenByUsernamePassword) code path. An internal add-in (AgentUserIdentityMsalAddIn) registered an OnBeforeTokenRequestHandler callback that rewrote the HTTP request body at the last moment — changing grant_type to user_fic, injecting token assertions, and removing the dummy password. This approach had several drawbacks:

  • No proper cache support: The ROPC silent-flow guard requires a non-null ClaimsPrincipal with oid/tid claims. In agentic scenarios (bots, services), ClaimsPrincipal is typically null or request-scoped, so the cache was always bypassed — causing 2–4 unnecessary network round-trips per call (#3840).
  • Fragile HTTP body manipulation: Rewriting request bodies inside a callback is opaque, hard to debug, and bypasses MSAL's validation and telemetry.
  • No type safety: The flow used string manipulation of grant types and body parameters rather than MSAL's typed API surface.

MSAL .NET has since introduced AcquireTokenByUserFederatedIdentityCredential — a first-class API for exactly this scenario, with built-in cache support and proper protocol handling. MSAL .NET 4.84.2 added the Guid userObjectId overload, enabling both UPN-based and OID-based flows.

Approach

Multi-CCA Pattern

The new implementation uses two CCA instances working together:

Component Role Cache
Blueprint CCA ID Web's existing CCA (the app's own identity). Handles Leg 1: acquires the FMI token (T1) via AcquireTokenForClient + WithFmiPath(agentAppId). Shared app token cache — T1 is cached and reused across agents.
Agent CCA New per-agent CCA whose client credential is a WithClientAssertion callback that chains to the blueprint for Leg 1. Handles Leg 2 (AcquireTokenForClient → T2) and Leg 3 (AcquireTokenByUserFederatedIdentityCredential → user token). Own in-memory cache — user tokens cached per agent, per user.

Three-Leg Flow

1. Caller → CreateAuthorizationHeaderForUserAsync(scopes, options, claimsPrincipal: null)
2. TryGetAuthenticationResultForAgentUserFicAsync detects agentic flow (UPN or OID)
3. Silent lookup via stored account identifier → cache HIT? Return cached token.
4. Cache MISS:
   Leg 1: Blueprint CCA → AcquireTokenForClient + WithFmiPath → T1 (FMI token)
   Leg 2: Agent CCA → AcquireTokenForClient (T1 as client assertion) → T2 (instance token)
   Leg 3: Agent CCA → AcquireTokenByUserFederatedIdentityCredential(scopes, UPN/OID, T2) → user token
5. Store account identifier for future silent lookups
6. Return user token

On subsequent calls for the same user, step 3 returns the cached token with zero network calls.

Account Identifier Storage

In all other ID Web flows, the MSAL account identifier (needed for AcquireTokenSilent) is stored in the ClaimsPrincipal via oid/tid claims. In the agentic scenario, ClaimsPrincipal is typically null or request-scoped, so a ConcurrentDictionary<string, string> maps "{agentAppId}:{USER_IDENTIFIER}:{TENANTID}" → MSAL account identifier. Entries are cleaned up when MSAL evicts the corresponding account from its cache.

Changes

Directory.Build.props

  • Bump MSAL .NET from 4.84.1 → 4.84.2 (adds Guid userObjectId overload)

TokenAcquisition.cs

  • TryGetAuthenticationResultForAgentUserFicAsync (new): Detects UPN or OID agentic flows, performs silent retrieval or the 3-leg flow via native MSAL APIs
  • GetOrBuildAgentUserFicCcaAsync (new): Builds and caches agent CCAs per (agentAppId, authenticationScheme) with assertion callbacks
  • ResolveFicScopes (new): Resolves the FIC token exchange scope from ExtraParameters, falling back to the public cloud default (api://AzureADTokenExchange/.default). Callers can override for national clouds via ExtraParameters[Constants.TokenExchangeUrlKey], matching the pattern used by OidcIdpSignedAssertionProvider and GetFicTokenAsync.
  • Early return in TryGetAuthenticationResultForConfidentialClientUsingRopcAsync: Intercepts agentic flows before the ROPC path
  • Dead code removal: Agent identity extraction blocks from the ROPC method

Constants.cs

  • TokenExchangeUrlKey (new): Key for overriding the FIC token exchange URL via ExtraParameters for national cloud support

AgentIdentitiesExtension.cs

  • Removed add-in callback registration from AddAgentIdentities() (AddOidcFic() preserved)

AgentUserIdentityMsalAddIn.cs

  • Deleted. The internal class that rewrote ROPC request bodies is fully superseded.

TokenAcquisitionTests.cs

  • 6 new tests:
    • UPN: UsesCacheOnSecondCall, WorksWithNonNullClaimsPrincipal, CacheWorksWithNewClaimsPrincipalPerCall
    • OID: OidUsesCacheOnSecondCall, OidCacheWorksWithNewClaimsPrincipalPerCall, UpnAndOidCachesAreIsolated

Testing

  • 40 TokenAcquisitionTests pass (34 existing + 6 new) across net8.0, net9.0, net10.0
  • 5 TokenAcquisitionAddInTests pass (general add-in infrastructure unaffected)
  • No regressions

Breaking Changes

None. All public APIs are unchanged:

  • WithAgentUserIdentity(options, agentAppId, username) — now uses native UPN path internally
  • WithAgentUserIdentity(options, agentAppId, userId) — now uses native OID path internally
  • AddAgentIdentities() — still registers OidcFic; no longer registers the (internal) add-in callback

The deleted AgentUserIdentityMsalAddIn was internal static with no external consumers.

Known Limitations

  • National cloud FIC scope: The FIC scope defaults to api://AzureADTokenExchange/.default (public cloud). National cloud variants (China, France, Germany) can be specified via ExtraParameters[Constants.TokenExchangeUrlKey]. This is consistent with how OidcIdpSignedAssertionProvider and GetFicTokenAsync handle the override. A future enhancement could add automatic cloud-aware resolution.
  • Agent CCA dictionary is unbounded: The _agentUserFicCcas dictionary has no size cap, consistent with all other CCA dictionaries in TokenAcquisition (_applicationsByAuthorityClientId, _managedIdentityApplicationsByClientId). In practice, cardinality is very low (typically 1 agent app per deployment). Adding a cross-cutting size cap could be a follow-up.

Resolves

Replace ROPC piggybacking with MSAL's native
AcquireTokenByUserFederatedIdentityCredential API using the multi-CCA
pattern (blueprint + per-agent CCAs with assertion callbacks).

This enables proper token caching for agentic User FIC flows when
ClaimsPrincipal is null, eliminating 2-4 unnecessary network round-trips
per bot message.

Phase 1: UPN-based flows only. OID-based flows remain on the existing
ROPC+add-in path pending MSAL .NET support for the OID overload.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Avery-Dunn Avery-Dunn requested a review from a team as a code owner June 4, 2026 16:56
/// Entries are cleaned up when MSAL evicts the corresponding account from its cache.
/// </summary>
private readonly ConcurrentDictionary<string, string> _agentUserFicAccountIds = new();
private static readonly string[] s_ficScopes = new[] { "api://AzureADTokenExchange/.default" };

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are always the public token exchange?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing behavior in ID Web for these agent scenarios hardcoded the public cloud endpoint, so my PR originally tried to match that.

However, in the latest commit I refactored it to be customizable: now the public cloud endpoint is only the default, and can be manually overridden by setting ExtraParameters[Constants.TokenExchangeUrlKey] (this is how national cloud versions of AzureADTokenExchange are handled in other parts of ID Web today, such as in OidcIdpSignedAssertionProvider and GetFicTokenAsync)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to fix this holistically. I assigned you some other bugs related to this.

@Avery-Dunn Avery-Dunn marked this pull request as draft June 4, 2026 20:04
/// <summary>
/// Caches agent CCAs for the native User FIC flow. Each agent CCA uses an assertion
/// callback that chains to the blueprint CCA for Leg 1 (FMI token acquisition).
/// Key format: "{agentAppId}:{authenticationScheme}".

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason to have the authenticationScheme included in the key for CCA?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expanded comment explaining why authenticationScheme is part of the key: the authenticationScheme is included because different schemes may resolve to different blueprint credentials (e.g. certificates), and the agent CCA's assertion callback captures the scheme to chain back to the correct blueprint CCA.

})
.WithAuthority(authority)
.WithHttpClientFactory(_httpClientFactory)
.WithInstanceDiscovery(false) // Blueprint already validated the authority.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is already validated then this is cached. Should not explicitly disable the instance discovery

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed WithInstanceDiscovery(false) in the latest commit.

/// callback that chains to the blueprint CCA for Leg 1 (FMI token acquisition).
/// Key format: "{agentAppId}:{authenticationScheme}".
/// </summary>
private readonly ConcurrentDictionary<string, IConfidentialClientApplication> _agentUserFicCcas = new();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many agentic CCA apps can exists do you know? Should there be an upper limit to this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The basic structure is that 1 "blueprint" Entra apps can some number of "agent" Entra apps, and each agent app has some number of users it's acting on behalf of, but I'm not sure what a realistic number of agent apps would actually be.

There are a few places in ID Web with similar unbound list of CCA instances:
-

private readonly ConcurrentDictionary<string, IConfidentialClientApplication?> _applicationsByAuthorityClientId = new();

-
private readonly ConcurrentDictionary<string, IManagedIdentityApplication> _managedIdentityApplicationsByClientId = new();

So on the one hand an unbounded dictionary matches the existing style, but on the other hand there are a lot more reasons to create "agent" apps than normal CCA apps so this list will likely be much larger than the others.

However, we haven't really discussed this issue, and any solution we come up with would also be helpful guidance for our customers, so that might be out of scope for this PR.

Avery-Dunn and others added 2 commits June 5, 2026 06:29
- Bump MSAL .NET from 4.84.1 to 4.84.2 (adds Guid userObjectId overload
  for AcquireTokenByUserFederatedIdentityCredential)
- Extend TryGetAuthenticationResultForAgentUserFicAsync to handle both
  UPN-based and OID-based agentic flows via native MSAL APIs
- Remove AgentUserIdentityMsalAddIn (ROPC body-rewriting workaround) and
  its registration in AddAgentIdentities — no longer needed
- Remove dead agent identity extraction code from ROPC path
- Add 3 OID-specific tests: cache on second call, fresh ClaimsPrincipal
  per call, and UPN/OID cache isolation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR modernizes the agentic “User FIC” token acquisition flow by replacing the prior ROPC piggybacking/request-rewrite add-in with MSAL’s native AcquireTokenByUserFederatedIdentityCredential API, aiming to restore proper MSAL cache usage and fix the cache-bypass reported in #3840.

Changes:

  • Bumps MSAL .NET to 4.84.2 and switches agentic user token acquisition to native UserFIC (multi-CCA / 3-leg flow).
  • Adds internal caching structures for per-agent CCA instances and MSAL account identifiers to enable silent token acquisition even when ClaimsPrincipal is null.
  • Removes the internal MSAL add-in that rewrote ROPC requests and adds new unit tests covering UPN/OID cache behavior.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
Directory.Build.props Updates MSAL .NET version to enable the needed UserFIC overload.
src/Microsoft.Identity.Web.TokenAcquisition/TokenAcquisition.cs Implements native UserFIC flow, agent CCA caching, and account-id mapping for silent cache hits.
src/Microsoft.Identity.Web.TokenAcquisition/Constants.cs Adds an internal key for overriding the token-exchange audience via ExtraParameters.
src/Microsoft.Identity.Web.AgentIdentities/AgentIdentitiesExtension.cs Stops registering the old ROPC-rewrite callback in AddAgentIdentities().
src/Microsoft.Identity.Web.AgentIdentities/AgentUserIdentityMsalAddIn.cs Deletes the internal add-in that rewrote token requests.
tests/Microsoft.Identity.Web.Test/TokenAcquisitionTests.cs Adds new tests for agentic UserFIC caching behavior (UPN + OID).

Comment on lines +729 to +732
// Include authenticationScheme in the CCA cache key so different schemes
// (pointing to different blueprint credentials) get separate CCAs.
string ccaCacheKey = $"{agentAppId}:{authenticationScheme ?? string.Empty}";

Comment on lines +634 to +638
// Try silent retrieval first using a stored account identifier from a prior call.
// Include tenantId in the key so cross-tenant calls don't collide.
string normalizedTenant = tenantId?.ToUpperInvariant() ?? string.Empty;
string accountLookupKey = $"{agentAppId}:{userIdentifierForCacheKey}:{normalizedTenant}";
if (!forceRefresh
Comment on lines +762 to +767
var leg1 = await blueprintCca
.AcquireTokenForClient(capturedFicScopes)
.WithFmiPath(agentAppId)
.WithSendX5C(blueprintOptions.SendX5C)
.ExecuteAsync(options.CancellationToken)
.ConfigureAwait(false);
Comment on lines +760 to +767
// UPN flow handlers (3 legs)
AddAgentUserFicMockHandlers(mockHttpClient!, userAccessToken: "upn-user-token");
// OID flow handlers (3 legs — Leg 1 may be cached from UPN flow, but Leg 2 + Leg 3 are needed)
// Note: Leg 1 (blueprint FMI) is cached in the blueprint CCA, but Leg 2 uses the agent CCA
// which also caches T2. Since OID and UPN go to the same agent CCA, Leg 2 may be cached.
// We still need a Leg 3 handler for the OID grant type.
mockHttpClient!.AddMockHandler(CreateUserFicTokenHandler(accessToken: "oid-user-token"));

/// different blueprint credentials (e.g. certificates), and the agent CCA's assertion
/// callback captures the scheme to chain to the correct blueprint CCA.
/// </summary>
private readonly ConcurrentDictionary<string, IConfidentialClientApplication> _agentUserFicCcas = new();

@bgavrilMS bgavrilMS Jun 5, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we try to reuse the existing CCAs? I am concerned because there is a lot of logic / hooks around the creation of these CCA objects, and we'd need to replicate them? For example, translating from the Identity.Web object model for configuration to the MSAL one (force refresh, auth scheme etc. etc.); It would be good to reuse as much as possible.

@Avery-Dunn Avery-Dunn Jun 5, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, what do you mean by "reuse the existing CCAs"?

We cannot re-use these CCA instances for different agents, as shown in this POC there is a fundamental incompatibility with MSAL's existing cache key design and the "1+N" client IDs that the agent ID flow requires: AzureAD/microsoft-authentication-library-for-dotnet#6008

That POC also shows that a relatively small code change could reduce it down to a single CCA instance, however in our previous discussions about this you said we should not need to go down that route and we should instead focus on providing guidance on how to use existing FIC APIs.

Now that we've started seeing real-world usage like this, should we start exploring our options to resolve this again?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I'm ok with some extensibility API in MSAL.NET to address this. This code duplication will cause problem further on.

@Avery-Dunn Avery-Dunn marked this pull request as ready for review June 5, 2026 15:30
@Avery-Dunn Avery-Dunn changed the title Fix #3840: Use MSAL's UserFIC API for agentic UPN flows Use MSAL's UserFIC API for agentic flows Jun 5, 2026
@Avery-Dunn Avery-Dunn changed the title Use MSAL's UserFIC API for agentic flows Use MSAL's recent UserFIC API for agentic flows Jun 5, 2026
Copilot AI and others added 5 commits June 5, 2026 09:28
* Initial plan

* Bump MSAL to 4.84.2

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Replace ROPC piggybacking with MSAL's native
AcquireTokenByUserFederatedIdentityCredential API using the multi-CCA
pattern (blueprint + per-agent CCAs with assertion callbacks).

This enables proper token caching for agentic User FIC flows when
ClaimsPrincipal is null, eliminating 2-4 unnecessary network round-trips
per bot message.

Phase 1: UPN-based flows only. OID-based flows remain on the existing
ROPC+add-in path pending MSAL .NET support for the OID overload.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Bump MSAL .NET from 4.84.1 to 4.84.2 (adds Guid userObjectId overload
  for AcquireTokenByUserFederatedIdentityCredential)
- Extend TryGetAuthenticationResultForAgentUserFicAsync to handle both
  UPN-based and OID-based agentic flows via native MSAL APIs
- Remove AgentUserIdentityMsalAddIn (ROPC body-rewriting workaround) and
  its registration in AddAgentIdentities — no longer needed
- Remove dead agent identity extraction code from ROPC path
- Add 3 OID-specific tests: cache on second call, fresh ClaimsPrincipal
  per call, and UPN/OID cache isolation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants