Skip to content

fix(ucs): skip disabled-connector check in shadow psync to prevent false typeDiff (#17476)#13085

Open
AmitsinghTanwar007 wants to merge 1 commit into
mainfrom
autofix/cybersource-psync-17476
Open

fix(ucs): skip disabled-connector check in shadow psync to prevent false typeDiff (#17476)#13085
AmitsinghTanwar007 wants to merge 1 commit into
mainfrom
autofix/cybersource-psync-17476

Conversation

@AmitsinghTanwar007

@AmitsinghTanwar007 AmitsinghTanwar007 commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Summary

When a connector is listed in ucs_psync_disabled_connectors, the psync shadow execution path returns Ok(router_data.clone()) early — before any UCS gRPC call is made. In primary mode this is the correct fallback to direct HTTP. In shadow mode it is incorrect: the validation service receives a router_data with:

  • connector_http_status_code = null (initial None — no API call was made)
  • response.Err.code = "CONNECTOR_ERROR_RESPONSE" (placeholder from uninitialized error path)

While HS makes the real connector call and gets:

  • connector_http_status_code = <number>
  • response.Err.code = "No error code" (actual connector response)

Root cause

psync_gateway.rs checks is_ucs_psync_disabled without inspecting ExecutionMode. The early-return applies in shadow mode as well as primary mode, so the comparison service receives an unpopulated router_data on the UCS side.

Fix

Gate the early-return on ExecutionMode != Shadow:

// Before
if is_ucs_psync_disabled {
    return Ok(router_data.clone());
}

// After
if is_ucs_psync_disabled
    && !matches!(unified_connector_service_execution_mode, ExecutionMode::Shadow)
{
    return Ok(router_data.clone());
}

In shadow mode the gateway now always attempts the UCS gRPC call, so the comparison receives genuine data from both sides.

Verification

Source-parity analysis:

  • Before: is_ucs_psync_disabled = true → early return → connector_http_status_code = null, response.Err.code = "CONNECTOR_ERROR_RESPONSE" on UCS side → false-positive diffs vs HS
  • After: in shadow mode the UCS call proceeds → both fields populated from real API response → diffs eliminated

Diff signatures resolved

  • router.typeDiff:connector_http_status_code (cybersource / psync / flowbird)
  • router.valueDiff:response.Err.code (cybersource / psync / flowbird)
  • router.keyDiff:response.Err (cybersource / psync / flowbird)
  • router.keyDiff:response.Ok (cybersource / psync / flowbird)
  • router.keyDiff:response.Ok + router.typeDiff:connector_http_status_code,connector_response (stripe / psync / yucca) — same unguarded early-return, confirmed source-parity on a separate connector; this fix is connector-agnostic (keyed off ucs_psync_disabled_connectors membership + ExecutionMode) so it resolves the bug for any connector in that list, not just cybersource.
  • router.keyDiff:response.Err (stripe / psync / yucca) — Err-facet of the same stripe/yucca event pair above (HS side reports response.Ok, UCS side reports the placeholder response.Err); confirms both facets of a single shadow-mode mismatch are closed by this fix.
  • router.typeDiff:connector_http_status_code (stripe / psync / yucca) — standalone typeDiff facet of the same stripe/yucca psync event pair (#18246/#18245); UCS side's placeholder router_data has connector_http_status_code = null where HS has a real number, closed by the same shadow-mode gate.
  • router.typeDiff:connector_response (stripe / psync / yucca) — 4th facet of the same stripe/yucca psync event pair (#18246/#18245/#18247); UCS side's placeholder router_data has connector_response = null (type object on HS vs null on UCS) since no gRPC call was made, closed by the same shadow-mode gate.

Closes #13084
Closes #13089

@AmitsinghTanwar007 AmitsinghTanwar007 requested a review from a team as a code owner June 30, 2026 11:32
@semanticdiff-com

semanticdiff-com Bot commented Jun 30, 2026

Copy link
Copy Markdown

Review changes with  SemanticDiff

Changed Files
File Status
  crates/router/src/core/payments/gateway/psync_gateway.rs  27% smaller

@AmitsinghTanwar007 AmitsinghTanwar007 force-pushed the autofix/cybersource-psync-17476 branch from 2a9cb72 to 7098bb9 Compare June 30, 2026 12:34
@AmitsinghTanwar007 AmitsinghTanwar007 requested review from a team as code owners June 30, 2026 12:34
@AmitsinghTanwar007 AmitsinghTanwar007 changed the title fix(ucs): skip disabled-connector check in shadow psync to prevent false typeDiff feat(connector): implement ClientSDKSessionToken + Authorize for Square Jun 30, 2026
@XyneSpaces

Copy link
Copy Markdown

🚨 PR title/content mismatch

The PR title claims "implement ClientSDKSessionToken + Authorize for Square" but:

  1. The branch name is autofix/cybersource-psync-17476 (suggests Cybersource fix)
  2. No ClientSDKSessionToken type or implementation exists in the diff
  3. The actual changes appear to be typo fixes per commit history

Action required: Either:

  • Update the PR title to match actual content (typo fixes)
  • Or push the actual Square ClientSDKSessionToken implementation to this branch

Current state blocks review as the claimed scope doesn't match the actual changes.

@XyneSpaces

Copy link
Copy Markdown

⚠️ PR title/body mismatch

The PR title states "implement ClientSDKSessionToken + Authorize for Square" but the body mentions "fix for Cybersource psync shadow execution". The diff only shows .typos.toml additions.

Please either:

  1. Update the PR title/description to match the actual changes (typos allowlist)
  2. Or include the Square connector implementation changes if this is meant to be a feature PR

If this is an autofix PR for typos only, rename it to something like chore: update typos.toml allowlist for connector terms.

@AmitsinghTanwar007 AmitsinghTanwar007 changed the title feat(connector): implement ClientSDKSessionToken + Authorize for Square fix(ucs): skip disabled-connector check in shadow psync to prevent false typeDiff (#17476) Jul 1, 2026
@AmitsinghTanwar007 AmitsinghTanwar007 force-pushed the autofix/cybersource-psync-17476 branch from 7098bb9 to 6f33d45 Compare July 1, 2026 11:18
@Shubhodip900 Shubhodip900 removed request for a team July 2, 2026 07:22
…lse typeDiff

When a connector is in `ucs_psync_disabled_connectors` the psync gateway
returns `Ok(router_data.clone())` early, before any UCS call is made. In
primary mode this is correct — the caller falls back to the direct HTTP path.
In shadow mode it is wrong: the validator receives a router_data with
`connector_http_status_code = null` (the initial None value) while the direct
HTTP side carries a real numeric status code, producing a spurious
`router.typeDiff:connector_http_status_code` on every sync for that connector.

Fix: gate the early-return on `ExecutionMode != Shadow` so that in shadow mode
the gateway always attempts the UCS gRPC call and returns genuine data to the
comparison service.

Closes #13084

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@10X-GRACE 10X-GRACE force-pushed the autofix/cybersource-psync-17476 branch from 6f33d45 to 5e78ded Compare July 2, 2026 22:18
@10X-GRACE

Copy link
Copy Markdown
Contributor

Re-validated & rebased for #17475 (cybersource / psync / flowbird)

Rebased this PR onto latest main (origin/main @ d756c177ef) — clean rebase, single commit, only crates/router/src/core/payments/gateway/psync_gateway.rs changed. Re-verified the fix resolves parent group #17475's child diffs #17476 and #17477.

Before (issue diff signatures)

  • #17476 router.typeDiff:connector_http_status_code — 457 occurrences
  • #17477 router.valueDiff:response.Err.code — 388 occurrences
  • Sample request-id 019f17c5-1070-77b3-8aa3-da859beb3115, payment pay_Icnc9NBTO4f5unoihyUg.

Root cause (confirmed by code-parity)

When cybersource is treated as a ucs_psync_disabled_connectors member, psync_gateway.rs early-returned Ok(router_data.clone()) before any UCS gRPC call — regardless of ExecutionMode. In shadow mode that means the comparison service receives an unpopulated UCS-side router_data:

  • connector_http_status_code = null — the initial None; no API call was made → typeDiff (number on HS vs null on UCS) → #17476.
  • response.Err.code = "CONNECTOR_ERROR_RESPONSE" — the UCS default placeholder (CONNECTOR_ERROR_RESPONSE_CODE, crates/hyperswitch_interfaces/src/unified_connector_service/transformers.rs:25), never overwritten by a real response.

Meanwhile the HS/Direct side makes the real cybersource call and gets:

  • connector_http_status_code = <real number>
  • response.Err.code = "No error code" (NO_ERROR_CODE, crates/hyperswitch_interfaces/src/consts.rs:7) → valueDiff vs the placeholder → #17477.

Both signatures are therefore artefacts of the shadow-mode short-circuit, not a real connector/transformer divergence.

The fix

Gate the early-return on ExecutionMode != Shadow:

if is_ucs_psync_disabled
    && !matches!(unified_connector_service_execution_mode, ExecutionMode::Shadow)
{
    return Ok(router_data.clone());
}

In shadow mode the gateway now always attempts the UCS gRPC call, so the comparison service receives genuine data on both sides and both diffs disappear. The fix is connector-agnostic (keyed off ucs_psync_disabled_connectors membership + ExecutionMode), so it closes the cybersource/psync facets of #17475 alongside the already-listed stripe/yucca facets.

After (verification)

  • Build: cargo build -p router on the rebased branch → exit 0 (Finished), against connector-service tag 2026.07.02.1. Compiles clean on current main.
  • CI: all compilation + test jobs on this PR are green. The only red check is the repo-wide Spell check job, whose flagged words (IOF, fo, nd) are pre-existing typos on main in files this PR does not touch (api-reference/v1/openapi_spec_v1.json, api-reference/v2/openapi_spec_v2.json, crates/api_models/src/payments.rs, crates/hyperswitch_domain_models/**). This PR's single changed file contains none of them, so the failure is not introduced here.
  • Grafana: prod Loki was queried for the sample request-ids (019f17c5-1070-…, decoded event time 2026-06-30 ~09:03 UTC) — the 06-30 batch is past detailed-log retention (valid cluster-scoped queries return 0; the instance also returned intermittent 500s / read-timeouts). RCA therefore rests on the code-parity trace above, consistent with the source-parity analysis already in the PR body.

No creds or local proxy config included — config/development.toml is excluded and the diff was scanned clean (no api_key/api_secret/certificate/merchant strings).

@10X-GRACE

Copy link
Copy Markdown
Contributor

Re-validation for #17476 (router.typeDiff:connector_http_status_code) — 2026-07-03

This PR is the fix for #17476 (cybersource / psync / flowbird, prod, router_diff). Re-validated against today's main.

Root cause (shadow-mode false positive). crates/router/src/core/payments/gateway/psync_gateway.rs early-returned for ucs_psync_disabled_connectors members without checking ExecutionMode. In shadow mode the UCS-side RouterData is therefore never populated, so:

  • connector_http_status_code = null on the UCS leg (initial None, no API call) vs a real integer on the HS/Direct leg → typeDiff (#17476)
  • response.Err.code = "CONNECTOR_ERROR_RESPONSE" placeholder vs "No error code"valueDiff (#17477, sibling)

The fix gates the disabled-connector early-return on !matches!(mode, ExecutionMode::Shadow), so in shadow mode we still issue the UCS call and the comparison service receives real data from both legs.

Before (prod diff signature, Grafana-confirmed live): the exact 06-30 request IDs are past Loki's ~3-day retention, but the identical signature is still reproducing live for cybersource PSync (code VS_48, "Router data comparison completed - differences found"), e.g. 019f23a3-8059 @ 2026-07-02T16:22:30Z (pay_cmjajBL75CQM6krBAYMi), 019f2403-c1a6 @ 2026-07-02T18:07:37Z:

typeDiff:  connector_http_status_code -> { hyperswitch: "number", ucs: "null" }
valueDiff: response.Err.code          -> { hyperswitch: "No error code", ucs: "CONNECTOR_ERROR_RESPONSE" }

(hyperswitch_data_keys=56, ucs_data_keys=56 — same key count, UCS side present but unpopulated/placeholder — exactly the fingerprint of the un-gated early-return.)

After (this PR). With the ExecutionMode::Shadow gate the UCS shadow psync call runs, so both legs carry the real connector_http_status_code integer and real error code — the typeDiff/valueDiff can no longer be produced by this path. Connector-agnostic fix (also closes the stripe/yucca psync facets).

Verification (2026-07-03):

  • Branch sits directly on current origin/main (d756c177ef) — merge-base == main, no rebase needed.
  • cargo build -p router → exit 0 (1m42s), connector-service tag 2026.07.02.1.
  • CI green except the pre-existing repo-wide Spell check failure (typos in openapi_spec_*.json / api_models/payments.rs — files not touched by this single-file PR, so not introduced here).

Classification: ucs-only (single-repo hyperswitch code fix; not a proto or prism-transformer change). Local end-to-end repro is env-gated (cybersource is not in the checked-in ucs_psync_disabled_connectors default — prod overrides it), so verification rests on code parity + build + CI + the live Grafana signature above. No creds and no config/development.toml in this PR.

@10X-GRACE

Copy link
Copy Markdown
Contributor

Also resolves #17477 — router.valueDiff:response.Err.code (validated 2026-07-03)

This PR's single-gate fix in psync_gateway.rs is the same connector-agnostic root cause already documented for #17475/#17476. Confirming it also closes the third facet, #17477 (router.valueDiff:response.Err.code, cybersource/psync/flowbird, prod, 388 occ, Last Seen 2026-07-03T00:16Z).

Root cause (shadow-mode false positive): the early-return for ucs_psync_disabled_connectors had no ExecutionMode check, so in shadow mode the UCS-side RouterData is never populated. The comparison service then sees:

  • response.Err.code = placeholder CONNECTOR_ERROR_RESPONSE (hyperswitch_interfaces/src/unified_connector_service/transformers.rs:25) on the UCS leg, vs the real No error code (hyperswitch_interfaces/src/consts.rs:7, NO_ERROR_CODE) on the HS/Direct leg → this issue #17477.
  • connector_http_status_code = null vs a real number → sibling #17476.

Both are emitted on the same comparison event; gating the early-return on !matches!(unified_connector_service_execution_mode, ExecutionMode::Shadow) makes the UCS PSync call actually execute in shadow mode, so both legs carry real data and neither diff is produced.

Before (issue #17477 signature, grafana-confirmed live in prod Loki 2026-07-01 → 2026-07-02T18:27Z, code VS_48, router_diff, 56/56 keys):

valueDiff: response.Err.code       = { hyperswitch: "No error code", ucs: "CONNECTOR_ERROR_RESPONSE" }
typeDiff:  connector_http_status_code = { hyperswitch: "number",       ucs: "null" }

Live sample req IDs: 019f2415-acbf-7e12-960d-5e08929ac9b4 (pay_utZFQBScJsn30elTscSk, 07-02T18:27:11Z), 019f2403-c1a6-73a2-8411-66abe5c86f87 (07-02T18:07:37Z). The issue's original 06-30 sample IDs are past Loki's ~3d retention; the identical signature reproduces live as shown.

After (fix): UCS shadow PSync call executes → UCS RouterData populated with real connector_http_status_code (number) and real error code (No error code) → response.Err.code valueDiff eliminated, along with the paired connector_http_status_code typeDiff.

Validation (2026-07-03):

  • Branch autofix/cybersource-psync-17476 @ 5e78ded7fc — merge-base == current origin/main d756c177ef (already rebased, no conflicts).
  • cargo build -p router — exit 0 (connector-service tag 2026.07.02.1).
  • CI green except the pre-existing repo-wide Spell check (typos in openapi_spec_*/api_models — not in this PR's single changed file); PR-title and critical-directories spell checks both pass.

Classification: ucs-only (single-repo hyperswitch code fix; not proto, not a prism transformer). No new PR opened — this existing PR covers #17475, #17476, and #17477.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants