Skip to content

fix(router-core): preserve percent-encoded URL-unsafe chars in decodeSegment#7695

Open
CDillinger wants to merge 1 commit into
TanStack:mainfrom
CDillinger:fix/decode-path-preserve-unsafe-chars
Open

fix(router-core): preserve percent-encoded URL-unsafe chars in decodeSegment#7695
CDillinger wants to merge 1 commit into
TanStack:mainfrom
CDillinger:fix/decode-path-preserve-unsafe-chars

Conversation

@CDillinger

@CDillinger CDillinger commented Jun 25, 2026

Copy link
Copy Markdown

What

decodeSegment (called by decodePath) previously used decodeURI() which decoded all percent-encoded characters — including those that are unsafe in URL paths per the WHATWG spec. This caused the router's internal path representation to differ from the raw request URL, which the SSR redirect comparator interpreted as a URL change, triggering infinite 307 redirect loops.

This PR replaces the decodeURI()-based approach with per-character decoding that preserves:

  • ASCII control characters (0x00-0x1F, 0x7F)
  • The WHATWG URL "path percent-encode set": space, ", <, >, `, {, }

Reproduction

Any TanStack Start app with a path param route will infinite-loop on URLs containing encoded curly braces, angle brackets, etc:

http://localhost:3000/some-route/%7B%7Btemplate%7D%7D

Why this approach

The previous implementation decoded everything in decodeSegment and then tried to fix problems after the fact (sanitizePathSegment stripped control chars, encodePathLikeUrl was supposed to re-encode). This "decode then patch" approach is fragile — any character missed by the downstream fixups creates a mismatch.

The cleaner fix is to not decode these characters in the first place. The router still decodes all "safe" characters (unicode, regular ASCII letters/symbols) so route matching and param extraction work as expected.

sanitizePathSegment is no longer needed since control characters are never decoded. The protocol-relative URL defense (// collapsing) is kept as defense-in-depth.

Fixes #7587.

Summary by CodeRabbit

  • Bug Fixes

    • Improved handling of encoded URL path characters so special symbols like {}, <>, quotes, and control characters stay safely encoded.
    • Prevented redirect loops and unexpected path changes for encoded routes, including cases like /%7B%7Btemplate%7D%7D.
    • Preserved same-origin navigation while avoiding protocol-relative URL confusion for %0d and similar inputs.
  • Tests

    • Added and updated end-to-end coverage for special characters and open-redirect prevention.

@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

sanitizePathSegment is rewritten to percent-encode URL-unsafe characters (ASCII control chars and < > " \ { }`) instead of stripping them, preventing infinite 307 redirect loops. Unit tests and e2e specs are updated to match the new behavior where these characters remain encoded rather than being decoded or collapsed.

Changes

decodeSegment unsafe-char preservation fix

Layer / File(s) Summary
sanitizePathSegment re-encoding logic
packages/router-core/src/utils.ts
Replaces the control-char-stripping regex with PATH_UNSAFE_RE, which re-encodes unsafe chars as %XX; updates decodePath comments to defense-in-depth framing.
Unit tests for updated decodePath
packages/router-core/tests/utils.test.ts
Updates open-redirect expectations so %0d/%0a/%00 remain encoded with handledProtocolRelativeURL: false; adds a new describe block asserting WHATWG unsafe chars stay encoded while safe sequences still decode.
E2e open-redirect and special-char tests
e2e/react-router/.../open-redirect-prevention.spec.ts, e2e/react-start/.../open-redirect-prevention.spec.ts, e2e/react-start/.../special-characters.spec.ts
Removes collapsed-pathname assertions from open-redirect tests; adds a curly-brace ({{app_name}}) infinite-redirect regression test.
Changeset
.changeset/fix-decode-path-preserve-unsafe.md
Adds patch changeset for @tanstack/router-core documenting the fix.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

package: router-core

Poem

🐇 Hop hop, the curly braces came,
Percent-encoded, playing their game.
No more redirect loop in sight,
%7B%7D held encoded tight.
The rabbit cheers—no ERR_TOO_MANY tonight! 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main fix in router-core: preserving percent-encoded unsafe path characters in decodeSegment.
Linked Issues check ✅ Passed The changes address #7587 by preventing redirect loops for encoded unsafe pathname characters and adding coverage for the affected cases.
Out of Scope Changes check ✅ Passed The changes are scoped to the redirect-loop fix, supporting tests, and a changeset entry with no unrelated functionality added.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@nx-cloud

nx-cloud Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

View your CI Pipeline Execution ↗ for commit 3f90992

Command Status Duration Result
nx affected --targets=test:eslint,test:unit,tes... ✅ Succeeded 5m 4s View ↗
nx run-many --target=build --exclude=examples/*... ✅ Succeeded 24s View ↗

☁️ Nx Cloud last updated this comment at 2026-06-26 21:40:25 UTC

@pkg-pr-new

pkg-pr-new Bot commented Jun 25, 2026

Copy link
Copy Markdown
More templates

@tanstack/arktype-adapter

npm i https://pkg.pr.new/@tanstack/arktype-adapter@7695

@tanstack/eslint-plugin-router

npm i https://pkg.pr.new/@tanstack/eslint-plugin-router@7695

@tanstack/eslint-plugin-start

npm i https://pkg.pr.new/@tanstack/eslint-plugin-start@7695

@tanstack/history

npm i https://pkg.pr.new/@tanstack/history@7695

@tanstack/nitro-v2-vite-plugin

npm i https://pkg.pr.new/@tanstack/nitro-v2-vite-plugin@7695

@tanstack/react-router

npm i https://pkg.pr.new/@tanstack/react-router@7695

@tanstack/react-router-devtools

npm i https://pkg.pr.new/@tanstack/react-router-devtools@7695

@tanstack/react-router-ssr-query

npm i https://pkg.pr.new/@tanstack/react-router-ssr-query@7695

@tanstack/react-start

npm i https://pkg.pr.new/@tanstack/react-start@7695

@tanstack/react-start-client

npm i https://pkg.pr.new/@tanstack/react-start-client@7695

@tanstack/react-start-rsc

npm i https://pkg.pr.new/@tanstack/react-start-rsc@7695

@tanstack/react-start-server

npm i https://pkg.pr.new/@tanstack/react-start-server@7695

@tanstack/router-cli

npm i https://pkg.pr.new/@tanstack/router-cli@7695

@tanstack/router-core

npm i https://pkg.pr.new/@tanstack/router-core@7695

@tanstack/router-devtools

npm i https://pkg.pr.new/@tanstack/router-devtools@7695

@tanstack/router-devtools-core

npm i https://pkg.pr.new/@tanstack/router-devtools-core@7695

@tanstack/router-generator

npm i https://pkg.pr.new/@tanstack/router-generator@7695

@tanstack/router-plugin

npm i https://pkg.pr.new/@tanstack/router-plugin@7695

@tanstack/router-ssr-query-core

npm i https://pkg.pr.new/@tanstack/router-ssr-query-core@7695

@tanstack/router-utils

npm i https://pkg.pr.new/@tanstack/router-utils@7695

@tanstack/router-vite-plugin

npm i https://pkg.pr.new/@tanstack/router-vite-plugin@7695

@tanstack/solid-router

npm i https://pkg.pr.new/@tanstack/solid-router@7695

@tanstack/solid-router-devtools

npm i https://pkg.pr.new/@tanstack/solid-router-devtools@7695

@tanstack/solid-router-ssr-query

npm i https://pkg.pr.new/@tanstack/solid-router-ssr-query@7695

@tanstack/solid-start

npm i https://pkg.pr.new/@tanstack/solid-start@7695

@tanstack/solid-start-client

npm i https://pkg.pr.new/@tanstack/solid-start-client@7695

@tanstack/solid-start-server

npm i https://pkg.pr.new/@tanstack/solid-start-server@7695

@tanstack/start-client-core

npm i https://pkg.pr.new/@tanstack/start-client-core@7695

@tanstack/start-fn-stubs

npm i https://pkg.pr.new/@tanstack/start-fn-stubs@7695

@tanstack/start-plugin-core

npm i https://pkg.pr.new/@tanstack/start-plugin-core@7695

@tanstack/start-server-core

npm i https://pkg.pr.new/@tanstack/start-server-core@7695

@tanstack/start-static-server-functions

npm i https://pkg.pr.new/@tanstack/start-static-server-functions@7695

@tanstack/start-storage-context

npm i https://pkg.pr.new/@tanstack/start-storage-context@7695

@tanstack/valibot-adapter

npm i https://pkg.pr.new/@tanstack/valibot-adapter@7695

@tanstack/virtual-file-routes

npm i https://pkg.pr.new/@tanstack/virtual-file-routes@7695

@tanstack/vue-router

npm i https://pkg.pr.new/@tanstack/vue-router@7695

@tanstack/vue-router-devtools

npm i https://pkg.pr.new/@tanstack/vue-router-devtools@7695

@tanstack/vue-router-ssr-query

npm i https://pkg.pr.new/@tanstack/vue-router-ssr-query@7695

@tanstack/vue-start

npm i https://pkg.pr.new/@tanstack/vue-start@7695

@tanstack/vue-start-client

npm i https://pkg.pr.new/@tanstack/vue-start-client@7695

@tanstack/vue-start-server

npm i https://pkg.pr.new/@tanstack/vue-start-server@7695

@tanstack/zod-adapter

npm i https://pkg.pr.new/@tanstack/zod-adapter@7695

commit: 3f90992

@codspeed-hq

codspeed-hq Bot commented Jun 25, 2026

Copy link
Copy Markdown

Merging this PR will degrade performance by 8.91%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 2 improved benchmarks
❌ 5 regressed benchmarks
✅ 137 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Memory mem serialization-payload (vue) 6.8 MB 9.5 MB -28.36%
Memory mem serialization-payload (solid) 6.8 MB 9.1 MB -24.84%
Memory mem aborted-requests (solid) 2.4 MB 2.9 MB -17.38%
Memory mem serialization-payload (react) 31.8 MB 33.4 MB -4.8%
Memory mem request-churn (solid) 1.1 MB 1.2 MB -3.21%
Memory mem peak-large-page (solid) 3.9 MB 3.4 MB +14.5%
Memory mem aborted-requests (vue) 1,021 KB 920.7 KB +10.88%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing CDillinger:fix/decode-path-preserve-unsafe-chars (3f90992) with main (ba52d2b)1

Open in CodSpeed

Footnotes

  1. No successful run was found on main (bb2daa6) during the generation of this report, so ba52d2b was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

nx-cloud[bot]

This comment was marked as outdated.

@CDillinger CDillinger force-pushed the fix/decode-path-preserve-unsafe-chars branch from 782242e to 7123395 Compare June 26, 2026 02:12
@nlynzaad

nlynzaad commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

There are two issues with the current approach.

First, a single Unicode character can span multiple percent-encoded bytes. For example:

'ш' // %D1%88
'🚀' // %F0%9F%9A%80

Decoding each %XX group independently will not handle these correctly. We need to process adjacent percent-encoded bytes as a run, decode the smallest valid UTF-8 sequence, then continue with the remainder.

This also needs to work when safe and unsafe characters are adjacent. For example:

'🚀@' // %F0%9F%9A%80%40

Here, 🚀 should decode, while @ must remain %40.

decodeURI largely handled router-reserved characters for us by leaving them encoded. With decodeURIComponent, those characters are decoded, so we need to explicitly preserve any characters that are unsafe or meaningful to routing. Processing each decoded unit individually is also important to avoid URL-poisoning behaviour.

Based on the current test suite, I compiled the following exclusion set:

const PATH_KEEP_ENCODED = /^[\x00-\x1F\x7F\x20"#$%&+,/:;<=>?@`^\\{}]$/

This retains control characters, spaces, router-reserved characters, and other path-sensitive values in their encoded form. It intentionally does not exclude the full component percent-encode set, as doing so would also preserve characters such as [ , ], and |, which would be a breaking change.

The remaining test differences are expected:

  • %20 now remains encoded rather than becoming a literal space.
  • Control characters are retained as encoded values rather than dropped.

I have an update to the PR ready to handle this. Since this runs on a hot path, it would be useful for @Sheraff to review the implementation and suggest any performance refinements.

@CDillinger CDillinger force-pushed the fix/decode-path-preserve-unsafe-chars branch 2 times, most recently from 4fca828 to 00296a3 Compare June 26, 2026 02:48
nx-cloud[bot]

This comment was marked as outdated.

@Sheraff

Sheraff commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

This is the fastest I got (so far) that passes the new tests and the olds ones:

function decodeSegment(segment: string): string {
  if (segment.indexOf('%') !== -1) {
    try {
      return decodeURI(segment)
    } catch {}
  }
  return segment
}

// ...

  // Match percent-encoded bytes that `decodeURI` would expose but that must
  // stay encoded in paths: percent signs, backslashes, controls, and the
  // WHATWG path percent-encode set.
  const re = /%(?:[01][\dA-F]|2[025]|3[CE]|5C|60|7[BDF])/gi
  let cursor = 0
  let result = ''
  let match
  while (null !== (match = re.exec(path))) {
    result += decodeSegment(path.slice(cursor, match.index)) + match[0]
    cursor = re.lastIndex
  }
  if (cursor) {
    result += decodeSegment(path.slice(cursor))
    // eslint-disable-next-line no-control-regex
    if (/[\x00-\x1f\x7f]/.test(path)) {
      result = sanitizePathSegment(result)
    }
  } else {
    result = sanitizePathSegment(decodeSegment(path))
  }

But I think correctness matters more here, so @nlynzaad and @CDillinger you should make sure the tests cover what we need and to have a working version, and we can merge as soon as that is ready. I can work on perf afterwards.

BTW: it is expected for the Bundle Size, and the Labeler workflows to break on forks, but PR/Test should pass

@CDillinger CDillinger force-pushed the fix/decode-path-preserve-unsafe-chars branch from 00296a3 to dd05e5d Compare June 26, 2026 16:22

@nx-cloud nx-cloud Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

At least one additional CI pipeline execution has run since the conclusion below was written and it may no longer be applicable.

Nx Cloud is proposing a fix for your failed CI:

We updated the open-redirect e2e test in react-router/basic-file-based to align with the new decodeSegment behavior introduced by this PR. The stale assertion (expect(url.pathname).toMatch(/^\/test-path\/?$/)) assumed the old "strip CR then collapse //" approach, but %0d is now kept encoded so the path resolves to /%0D/test-path rather than /test-path. This mirrors the identical fix already applied to the equivalent react-start/basic test in the PR.

Tip

We verified this fix by re-running tanstack-router-e2e-react-basic-file-based:test:e2e, tanstack-react-start-e2e-basic:test:e2e--rsbuild-prerender.

diff --git a/e2e/react-router/basic-file-based/tests/open-redirect-prevention.spec.ts b/e2e/react-router/basic-file-based/tests/open-redirect-prevention.spec.ts
index 3ad83fb4..2f0fe256 100644
--- a/e2e/react-router/basic-file-based/tests/open-redirect-prevention.spec.ts
+++ b/e2e/react-router/basic-file-based/tests/open-redirect-prevention.spec.ts
@@ -69,10 +69,7 @@ test.describe('Open redirect prevention', () => {
       page,
       baseURL,
     }) => {
-      // When control characters are stripped from paths like /%0d/evil.com/
-      // the result could be //evil.com/ which is a protocol-relative URL
-      // Our fix collapses these to /evil.com/ to prevent external redirects
-      // This is already tested above, but we verify the collapsed path works
+      // %0d is kept encoded, so /%0d/test-path/ stays as-is and won't become //test-path/
       await page.goto('/%0d/test-path/')
       await page.waitForLoadState('networkidle')
 
@@ -80,8 +77,6 @@ test.describe('Open redirect prevention', () => {
       expect(page.url().startsWith(baseURL!)).toBe(true)
       const url = new URL(page.url())
       expect(url.origin).toBe(new URL(baseURL!).origin)
-      // Path should be collapsed to /test-path (not //test-path/)
-      expect(url.pathname).toMatch(/^\/test-path\/?$/)
     })
   })
 

Because this branch comes from a fork, it is not possible for us to apply fixes directly, but you can apply the changes locally using the available options below.

Apply changes locally with:

npx nx-cloud apply-locally N4iC-eU6c

Apply fix locally with your editor ↗   View interactive diff ↗



🎓 Learn more about Self-Healing CI on nx.dev

…Segment

Replace sanitizePathSegment (which stripped control characters) with a
re-encode step that keeps WHATWG path percent-encode set characters and
control characters in their encoded form after decodeURI.

This preserves the existing decodeURI-based approach which correctly
handles multi-byte UTF-8 sequences, while fixing the mismatch between
the original request URL and the router's internal representation that
caused infinite 307 redirect loops on paths containing these characters.

Fixes TanStack#7587.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@CDillinger CDillinger force-pushed the fix/decode-path-preserve-unsafe-chars branch from dd05e5d to 3f90992 Compare June 26, 2026 21:01
@CDillinger CDillinger marked this pull request as ready for review June 29, 2026 16:41

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
e2e/react-start/basic/tests/open-redirect-prevention.spec.ts (1)

80-87: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Assert the encoded pathname here as well.

The new comment says /%0d/test-path/ “stays as-is”, but this test now only checks same-origin. A same-origin rewrite to /test-path/ would still pass and miss the regression this PR is trying to lock down. Please add an exact page.url() or pathname assertion for the preserved encoded path.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@e2e/react-start/basic/tests/open-redirect-prevention.spec.ts` around lines 80
- 87, The open-redirect prevention check in the test around page.goto and
page.url() only verifies same-origin, which would still allow a rewrite from
/%0d/test-path/ to /test-path/. Add an exact assertion on the preserved encoded
pathname using the existing page.url() or the URL pathname so the test
explicitly confirms the encoded path remains unchanged after navigation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.changeset/fix-decode-path-preserve-unsafe.md:
- Around line 5-7: The changeset summary for `decodeSegment` is inaccurate and
should be revised to match the implementation. Update the description to say
that `decodeSegment` still primarily uses `decodeURI` with a per-character
fallback, and that the actual fix is re-encoding URL-unsafe characters via
`sanitizePathSegment` instead of stripping control characters. Keep the rest of
the explanation aligned with the infinite redirect loop issue and preserve the
references to `decodeSegment` and `sanitizePathSegment`.

In `@packages/router-core/src/utils.ts`:
- Around line 530-543: The path sanitization in utils.ts still leaves spaces
decoded, so internal router paths can diverge from the raw request URL and cause
SSR/router mismatches. Update sanitizePathSegment and the PATH_UNSAFE_RE
contract to also re-encode space characters (along with the existing unsafe
bytes), then adjust the related unit expectation in the path handling tests so
"%20" is preserved consistently.
- Around line 541-544: The fallback in sanitizePathSegment still fails on mixed
malformed and valid UTF-8 because it decodes %XX byte-by-byte after decodeURI
throws, leaving later multibyte runs incorrectly encoded. Update the decoding
logic in sanitizePathSegment (and its PATH_UNSAFE_RE-based fallback path) to
process contiguous valid percent-encoded runs as a unit, preserve valid decoded
bytes, and continue past malformed bytes instead of falling back to per-byte
behavior.

---

Nitpick comments:
In `@e2e/react-start/basic/tests/open-redirect-prevention.spec.ts`:
- Around line 80-87: The open-redirect prevention check in the test around
page.goto and page.url() only verifies same-origin, which would still allow a
rewrite from /%0d/test-path/ to /test-path/. Add an exact assertion on the
preserved encoded pathname using the existing page.url() or the URL pathname so
the test explicitly confirms the encoded path remains unchanged after
navigation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 581645e2-8486-411b-90eb-d785e8eb775c

📥 Commits

Reviewing files that changed from the base of the PR and between bb2daa6 and 3f90992.

📒 Files selected for processing (6)
  • .changeset/fix-decode-path-preserve-unsafe.md
  • e2e/react-router/basic-file-based/tests/open-redirect-prevention.spec.ts
  • e2e/react-start/basic/tests/open-redirect-prevention.spec.ts
  • e2e/react-start/basic/tests/special-characters.spec.ts
  • packages/router-core/src/utils.ts
  • packages/router-core/tests/utils.test.ts

Comment on lines +5 to +7
fix(router-core): preserve percent-encoded URL-unsafe characters in `decodeSegment` to prevent infinite redirect loops

`decodeSegment` now uses per-character decoding instead of `decodeURI`, preserving characters in the WHATWG URL "path percent-encode set" (`<`, `>`, `"`, `` ` ``, `{`, `}`) and ASCII control characters in their percent-encoded form. This prevents mismatches between the original URL and the router's internal representation that previously caused infinite 307 redirect loops on paths containing these characters (e.g. `/%7B%7Btemplate%7D%7D`).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Correct the changeset description to match the actual implementation.

The description states "decodeSegment now uses per-character decoding instead of decodeURI", but the implementation still uses decodeURI as the primary decoder (with fallback to per-character decoding on failure), then re-encodes unsafe characters via sanitizePathSegment. The key change is the replacement of control-char stripping with re-encoding of URL-unsafe characters, not the removal of decodeURI.

Please revise to accurately describe the fix, e.g.:

decodeSegment now re-encodes URL-unsafe characters in the WHATWG URL "path percent-encode set" (<, >, ", `, {, }) and ASCII control characters after decoding, keeping them in percent-encoded form. This prevents mismatches...

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.changeset/fix-decode-path-preserve-unsafe.md around lines 5 - 7, The
changeset summary for `decodeSegment` is inaccurate and should be revised to
match the implementation. Update the description to say that `decodeSegment`
still primarily uses `decodeURI` with a per-character fallback, and that the
actual fix is re-encoding URL-unsafe characters via `sanitizePathSegment`
instead of stripping control characters. Keep the rest of the explanation
aligned with the infinite redirect loop issue and preserve the references to
`decodeSegment` and `sanitizePathSegment`.

Comment on lines +530 to +543
* Space (0x20) is intentionally excluded — decodeURI decodes %20 to space
* and the router stores decoded spaces in location.pathname. The existing
* encodePathLikeUrl already handles re-encoding spaces for outgoing URLs.
*
* These characters are decoded by decodeURI but must remain percent-encoded
* in paths to match how upstream layers (CDNs, edge middleware, browsers)
* interpret the URL, preventing infinite redirect loops and path mismatches.
*/
// eslint-disable-next-line no-control-regex
const PATH_UNSAFE_RE = /[\x00-\x1f\x7f"<>`{}]/g

function sanitizePathSegment(segment: string): string {
// Remove ASCII control characters (0x00-0x1F) and DEL (0x7F)
// These include CR (\r = 0x0D), LF (\n = 0x0A), and other potentially dangerous characters
// eslint-disable-next-line no-control-regex
return segment.replace(/[\x00-\x1f\x7f]/g, '')
return segment.replace(PATH_UNSAFE_RE, (ch) =>
'%' + ch.charCodeAt(0).toString(16).toUpperCase().padStart(2, '0'),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Preserve %20 in the internal path too.

Line 530 intentionally keeps %20 decoded to a literal space, but the stated contract for this fix is to keep encoded path-unsafe bytes aligned with the raw request URL. That means paths like /file%20name can still diverge during SSR/router comparisons, which is the same mismatch class this patch is trying to remove. Please include space in the re-encode set and update the unit expectation at Line 630 with it.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/router-core/src/utils.ts` around lines 530 - 543, The path
sanitization in utils.ts still leaves spaces decoded, so internal router paths
can diverge from the raw request URL and cause SSR/router mismatches. Update
sanitizePathSegment and the PATH_UNSAFE_RE contract to also re-encode space
characters (along with the existing unsafe bytes), then adjust the related unit
expectation in the path handling tests so "%20" is preserved consistently.

Comment on lines 541 to +544
function sanitizePathSegment(segment: string): string {
// Remove ASCII control characters (0x00-0x1F) and DEL (0x7F)
// These include CR (\r = 0x0D), LF (\n = 0x0A), and other potentially dangerous characters
// eslint-disable-next-line no-control-regex
return segment.replace(/[\x00-\x1f\x7f]/g, '')
return segment.replace(PATH_UNSAFE_RE, (ch) =>
'%' + ch.charCodeAt(0).toString(16).toUpperCase().padStart(2, '0'),
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | 🏗️ Heavy lift

Mixed malformed + valid UTF-8 sequences are still broken.

This re-encode step only helps after decodeURI(segment) succeeds. If one malformed escape makes that throw, the fallback still decodes %XX byte-by-byte, so a later valid multibyte run in the same segment stays incorrectly encoded instead of being decoded and preserved. That misses the “decode contiguous runs and continue” requirement from the review thread and can still change route matching/param extraction on mixed-validity paths.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/router-core/src/utils.ts` around lines 541 - 544, The fallback in
sanitizePathSegment still fails on mixed malformed and valid UTF-8 because it
decodes %XX byte-by-byte after decodeURI throws, leaving later multibyte runs
incorrectly encoded. Update the decoding logic in sanitizePathSegment (and its
PATH_UNSAFE_RE-based fallback path) to process contiguous valid percent-encoded
runs as a unit, preserve valid decoded bytes, and continue past malformed bytes
instead of falling back to per-byte behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Infinite redirect loop ("ERR_TOO_MANY_REDIRECTS") caused by encoded unsafe characters in URL pathname

3 participants