http: percent-decode should follow WHATWG for invalid % sequences

## Problem

`url_decode` / `path_decode` treat any `%` followed by two characters as a hex escape and decode it with `hex_to_byte()`. That has two failure modes:

1. **Historical bug:** `a`–`z` / `A`–`Z` were accepted as hex “digits”, so e.g. `g` was mapped to nibble 16 (`g - 'a' + 10`), which is not a valid hex nibble (must be 0–15). That produced **wrong decoded bytes** (e.g. `%2g` interpreted as a deliberate encoding instead of garbage).

2. **After narrowing to `a`–`f`:** Non-hex characters (e.g. `z`) fall through to `c - '0'`, which is still **not** a hex nibble—it is an arbitrary value (e.g. `'z' - '0' = 74`). So invalid input still decodes to **wrong bytes**, just differently.

Neither matches the URL Standard.

## Expected behavior

[WHATWG URL — *Percent-encoded bytes*](https://url.spec.whatwg.org/#percent-encoded-bytes): a percent-encoded byte is `%` followed by two **ASCII hex digits**. If that is not the case, **append `%` only** and continue (see the spec’s example: `%25%s%1G` → `%%s%1G`).

## Suggested fix

- Only decode when the next two code points are ASCII hex digits.
- Otherwise emit a literal `%` (and do not consume the following characters as hex).
- Trailing `%` or `%X` with fewer than two following bytes should not fail the whole decode; treat like the spec (literal `%`).

Regression tests should lock the WHATWG example and cases like `%2g` → literal `%2g` (not a bogus byte).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

http: percent-decode should follow WHATWG for invalid % sequences #3304

Problem

Expected behavior

Suggested fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

http: percent-decode should follow WHATWG for invalid % sequences #3304

Description

Problem

Expected behavior

Suggested fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions