Fix UTF-8 header corruption when fallback charset differs by OctopusET · Pull Request #10078 · roundcube/roundcubemail

OctopusET · 2026-01-27T20:09:47Z

Problem

Korean email subjects display as garbled text (í¬ìŠ¤í¬ì˜¬ instead of 포스포올) when:

Email client (e.g., Outlook) sends raw UTF-8 headers without MIME encoding
Message body declares different charset (e.g., charset=ISO-8859-1)

Root Cause

In rcube_mime::decode_mime_string(), non-MIME headers use the body's charset as fallback. UTF-8 bytes get misinterpreted as Latin-1 and double-encoded.

Solution

Add UTF-8 detection before fallback conversion:

if (mb_check_encoding($input, 'UTF-8') && preg_match('/[\x80-\xFF]/', $input)) {
    return $input;
}

This validates UTF-8 byte sequences before applying potentially wrong charset conversion. Random Latin-1 bytes have only ~1/15 chance of forming valid UTF-8 sequences.

Standards

RFC 6532 (2012) legitimizes raw UTF-8 in email headers via SMTPUTF8 extension.

References

VLC: IsUTF8() - validates UTF-8 before falling back to other encodings
Thunderbird: mozilla::EncodingDetector - unified charset detection
ICU: CharsetDetector - checks byte sequences for legal UTF-8 patterns
uchardet: nsUTF8Prober - Mozilla-derived UTF-8 state machine prober

Screenshots

Before

After

Add UTF-8 detection before charset conversion to prevent double-encoding of raw UTF-8 headers (e.g., Korean from Outlook).

Fix UTF-8 header corruption when fallback charset differs

ebab1c3

Add UTF-8 detection before charset conversion to prevent double-encoding of raw UTF-8 headers (e.g., Korean from Outlook).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix UTF-8 header corruption when fallback charset differs#10078

Fix UTF-8 header corruption when fallback charset differs#10078
OctopusET wants to merge 1 commit intoroundcube:masterfrom
OctopusET:fix-utf8-header

OctopusET commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

OctopusET commented Jan 27, 2026

Problem

Root Cause

Solution

Standards

References

Screenshots

Before

After

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant