Skip to content

AsciiJSON silently corrupts non-BMP characters (emoji) by emitting malformed \u escapes #4688

@xd-sarthak

Description

@xd-sarthak

Description

render.AsciiJSON corrupts any Unicode code point above U+FFFF (non-BMP characters such as emoji). It escapes every non-ASCII rune with fmt.Appendf(buf, "\\u%04x", r), but %04x is a minimum width, not a fixed width. A non-BMP rune needs 5+ hex digits, so a single 5-digit \uXXXXX token is written. JSON \u escapes are exactly 4 hex digits, so a decoder reads the first 4 as one character and the trailing digit(s) as literal text.

The output is still syntactically valid JSON (so this is not a parse error and is easy to miss), but the value is silently wrong — it does not round-trip.

Per RFC 8259 §7, code points outside the BMP must be encoded as a UTF-16 surrogate pair (e.g. U+1F600 → \ud83d\ude00).

Example — 😀 (U+1F600):

value
Input 😀
AsciiJSON output {"msg":"\u1f600"}
After json.Unmarshal ὠ0 (U+1F60 + literal 0)
Expected after round-trip 😀

Offending code — render/json.go (AsciiJSON.Render):

for _, r := range bytesconv.BytesToString(ret) {
    if r > unicode.MaxASCII {
        escapeBuf = fmt.Appendf(escapeBuf[:0], "\\u%04x", r) // 5+ digits for non-BMP → invalid escape
        buffer.Write(escapeBuf)
    } else {
        buffer.WriteByte(byte(r))
    }
}

Related (but distinct — that one is about naming/content-type/deprecation, not this corruption): #2546.

Gin Version

v1.12.0 (master, commit d83f866)

Can you reproduce the bug?

Yes.

Steps:

  1. Render any payload containing a non-BMP character via c.AsciiJSON(...) (or render.AsciiJSON).
  2. Observe the response body contains a malformed \uXXXXX escape.
  3. json.Unmarshal the body — the decoded value differs from the original.

Source Code

Minimal reproducible example:

package main

import (
	"encoding/json"
	"fmt"
	"net/http/httptest"

	"github.com/gin-gonic/gin/render"
)

func main() {
	w := httptest.NewRecorder()
	_ = render.AsciiJSON{Data: map[string]string{"msg": "😀"}}.Render(w)

	out := w.Body.String()
	fmt.Println("rendered:", out) // {"msg":"\u1f600"}

	var decoded map[string]string
	_ = json.Unmarshal([]byte(out), &decoded)
	fmt.Printf("decoded: %q\n", decoded["msg"]) // "ὠ0", expected "😀"
}

Go Version

go1.25.0

Operating System

Linux 6.17.0 x86_64 (codec-independent — the defect is in Gin's ASCII-escaping loop, not the JSON marshaler).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions