Open
Description
Background and motivation
There's no built-in implementation that allows characters from all languages to be kept readable, and unnecessary escaping to be avoided when the caller knows recipients parse JSON correctly.
For additional context, see:
#42847
#86800
#87138
API Proposal
namespace System.Text.Encodings.Web
{
internal sealed class UnicodeJsonEncoder : JavaScriptEncoder
{
internal static readonly UnicodeJsonEncoder Singleton = new UnicodeJsonEncoder();
private readonly bool _preferHexEscape;
private readonly bool _preferUppercase;
public UnicodeJsonEncoder()
: this(preferHexEscape: false, preferUppercase: false)
{
}
public UnicodeJsonEncoder(bool preferHexEscape, bool preferUppercase)
{
_preferHexEscape = preferHexEscape;
_preferUppercase = preferUppercase;
}
// Implementations of base class members.
}
}
namespace System.Text.Encodings.Web
{
public abstract class JavaScriptEncoder : TextEncoder
{
// Existing members
public static JavaScriptEncoder Unicode => UnicodeJsonEncoder.Singleton;
}
}
PR #87147 has additional implementation details.
API Usage
// some typed variable with the JSON object to serialize, called "data"
string json = JsonSerializer.Serialize(data new JsonSerializerOptions { Encoder = JavaScriptEncoder.Unicode });
Or, to force hex escapes (\uxxxx) rather than two-character escapes (for example, "):
// some typed variable with the JSON object to serialize, called "data"
string json = JsonSerializer.Serialize(data new JsonSerializerOptions { Encoder = new UnicodeJsonEncoder(preferHexEscape: true, preferUppercase; false) }); // or other values for those bools
Alternative Designs
No response
Risks
Similar to UnsafeRelaxedJsonEncoder, but see #87138.
Callers need to ensure two things:
- If this JSON output is embedded inside another language (HTML, SQL, C#, etc.), the text is correctly escaped according to that language's requirements.
- The recipient and intermediaries follow the JSON spec correctly (can handle any Unicode characters that are unescaped like they do when such characters are escaped).