Skip to content

[API Proposal]: UnicodeJsonEncoder #87153

Open
@davidmatson

Description

@davidmatson

Background and motivation

There's no built-in implementation that allows characters from all languages to be kept readable, and unnecessary escaping to be avoided when the caller knows recipients parse JSON correctly.

For additional context, see:
#42847
#86800
#87138

API Proposal

namespace System.Text.Encodings.Web
{
    internal sealed class UnicodeJsonEncoder : JavaScriptEncoder
    {
        internal static readonly UnicodeJsonEncoder Singleton = new UnicodeJsonEncoder();

        private readonly bool _preferHexEscape;
        private readonly bool _preferUppercase;

        public UnicodeJsonEncoder()
            : this(preferHexEscape: false, preferUppercase: false)
        {
        }

        public UnicodeJsonEncoder(bool preferHexEscape, bool preferUppercase)
        {
            _preferHexEscape = preferHexEscape;
            _preferUppercase = preferUppercase;
        }

        // Implementations of base class members.
    }
}
namespace System.Text.Encodings.Web
{
    public abstract class JavaScriptEncoder : TextEncoder
    {
        // Existing members

        public static JavaScriptEncoder Unicode => UnicodeJsonEncoder.Singleton;
    }
}

PR #87147 has additional implementation details.

API Usage

// some typed variable with the JSON object to serialize, called "data"
string json = JsonSerializer.Serialize(data new JsonSerializerOptions {  Encoder = JavaScriptEncoder.Unicode });

Or, to force hex escapes (\uxxxx) rather than two-character escapes (for example, "):

// some typed variable with the JSON object to serialize, called "data"
string json = JsonSerializer.Serialize(data new JsonSerializerOptions {  Encoder = new UnicodeJsonEncoder(preferHexEscape: true, preferUppercase; false) }); // or other values for those bools

Alternative Designs

No response

Risks

Similar to UnsafeRelaxedJsonEncoder, but see #87138.

Callers need to ensure two things:

  1. If this JSON output is embedded inside another language (HTML, SQL, C#, etc.), the text is correctly escaped according to that language's requirements.
  2. The recipient and intermediaries follow the JSON spec correctly (can handle any Unicode characters that are unescaped like they do when such characters are escaped).

Metadata

Metadata

Assignees

No one assigned

    Labels

    api-suggestionEarly API idea and discussion, it is NOT ready for implementationarea-System.Text.Encodings.WebwishlistIssue we would like to prioritize, but we can't commit we will get to it yet

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions