Skip to content

API Proposal: JavaScriptEncoder.JsonDefault #73329

Open
@GrabYourPitchforks

Description

@GrabYourPitchforks

API Proposal: JavaScriptEncoder.JsonDefault

The API JavaScriptEncoder.Default is intended to be the single entry point for escaping all things ECMAScript and ECMAScript-adjacent, but it's doing a bit too much right now. For example, since it's trying to handle both ECMA-262 (ECMAScript) and ECMA-404 (JSON), it needs to support the subset that is common to them both.

This results in JavaScriptEncoder.Default encoding the U+000B VERTICAL TABULATION character as "\u000B" instead of "\v", for instance, since that's the only way to encode it that's legal under both ECMA specs. It also results in JavaScriptEncoder.Default escaping more characters than needed (like U+0060 GRAVE ACCENT) for JSON scenarios.

Proposal

namespace System.Text.Encodings.Web
{
    public partial class JavaScriptEncoder
    {
        // (existing accelerators)
        public static JavaScriptEncoder Default { get; }
        public static JavaScriptEncoder UnsafeRelaxedJsonEscaping { get; }

        // NEW PROPOSED ACCELERATOR
        public static JavaScriptEncoder JsonDefault { get; }
    }
}

Discussion

This new static property would be the default for JSON escaping scenarios. We would expect System.Text.Json to use it instead of JavaScriptEncoder.Default as the default encoder.

This allows the two scenarios to change with some amount of isolation. Any work we do to support ECMA-262 can stay contained within the Default encoder, and any work we do to support ECMA-404 can stay contained within the JsonDefault encoder.

Note: We would expect some cross-pollination of logic between the two. Additionally, if some future update to ECMA-262 or ECMA-404 is released which brings the two specs into closer alignment, it's possible that the two encoders might eventually have identical behavior. And as the specs get new revisions, the behaviors might again later diverge.

Risks

Little risk of confusion to existing customers, since very few people call these APIs directly.

Biggest risk is likely that switching between the two encoders will alter the output payload, and customers who depend on these payloads being byte-for-byte identical across versions may see test failures if they're not properly unescaping the payloads while reading them back in. But this isn't a new risk: every released version of the encoding APIs has tweaked the list of allowed-unescaped characters, so applications which depended on bitwise identical outputs were already broken.

/cc @dotnet/area-system-text-json

Metadata

Metadata

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions