Skip to content

Potentially Faster Literal Escaping #103

@filip-sakel

Description

@filip-sakel

I was experimenting with the literal escape code and I might have a way of making it more efficient. Putting the contents of JSON.EscapeCode into Godbolt seemed to generate a lot of branch instructions which isn't optimal. I came up with the following (admittedly complicated) code that does some smart bit shifting to quickly calculate is a code unit is an escape and map it to its non-escaped counterpart. This way, we only really get one branch for whether to add an escaping code unit or add the codeunit directly. Here's a snippet of the proposed function and its aarch64 assembly on Godbolt.

func mapEscape(_ codeunit: UInt8) -> UInt8 {
    // Create hash to match with potential byte. The resulting
    // hash may only have the 3 LSBs set.
    //
    // Index code unit with 3 lower bits, and OR with 3 if the upper nibble is set.
    //   0x08 & 0x7 | 0 = 0b000 = 0
    //   0x09 & 0x7 | 0 = 0b001 = 1
    //   0x0A & 0x7 | 0 = 0b010 = 2
    //   0x22 & 0x7 | (0x22 > 0xf * 3) = 0b010 | 3 
    //                  = 0b011 = 3
    //   0x0C & 0x7 | 0 = 0b100 = 4
    //   0x0D & 0x7 | 0 = 0b101 = 5
    //   ---------------------- = 6 
    //   0x5C & 0x7 | (0x5C > 0xf * 3) = 0b100 | 3 = 
    //                  = 0b111 = 7
    let highNibbleAdjustment: UInt8 = codeunit > 0xf ? 3 : 0 // 3 if upper nibble set, 0 otherwise
    let byteHash = (codeunit & 0x7) | highNibbleAdjustment // The 3-bit hash
    let bitShift = byteHash << 3 // The bit shift to index a 64-bit integer.

    // Add the expected byte to each index.
    //
    // Note that for the case where the idx is 6 
    // (unspecified), we get that either:
    //   codeunit >> 0x7 == 5, or 
    //   codeunit >> 0x7 == 6
    // In both cases, setting the expected byte to 0 will
    // result in a false equality comparison.
    let equalityMask: UInt64 = 0x5C_00_0D_0C_22_0A_09_08
    let map: UInt64          = 0x5C_00_72_66_22_6E_74_62
    
    // Whether we need to escape this code unit.
    //
    // To compute, we index the UInt64 at the right byte using our 
    // hash-based bit shift.
    let isEscape = ((equalityMask >> bitShift) & 0xFF) == codeunit
    // Map the code unit to its escaped counterpart using the same indexing.
    let mapunit = isEscape ? (map >> bitShift) : 0x00
  
    // Cast back to UInt8 truncating the upper bits.
    return UInt8(truncatingIfNeeded: mapunit)
}

I understand the code is complicated and may not be worth it for eliminating just a couple of branch instructions. However, if you're interested I could open a PR with the modified code. After some basic testing, the proposed function seems to produce the same results as the current code with the exception of returning a UInt8 and not an Optional.

Edit: Another way is to use SIMD but that seems to generate a few more instructions (for aarch64):

func mapEscape2(unit: UInt8) -> UInt8 {
    let unitVector = SIMD8(repeating: unit)
    let escapeVector = SIMD8<UInt8>(
        0x08, 0x09, 0x0A, 0x22, 0x0C, 0x0D, 0x5C, 0x00)
    let mapVector = SIMD8<UInt8>(
        0x62, 0x74, 0x6E, 0x22, 0x66, 0x72, 0x5C, 0x00)

    let mapped = SIMD8<UInt8>().replacing(with: mapVector, where: unitVector .== escapeVector)

    return mapped.wrappedSum()
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions