Skip to content

DeflaterSink may use the same buffer for both input and output to Java's Deflater.deflate() call #1608

Closed
@boyuzhang-msft

Description

@boyuzhang-msft

In DeflaterSink class in jvmMain, its write() method may recycle head Segment, which is used as input buffer in deflater.setInput() call. Later on this head Segment might be returned by buffer.writableSegment(1) in DeflaterSink.deflate() method, and it's being used as output buffer in deflater.deflate() call.

In some workflow (see example below), the same region of this recycled head Segment is being used as both input and output buffer to Java's Deflater calls, which may result in undefined behavior.

Flow with issue: when I use okhttp to send a compressed message using webSocket.send("Hello, WebSocket! test"), okhttp's WebSocketWriter calls into MessageDeflater.deflate(), which calls into okio's DeflaterSink.write() and DeflaterSink.flush(). In this flow, the same region of head Segment is being used as both input and output buffer to Java's Deflater methods.

I added logs to print out the buffer object (head.data and s.data respectively) used in Java's deflater.setInput (after L39 of https://github.com/square/okio/blob/master/okio/src/jvmMain/kotlin/okio/DeflaterSink.kt) and deflater.deflate (after L70 of the same file) calls.

2025-03-20 10:46:37.276 25536-25657 okhttp3.internal.ws.Mes com.sample.app         I  Hello, WebSocket! test
// This is  DeflaterSink.write() call
2025-03-20 10:46:37.283 25536-25658 okio.DeflaterSink       com.sample.app         I  deflater set input array object: [B@9cf71d5, pos: 0, todeflate: 22}, input data: [72, 101, 108, 108, 111, 44, 32, 87, 101, 98, 83, 111, 99, 107, 101, 116, 33, 32, 116, 101, 115, 116, 0, 0, 0,
2025-03-20 10:46:37.317 25536-25658 okio.DeflaterSink       com.sample.app         I  syncFlush: false, output buffer object: [B@1c08dea, deflated: 0, pos: 0, len: 8192,  s: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 

// This is DeflaterSink.flush() call
2025-03-20 10:46:37.329 25536-25658 okio.DeflaterSink       com.sample.app         I  syncFlush: true, output buffer object: [B@9cf71d5, deflated: 28, pos: 0, len: 8192,  s: [72, 101, 108, 108, 111, 44, 32, 87, 101, 98, 83, 111, 99, 107, 101, 116, 33, 32, 116, 101, 115, 116, 0, 
2025-03-20 10:46:37.336 25536-25658 okio.DeflaterSink       com.sample.app         I  syncFlush: true, output buffer object: [B@9cf71d5, deflated: 0, pos: 28, len: 8164,  s: [72, 101, 108, 108, 111, 44, 32, 87, 101, 98, 83, 111, 99, 107, 101, 116, 33, 32, 116, 101, 115, 116, 0, 

You can see that in DeflaterSink.flush() call, the output buffer re-used the same region of the input buffer. On some Android devices, this may cause an issue due to implementation details on java side. Java's Deflater.deflate method is implemented in a C/C++ method Deflater_deflateBytesBytes. The C/C++ implementation calls JNI's GetPrimitiveArrayCritical(input), GetPrimitiveArrayCritical(output), ReleasePrimitiveArrayCritical(output), ReleasePrimitiveArrayCritical(input) in this sequence. This pair of JNI methods are documented to return either a pointer to the primitive jarray or to make a copy of the jarray on native side and return the pointer to the native memory. The sequence in the native implementation of java goes:

  1. GetPrimitiveArrayCritical(input), gets a pointer to the primitive jarray (input), or JVM copies the content of jarray (input) to native side, and gets the pointer to the native memory. Native memory gets the content of the primitive jarray (input).
  2. GetPrimitiveArrayCritical(output), same op, just for output array.
  3. ReleasePrimitiveArrayCritical(output), releases native memory (content of this native memory is the deflated bits), native memory content is reflected on java side (output jarray)
  4. ReleasePrimitiveArrayCritical(input), releases native memory (content of this native memory is the non-deflated input bits), native memory content is reflected on java side (input jarray)

When a pointer is returned in step 1 and 2 (i.e. no memory copy is made), everything works. But when copies are made, and step 1 and 2 return pointers that point to separate native memory addresses with copies of input and output data, it doesn't work as intended. Since 3 will copy back the deflated bits to output buffer on java side, and 4 will copy back the original input bits to input buffer on java side. On java side, when input and output buffer are the same, it would appear that the deflate didn't happen, but in reality, it happened, just the second ReleaseXXX call overrides the java side buffer with input data on native memory.

I didn't find explicit callouts on Android's documentation on Deflater class on re-using buffer for input and output. But in the example it provided, two separate buffers are used, so I'm leaning towards two separate buffers should be used.

A fix would be in DeflaterSink.write method, we do not recycle head, so we don't risk using the same Segment for both input and output. Let me know if this makes sense.

      if (head.pos == head.limit) {
        source.head = head.pop()
      }

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions