Skip to content

MsgPackReader silently returns empty Array/Map for Array32/Map32 with length >= 2^31 #702

@haskiindahouse

Description

@haskiindahouse

In upack.MsgPackReader, parseUInt32 returns a signed Int. MsgPack 32-bit lengths are unsigned, so values in [2^31, 2^32-1] wrap to negative. The consuming parseArray/parseMap (and parseStr/parseBin) use while (i < n) with signed comparison, so a negative n makes the loop body run zero times — and the parser hands back an empty container, silently swallowing whatever payload follows.

Minimal example:

//> using dep com.lihaoyi::upack::4.4.3

@main def test(): Unit = {
  // Map32 with length 0x80000000 (2^31)
  val bigMap = Array[Byte](
    0xdf.toByte,
    0x80.toByte, 0x00, 0x00, 0x00
  )
  println(upack.read(bigMap))
  // => Obj(Map())  -- empty, no error

  // Array32 with length 0x80000001
  val bigArr = Array[Byte](
    0xdd.toByte,
    0x80.toByte, 0x00, 0x00, 0x01
  )
  println(upack.read(bigArr))
  // => Arr(ArrayBuffer())  -- empty, no error
}

Got (upickle 4.4.3, Scala 3.8.3)

Obj(Map())
Arr(ArrayBuffer())

Expected

Either the declared number of entries are read (when present), or the parser raises — anything but silently returning an empty container with the byte stream still containing the payload.

Source

def parseUInt32(i: Int) = {
index = i + 4
requestUntil(i + 3)
(getByteUnsafe(i) & 0xff) << 24 | (getByteUnsafe(i + 1) & 0xff) << 16 |
(getByteUnsafe(i + 2) & 0xff) << 8 | getByteUnsafe(i + 3) & 0xff
}

def parseUInt32(i: Int) = {
  index = i + 4
  requestUntil(i + 3)
  (getByteUnsafe(i) & 0xff) << 24 | (getByteUnsafe(i + 1) & 0xff) << 16 |
  (getByteUnsafe(i + 2) & 0xff) << 8 | getByteUnsafe(i + 3) & 0xff
}

The << 24 shift sets bit 31 for any value >= 2^31, so the result becomes a negative Int. The dispatch on lines 73 / 76:

case MPK.Array32 => parseArray(parseUInt32(index + 1), visitor)
case MPK.Map32   => parseMap(parseUInt32(index + 1), visitor)

passes that negative value straight into parseArray/parseMap, where while(i < n) is immediately false.

There's also a security angle: a hostile MsgPack stream can declare a Map32/Array32 of length 0x80000000 followed by real entries; the parser returns an empty container and the reader is left in an inconsistent state with respect to the underlying stream.

Suggested fix

Either widen parseUInt32 to return Long and propagate the wider type to the loops, or — at minimum — reject lengths >= 2^31 with an explicit error in parseArray/parseMap/parseStr/parseBin. There's already a comment in the code about "Normalize index to prevent overflow when parsing files >2GB" but it doesn't cover this case.

Versions

upickle / upack 4.4.3, Scala 3.8.3.

Happy to PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions