Fix misaligned memory access in LossyDctDecoder_execute HALF→FLOAT expansion

cary-ilm · cary-ilm · commit b506d27d7478 · 2026-03-26T08:27:58.000-07:00
After DCT decoding, `LossyDctDecoder_execute()` expands FLOAT-type channels
from their intermediate HALF (16-bit) XDR representation back to FLOAT (32-bit)
XDR in place.  The expansion was done by casting `_rows[y]` (a `uint8_t *`)
directly to `float *` and `uint16_t *`, then reading and writing through those
typed pointers.

Because row buffers are assigned by advancing a byte pointer with no alignment
padding (`outBufferEnd += chan-&gt;width * chan-&gt;bytes_per_element` in
`internal_dwa_compressor.h`), a FLOAT channel that follows a HALF channel of
odd width receives a `_rows[y]` pointer that is 2-byte aligned but not 4-byte
aligned.  Dereferencing a `float *` cast from such a pointer is undefined
behavior under the C standard:

- On ARM, RISC-V, and MIPS (strict alignment) this crashes immediately.
- On x86 it is silently tolerated at the hardware level but remains UB:
  auto-vectorizing compilers (SSE/AVX) may assume aligned access and generate
  incorrect code.
- UBSan reports: `store to misaligned address ... for type 'float', which
  requires 4 byte alignment` at `internal_dwa_decoder.h:749`.

Fix: replace the cast-and-dereference pattern with the `unaligned_load16` /
`memcpy` / `unaligned_store32` helpers already used throughout the rest of
OpenEXRCore (`internal_xdr.h`, `unpack.c`, `pack.c`, `internal_pxr24.c`).
These helpers use `memcpy` internally, which the C standard guarantees is safe
for unaligned addresses and which compilers compile to a single load/store
instruction on architectures that support it.

The byte-order handling is preserved correctly:
- `unaligned_load16` reads 2 bytes via `memcpy` and applies `one_to_native16`
  (XDR → native), returning a native-endian HALF value.
- `half_to_float` converts native HALF → native float.
- `memcpy(&amp;bits, &amp;f, 4)` reinterprets the float's bit pattern as `uint32_t`
  without numeric conversion (the correct type-pun idiom in C).
- `unaligned_store32` applies `one_from_native32` (native → XDR) and writes
  4 bytes via `memcpy`, storing the result in XDR float format.

Made-with: Cursor
Signed-off-by: Cary Phillips &lt;cary@ilm.com&gt;
diff --git a/src/lib/OpenEXRCore/internal_dwa_decoder.h b/src/lib/OpenEXRCore/internal_dwa_decoder.h
@@ -741,13 +741,15 @@ LossyDctDecoder_execute (
         /* process in place in reverse to avoid temporary buffer */
         for (int y = 0; y < d->_height; ++y)
         {
-            float*    floatXdrPtr = (float*) chanData[chan]->_rows[y];
-            uint16_t* halfXdr     = (uint16_t*) floatXdrPtr;
+            uint8_t* rowBytes = chanData[chan]->_rows[y];
 
             for (int x = d->_width - 1; x >= 0; --x)
             {
-                floatXdrPtr[x] = one_from_native_float (
-                    half_to_float (one_to_native16 (halfXdr[x])));
+                uint16_t h = unaligned_load16 (rowBytes + x * sizeof (uint16_t));
+                float    f = half_to_float (h);
+                uint32_t bits;
+                memcpy (&bits, &f, sizeof (bits));
+                unaligned_store32 (rowBytes + x * sizeof (float), bits);
             }
         }
     }

Original file line number	Diff line number	Diff line change
`@@ -741,13 +741,15 @@ LossyDctDecoder_execute (`
`741`	`741`	`/* process in place in reverse to avoid temporary buffer */`
`742`	`742`	`for (int y = 0; y < d->_height; ++y)`
`743`	`743`	`{`
`744`		`- float* floatXdrPtr = (float*) chanData[chan]->_rows[y];`
`745`		`- uint16_t* halfXdr = (uint16_t*) floatXdrPtr;`
	`744`	`+ uint8_t* rowBytes = chanData[chan]->_rows[y];`
`746`	`745`
`747`	`746`	`for (int x = d->_width - 1; x >= 0; --x)`
`748`	`747`	`{`
`749`		`- floatXdrPtr[x] = one_from_native_float (`
`750`		`- half_to_float (one_to_native16 (halfXdr[x])));`
	`748`	`+ uint16_t h = unaligned_load16 (rowBytes + x * sizeof (uint16_t));`
	`749`	`+ float f = half_to_float (h);`
	`750`	`+ uint32_t bits;`
	`751`	`+ memcpy (&bits, &f, sizeof (bits));`
	`752`	`+ unaligned_store32 (rowBytes + x * sizeof (float), bits);`
`751`	`753`	`}`
`752`	`754`	`}`
`753`	`755`	`}`