Skip to content

Proposal: Change PhotonSeries dimension ordering from (width, height) to (height, width) #660

@h-mayorquin

Description

@h-mayorquin

In #649 I wanted to make the documentation clear for the current schema. Here, I am proposing changing the dimension ordering in OnePhotonSeries and TwoPhotonSeries from the current data[time, width, height] to data[time, height, width] . To clarify, here "width" refers to the horizontal extent (columns, x-axis), while "height" refers to the vertical extent (rows, y-axis):

             → x (columns, width)

           0 ╔═══╦═══╦═══╦═══╦═══╗
             ║   ║   ║   ║   ║   ║
           1 ╠═══╬═══╬═══╬═══╬═══╣
y (rows,     ║   ║   ║   ║   ║   ║
 height)   2 ╠═══╬═══╬═══╬═══╬═══╣
             ║   ║   ║   ║   ║   ║
    ↓      3 ╚═══╩═══╩═══╩═══╩═══╝
             0   1   2   3   4

First, this change aligns NWB with the standard matrix indexing convention [row, column] = [height, width] used in the image processing ecossystem:

  • scikit-image: Image processing toolkit for Python, arrays have shape (height, width) with indexing array[row, col]
  • OpenCV (Mat.at, array indexing): Computer vision library, matrices indexed as mat.at<type>(row, col)
  • imageio: Python library for reading and writing images, returns arrays with shape (height, width)
  • tifffile: Python library for reading microscopy TIFF files (including ScanImage), returns arrays indexed as data[y, x] where y=rows/height, x=columns/width
  • BioIO: Microscopy file format reader, standardizes all data to 'TCZYX' ordering with spatial dimensions as YX

All of these are analysis and processing libraries where users interact with data through array indexing. Notable exceptions are Pillow (PIL) and ImageJ, which use (width, height) ordering, reflecting their heritage as graphics/plotting-centered tools rather than array-processing libraries (see final paragraph). Switching to this convention would reduce friction in NWB usage.

Second, this change should improve performance for raster-scanning microscopy, which represents a significant portion of optical physiology data. In raster-based systems, the width (horizontal direction) is the fastest-changing dimension during acquisition. When width becomes the last dimension and data is stored in C-order (row-major, the default for HDF5, Zarr, and NumPy), the proposed convention aligns the memory layout with the natural acquisition order.

Notably, ndx-microscopy extension which is looking like the future for NWB's core microscopy handling, already implements the data[time, height, width] convention. @alessandratrapani might be in a better position to add the motivation here.

This proposal represents a shift from a Cartesian/plotting-centric indexing convention to a matrix/image-processing-centric convention. The Cartesian convention writes coordinates as (x, y) with x=width first and is common in graphics and plotting contexts (see OpenCV Point discussion). In contrast, the matrix convention indexes arrays as [row, column] = [height, width] and dominates in image processing and array manipulation libraries. Since OnePhotonSeries and TwoPhotonSeries store raw imaging data that users will primarily interact with through array indexing rather than plotting, adopting the matrix convention better serves the typical NWB analysis workflow.

c.c. @bendichter @alessandratrapani

Links:
Indexing terminology: https://blogs.mathworks.com/loren/2007/06/21/indexing-terminology/

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions