Skip to content

[Python] PyCapsule interface for Image/Raster Data #43831

@wiredfool

Description

@wiredfool

Describe the usage question you have. Please include as many useful details as possible.

I'm implementing support for the Arrow PyCapsule Protocol in Pillow, as referenced here: python-pillow/Pillow#8329, implementation here: python-pillow/Pillow#8330

There are a couple of implementation questions that arise from it:

Internally, we store images as a binary chunk, in full raster lines up to 16MB. Above that, the images overflow to the next chunk. There's a variable amount of dead space between the end of the last scan line up to the 16mb point. So for the simple, small image case, we can just point at this memory as the array buffer.

Is an __arrow_c_stream__ the best way to implement what would effectively be chunked arrays? Is there a way in the protocol to fall back from the __arrow_c_array__ to the stream on err/null? For our purposes, a stream is likely as lightweight to provide as an array.

Is there a preferred array representation of Image raster data? There are a few possible, but I'd like to provide something that looks vaguely like a standard. FWIW, at the moment, the numpy array interface does return a shaped array, so the dimensions of the image are available.

  • Flat array arr[(y*(width)+x)*4 + channel]
  • or Fixed Pixel array arr[y*(width)+x][channel]?
  • Would it make sense to embed this into a set of FixedArrays that are a line length, arr[y][x][channel]?

Component(s)

Python

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions