Move from PIL to torchvision.io.decode_image

Instead of using `PIL.open`, @NicolasHug has pointed out that we can move to `torchvision.io.decode_image` to speed up image processing by doing everything on pure tensors ([see also](https://pytorch.org/vision/main/io.html#image-decoding). This would also allow us to drop our explicit [PIL requirement](https://github.com/pytorch/torchtune/blob/main/pyproject.toml#L34)). This should entail:

1) changing [load_image](https://github.com/pytorch/torchtune/blob/23896c30dc10c97f8baaed8841e61bc3b8f2a61c/torchtune/data/_utils.py#L47) to use [torchvision.io.decode_image](https://pytorch.org/vision/main/generated/torchvision.io.decode_image.html#torchvision.io.decode_image)
2) updating [CLIPImageTransform](https://github.com/pytorch/torchtune/blob/23896c30dc10c97f8baaed8841e61bc3b8f2a61c/torchtune/models/clip/_transform.py#L26) to accept tensors instead of PIL.Image (in the short term we can keep PIL.Image support for backwards compatibility)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Move from PIL to torchvision.io.decode_image #2303

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Move from PIL to torchvision.io.decode_image #2303

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions