Skip to content

CVAT cloud storage ingestion fails on JPEGs with large RECONYX MakerNote EXIF, despite valid image decoding #10570

@joshwilson-dev

Description

@joshwilson-dev

Actions before raising this issue

  • I searched the existing issues and did not find anything similar.
  • I read/searched the docs

Steps to Reproduce

Environment setup
CVAT running with Azure Blob Storage cloud storage attached
Images stored in Azure Blob container and referenced via CVAT cloud storage

Dataset characteristics
Images originate from a RECONYX HYPERFIRE HF4K camera trap system.

Failing behaviour
Attach Azure Blob Storage as CVAT cloud storage
Add dataset containing JPEG images directly from camera
Create a task from cloud storage
CVAT fails during ingestion with:
utils.dataset_manifest.errors.InvalidImageError:
failed to parse image file '.../RCNX0001.JPG'

Important observation
The same images:
open successfully in Pillow
open successfully in OpenCV (cv2.imread)
are valid baseline JPEGs (RGB, non-progressive)
Only fail inside CVAT ingestion pipeline

Minimal reproduction case
Affected images contain a large EXIF block:

MakerNote ≈ 33 KB (RECONYX proprietary metadata)
or
Thumbnail is big
GPS section present but small
Standard JPEG encoding otherwise valid

Removing only MakerNote (or thumbnail if it's large) resolves the issue:

from PIL import Image
import piexif
img = Image.open("input.jpg")
exif_dict = piexif.load(img.info["exif"])
del exif_dict["Exif"][piexif.ExifIFD.MakerNote]
exif_bytes = piexif.dump(exif_dict)
img.save("output.jpg", exif=exif_bytes)

Expected Behavior

CVAT should successfully ingest and index valid JPEG images regardless of EXIF metadata size or vendor-specific MakerNote content.

EXIF metadata (including large or proprietary MakerNote blocks) should not prevent image decoding or manifest generation.

Possible Solution

No response

Context

I am working with large-scale camera trap datasets stored in Azure Blob Storage and using CVAT for annotation.

These datasets contain images from RECONYX HF4K trail cameras which embed extensive proprietary EXIF metadata (especially MakerNote blocks).

This issue blocks ingestion of otherwise valid imagery into CVAT cloud storage workflows.

To work around this issue we strip the makernotes before upload, but it would be prefered not to have to do this and lose that metadata.

Ideally, only problematic metadata fields should be ignored during ingestion.

Environment

CVAT version: CVAT Online
Storage backend: Azure Blob Storage
Client OS: Windows 11
Browser: Edge
Camera source: RECONYX HYPERFIRE HF4K
Image format: JPEG (RGB, non-progressive)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions