Skip to content

Handle PIL UnidentifiedImageError exception when running cleanvision on local image folder dataset #222

Open
@sanjanag

Description

@sanjanag

Stack trace

Traceback (most recent call last):
  File "/home/sanjana/code/sandbox/zero_shot_image_issues/nsfw_cleanvision.py", line 5, in <module>
    lab.find_issues(issue_types={"odd_size":{}})
  File "/home/sanjana/code/sandbox/.venv/lib/python3.10/site-packages/cleanvision/imagelab.py", line 273, in find_issues
    issue_manager.find_issues(
  File "/home/sanjana/code/sandbox/.venv/lib/python3.10/site-packages/cleanvision/issue_managers/image_property_issue_manager.py", line 162, in find_issues
    results = list(
  File "/home/sanjana/code/sandbox/.venv/lib/python3.10/site-packages/tqdm/std.py", line 1178, in __iter__
    for obj in iterable:
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 448, in <genexpr>
    return (item for chunk in result for item in chunk)
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 870, in next
    raise value
PIL.UnidentifiedImageError: cannot identify image file <fsspec.implementations.local.LocalFileOpener object at 0x7f1c3dbebd90>

Steps to reproduce

Use a local image dataset with one buggy file which cannot be loaded using Image.open(path)

Intended workflow

The code should skip files throwing an error while calling Image.open() and continue processing the rest of the dataset. However, the erroneous files should be reported in the logs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions