Skip to content

pyarrow.lib.ArrowMemoryError: realloc of size 2172846080 failed #443

@FFFFFhh

Description

@FFFFFhh

Hi,when I use the img2dataset to download the dataset ,it raises the error.

Traceback (most recent call last):
File "/hdd/u202320081001041/CapsFusion/download_file.py", line 22, in
number_sample_per_shard=1000,
File "/home/u202320081001041/miniconda3/envs/XVLM/lib/python3.7/site-packages/img2dataset/main.py", line 267, in download
max_shard_retry,
File "/home/u202320081001041/miniconda3/envs/XVLM/lib/python3.7/site-packages/img2dataset/distributor.py", line 36, in multiprocessing_distributor
failed_shards = run(reader)
File "/home/u202320081001041/miniconda3/envs/XVLM/lib/python3.7/site-packages/img2dataset/distributor.py", line 31, in run
for status, row in tqdm(process_pool.imap_unordered(downloader, gen)):
File "/home/u202320081001041/miniconda3/envs/XVLM/lib/python3.7/site-packages/tqdm/std.py", line 1181, in iter
for obj in iterable:
File "/home/u202320081001041/miniconda3/envs/XVLM/lib/python3.7/multiprocessing/pool.py", line 748, in next
raise value
pyarrow.lib.ArrowMemoryError: realloc of size 2172846080 failed
Could you tell me where the problem is?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions