Skip to content

multi task processing result in multiple times creation same tmp file & multiple times processing same raster #70

@justRishi

Description

@justRishi

Problem

If RASTER_USE_CELERY = True and (RASTER_PARSE_SINGLE_TASK = False or not set) then
a temp file is created multiple times in def open_raster_file in RasterLayerParser in parser.py
Also when not in the right reprojection, the projection is done multiple times.

Why problem

Big raster files are copied in my case 4 times, processed by GDAL 4 times . and sometimes (when not in the right projection) 4 times reprojected.

How tested

by adding self.log to print out tmp file creation resulting in:
image

How to mitigate

put RASTER_PARSE_SINGLE_TASK = True in settings , but meaning will not use concurrency to process raster file

Possible solution to process parallel and not duplicate work

  1. check that only 1 tmp folder is created :
    so this line in parser.py should change self.tmpdir = tempfile.mkdtemp(dir=raster_workdir (as always unique)
  2. self.dataset in parser.py (in class RasterLayerParser) should be shared by all parallel tasks for same raster file

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions