`to_numpy` can be optimized

@kraaijenbrink reported that obtaining a partitioned array cell's value using a `from_numpy`, `where`, `sum` combo is faster than using `to_numpy`. If this is correct, then `to_numpy` can be optimized.

Example code:

```python
# Slow: full raster materialization per point/sample
value = lfr.to_numpy(array)[row, col]

# Faster workaround with current Python API
mask_np = np.zeros(array_shape, dtype=np.uint8)
mask_np[row, col] = 1
mask = lfr.from_numpy(mask_np, partition_shape=partition_shape)

value_future = lfr.sum(lfr.where(mask, array, 0.0)).future
value = value_future.get()  # defer this as late as possible
```

Options:
- In `to_numpy`, each partition is waited upon, in turn. This can be improved by handling ready partitions as soon as possible.
- Write partition data into the NumPy buffer in parallel.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`to_numpy` can be optimized #984

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

to_numpy can be optimized #984

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`to_numpy` can be optimized #984