Open
Description
What I need help with / What I was wondering
I need to build a large dataset of imagery that has > 3 channels (multi-spectral satellite imagery), so I'm relying on the tfds.features.Tensor
feature connector. As writing data uncompressed is highly inefficient, I'm using tfds.features.Encoding.ZLIB
for compression.
However, this compression step actually becomes the bottleneck in my dataset building process as it is single-threaded, causing my dataset build to take longer than a month.
What I've tried so far
Read up on the docs, also checked the tf.io
namespace for any possible workarounds.
It would be nice if...
- Is there any way of speeding up the encoding/compression of the examples by using multiple cores?
- Are there plans to support a faster compression method than
ZLIB
for generic Tensor features?
Activity