Description
Hello!
I just wanted to share some data and thoughts that stem from my experiments with using different tables sizes in CompressedBlock
. This is very exploratory/preliminary and I hope that we can discuss various ideas and directions before we commit to any particular solution (a random internet blogpost tells me that this may result in a better outcome :-P).
One experiment I've done, is using the default table sizes (4096/512) when the estimated image bytes (width x height x samples-per-pixel x bits-per-sample / 8) is above a certain threshold, but using smaller tables (512/128) otherwise (see the Chromium CL here). The results I've got (see section "Measurements 2024-12-30 - 2025-01-02" in my doc) have been positive, but the magnitude of the improvement has been disappointing to me.
The results above were also surprisingly flat - I've expected that small images will significantly benefit from small tables (and big images from big/default tables). One hypothesis that could explain that is that image size is not a good predictor for the size of zlib compressed blocks - e.g. maybe some big images can use lots of relatively short compressed-zlib-blocks. So I tried another experiment to gather this kind of data on my old 2023 corpus of ~1650 PNG images from top 500 websites (see also the tool bits here and here) - the results can be found in a spreadsheet here. I think the following bits of data are interesting:
- There is quite a wide range of compressed block sizes. Even when looking at 100 biggest images, the block sizes range from ~3kB (at 10%-ile) to ~44kB (at 90%-ile).
- Some images use a mix of compressed blocks with 1) fixed/default-symbol-encoding and 2) custom Huffman trees.
- Some images use uncompressed blocks
I also think that it is a bit icky that in my experiments the public API of fdeflate
"leaks" the implementation detail of Huffman table sizes. One idea to avoid this is to:
- Decouple
CompressedBlock
andfn read_compressed
fromDecompressor
, so thatDecompressor
can internally choose to use small or big table sizes (with dynamic dispatch via something likeBox<dyn CompressedBlockRead[er]>
). I think that movingfn read_compressed
toimpl...CompressedBlock
can be made easier by packaging/encapsulating bits ofDecompressor
(to make it easier to pass them as&mut
reference tofn read_compressed
) - for example maybebuffer
+nbits
can become fields ofBitBuffer
struct
andqueued_rle
+queued_backref
can become variants ofenum QueuedOutput
. - Add
fdeflate::Decompressor::set_output_size_estimate(estimate: usize)
which can be used to decide the initial table sizes. (Note thatpng::ZlibStream
already has such estimate available - it calls itmax_total_output
.) - Track the size of the last compressed blocks and switch the table sizes if that size is below/above a certain threshold.