Decoding animated WebP is 4x slower than libwebp-sys
+ webp-animation
#119
Description
In image v0.25.4 and image-webp v0.2.0, decoding the attached animated WebP is 4x slower than using libwebp-sys
+ webp-animation
: sample.zip
image
use std::error::Error;
use image::{codecs::webp::WebPDecoder, AnimationDecoder, ImageReader};
fn main() -> Result<(), Box<dyn Error>> {
let input = std::env::args().nth(1).unwrap();
let reader = ImageReader::open(input)?.into_inner();
let decoder = WebPDecoder::new(reader)?;
let mut iter = decoder.into_frames();
while let Some(_frame) = iter.next() {}
Ok(())
}
hyperfine
results:
Time (mean ± σ): 411.9 ms ± 3.9 ms [User: 372.2 ms, System: 38.8 ms]
Range (min … max): 407.6 ms … 416.8 ms 10 runs
libwebp-sys + webp-animation
use std::error::Error;
use webp_animation::prelude::*;
fn main() -> Result<(), Box<dyn Error>> {
let input = std::env::args().nth(1).unwrap();
let buffer = std::fs::read(input).unwrap();
let decoder = Decoder::new(&buffer).unwrap();
let mut iter = decoder.into_iter();
while let Some(_frame) = iter.next() {}
Ok(())
}
hyperfine
results:
Time (mean ± σ): 95.9 ms ± 0.4 ms [User: 128.7 ms, System: 7.4 ms]
Range (min … max): 95.3 ms … 96.7 ms 30 runs
Analysis
webp-animation
shows a bit of multi-threading happening on the profile, with user time being longer than the total execution time, but even accounting for that image-webp
is 3x slower.
Breakdown of where the time is spent in image
, recorded by samply
: https://share.firefox.dev/4fc3utg
The greatest contributors seem to be image_webp::vp8::Vp8Decoder::decode_frame
(48%), image_webp::extended::do_alpha_blending
(20%), image_webp::vp8::Frame::fill_rgba
(16%).
Within decode_frame
the biggest contributor is image_webp::vp8::Vp8Decoder::read_coefficients
(12% self time, 32% total time), and the code of that function looks like it could be optimized further to reduce bounds checks, etc. #71 is also relevant, but only accounts for 20% of the total time.