-
-
Notifications
You must be signed in to change notification settings - Fork 865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V3: Gracefully handle LZW overflows #2880
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 3 out of 8 changed files in this pull request and generated no comments.
Files not reviewed (5)
- tests/Images/External/ReferenceOutput/GifDecoderTests/Issue2859_LZWPixelStackOverflow_Rgba32_issue_2859_A.gif/00.png: Language not supported
- tests/Images/External/ReferenceOutput/GifDecoderTests/Issue2859_LZWPixelStackOverflow_Rgba32_issue_2859_B.gif/00.png: Language not supported
- tests/Images/External/ReferenceOutput/GifDecoderTests/Issue2859_LZWPixelStackOverflow_Rgba32_issue_2859_B.gif/01.png: Language not supported
- tests/Images/Input/Gif/issues/issue_2859_A.gif: Language not supported
- tests/Images/Input/Gif/issues/issue_2859_B.gif: Language not supported
@@ -204,7 +205,7 @@ public void DecodePixelRow(Span<byte> indices) | |||
this.code = this.oldCode; | |||
} | |||
|
|||
while (this.code > this.clearCode) | |||
while (this.code > this.clearCode && this.top < maxTop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are here, can we also un-Unsafe the methods touched?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only if we can get the same performance out. We're hitting two separate spans inside a loop which is will definitely incur bounds checks costs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can benchmark that. IMO it was a big mistake to start using Unsafe
as default way of indexing things, since it opens us up to security risks and it should be our goal to significantly reduce the amount of Unsafe usage across the codebase. A gain of few percents is not worth the risk, it's better to find algorithmic optimizations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@antonfirsov I've updated the LzwDecoder
with the best I could do performance-wise using safe code.
I'm OK with the performance hit given the much-improved security. Given that System.Drawing doesn't actually fully decode the image I'm still happy with the the performance of the decoder.
Target Branch
BenchmarkDotNet=v0.13.0, OS=Windows 10.0.26100
11th Gen Intel Core i7-11370H 3.30GHz, 1 CPU, 8 logical and 4 physical cores
.NET SDK=9.0.200-preview.0.25057.12
[Host] : .NET 7.0.20 (7.0.2024.26716), X64 RyuJIT
Job-AUXYJW : .NET 6.0.36 (6.0.3624.51421), X64 RyuJIT
Runtime=.NET 6.0 Arguments=/p:DebugType=portable IterationCount=3
LaunchCount=1 WarmupCount=3
Method | TestImage | Mean | Error | StdDev | Ratio | RatioSD | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|---|
'System.Drawing Gif' | Gif/rings.gif | 294.7 us | 100.98 us | 5.53 us | 1.00 | 0.00 | - | - | - | 168 B |
'ImageSharp Gif' | Gif/rings.gif | 410.6 us | 87.95 us | 4.82 us | 1.39 | 0.03 | 0.9766 | - | - | 7,557 B |
This PR
Method | TestImage | Mean | Error | StdDev | Ratio | RatioSD | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|---|
'System.Drawing Gif' | Gif/rings.gif | 299.0 us | 189.8 us | 10.41 us | 1.00 | 0.00 | - | - | - | 168 B |
'ImageSharp Gif' | Gif/rings.gif | 486.9 us | 153.3 us | 8.40 us | 1.63 | 0.06 | 0.9766 | - | - | 7,715 B |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. The ~15-20% regression seems somewhat worse than what I expected, but I think we can improve on that by micro-optimizations.
Mentioning a few suggestions I found, but they might be not the most efficient ideas, I'm OK adding them in another PR if you prefer so.
The ideal approach would be to profile the code anyways. If that detects some indexing bottlenecks which can't be eliminated by other means, I wouldn't be against limited surgical applications of Unsafe, but those should come with a very mindful, well-documented analysis of the code arguing it's safe.
while (code > clearCode && top < maxTop) | ||
{ | ||
Unsafe.Add(ref pixelStackRef, (uint)this.top++) = Unsafe.Add(ref suffixRef, (uint)this.code); | ||
this.code = Unsafe.Add(ref prefixRef, (uint)this.code); | ||
pixelStack[top++] = suffix[code]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pixelStack
can be sliced down here to turn this into a for
loop that runs until pixelStackSlice.Length
.
prefix[availableCode] = oldCode; | ||
suffix[availableCode] = first; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can slice down one of these spans.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand you and had a look. It seems like this would make the code a little more difficult to follow so have left it.
xyz++; | ||
Unsafe.Add(ref pixelsRowRef, (uint)x++) = (byte)Unsafe.Add(ref pixelStackRef, (uint)this.top); | ||
// Clear missing pixels. | ||
indices[xyz++] = (byte)pixelStack[top]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since xyz seems to run till indices.Length
this can be a straightforward for-loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't always increment on each iteration so that's not possible.
} | ||
|
||
// Pop a pixel off the pixel stack. | ||
this.top--; | ||
top--; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a some logical guarantee that top
doesn't run below zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As soon as we hit zero we enter the condition above. In there I believe there's at least one condition that increments top
by one. The worst thing that could ever happen if top
fals below zero is that we skip to the end of the pixel indices.
int max = Math.Min(this.clearCode, suffix.Length); | ||
for (this.code = 0; this.code < max; this.code++) | ||
{ | ||
Unsafe.Add(ref suffixRef, (uint)this.code) = (byte)this.code; | ||
suffix[this.code] = (byte)this.code; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could run on a slice of suffix
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or if there is an upper limit to Math.Min(this.clearCode, suffix.Length)
, it can be even copied out from a statically initialized buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clear code maxes out at 256. We can certainly slice here.
@antonfirsov Managed to squeeze a little more performance out of this. Still haven't done any profiling though. Will merge once built.
|
Prerequisites
Description
The issue is caused by using the full capacity of the pixel stack for intermediate pushes without reserving a slot for the final push, which leads to an overflow. By capping out the max index we can skip corrupt blocks.
Fixes #2859