- 
                Notifications
    You must be signed in to change notification settings 
- Fork 2.3k
Add Controls to Constrain Window Size to Protocol Limits #4261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Add Controls to Constrain Window Size to Protocol Limits #4261
Conversation
ca836a3    to
    82fcdce      
    Compare
  
    This commit adds a new CCtxParam which lets a user specify a protocol whose requirements they want the compressed frame to comply with.
82fcdce    to
    8366bb7      
    Compare
  
    8366bb7    to
    48014ab      
    Compare
  
    | cc @pmeenan, @nidhijaju. We'd talked about this being desirable for adoption of  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just the one nit.
We also talked about an option to add the dictionary hash in the skippable frame automatically. I'm assuming that will come separately, maybe after the RFC publishes.
FWIW, the draft was approved for publication and is just held up by the shared brotli normative reference. The shared brotli draft was also just approved for publication but has a normative reference to highway hash which will probably drag out the actual publication a bit.
|  | ||
| /* Constrains the window size to comply with the limits imposed on the | ||
| * `dcz` Content-Encoding, as specified by the Compression Dictionary | ||
| * Transport protocol. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth mentioning what the actual limit is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks for putting this together!
| */ | ||
| typedef enum { | ||
| /* Equivalent to disabling. */ | ||
| ZSTD_ConstrainWindow_auto = ZSTD_ps_auto, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, what is the difference between "auto" and "disable", they seem basically the same? Do you see a situation where they aren't the same in the future?
The command line option also only has "none".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're just here by analogy to the ZSTD_paramSwitch_e enum, which is used for most cctx param options. They behave identically.
48014ab    to
    7c1caba      
    Compare
  
    | wlog = 27; /* 128 MB */ | ||
| } else { | ||
| dictSize += dictSize >> 2; /* dictSize *= 1.25 */ | ||
| wlog = ZSTD_highbit32((U32)dictSize); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need something more accurate than a power of 2.
The Zstandard specification makes it possible to specify a window size which is a power of 2 + n * 1/8th of the power of 2. This is the reason why the *1.25 rule was selected, because it guarantees that such a window size, both larger the dictSize, but not by too much, necessarily exists.
If we stick to pure power of 2, such a window size may not exist.
For example, 12 MB window size will become 15 MB max, which will be round down to 8 MB.
As a consequence, the dictionary will only remain valid up to 8 MB.
In contrast, using the 1/8th increment, it would be possible to match the 15 MB maximum window size.
One of the issues though is that the current code base is unprepared to handle window sizes which are not power of 2. Even if it can, this information cannot be returned as wlog.
So we probably need some in-depth tech talk on this topic.
This PR adds convenience options that ensure Zstd produces compressed outputs compatible with protocols that place limits on the window size. Currently, it supports two protocols:
zstdHTTP Content-Encoding (RFC 8878 and RFC 9659).dczHTTP Content-Encoding (draft-ietf-httpbis-compression-dictionary-19).This option is exposed as a new CCtx param
ZSTD_c_constrainWindowForProtocoland a corresponding CLI argument--constrain-window.To-Do: