-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Format compatibility test cover compressions, including mixed #13414
Conversation
3fddd5e
to
dbbeba2
Compare
dbbeba2
to
e89b8b2
Compare
@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@pdillinger has updated the pull request. You must reimport the pull request before landing. |
@pdillinger has updated the pull request. You must reimport the pull request before landing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@pdillinger has updated the pull request. You must reimport the pull request before landing. |
@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@pdillinger merged this pull request in 3af905a. |
Summary: The existing format compatibility test had limited coverage of compression options, particularly newer algorithms with and without dictionary compression. There are some subtleties that need to remain consistent, such as index blocks potentially being compressed but not using the file's dictionary if they are. This involves detecting (with a rough approximation) builds with the appropriate capabilities.
The other motivation for this change is testing some potentially useful reader-side functionality that has been in place for a long time but has not been exercised until now: mixing compressions in a single SST file. The block-based SST schema puts a compression marker on each block; arguably this is for distinguishing blocks compressed using the algorithm stored in compression_name table property from blocks left uncompressed, e.g. because they did not reach the threshold of useful compression ratio, but the marker can also distinguish compression algorithms / decompressors.
As we work toward customizable compression, it seems worth unlocking the capability to leverage the existing schema and SST reader-side support for mixing compression algorithms among the blocks of a file. Yes, a custom compression could implement its own dynamic algorithm chooser with its own tag on the compressed data (e.g. first byte), but that is slightly less storage efficient and doesn't support "vanilla" RocksDB builds reading files using a mix of built-in algorithms. As a hypothetical example, we might want to switch to lz4 on a machine that is under heavy CPU load and back to zstd when load is more normal. I dug up some data indicating ~30 seconds per output file in compaction, suggesting that file-level responsiveness might be too slow. This agility is perhaps more useful with disaggregated storage, where there is more flexibility in DB storage footprint and potentially more payoff in optimizing the average footprint.
In support of this direction, I have added a backdoor capability for debug builds of
ldb
to generate files with a mix of compression algorithms and incorporated this into the format compatibility test. All of the existing "forward compatible" versions (currently back to 8.6) are able to read the files generated with "mixed" compression. (NOTE: there's no easy way to patch a bunch of old versions to have them support generating mixed compression files, but going forward we can auto-detect builds with this "mixed" capability.) A subtle aspect of this support that is that for proper handling of decompression contexts and digested dictionaries, we need to set thecompression_name
table property tozstd
if any blocks are zstd compressed. I'm expecting to add better info to SST files in follow-up, but this approach here gives us forward compatibility back to 8.6.However, in the spirit of opening things up with what makes sense under the existing schema, we only support one compression dictionary per file. It will be used by any/all algorithms that support dictionary compression. This is not outrageous because it seems standard that a dictionary is or can be arbitrary data representative of what will be compressed. This means we would need a schema change to add dictionary compression support to an existing built-in compression algorithm (because otherwise old versions and new versions would disagree on whether the data dictionary is needed with that algorithm; this could take the form of a new built-in compression type, e.g.
kSnappyCompressionWithDict
; only snappy, bzip2, and windows-only xpress compression lack dictionary support currently).Looking ahead to supporting custom compression, exposing a sizeable set of CompressionTypes to the user for custom handling essentially guarantees a path for the user to put versioning on their compression even if they neglect that initially, and without resorting to managing a bunch of distinct named entities. (I'm envisioning perhaps 64 or 127 CompressionTypes open to customization, enough for ~weekly new releases with more than a year of horizon on recycling.)
More details:
ldb
fail if the specified compression type is not supported by the build.Other future work needed:
Test Plan:
Manual runs with some temporary instrumentation, also a recent revision of this change included a GitHub Actions run of the updated format compatible test: https://github.com/facebook/rocksdb/actions/runs/13463551149/job/37624205915?pr=13414