-
Notifications
You must be signed in to change notification settings - Fork 1.2k
KHR_meshopt_compression proposal #2517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Also use "delta bytes" instead of "byte deltas" when describing control bits to reduce confusion with 1-byte vs 2-byte vs 4-byte deltas.
Add Alexey and Don based on prior contributions to EXT_ (missed during finalization) and advice for KHR_; update all links and profiles to github.com for consistency.
requirement for Web delivery.
KHR vs EXT prefix choice generally means different treatment of the extension's IP besides perceived "ecosystem support". In particular the following details should be clarified upfront (not a legal advice, though):
|
|
Understood. I'm not a lawyer, but none of these seem like blockers to me (pending closer review). meshopt is not a registered trademark. |
|
Some updates!
It would be great to understand what the next steps could be here, as from my perspective all blockers have been resolved. Quite confident the actual implementations are going to be quick to finalize if there is agreement from Khronos in principle that the extension should be included. I'm obviously also happy to adjust the proposed text if corrections or clarifications are necessary. |
|
The three.js update suggests that the decoder does not need to know whether the glTF asset uses the original EXT or the new KHR extension name. This implies that existing EXT (v0) files could be upgraded to KHR (v1) without re-encoding anything at all. Since v0 support is not going anywhere (because the EXT extension has been ratified and tools are expected to accept such files indefinitely), I'd suggest allowing v0 in KHR as well (with a note about v1 benefits), given that the intention is to keep using the same extension name. |
That sounds good to me. I originally specified KHR as just accepting v1 in order to have the smallest/simplest possible specification. But indeed, this restriction is not strictly necessary - v0 support is never going away, and the format version is encoded in the first byte of the stream so the meshoptimizer library can decode both. The reference decoder decodes both as well. I can change this, it would just require a more complicated bitstream description where specific additions are called out as only being there if version is 1. |
|
@javagl In the context of this extension, there are two quite independent pieces of "an implementation":
The first piece is engine-specific and therefore, for example, glTF-Transform, three.js, Babylon.js, and Khronos glTF-Sample-Viewer would be four separate implementations should they support the extension. The second piece is engine-agnostic so different viewers can technically use the same decoding library. That said, the bitstream spec must be complete and unambiguous enough to enable anyone implement a decoder from scratch without referring to any source code besides spec-inlined snippets. Given that this bitstream spec has two implementations written in two different languages, I think they count as two. |
|
The bitstream specification here looks very detailed. It looks like it should be possible to implement a decoder (and maybe even encoder?) from that. It looks like there actually are things that can count as "two implementations". So to not further pollute this PR with off-topic discussions, I moved that to #2542 |
- Use uintN more consistently as outputs and two of four inputs are unsigned - Define findMSB - Clarify that the output is unsigned normalized
- Clarify top bits when K is smaller than maximum - Clarify that the range for signed chrominance values is symmetric - Clarify that the decoding using 32-bit integer math must result in K-bit value
|
@zeux We'd need sample assets covering all JSON properties of this extension to ensure that engines correctly pass them to decoders. In particular:
It would also be great to eventually have sample streams exhaustively covering the bitstream spec to ensure that decoders handle it correctly. Note that exhaustive bitstream coverage is mostly a "nice-to-have" thing for now but it's required to ensure long-term sustainability (think potential inclusion of meshopt in the ISO version of glTF). Some of the test cases may be merged together when that would make sense. Logistically, they may be organized as glTF assets containing both compressed and uncompressed data and asserting that decompressing the compressed blocks yields uncompressed blocks exactly (when filters aren't used) or within a reasonable epsilon. Here's a list of test cases based on my understanding of the spec.
|
|
My understanding is that this would be independent of this PR, and would belong in glTF-Sample-Assets? That repository has a couple EXT_ assets and it should be easy to replicate these for KHR (see also #2517 (comment)) - that is not combinatoric of course. In addition, I think perhaps just two assets should be enough to cover the JSON variations; a single indexed sphere mesh with normals and vertex colors can be used to test all 4 encoding modes (attributes could be encoded using v0 & v1 for different spheres; indices can be encoded using triangles or indices for different spheres), and 3 of 4 filters (e.g. exponential for float3 positions, octahedral for quantized normals, color for quantized colors), with varying byte strides. To test quaternion filter you need an animation or EXT_mesh_gpu_instancing; BrainStem asset has that but a basic animation could also be added to the same test asset, e.g. a small grid of spheres plus a rotating cube. I guess we could use instancing instead and rely on vertex color pattern rotating visually, as the sphere geometry is symmetric, or use a more complex base mesh instead of a sphere. Then the entire asset could be duplicated with a version with a fallback buffer, in which the extension would be optional, and the expectation would be that rendering matches between:
The bitstream test assets are more difficult to produce just because of the sheer number, and the requirement to have custom code that produces them as these include suboptimal decisions that my current encoder never performs. It's definitely possible to make in the future; I'm also not sure where this asset would belong, as it would need to be part of some programmatic framework instead of a renderer ideally (e.g. the quaternion filter testing would need to produce a mostly incoherent quaternion stream, attempts to use that for animations would be difficult to validate visually). |
Right but that's one of the conditions for marking an extension a Release Candidate.
That would create a dependency on the instancing extension, which would be suboptimal for a meshopt-focused test asset. There are some early ideas on creating dedicated collections of low-level test assets, bitstream tests would eventually go there. |
The sequencing here isn't super clear to me; what status is this extension in right now? A simpler pair of test assets for BrainStem & flower vase (linked in the comment above) test all 4 filters (although not all of them are with a full set of bytestrides) and two modes out of 3 (no INDICES coverage), using v1. The flower asset is a point cloud though so including that into glTF-Sample-Assets is probably complicated, this is from the issue I've filed a few years ago KhronosGroup/glTF-Sample-Assets#31. Neither uses the fallback though; these are good to do basic testing of the existing implementations, but not as comprehensive long term. So probably the most direct route is a composite asset along the lines I've sketched above... maybe it's easier to make an asset with several cubes here, as that can then include animations for rotations which would be easy to visually distinguish. For the existing EXT_meshopt assets in glTF-Sample-Assets repository, we could either "upgrade" them, or come up with a folder naming scheme that incorporates both variants as they are currently placed in glTF-Meshopt folders. |
Admittedly the formal extension development lifecycle is not fully defined yet. This extension should be a Review Draft as of now. To move it forward (to RC, which implies merging the PR), we'd need sample assets and at least one glTF viewer/loader implementation. It seems that the latter is trivial and almost complete (although not merged). |
|
I "wrote" a script that generates a synthetic test asset that might be helpful here ("wrote" is quoted because the code is generated by GPT Codex, I've lightly reviewed the code and the generated asset, and I also confirmed that "breaking" individual parts of the decoder manually breaks the asset in "correct" ways). The asset is a 5x5 cube grid, where columns of the grid vary the geometry encoding from the point of view of glTF spec (different bit counts for normals/colors/indices, interleaved vs not, and last column is an animated uncolored cube): https://gist.github.com/zeux/d26340c53dd70d19ae18045e79d065df#file-layout-md The generated file normally renders like this - subject to lighting conditions etc., the file itself doesn't carry a light setup:
The rows of the grid use different ways to encode the data. The top row is uncompressed, and then we have v0 compression with INDICES & TRIANGLES compression for index buffer without filters, then v0 with filters & triangles, and last row is v1 with filters & triangles. Note that interleaved data (first column) always has no filters applied, and as such v1 with no filters is implicitly tested here too. For example, if I break octahedral filter decoding by returning 0 instead of decoding the data, I get this (affects last two rows with filters; doesn't affect animated or interleaved cubes because these are not using filters):
If I break INDICES decoding, I get this (doesn't affect animated cube because it's not compressed wrt geometry, doesn't affect other rows because first row is not compressed and subsequent rows are using TRIANGLES):
The script can also be used to generate data with a fallback buffer, which can then be loaded in a viewer that doesn't support this extension to begin with. The linked Gist also has a small three.js viewer that uses three.js GLTF loader extensibility to substitute an implementation for KHR extension, but this should not be necessary for three.js once the "real" PR gets merged. It was waiting for the extension to get to review draft, so it sounds like that could be merged now - I'll ping folks in that PR separately. If this implementation style is acceptable then I can generate the two variants (with/without fallback buffer) and that could be a new MeshoptCubeGrid test asset or thereabouts. The "normal" test assets are still valuable I think but they are much easier to generate as it's a matter of re-running gltfpack on the source assets, pending the note about which folder to place those in. |
|
The asset looks fine, one request for adding it to the Sample-Assets repo: consider adding in-scene labels using textured planes to the rows and columns (see, for example, |



This PR proposes a new extension,
KHR_meshopt_compression, which is a successor to the existing extensionEXT_meshopt_compression.Motivation
KHR_meshopt_compressionprovides general functionality to compress bufferViews. This compression is tailored to the common types of data seen in glTF buffers, but it's specified independently and is transparent to the rest of the loading process - implementations only need to decompress compressed bufferViews and the accessors behave as usual after that. The compression is designed to make it possible for optimized implementations to reach decompression throughput of multiple gigabytes per second on modern desktop hardware (through native or WebAssembly code), to ensure that the compression is not a bottleneck even when the transmission throughput is high.As a result, it's possible to compress mesh vertex and index data (including morph targets), point clouds, animation keyframes, instancing transforms, Gaussian Splat data and other types of binary data. Compression can be lossless, taking advantage of preprocessing to optimally order and optionally pre-quantize the data, or lossy with the use of filters, which allows arbitrary degree of tradeoff between transmission size and quality for multiple types of common glTF data.
The compression process is versatile and different data processing tools may make different tradeoffs with how to preprocess the data before compression, whether to use lossy compression and of what kind, etc. - by comparison, the decompression process is straightforward and fast. This is by design, and means that it's comparatively easier to implement this extension in renderers compared to processing pipelines, which removes as many barriers to adoption as possible.
Compared to
EXT_meshopt_compression, this extension uses an improved method of compressing attribute data, which delivers improved compression ratios at the same decompression speeds for a variety of different use cases, and incorporates special support for lossy color encoding which allows to reduce the size further for cases where color streams are a significant fraction of the asset (some 3D scanned meshes, point clouds, Gaussian splats). All existing use cases ofEXT_meshopt_compressionare supported well, and no significant performance compromises are being made - as such, all existing users ofEXT_meshopt_compressionshould be able to upgrade (pending ecosystem adoption) toKHR_meshopt_compression.Why a new extension?
EXT_meshopt_compressionhas been completed ~4 years ago; it serves as a good and versatile compression scheme that transparently supports geometry, animation, instancing data and allows to maintain maximum rendering efficiency and in-memory size while using additional compression during transfer, with decompression throughput in gigabytes/second on commodity hardware.Since then, the underlying compression implementation for attributes in meshoptimizer has been revised to version 1 (from version 0 that's used in the EXT extension) for better compression; this is currently used outside of glTF ecosystem by some native and web applications. Additionally, some use cases like point clouds and 3D scans have emerged after the extension was initially standardized, that benefit from better color compression (which has been considered for EXT but not included since at the time it was focused more on "traditional" 3D geometry).
Because of the change in the underlying compression bitstream, the bitstream specification needs to be revised as implementations of
EXT_meshopt_compressionmay not be able to decode the new format - thus, a new extension name is necessary.What changed?
KHR_meshopt_compressionuses the same JSON structure asEXT_extension, keeps the same three filters and keeps two existing schemes for index compression as is. It upgrades attribute compression to use a more versatile encoding (v1), which supports enhanced bit specification for deltas and customizable per-channel delta modes that improve compression further for 16-bit as well as some types of 32-bit data. For compatibility, v0 encoding is still supported. It also adds one new filter,COLOR, which applies additional lossy transform to quantized RGBA color data to decorrelate input channels (similarly toOCTAHEDRALencoding for vectors, this improves compression and provides more optionality wrt using variable number of bits, without any changes needed to the renderer as the filter unpacks data back into quantized normalized RGBA).On typical geometry data, enhanced attribute compression provides approximately 10% geomean reduction in vertex data size; the gains obviously highly depend on the content. On point clouds, as well as some 3D scanned models that use vertex colors, the new color filter together with the new attribute compression results in ~15% geomean reduction, with even stronger (20-25%) gains when non-aligned bit counts are used (e.g. 6-bit or 10-bit colors).
Why KHR?
These improvements require a new extension, as they update the data format / bitstream as well as the JSON enum for color filter. While it's possible to specify this as a new EXT extension, like
EXT_meshopt_compression2, it seemed better to promote to KHR:Since a lot of different parts of glTF ecosystem can be supported by meshopt compression, including core functionality and extensions like Gaussian splats or mesh instancing, specifying a KHR version provides a more comprehensive/coherent story wrt compression for glTF ecosystem.
The minimal "upgrade" path from EXT to KHR would involve just changing the extension name, as the original bitstream should be fully compatible with the new bitstream. The ideal upgrade path would involve re-encoding the original data (helpful if COLOR filter is useful), or at least losslessly re-encoding the attribute data from v0 to v1 - this doesn't require parsing any accessor data, and merely requires decompressing buffer views and re-compressing them with new encoding.
Implementations
Since this is a proposal that just got created, this extension obviously does not have implementations yet :) Having said that, because the JSON structure is exactly the same, except the addition of a COLOR filter, and most implementations of
EXT_meshopt_compressionuse meshoptimizer library which supports the new additions (color filter was released in 0.25), I'd expect that existing implementations forEXT_can be made compatible withKHR_with minimal effort:EXT_(three.js, Babylon.JS), updating meshoptimizer module and tweaking the JSON parsing code to recognizeKHR_extension should be only a few lines of changes and should be sufficient (for reference, full three.js implementation of EXT variant);EXT_(gltfpack, glTF-Transform), updating meshoptimizer module and exposing a user option that would serializeKHR_extension and encode attribute data using new attribute encoding, plus optionally support color filter, should be easy