Skip to content

Conversation

@zeux
Copy link
Contributor

@zeux zeux commented Aug 6, 2025

This PR proposes a new extension, KHR_meshopt_compression, which is a successor to the existing extension EXT_meshopt_compression.

Motivation

KHR_meshopt_compression provides general functionality to compress bufferViews. This compression is tailored to the common types of data seen in glTF buffers, but it's specified independently and is transparent to the rest of the loading process - implementations only need to decompress compressed bufferViews and the accessors behave as usual after that. The compression is designed to make it possible for optimized implementations to reach decompression throughput of multiple gigabytes per second on modern desktop hardware (through native or WebAssembly code), to ensure that the compression is not a bottleneck even when the transmission throughput is high.

As a result, it's possible to compress mesh vertex and index data (including morph targets), point clouds, animation keyframes, instancing transforms, Gaussian Splat data and other types of binary data. Compression can be lossless, taking advantage of preprocessing to optimally order and optionally pre-quantize the data, or lossy with the use of filters, which allows arbitrary degree of tradeoff between transmission size and quality for multiple types of common glTF data.

The compression process is versatile and different data processing tools may make different tradeoffs with how to preprocess the data before compression, whether to use lossy compression and of what kind, etc. - by comparison, the decompression process is straightforward and fast. This is by design, and means that it's comparatively easier to implement this extension in renderers compared to processing pipelines, which removes as many barriers to adoption as possible.

Compared to EXT_meshopt_compression, this extension uses an improved method of compressing attribute data, which delivers improved compression ratios at the same decompression speeds for a variety of different use cases, and incorporates special support for lossy color encoding which allows to reduce the size further for cases where color streams are a significant fraction of the asset (some 3D scanned meshes, point clouds, Gaussian splats). All existing use cases of EXT_meshopt_compression are supported well, and no significant performance compromises are being made - as such, all existing users of EXT_meshopt_compression should be able to upgrade (pending ecosystem adoption) to KHR_meshopt_compression.

Why a new extension?

EXT_meshopt_compression has been completed ~4 years ago; it serves as a good and versatile compression scheme that transparently supports geometry, animation, instancing data and allows to maintain maximum rendering efficiency and in-memory size while using additional compression during transfer, with decompression throughput in gigabytes/second on commodity hardware.

Since then, the underlying compression implementation for attributes in meshoptimizer has been revised to version 1 (from version 0 that's used in the EXT extension) for better compression; this is currently used outside of glTF ecosystem by some native and web applications. Additionally, some use cases like point clouds and 3D scans have emerged after the extension was initially standardized, that benefit from better color compression (which has been considered for EXT but not included since at the time it was focused more on "traditional" 3D geometry).

Because of the change in the underlying compression bitstream, the bitstream specification needs to be revised as implementations of EXT_meshopt_compression may not be able to decode the new format - thus, a new extension name is necessary.

What changed?

KHR_meshopt_compression uses the same JSON structure as EXT_ extension, keeps the same three filters and keeps two existing schemes for index compression as is. It upgrades attribute compression to use a more versatile encoding (v1), which supports enhanced bit specification for deltas and customizable per-channel delta modes that improve compression further for 16-bit as well as some types of 32-bit data. For compatibility, v0 encoding is still supported. It also adds one new filter, COLOR, which applies additional lossy transform to quantized RGBA color data to decorrelate input channels (similarly to OCTAHEDRAL encoding for vectors, this improves compression and provides more optionality wrt using variable number of bits, without any changes needed to the renderer as the filter unpacks data back into quantized normalized RGBA).

On typical geometry data, enhanced attribute compression provides approximately 10% geomean reduction in vertex data size; the gains obviously highly depend on the content. On point clouds, as well as some 3D scanned models that use vertex colors, the new color filter together with the new attribute compression results in ~15% geomean reduction, with even stronger (20-25%) gains when non-aligned bit counts are used (e.g. 6-bit or 10-bit colors).

Why KHR?

These improvements require a new extension, as they update the data format / bitstream as well as the JSON enum for color filter. While it's possible to specify this as a new EXT extension, like EXT_meshopt_compression2, it seemed better to promote to KHR:

  • This matches existing compression formats, like Draco and Basis Universal, as well as some formats pending discussion like SPZ
  • This makes the compression scheme, which has proven to be versatile and useful since it got standardized, more first class, with associated ecosystem benefits in the coming years (more support)

Since a lot of different parts of glTF ecosystem can be supported by meshopt compression, including core functionality and extensions like Gaussian splats or mesh instancing, specifying a KHR version provides a more comprehensive/coherent story wrt compression for glTF ecosystem.

The minimal "upgrade" path from EXT to KHR would involve just changing the extension name, as the original bitstream should be fully compatible with the new bitstream. The ideal upgrade path would involve re-encoding the original data (helpful if COLOR filter is useful), or at least losslessly re-encoding the attribute data from v0 to v1 - this doesn't require parsing any accessor data, and merely requires decompressing buffer views and re-compressing them with new encoding.

Implementations

Since this is a proposal that just got created, this extension obviously does not have implementations yet :) Having said that, because the JSON structure is exactly the same, except the addition of a COLOR filter, and most implementations of EXT_meshopt_compression use meshoptimizer library which supports the new additions (color filter was released in 0.25), I'd expect that existing implementations for EXT_ can be made compatible with KHR_ with minimal effort:

  • For loaders that currently implement support for EXT_ (three.js, Babylon.JS), updating meshoptimizer module and tweaking the JSON parsing code to recognize KHR_ extension should be only a few lines of changes and should be sufficient (for reference, full three.js implementation of EXT variant);
  • For data processors that currently implement support for EXT_ (gltfpack, glTF-Transform), updating meshoptimizer module and exposing a user option that would serialize KHR_ extension and encode attribute data using new attribute encoding, plus optionally support color filter, should be easy
  • For any loader that wants to implement this without relying on meshoptimizer library for some reason, I've updated the reference decoder by following the updates made to this specification, so it should be comprehensive and while it's more changes than you'd need if you were just using the library, it's a manageable amount of extra complexity.

@lexaknyazev
Copy link
Member

it seemed better to promote to KHR

KHR vs EXT prefix choice generally means different treatment of the extension's IP besides perceived "ecosystem support". In particular the following details should be clarified upfront (not a legal advice, though):

  • Khronos would be the extension text copyright holder
  • The extension should not include 3rd-party trademarks or their use should be explicitly allowed (is "meshopt" registered in some way?)
  • Any technology needed by the extension may become included in the Khronos IP framework (with some caveats)

@zeux
Copy link
Contributor Author

zeux commented Aug 6, 2025

Understood. I'm not a lawyer, but none of these seem like blockers to me (pending closer review). meshopt is not a registered trademark.

@zeux
Copy link
Contributor Author

zeux commented Sep 2, 2025

Some updates!

  1. There should not be any copyright/trademark/IP issues as far as I'm concerned.
  2. meshoptimizer 0.25 released last week includes support for color filter and v1 vertex codec in JS module, as well as an update to the reference decoder that supports both as well.
  3. While no tools exist that can produce files with this extension, and I intentionally deferred official support in gltfpack for it, I have an implementation of encoding that requires just a few small tweaks to the source code (to use the correct extension name), so it should be very easy to produce test assets if necessary
  4. I took a stab at seeing what it takes to upgrade an implementation of EXT_meshopt_compression to KHR; since the heavy-lifting is done by meshoptimizer library, the actual change should only require a few lines of JSON parsing changes. See mrdoob/three.js@dev...zeux:three.js:khr-meshopt - the first commit there is just updating the meshopt_decoder to latest version.

It would be great to understand what the next steps could be here, as from my perspective all blockers have been resolved. Quite confident the actual implementations are going to be quick to finalize if there is agreement from Khronos in principle that the extension should be included. I'm obviously also happy to adjust the proposed text if corrections or clarifications are necessary.

@lexaknyazev
Copy link
Member

The three.js update suggests that the decoder does not need to know whether the glTF asset uses the original EXT or the new KHR extension name. This implies that existing EXT (v0) files could be upgraded to KHR (v1) without re-encoding anything at all.

Since v0 support is not going anywhere (because the EXT extension has been ratified and tools are expected to accept such files indefinitely), I'd suggest allowing v0 in KHR as well (with a note about v1 benefits), given that the intention is to keep using the same extension name.

@zeux
Copy link
Contributor Author

zeux commented Sep 4, 2025

Since v0 support is not going anywhere (because the EXT extension has been ratified and tools are expected to accept such files indefinitely), I'd suggest allowing v0 in KHR as well (with a note about v1 benefits), given that the intention is to keep using the same extension name.

That sounds good to me. I originally specified KHR as just accepting v1 in order to have the smallest/simplest possible specification. But indeed, this restriction is not strictly necessary - v0 support is never going away, and the format version is encoded in the first byte of the stream so the meshoptimizer library can decode both. The reference decoder decodes both as well. I can change this, it would just require a more complicated bitstream description where specific additions are called out as only being there if version is 1.

@lexaknyazev
Copy link
Member

@javagl In the context of this extension, there are two quite independent pieces of "an implementation":

  1. Parsing JSON properties, understanding how to use meshopt-compressed data in the context of a glTF asset.
  2. Decoding the compressed bitstreams.

The first piece is engine-specific and therefore, for example, glTF-Transform, three.js, Babylon.js, and Khronos glTF-Sample-Viewer would be four separate implementations should they support the extension.

The second piece is engine-agnostic so different viewers can technically use the same decoding library. That said, the bitstream spec must be complete and unambiguous enough to enable anyone implement a decoder from scratch without referring to any source code besides spec-inlined snippets. Given that this bitstream spec has two implementations written in two different languages, I think they count as two.

@javagl
Copy link
Contributor

javagl commented Dec 8, 2025

The bitstream specification here looks very detailed. It looks like it should be possible to implement a decoder (and maybe even encoder?) from that. It looks like there actually are things that can count as "two implementations". So to not further pollute this PR with off-topic discussions, I moved that to #2542

zeux added 3 commits December 10, 2025 10:04
- Use uintN more consistently as outputs and two of four inputs are
  unsigned
- Define findMSB
- Clarify that the output is unsigned normalized
- Clarify top bits when K is smaller than maximum
- Clarify that the range for signed chrominance values is symmetric
- Clarify that the decoding using 32-bit integer math must result in
  K-bit value
@lexaknyazev
Copy link
Member

lexaknyazev commented Dec 15, 2025

@zeux We'd need sample assets covering all JSON properties of this extension to ensure that engines correctly pass them to decoders. In particular:

  • An asset with a non-zero byteOffset
  • An asset with a fallback buffer
  • Assets covering all modes
    • For the attributes mode
      • All filters, including NONE and undefined
      • Both v0 and v1 streams
      • As a special case, a v0 stream used with the color filter
    • For the triangles mode
      • Both byteStride options
    • For the indices mode
      • Both byteStride options

It would also be great to eventually have sample streams exhaustively covering the bitstream spec to ensure that decoders handle it correctly. Note that exhaustive bitstream coverage is mostly a "nice-to-have" thing for now but it's required to ensure long-term sustainability (think potential inclusion of meshopt in the ISO version of glTF). Some of the test cases may be merged together when that would make sense. Logistically, they may be organized as glTF assets containing both compressed and uncompressed data and asserting that decompressing the compressed blocks yields uncompressed blocks exactly (when filters aren't used) or within a reasonable epsilon.

Here's a list of test cases based on my understanding of the spec.
  • For the attributes mode:
    • different byte stride values resulting in different maxBlockElements values, not sure if the full set of block sizes (256, 224, 192, 176, 160, 144, 128, 112, 96, 80, 64, 48, 32) needs to be covered;
    • a number of elements not divisible by 16;
    • all control modes, all delta encoding modes, and all channel modes;
    • for the octahedral filter:
      • both byteStride options;
      • negative x and y values;
      • boundary K values for each valid byte stride, additionally the case when K=8 and N=16;
      • values that need clamping for negative hemisphere;
      • fourth component with bit width exceeding K (to ensure that it's not clamped/masked);
    • for the quaternion filter:
      • boundary K values (4 and 16);
      • negative x, y, and z values;
      • clamping of negative values during w calculation;
    • for the exponential filter:
      • positive and negative exponent values;
      • positive and negative mantissa values;
      • boundary exponent values;
      • exactness of the decoding;
    • for the color filter:
      • both byte stride options;
      • boundary K values for each valid byte stride, additionally the case when K=8 and N=16;
      • valid y, co, and cg values that cause the intermediate RGB values to not fit into the original N-bit representation.
  • For the triangles mode:
    • 32-bit values used with byteStride: 2;
    • data relying on next and last wraparounds;
    • data using all 16 elements of the FIFOs;
    • data containing varint-7 values above 4294967295;
    • data covering all 0xXY branches including sub-branches (Z = 0 and W = 0);
    • data using all 16 elements of the codeaux block.
  • For the indices mode:
    • 32-bit values used with byteStride: 2;
    • data relying on last wraparound;
    • data using both baseline index values;
    • data containing varint-7 values above 4294967295.

@zeux
Copy link
Contributor Author

zeux commented Dec 15, 2025

My understanding is that this would be independent of this PR, and would belong in glTF-Sample-Assets?

That repository has a couple EXT_ assets and it should be easy to replicate these for KHR (see also #2517 (comment)) - that is not combinatoric of course.

In addition, I think perhaps just two assets should be enough to cover the JSON variations; a single indexed sphere mesh with normals and vertex colors can be used to test all 4 encoding modes (attributes could be encoded using v0 & v1 for different spheres; indices can be encoded using triangles or indices for different spheres), and 3 of 4 filters (e.g. exponential for float3 positions, octahedral for quantized normals, color for quantized colors), with varying byte strides. To test quaternion filter you need an animation or EXT_mesh_gpu_instancing; BrainStem asset has that but a basic animation could also be added to the same test asset, e.g. a small grid of spheres plus a rotating cube. I guess we could use instancing instead and rely on vertex color pattern rotating visually, as the sphere geometry is symmetric, or use a more complex base mesh instead of a sphere. Then the entire asset could be duplicated with a version with a fallback buffer, in which the extension would be optional, and the expectation would be that rendering matches between:

  • asset with fallback, for renderers that don't support the extension;
  • asset with fallback, for renderers that support the extension;
  • asset without fallback, for renderers that support the extension.

The bitstream test assets are more difficult to produce just because of the sheer number, and the requirement to have custom code that produces them as these include suboptimal decisions that my current encoder never performs. It's definitely possible to make in the future; I'm also not sure where this asset would belong, as it would need to be part of some programmatic framework instead of a renderer ideally (e.g. the quaternion filter testing would need to produce a mostly incoherent quaternion stream, attempts to use that for animations would be difficult to validate visually).

@lexaknyazev
Copy link
Member

My understanding is that this would be independent of this PR, and would belong in glTF-Sample-Assets?

Right but that's one of the conditions for marking an extension a Release Candidate.

we could use instancing instead

That would create a dependency on the instancing extension, which would be suboptimal for a meshopt-focused test asset.


There are some early ideas on creating dedicated collections of low-level test assets, bitstream tests would eventually go there.

@zeux
Copy link
Contributor Author

zeux commented Dec 15, 2025

Right but that's one of the conditions for marking an extension a Release Candidate.

The sequencing here isn't super clear to me; what status is this extension in right now?

A simpler pair of test assets for BrainStem & flower vase (linked in the comment above) test all 4 filters (although not all of them are with a full set of bytestrides) and two modes out of 3 (no INDICES coverage), using v1. The flower asset is a point cloud though so including that into glTF-Sample-Assets is probably complicated, this is from the issue I've filed a few years ago KhronosGroup/glTF-Sample-Assets#31. Neither uses the fallback though; these are good to do basic testing of the existing implementations, but not as comprehensive long term. So probably the most direct route is a composite asset along the lines I've sketched above... maybe it's easier to make an asset with several cubes here, as that can then include animations for rotations which would be easy to visually distinguish.

For the existing EXT_meshopt assets in glTF-Sample-Assets repository, we could either "upgrade" them, or come up with a folder naming scheme that incorporates both variants as they are currently placed in glTF-Meshopt folders.

@lexaknyazev
Copy link
Member

The sequencing here isn't super clear to me; what status is this extension in right now?

Admittedly the formal extension development lifecycle is not fully defined yet. This extension should be a Review Draft as of now.

To move it forward (to RC, which implies merging the PR), we'd need sample assets and at least one glTF viewer/loader implementation. It seems that the latter is trivial and almost complete (although not merged).

@zeux
Copy link
Contributor Author

zeux commented Dec 15, 2025

I "wrote" a script that generates a synthetic test asset that might be helpful here ("wrote" is quoted because the code is generated by GPT Codex, I've lightly reviewed the code and the generated asset, and I also confirmed that "breaking" individual parts of the decoder manually breaks the asset in "correct" ways).

The asset is a 5x5 cube grid, where columns of the grid vary the geometry encoding from the point of view of glTF spec (different bit counts for normals/colors/indices, interleaved vs not, and last column is an animated uncolored cube):

https://gist.github.com/zeux/d26340c53dd70d19ae18045e79d065df#file-layout-md

The generated file normally renders like this - subject to lighting conditions etc., the file itself doesn't carry a light setup:

image

The rows of the grid use different ways to encode the data. The top row is uncompressed, and then we have v0 compression with INDICES & TRIANGLES compression for index buffer without filters, then v0 with filters & triangles, and last row is v1 with filters & triangles. Note that interleaved data (first column) always has no filters applied, and as such v1 with no filters is implicitly tested here too.

For example, if I break octahedral filter decoding by returning 0 instead of decoding the data, I get this (affects last two rows with filters; doesn't affect animated or interleaved cubes because these are not using filters):

image

If I break INDICES decoding, I get this (doesn't affect animated cube because it's not compressed wrt geometry, doesn't affect other rows because first row is not compressed and subsequent rows are using TRIANGLES):

image

The script can also be used to generate data with a fallback buffer, which can then be loaded in a viewer that doesn't support this extension to begin with.

The linked Gist also has a small three.js viewer that uses three.js GLTF loader extensibility to substitute an implementation for KHR extension, but this should not be necessary for three.js once the "real" PR gets merged. It was waiting for the extension to get to review draft, so it sounds like that could be merged now - I'll ping folks in that PR separately.

If this implementation style is acceptable then I can generate the two variants (with/without fallback buffer) and that could be a new MeshoptCubeGrid test asset or thereabouts. The "normal" test assets are still valuable I think but they are much easier to generate as it's a matter of re-running gltfpack on the source assets, pending the note about which folder to place those in.

meshopt-cubes.zip

@lexaknyazev
Copy link
Member

The asset looks fine, one request for adding it to the Sample-Assets repo: consider adding in-scene labels using textured planes to the rows and columns (see, for example, DispersionTest).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants