Skip to content

Add UASTC HDR. #216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 27 commits into
base: main
Choose a base branch
from
Draft

Add UASTC HDR. #216

wants to merge 27 commits into from

Conversation

MarkCallow
Copy link
Contributor

Draft of language to add support for UASTC HDR. KDFS and all KDFS references need updating before this can be considered ready for merge.

Draft says to use colorModel KHR_DF_MODEL_UASTC_HDR (= 167) together with vkFormat
VK_FORMAT_ASTC_4x4_SFLOAT_BLOCK (= 1000066000). Any issues with that?

@MarkCallow MarkCallow requested a review from lexaknyazev August 24, 2024 04:24
@MarkCallow MarkCallow marked this pull request as draft August 24, 2024 04:24
@MarkCallow
Copy link
Contributor Author

Draft says to use colorModel KHR_DF_MODEL_UASTC_HDR (= 167) together with vkFormat
VK_FORMAT_ASTC_4x4_SFLOAT_BLOCK (= 1000066000). Any issues with that?

@lexaknyazev please let me know your opinion on this as soon as possible. Rich plans to release his code very soon. We need to decide the representation in a .ktx2 file before then.

@MarkCallow
Copy link
Contributor Author

A version of https://github.com/BinomialLLC/basis_universal/wiki/UASTC-HDR-Texture-Specification-v1.0 needs to be incorporated into the Data Format Specification. That document has two parts: a description of the UASTC HDR subset of ASTC and a description of transcoding to BC6H. I think for simplicity and ease of use whole thing should be incorporated into KDFS but an argument could be made for putting the subset description in KDFS and the transcoding description into the KTX spec.

@lexaknyazev
Copy link
Member

the transcoding description into the KTX spec

The unreleased KDFS version of the original UASTC spec contains both the bitstream and the transcoding steps. I think UASTC HDR should follow the same path.

@MarkCallow
Copy link
Contributor Author

@richgel999 says we need to save in the file the scale used to scale the maximum value from .exr/.hdr down to what BC6H/ASTC can handle, "roughly ~65k". We'll need to add a new standard metadata item for this. He promised me a reference to a similar field in the .exr standard. Once I have that to refer to I'll draft a description of the item.

@lexaknyazev
Copy link
Member

lexaknyazev commented Sep 26, 2024

The range for BC6H/ASTC HDR is [0.0 … 65504.0], same as non-negative 16-bit floats.

EXR supports three data types: float16, float32, and uint32. Leaving uint32 aside, other two types are signed. It would make sense to store two values: a scale and an offset, to be able to map any source range to UASTC HDR.

HDR (Radiance) range is [0.0 … 1.698e+38]. Although these values are always non-negative, it would still be useful to support both a scale and an offset for more efficient use of the UASTC HDR range.

@lexaknyazev
Copy link
Member

So my proposal would be to add a metadata entry with an 8-byte payload containing two float32 values: scale and offset, with the following usage:

  • Allowed for all floating-point Vulkan formats.
  • When present, the effective texture values are sampled * scale + offset.

It's important to multiply first so that decoders could use a single FMA instruction to apply the scale and the offset at once.

@MarkCallow
Copy link
Contributor Author

  • When present, the effective texture values are sampled * scale + offset

Let me make sure I have this correct. During encoding the "sampled" value will be calculated as sampled = (tv - offset) /scale where tv is the original texture value. Then that value is restored during sampling by tv = sampled * scale + offset. Correct? I realize that in the second case sampled is the filtered value from the texture so "sampled" in my 2 equations are not identical quantities. I'm trying to keep things simple.

@lexaknyazev
Copy link
Member

Yes, that's correct. Exact choice of scale and offset is up to the encoder but they must be finite values, i.e., neither NaN nor Inf.

Whether sampled is post- or pre-filtered is an implementation detail. Assuming that original floating-point values are linear and the filtering is linear or nearest (e.g., not cubic) the results would be the same modulo FP precision.

@MarkCallow
Copy link
Contributor Author

So my proposal would be to add a metadata entry with an 8-byte payload containing two float32 values: scale and offset, with the following usage:

How about KTXmapRange which has the benefit of being a multiple of 4 bytes long (when the terminating NUL is included).

@fluppeteer
Copy link

the transcoding description into the KTX spec

The unreleased KDFS version of the original UASTC spec contains both the bitstream and the transcoding steps. I think UASTC HDR should follow the same path.

I was going to email this, but this is as good a place as any...

I agree that describing both is ideal. I've been making decent progress in merging everything for a 1.3.2 release and should have something reviewable over the weekend, but we had stalled at the point of folding in UASTC before, and I wanted clarification on the thing that had previously made me hesitate:

There was a comment that "to get 32-bpp data from a UASTC texture, it's recommended to first transcode to ASTC, then decode that by following the ASTC specification and any applicable extensions", and I don't think I ever had clarification on whether this meant to remove the entire "Decoding process" section (which describes individual pixel colour values). Is that the intent, or should I preserve both, with the note that they're supposed to be identical and that decoding via ASTC is still recommended?

@richgel999
Copy link

There was a BinomialLLC/basis_universal#162 (comment) that "to get 32-bpp data from a UASTC texture, it's recommended to first transcode to ASTC, then decode that by following the ASTC specification and any applicable extensions", and I don't think I ever had clarification on whether this meant to remove the entire "Decoding process" section (which describes individual pixel colour values). Is that the intent, or should I preserve both, with the note that they're supposed to be identical and that decoding via ASTC is still recommended?

If you're using BasisU's transcoder, you can transcode UASTC LDR to basist::cTFRGBA32 (or another uncompressed LDR format, like cTFRGB565 etc.). Internally this would skip the ASTC block packing step, and be faster. It will decode the UASTC blocks to a logical format and then immediately decode these block pixels to 32bpp.

The Universal ASTC HDR image format (UASTC HDR) is indicated by
`colorModel` `KHR_DF_MODEL_UASTC_HDR` (= 167). This format supports
two compression ratios defined by the texel block size in the DFD:
4x4 and 6x6. Images in this format are a strict RGB-only ASTC HDR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reference for UASTC HDR 6x6?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not available yet. @richgel999 is working on it.

ktxspec.adoc Outdated
@@ -633,7 +635,8 @@ supercompression.
| 1 | BasisLZ | <<etc1s_slice,ETC1S Slice Decoding>> | <<basislz_gd,BasisLZ Global Data>>
| 2 | Zstandard | <<RFC8478>> | n/a
| 3 | ZLIB | <<RFC1950>> | n/a
| 4・・・0xffff | Reserved^1^ | |
| 4 | Basis GPU Photo 6x6 | https://github.com/BinomialLLC/basis_universal/wiki/ASTC-HDR-6x6-Intermediate-File-Format-(Basis-GPU-Photo-6x6) | n/a
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to figure out what to call this. Note that this is just the supercompressed bitstream not a "file format" which the linked document name implies. @richgel999 any suggestions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That scheme is a bit similar to BasisLZ but does not use Huffman coding. It's a mix of RLE, VLC, and logical repackaging of ASTC fields.

Copy link

@richgel999 richgel999 Feb 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - it's command based. The current format is easy for the RDO coder to target (it's trivial to accurately estimate command bit prices, and all bit prices are integer and stable). Adding entropy coding and eliminating other redundancies would probably boost compression significantly (10-30%?).

The high-level format is documented here:
https://github.com/BinomialLLC/basis_universal/wiki/UASTC-HDR-6x6-Intermediate-File-Format-(Basis-GPU-Photo-6x6)

Adding some sort of entropy coding to the format would be a relatively easy next step.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If entropy coding is coming, we shouldn't standardize it now.

Copy link

@richgel999 richgel999 Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Entropy coding is coming in the next major HDR release, but LDR 4x4-8x8 RDO+intermediate and real-time encoders for JPEG/JPEG XL etc. are looking like higher priorities for our corporate customers (so it might take a while).
Entropy coding is a tradeoff, and we're still going to support the current files. Most of the compression value comes from ASTC itself and the encoder's RDO, so it'll remain optional.

Copy link
Contributor Author

@MarkCallow MarkCallow Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I have moved GPU Photo supercompression to the vendor list. See 291aba1 and 7c62b62. (Sorry for two. I failed to copy the format link the first time.) This means Rich can make legal KTX2 files now. When entropy coding is finalized we can add it to the standardized list with the name minus the BINOMIAL suffix.

@richgel999 what contact info would you like to include in the vendor table?

Copy link
Member

@lexaknyazev lexaknyazev Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved GPU Photo supercompression to the vendor list.

@MarkCallow I think this is unnecessary:

  1. The scheme is now called "UASTC HDR 6x6 Intermediate" in Basis Universal readme.
  2. We should be able to standardize it without entropy coding for now and add one more enum later if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a screenshot of the formatted vendor scheme table.

Screenshot 2025-02-25 at 21 47 00

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved GPU Photo supercompression to the vendor list.

@MarkCallow I think this is unnecessary:

I was responding to your earlier comment

If entropy coding is coming, we shouldn't standardize it now.

I had somehow missed Rich's comment about the software continuing to support current files and entropy coding being optional. I'll revert the change tomorrow.

The scheme is now called "UASTC HDR 6x6 Intermediate"

Yes. I find it confusing to use the same name for both a supercompression scheme and a block-compressed format. Plus it is a very long name.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richgel999 what contact info would you like to include in the vendor table?

[email protected] is good.

16-bit floating-point values before being uploaded to and sampled
by a GPU that does not support ASTC HDR. UASTC HDR images can be
supercompressed with any scheme except BasisLZ (`supercompressionScheme`
= 1) and Basis GPU Photo 6x6 (`supercompressionScheme` = 4). If
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am understanding correctly the ASTC 6x6 part of "UASTC HDR 6x6 Intermediate" is the same as UASTC HDR 6x6 as @richgel999 has stated the same encoder is used. Therefore what I would like to say here is

"UASTC HDR 4x4 images can be supercompressed with any scheme except BasisLZ and Basis GPU Photo 6x6. UASTC 6x6 images can be supercompressed with any scheme except BasisLZ. Best results are obtained with Basis GPU Photo 6x6."

@richgel999 is this okay.

Copy link

@richgel999 richgel999 Feb 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you go:

  1. UASTC HDR 4x4: the KTX2 data is 100% standard ASTC HDR 4x4. Can be stored either uncompressed or with Zstd. The current encoder doesn't support RDO yet, just highest quality. The allowed subset of ASTC (which permits very fast pure transcoding to BC6H) is described here:
    https://github.com/BinomialLLC/basis_universal/wiki/UASTC-HDR-4x4-Texture-Specification-v1.0

  2. UASTC HDR 6x6: the KTX2 data is standard ASTC HDR 6x6. Can be stored either uncompressed or with Zstd. The encoder supports highest quality or RDO encoding. The current encoder only exploits up to 75 modes (but this could potentially change). The transcoder can unpack the ASTC HDR 6x6 tex data using our general purpose ASTC decoder and re-encode to BC6H.

The plan for this is to further optimize BC6H transcoding from this 75 mode subset, but it wasn't necessary for the initial release because the scalar BC6H encoder is quite optimized. The BC6H encoder supports 1 or 2 subsets.

  1. "UASTC HDR 6x6 Intermediate" (the beginnings of "GPU Photo") use a custom compression format that is internally based off ASTC HDR 6x6, so it's simple to transcode to ASTC HDR 6x6. The transcoder can unpack this to ASTC HDR 6x6 and re-encode to BC6H. The custom format is described in our wiki here:
    https://github.com/BinomialLLC/basis_universal/wiki/UASTC-HDR-6x6-Intermediate-File-Format-(Basis-GPU-Photo-6x6)

The current custom format for this does not use entropy coding, just phase-in codes. The BC6H encoder is the same one used for UASTC HDR 6x6.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like we do need a "UASTC HDR 6x6 Intermediate" section separate from the "UASTC HDR" section.

@richgel999
Copy link

Note Windows ARM support is checked into the Basis Universal repo. Tested on Snapdragon X.

@MarkCallow
Copy link
Contributor Author

MarkCallow commented Apr 2, 2025

Sorry for my long delay on this. I ended up in a much deeper hole than expected getting the KTX-Software 4.4.0 release out. It is done now and has cleared the decks for the next release to focus on HDR work.

About removing the vendor GPU Photo supercompression scheme and reverting it to a standard scheme, I have a question for @richgel999. When entropy encoding is introduced will decoders need to be aware of it or is it purely on the encoder side? If the decoder needs to be aware then I am reluctant to make it a standard scheme at this point.

Also @richgel999 any further thoughts on naming for the intermediate format? As I wrote elsewhere I find using the same name for for both the supercompression scheme and block-compressed format confusing plus "UASTC HDR 6x6 Intermediate" is long and, I feel, unintuitive.

The bitstream specs still seem to be in flux. Do the decoders need to be aware of any of the planned changes, @richgel999.

I'd like to get this wrapped up as quickly as possible so we get approval from the WG and submit the spec. for ratification, which I think is necessary given the scope of the additions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants