Skip to content

BENTO4: AP4_CencSampleInfoTable::Create rejects valid IV-only sample aux info (breaks CENC audio playback) #2015

@drewvs

Description

@drewvs

Bug report

Summary

AP4_CencSampleInfoTable::Create in Source/C++/Core/Ap4CommonEncryption.cpp rejects sample auxiliary information that is exactly iv_size bytes long (IV only, no subsample map) with AP4_ERROR_INVALID_FORMAT. This is a valid CENC layout for protected samples that do not use subsample encryption — e.g. audio tracks, where the whole sample is encrypted in one piece and there is no need for a subsample count / subsample map.

Content that uses this layout (YouTube TV / YouTube Music VOD, and likely other services) silently fails to decrypt downstream: the caller in inputstream.adaptive's CFragmentedSampleReader::ProcessMoof catches the error and treats the fragment as unencrypted (goto SUCCESS with the comment "we assume unencrypted fragment here"), passing ciphertext straight to the decoder. For AAC audio this presents as an endless stream of ffmpeg errors (Reserved bit set, Number of bands exceeds limit, Prediction is not allowed in AAC-LC, ms_present = 3 is reserved) and eventually crashes the player when the audio pipeline state corrupts.

Reproduction

  • Player: Kodi 21 Omega with inputstream.adaptive 21.5.18 (stock)
  • Content: any YouTube TV / YouTube Unplugged VOD movie with encrypted audio, e.g. an MGM+ title
  • DRM: Widevine L3
  • Result: for a movie whose first audio subsegment is cleartext, audio plays for ~10 seconds, then garbage starts as soon as the first encrypted subsegment is read; otherwise audio is garbage from the first sample.

The raw saiz/saio box layout from the affected files:

saiz: flags=0x000000 default_sample_info_size=8 sample_count=N
saio: version=0 entry_count=1 offset=<offset into mdat payload>

default_sample_info_size = 8 exactly matches tenc.default_Per_Sample_IV_Size = 8. There is no trailing subsample_count, no subsample map. The IVs are stored inline at the start of each moof's mdat.

Root cause

Source/C++/Core/Ap4CommonEncryption.cpp around line 2950 (v1.6.0-641-3-Omega):

if (per_sample_iv_size) {
    if (per_sample_iv_size > info_size) {
        result = AP4_ERROR_INVALID_FORMAT;
        goto end;
    }
    table->SetIv(saiz_index, info_data);
} else {
    table->SetIv(saiz_index, constant_iv);
}
if (info_size < per_sample_iv_size+2) {     // <-- bug
    result = AP4_ERROR_INVALID_FORMAT;
    goto end;
}
AP4_UI16 subsample_count = AP4_BytesToUInt16BE(info_data+per_sample_iv_size);
if (info_size < per_sample_iv_size+2+subsample_count*6) {
    result = AP4_ERROR_INVALID_FORMAT;
    goto end;
}
table->AddSubSampleData(subsample_count, info_data+per_sample_iv_size+2);

The second if unconditionally requires two more bytes after the IV for a subsample_count field. Per ISO/IEC 23001-7 (Common Encryption), subsample mapping is not mandatory — it is only required when the UseSubsampleEncryption flag is set in the sample encryption box (senc) or when the sample has cleartext portions that must be distinguished from encrypted ones. Audio samples where every byte is encrypted need only the IV.

Proposed fix

Treat aux info of exactly iv_size bytes as "IV only, zero subsamples":

--- a/Source/C++/Core/Ap4CommonEncryption.cpp
+++ b/Source/C++/Core/Ap4CommonEncryption.cpp
@@ -2956,17 +2956,32 @@
                 } else {
                     table->SetIv(saiz_index, constant_iv);
                 }
-                if (info_size < per_sample_iv_size+2) {
-                    result = AP4_ERROR_INVALID_FORMAT;
-                    goto end;
-                }
-                AP4_UI16 subsample_count = AP4_BytesToUInt16BE(info_data+per_sample_iv_size);
-                if (info_size < per_sample_iv_size+2+subsample_count*6) {
-                    // not enough data
+                // Aux info layouts we must accept:
+                //   1) [IV][subsample_count:2][subsample_map:subsample_count*6]
+                //      (subsample encryption, e.g. video with cleartext slice
+                //      headers mixed with encrypted payload)
+                //   2) [IV]                                  (no subsample info)
+                //      (whole-sample encryption, common for audio where every
+                //      byte of the sample is encrypted and there is nothing
+                //      to describe — valid per ISO/IEC 23001-7)
+                AP4_UI16 subsample_count = 0;
+                if (info_size >= per_sample_iv_size+2) {
+                    subsample_count = AP4_BytesToUInt16BE(info_data+per_sample_iv_size);
+                    if (info_size < per_sample_iv_size+2+subsample_count*6) {
+                        // not enough data
+                        result = AP4_ERROR_INVALID_FORMAT;
+                        goto end;
+                    }
+                    table->AddSubSampleData(subsample_count, info_data+per_sample_iv_size+2);
+                } else if (info_size == per_sample_iv_size) {
+                    // IV-only aux info: whole sample is encrypted, no subsamples.
+                    table->AddSubSampleData(0, NULL);
+                } else {
                     result = AP4_ERROR_INVALID_FORMAT;
                     goto end;
                 }
-                table->AddSubSampleData(subsample_count, info_data+per_sample_iv_size+2);
                 saiz_index++;
             }
         }

The change is backward compatible: content that uses the existing [IV][subsample_count:2][subsample_map] layout still takes the first branch unchanged. Only aux info that is exactly iv_size bytes (and would have failed before) is now accepted.

Testing

Verified on the YouTube TV VOD path in plugin.video.youtubetv (sabr branch). Without the patch: audio decodes as AAC garbage starting at the first encrypted subsegment, ffmpeg produces continuous [aac] Reserved bit set / Number of bands exceeds limit errors, Kodi crashes. With the patch: audio decrypts correctly through the full file, no ffmpeg errors, no crashes. Video playback and existing CENC video decryption are unaffected (tested on the same content).

Cross-verified that the license server returns keys with KIDs matching the audio init's tenc box, and that pywidevine with the same PSSH can decrypt the same subsegments byte-perfectly using those keys — proving the issue was not in the license exchange or the key set, but in how Bento4 consumed the sample auxiliary info.

References

  • ISO/IEC 23001-7:2016 §7.3, "Sample Auxiliary Information format"
  • xbmc/inputstream.adaptive#2013 — the secure audio path discussion that surfaced the underlying flow
  • Patch file in xbmc/inputstream.adaptive fork branch omega-audio-secpath: depends/common/bento4/0001-accept-iv-only-aux-info-for-audio.patch

Debuglog

The debuglog can be found here:

Your Environment

Used Operating system:

  • Android

  • iOS

  • tvOS

  • [X ] Linux

  • macOS

  • Windows

  • Windows UWP

  • Operating system version/name:

  • Kodi version: 21/22

Metadata

Metadata

Assignees

No one assigned

    Labels

    Issue Type: Bugissue has reported a bugResolution: Externalproblem not related to this repo or caused by external factors

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions