Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a bug where ISVCDecoder::DecodeFrameNoDelay() could fail when decoding an H.264 stream encoded by the Apple HWA encoder. #3787

Merged
merged 3 commits into from
Oct 10, 2024

Conversation

sile
Copy link
Contributor

@sile sile commented Sep 18, 2024

Let me report a decoding issue and propose a patch to fix it.

I recorded H.264 streams published by Safari and transmitted through a WebRTC SFU server.
When I attempted to decode these streams using openh264 (commit: 3668daf), the method ISVCDecoder::DecodeFrameNoDelay() failed with the error code -4 for some of the streams, whereas popular players like ffplay were able to decode the streams without any errors.

The following file contains an H.264 elementary stream of a failed attempt.
input.h264.gz

While I cannot provide the exact steps to reproduce the issue, it has been confirmed that such a stream can be generated by Safari on macOS 15.0 and iPad 17.6.1.

After inserting additional debug prints and examining the openh264 source code, it was discovered that the error originated from the following code:

if (iMaxLongTermFrameIdx > int32_t (pSps->uiLog2MaxFrameNum)) {
//ISO/IEC 14496-10:2009(E) 7.4.3.3 Decoded reference picture marking semantics page 96
return GENERATE_ERROR_NO (ERR_LEVEL_SLICE_HEADER, ERR_INFO_INVALID_REF_MARKING);
}

In the attached .h264 file, the iMaxLongTermFrameIdx and pSps->uiLog2MaxFrameNum had values of 13 and 12, respectively. Therefore, the above condition was met, which resulted in the error being returned.

However, the H.264 specification says as follows:

max_long_term_frame_idx_plus1 minus 1 specifies the maximum value of long-term frame index allowed for long-term reference pictures (until receipt of another value of max_long_term_frame_idx_plus1). The value of max_long_term_frame_idx_plus1 shall be in the range of 0 to max_num_ref_frames, inclusive.

From: [Rec. ITU-T H.264 (08/2021)] 7.4.3.3 Decoded reference picture marking semantics (p.97)

Although I am not well-versed in the H.264 specification, it seems that pSps->iNumRefFrames should be used in the code above instead of pSps->uiLog2MaxFrameNum.
In the attached file, the value of pSps->iNumRefFrames was 15. After modifying the code to use pSps->iNumRefFrames (i.e., the code in this PR branch), the decoding process succeeded without any errors.

@BenzhengZhang
Copy link
Collaborator

Hi @sile, please merge the master branch as there is a new commit that fixes the fuzzing test.

@sile
Copy link
Contributor Author

sile commented Oct 9, 2024

@BenzhengZhang Merged!

@BenzhengZhang BenzhengZhang merged commit f70c905 into cisco:master Oct 10, 2024
8 checks passed
@sile sile deleted the fix-parse-dec-ref-pic-marking-error branch October 10, 2024 06:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants