Harden ASF parser against infinite loop and malformed headers#2624
Harden ASF parser against infinite loop and malformed headers#2624
Conversation
- Prevent parser rewind by disallowing negative skip lengths in tokenizer - Validate ASF object sizes to ensure minimum header length - Add sanity limit (10k) for numberOfHeaderObjects to avoid excessive iteration
755f01c to
8929861
Compare
There was a problem hiding this comment.
Pull request overview
This PR hardens media parsing against crafted ASF/WMA (and adds an AIFF regression case) that could previously lead to excessive iteration or infinite-loop–like behavior, improving resilience against malformed headers.
Changes:
- Adds ASF header sanity checks (cap
numberOfHeaderObjects) and strengthens ASF object-size validation. - Exposes
UnexpectedFileContentErrorpublicly and adds an AIFF test for a zero-lengthCOMMchunk. - Adds new malicious/corrupt sample fixtures and corresponding security-focused tests.
Reviewed changes
Copilot reviewed 5 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
lib/asf/AsfParser.ts |
Adds a header-object count sanity limit; parsing flow adjusted (requires a couple of fixes noted in comments). |
lib/asf/AsfObject.ts |
Refactors top-level header parsing to reuse the common header token and adds a minimum object-size validation. |
lib/ParseError.ts |
Exports UnexpectedFileContentError for external/type-safe consumption. |
test/test-file-asf.ts |
Adds security-hardening test coverage for ASF header object count and loop prevention. |
test/test-file-aiff.ts |
Adds a regression test for AIFF/AIFC malformed COMM chunk size (CWE-835 style). |
test/samples/asf/max-numberOfObjectHeaders.wma |
New fixture for extreme header-object count. |
test/samples/asf/CWE-835-03.wma |
New ASF fixture (currently not referenced by tests). |
test/samples/aiff/CWE-835-01.aiff |
New AIFF/AIFC malformed fixture for zero COMM chunk length. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| public async parse() { | ||
| const header = await this.tokenizer.readToken<AsfObject.IAsfTopLevelObjectHeader>(AsfObject.TopLevelHeaderObjectToken); | ||
| if (!header.objectId.equals(AsfGuid.HeaderObject)) { | ||
| throw new AsfContentParseError(`expected asf header; but was not found; got: ${header.objectId.str}`); | ||
| if (header.numberOfHeaderObjects > 10000) { | ||
| throw new AsfContentParseError( | ||
| `Unrealistic number of ASF header objects: ${header.numberOfHeaderObjects}` | ||
| ); | ||
| } | ||
| await this.parseObjectHeader(header.numberOfHeaderObjects); |
There was a problem hiding this comment.
parse() no longer verifies the top-level ASF GUID is AsfGuid.HeaderObject (the previous expected asf header check was removed). This can allow non-ASF input to be parsed as ASF, changing failure mode and potentially masking file-type issues. Consider restoring the GUID check (in addition to the new numberOfHeaderObjects sanity limit).
| private async parseExtensionObject(extensionSize: number): Promise<void> { | ||
| do { | ||
| // Parse common header of the ASF Object (3.1) | ||
| const header = await this.tokenizer.readToken<AsfObject.IAsfObjectHeader>(AsfObject.HeaderObjectToken); | ||
| const remaining = header.objectSize - AsfObject.HeaderObjectToken.len; | ||
| if (remaining < 0) { | ||
| throw new AsfContentParseError(`Invalid ASF header object size: ${header.objectSize}`); | ||
| } | ||
| // Parse data part of the ASF Object |
There was a problem hiding this comment.
parseExtensionObject uses a do { ... } while (extensionSize > 0) loop, which will still parse one object even when extensionSize is 0. If an ASF Header Extension Object legitimately has extensionDataSize = 0 (or a crafted file sets it to 0), this will read unexpected data. Consider switching to a while (extensionSize > 0) loop (pre-check) so 0 bytes means no iteration.
| get: (buf, off): IAsfTopLevelObjectHeader => { | ||
| const base = HeaderObjectToken.get(buf, off); | ||
|
|
||
| return { | ||
| objectId: AsfGuid.fromBin(buf, off), | ||
| objectSize: Number(Token.UINT64_LE.get(buf, off + 16)), | ||
| ...base, | ||
| numberOfHeaderObjects: Token.UINT32_LE.get(buf, off + 24) | ||
| // Reserved: 2 bytes | ||
| // reserved: 2 bytes | ||
| }; |
There was a problem hiding this comment.
TopLevelHeaderObjectToken now reuses HeaderObjectToken.get, but it only enforces objectSize >= 24. The ASF top-level Header Object requires at least 30 bytes (GUID+size+numberOfHeaderObjects+reserved). Add a check that base.objectSize >= TopLevelHeaderObjectToken.len (30) to properly reject malformed headers in the 24..29 range.
| ); | ||
| } | ||
| await this.parseObjectHeader(header.numberOfHeaderObjects); | ||
| } |
There was a problem hiding this comment.
The parseObjectHeader(header.numberOfHeaderObjects) call inherits a subtle edge case from parseObjectHeader’s do { ... } while (--numberOfObjectHeaders) loop: if numberOfHeaderObjects is 0, the loop decrements into negative values (truthy) and becomes unbounded until EOF/error. Consider validating numberOfHeaderObjects >= 1 here (or rewriting the loop) so a 0 value cannot trigger excessive parsing/DoS behavior.
Summary
This PR hardens ASF parsing against malformed input that could previously trigger an infinite loop and cause a denial of service.
Changes
Additional context
An additional layer of protection is introduced in strtok3 via Borewit/strtok3#1290, which disallows negative ignore() values. This PR enforces that safeguard by updating the dependency (see #2615).
Security
Fixes an infinite-loop denial of service when parsing crafted ASF/WMA files.
Resolves: https://github.com/Borewit/music-metadata/security/advisories/GHSA-v7g3-hcgj-f95g