Skip to content

Harden ASF parser against infinite loop and malformed headers#2624

Open
Borewit wants to merge 1 commit intomasterfrom
harden-asf-parser
Open

Harden ASF parser against infinite loop and malformed headers#2624
Borewit wants to merge 1 commit intomasterfrom
harden-asf-parser

Conversation

@Borewit
Copy link
Copy Markdown
Owner

@Borewit Borewit commented Apr 1, 2026

Summary

This PR hardens ASF parsing against malformed input that could previously trigger an infinite loop and cause a denial of service.

Changes

  • Prevent parser rewind by disallowing negative skip lengths in the tokenizer
  • Validate ASF object sizes to enforce a minimum header length
  • Add a sanity limit of 10,000 for numberOfHeaderObjects to avoid excessive iteration

Additional context

An additional layer of protection is introduced in strtok3 via Borewit/strtok3#1290, which disallows negative ignore() values. This PR enforces that safeguard by updating the dependency (see #2615).

Security

Fixes an infinite-loop denial of service when parsing crafted ASF/WMA files.

Resolves: https://github.com/Borewit/music-metadata/security/advisories/GHSA-v7g3-hcgj-f95g

@Borewit Borewit self-assigned this Apr 1, 2026
@Borewit Borewit added the security Pull requests that address a security vulnerability label Apr 1, 2026
- Prevent parser rewind by disallowing negative skip lengths in tokenizer
- Validate ASF object sizes to ensure minimum header length
- Add sanity limit (10k) for numberOfHeaderObjects to avoid excessive iteration
@Borewit Borewit force-pushed the harden-asf-parser branch from 755f01c to 8929861 Compare April 1, 2026 19:23
@coveralls
Copy link
Copy Markdown

Coverage Status

coverage: 95.772% (+0.06%) from 95.712%
when pulling 8929861 on harden-asf-parser
into 1a057b4 on master.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens media parsing against crafted ASF/WMA (and adds an AIFF regression case) that could previously lead to excessive iteration or infinite-loop–like behavior, improving resilience against malformed headers.

Changes:

  • Adds ASF header sanity checks (cap numberOfHeaderObjects) and strengthens ASF object-size validation.
  • Exposes UnexpectedFileContentError publicly and adds an AIFF test for a zero-length COMM chunk.
  • Adds new malicious/corrupt sample fixtures and corresponding security-focused tests.

Reviewed changes

Copilot reviewed 5 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
lib/asf/AsfParser.ts Adds a header-object count sanity limit; parsing flow adjusted (requires a couple of fixes noted in comments).
lib/asf/AsfObject.ts Refactors top-level header parsing to reuse the common header token and adds a minimum object-size validation.
lib/ParseError.ts Exports UnexpectedFileContentError for external/type-safe consumption.
test/test-file-asf.ts Adds security-hardening test coverage for ASF header object count and loop prevention.
test/test-file-aiff.ts Adds a regression test for AIFF/AIFC malformed COMM chunk size (CWE-835 style).
test/samples/asf/max-numberOfObjectHeaders.wma New fixture for extreme header-object count.
test/samples/asf/CWE-835-03.wma New ASF fixture (currently not referenced by tests).
test/samples/aiff/CWE-835-01.aiff New AIFF/AIFC malformed fixture for zero COMM chunk length.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 24 to 31
public async parse() {
const header = await this.tokenizer.readToken<AsfObject.IAsfTopLevelObjectHeader>(AsfObject.TopLevelHeaderObjectToken);
if (!header.objectId.equals(AsfGuid.HeaderObject)) {
throw new AsfContentParseError(`expected asf header; but was not found; got: ${header.objectId.str}`);
if (header.numberOfHeaderObjects > 10000) {
throw new AsfContentParseError(
`Unrealistic number of ASF header objects: ${header.numberOfHeaderObjects}`
);
}
await this.parseObjectHeader(header.numberOfHeaderObjects);
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse() no longer verifies the top-level ASF GUID is AsfGuid.HeaderObject (the previous expected asf header check was removed). This can allow non-ASF input to be parsed as ASF, changing failure mode and potentially masking file-type issues. Consider restoring the GUID check (in addition to the new numberOfHeaderObjects sanity limit).

Copilot uses AI. Check for mistakes.
Comment on lines 109 to 114
private async parseExtensionObject(extensionSize: number): Promise<void> {
do {
// Parse common header of the ASF Object (3.1)
const header = await this.tokenizer.readToken<AsfObject.IAsfObjectHeader>(AsfObject.HeaderObjectToken);
const remaining = header.objectSize - AsfObject.HeaderObjectToken.len;
if (remaining < 0) {
throw new AsfContentParseError(`Invalid ASF header object size: ${header.objectSize}`);
}
// Parse data part of the ASF Object
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseExtensionObject uses a do { ... } while (extensionSize > 0) loop, which will still parse one object even when extensionSize is 0. If an ASF Header Extension Object legitimately has extensionDataSize = 0 (or a crafted file sets it to 0), this will read unexpected data. Consider switching to a while (extensionSize > 0) loop (pre-check) so 0 bytes means no iteration.

Copilot uses AI. Check for mistakes.
Comment on lines 79 to 86
get: (buf, off): IAsfTopLevelObjectHeader => {
const base = HeaderObjectToken.get(buf, off);

return {
objectId: AsfGuid.fromBin(buf, off),
objectSize: Number(Token.UINT64_LE.get(buf, off + 16)),
...base,
numberOfHeaderObjects: Token.UINT32_LE.get(buf, off + 24)
// Reserved: 2 bytes
// reserved: 2 bytes
};
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TopLevelHeaderObjectToken now reuses HeaderObjectToken.get, but it only enforces objectSize >= 24. The ASF top-level Header Object requires at least 30 bytes (GUID+size+numberOfHeaderObjects+reserved). Add a check that base.objectSize >= TopLevelHeaderObjectToken.len (30) to properly reject malformed headers in the 24..29 range.

Copilot uses AI. Check for mistakes.
Comment on lines +29 to 32
);
}
await this.parseObjectHeader(header.numberOfHeaderObjects);
}
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parseObjectHeader(header.numberOfHeaderObjects) call inherits a subtle edge case from parseObjectHeader’s do { ... } while (--numberOfObjectHeaders) loop: if numberOfHeaderObjects is 0, the loop decrements into negative values (truthy) and becomes unbounded until EOF/error. Consider validating numberOfHeaderObjects >= 1 here (or rewriting the loop) so a 0 value cannot trigger excessive parsing/DoS behavior.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

security Pull requests that address a security vulnerability

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants