Skip to content

Commit 77be1c2

Browse files
Throw errors from HTML parsing
Replace some TODOs with actual exception throwing. The exceptions aren’t great, but at least they’re better than just ignoring the errors. The first case doesn’t have a test because, as far as I can tell, it’s basically impossible to trigger. There are apparently very few situations where loadHTML() will return false [1] rather than trying to fix up the markup somehow, and we’re already guaranteeing that the HTML isn’t empty and that the options are valid. The second case can raise a variety of errors; just test one of them. Turning the array of LibXMLError instances (which aren’t Throwable) into a “tree” / “stack” of Exception instances isn’t super nice, but better than doing nothing or only throwing an Exception for the first error. [1]: https://stackoverflow.com/a/79055330/1420237
1 parent 158a0ff commit 77be1c2

File tree

2 files changed

+13
-3
lines changed

2 files changed

+13
-3
lines changed

src/HtmlParser.php

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ public function parseHtml( string $html ): DOMDocument {
2828
// LIBXML_NOBLANKS Constant excludes "ghost nodes" to avoid violating
2929
// vue's single root node constraint
3030
if ( !$document->loadHTML( '<?xml encoding="utf-8" ?>' . $html, LIBXML_NOBLANKS ) ) {
31-
//TODO Test failure
31+
throw new Exception( 'Failed to parse HTML' );
3232
}
3333

3434
/** @var LibXMLError[] $errors */
@@ -41,9 +41,12 @@ public function parseHtml( string $html ): DOMDocument {
4141
libxml_disable_entity_loader( $entityLoaderDisabled );
4242
}
4343

44+
$exception = null;
4445
foreach ( $errors as $error ) {
45-
//TODO html5 tags can fail parsing
46-
//TODO Throw an exception
46+
$exception = new Exception( $error->message, $error->code, $exception );
47+
}
48+
if ( $exception !== null ) {
49+
throw $exception;
4750
}
4851

4952
return $document;

tests/php/HtmlParserTest.php

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,4 +63,11 @@ public function testTwoRootNodes() {
6363
$this->parseAndGetRootNode( '<p></p><p></p>' );
6464
}
6565

66+
public function testMalformedHtml(): void {
67+
$htmlParser = new HtmlParser();
68+
$this->expectException( Exception::class );
69+
$this->expectExceptionMessage( 'Unexpected end tag' );
70+
$htmlParser->parseHtml( '</p>' );
71+
}
72+
6673
}

0 commit comments

Comments
 (0)