From ba3f8e0befea74429430b1ae988ab4d573883ed8 Mon Sep 17 00:00:00 2001 From: Lucas Werkmeister Date: Fri, 6 Jun 2025 15:15:12 +0200 Subject: [PATCH] Throw errors from HTML parsing MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replace some TODOs with actual exception throwing. The exceptions aren’t great, but at least they’re better than just ignoring the errors. The first case doesn’t have a test because, as far as I can tell, it’s basically impossible to trigger. There are apparently very few situations where loadHTML() will return false [1] rather than trying to fix up the markup somehow, and we’re already guaranteeing that the HTML isn’t empty and that the options are valid. The second case can raise a variety of errors; just test one of them. Turning the array of LibXMLError instances (which aren’t Throwable) into a “tree” / “stack” of Exception instances isn’t super nice, but better than doing nothing or only throwing an Exception for the first error. Note that we have to ignore / skip one error: libxml may complain about “invalid”