fix(attributes): preserve non-standard boolean attribute values like hidden=until-found#5283
Conversation
…hidden="until-found" The HTML spec allows hidden="until-found" as a special value for the hidden attribute. Cheerio was normalizing this back to just "hidden" because the boolean-attribute getter always returned the attribute name. Now the getter checks the actual attribute value: if it has a non-standard value (neither empty nor matching the attribute name), the real value is returned instead of being normalized.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b482443ed8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // return the actual value. | ||
| if (!xmlMode && rboolean.test(name)) { | ||
| const value = elem.attribs[name]; | ||
| if (value === '' || value === name) { |
There was a problem hiding this comment.
Normalize boolean values case-insensitively
For parsed HTML such as <input checked="CHECKED">, boolean attributes whose value is the attribute name in a different case are still standard boolean syntax, but this strict comparison now treats them as custom values and returns "CHECKED" instead of preserving Cheerio's normalized "checked" behavior. Since rboolean already matches names case-insensitively, this should compare the value to the attribute name case-insensitively before falling through to return the raw value.
Useful? React with 👍 / 👎.
Problem
When using
hidden="until-found"(an HTML Living Standard feature), Cheerio normalizes the attribute value back to justhidden:This affects any boolean HTML attribute that receives a non-standard value.
Root cause
In
src/api/attributes.ts, the attribute getter checks if the attribute name matches a boolean attribute regex, and if so, always returns the attribute name instead of the actual value:Changes
Modified the getter to check the actual attribute value. If the value is empty or matches the attribute name, the boolean normalization is preserved. If the attribute has a non-standard value (e.g.,
"until-found"forhidden), the actual value is returned.Closes
Closes #5073