HTML parser collapses consecutive spaces during setHtml()/toHtml() round-trip
When using setHtml() with content containing consecutive spaces, the editor/parser collapses all repeated spaces into a single space during parsing/serialization.
Example:
Input HTML:
<p><span>hello world</span></p>
After setHtml() + toHtml():
<p><span>hello world</span></p>
Expected behavior:
Consecutive spaces should be preserved during HTML round-trip, especially when using:
-
- white-space: pre-wrap
- Unicode non-breaking spaces
Actual behavior:
The parser normalizes whitespace and removes repeated spaces from text nodes.
This causes formatting/layout issues for:
- code-like text
- aligned text
- educational content
- whitespace-sensitive rendering
Environment:
- Android
- Jetpack Compose
- RichTextEditor
Reproduction:
1. Call setHtml() with multiple spaces between words
2. Serialize back using toHtml()
3. Observe that spaces are collapsed into a single space
It appears the internal HTML parser normalizes whitespace during DOM reconstruction.
Would it be possible to:
- preserve consecutive spaces
- support white-space: pre-wrap
- preserve entities
- or provide an option to disable whitespace normalization?
Title:
Description: