Fix partial paragraph highlighting disappearing in {{content}} (Fixes #446) by namcusamlc · Pull Request #854 · obsidianmd/obsidian-clipper

namcusamlc · 2026-05-20T03:44:15Z

This PR resolves an issue where partially selected highlights on pages with dense  tags (such as Gemini App and Investopedia) would fail to render or partially disappear when evaluating the template's {{content}} variable. This directly addresses GitHub Issue #446 ("BUG: {{content}} no longer adding highlights properly").

The Problem

Saved highlights are stored with precise XPaths, text offsets, and character lengths relative to the original page's DOM (e.g. fullHtml). However, the template extraction pipeline previously:

Stripped Highlight Metadata: The getPageContent response payload was stripping all rich highlight metadata—including xpath, startOffset, endOffset, and id—leaving only a flat array of plain text strings.
Discarded DOM Context: Because XPaths and offsets were stripped, the template content extractor had to rely purely on complex and fragile regex-like fallback text searches matching against the defuddled HTML.
Dense  Mismatches: On pages with dense  elements, partial selections (selecting only a few words in a paragraph rather than the full block) could not be matched accurately by plain text searches, leading to highlights silently failing or disappearing during markdown generation.
Fragile Range Wrapping: The fallback text search used range.surroundContents to inject  tags, which crashes in standard browser engines if a selection crosses structural tag boundaries (e.g., inline formatting tags like , , , or <a>).

The Solution

Preserved Highlight Payload: Updated the page extraction message signatures (in src/content.ts and src/utils/content-extractor.ts) to return the full AnyHighlightData objects containing XPaths and offsets rather than simple text strings.
XPath-Preserving Pipeline: Passed fullHtml directly to processHighlights. We now parse the original document DOM first, evaluate the exact stored xpath coordinates to find the target element, and apply the highlight there.
Robust DOM-Range Extraction: Replaced range.surroundContents with a robust range.extractContents() pattern. This extracts the content within the highlighted range and appends it inside a new  node before inserting it back, completely avoiding crashes when selections cross inline HTML boundaries.
Offset-Based Text Wrapping:
- Introduced a precise findTextNodeAtOffset helper that traverses text nodes using a TreeWalker to pinpoint exact starting and ending offsets for partial selections.
Seamless Defuddle Extraction: After applying highlights to the raw page DOM, we pass the highlighted document directly through DefuddleClass.parse(). This lets Defuddle extract the article structure with the  tags intact, generating flawless markdown highlights.

Verification Plan

Manual Verification

Verified highlight embedding on pages with dense  tags (Investopedia, Gemini Web App).
Verified partial paragraph selections render consistently in the final template {{content}} and "Clip to Obsidian" output.

Fix highligting on tag partially disappears in content variable

a347df4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix partial paragraph highlighting disappearing in {{content}} (Fixes #446)#854

Fix partial paragraph highlighting disappearing in {{content}} (Fixes #446)#854
namcusamlc wants to merge 1 commit into
obsidianmd:mainfrom
namcusamlc:fix/partial-p-highlights

namcusamlc commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

namcusamlc commented May 20, 2026

The Problem

The Solution

Verification Plan

Manual Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant