Summary
When SplitAfterLine (added in PR #253) is called on a paragraph with SetHyphens("auto"), and a hyphenated word pair lands BOTH in the same head/tail half, the reconstructed paragraph's run text contains a literal - artifact.
Example: original word "linguistic" hyphenates to part="linguis-" + rest="tic" between rendered lines 5 and 6. If the user splits at line 8, both halves of the pair go into the head. cloneWithWords joins them as "linguis-tic" (with literal hyphen) instead of recombining to "linguistic".
The user-visible artifact is a hyphen mid-word in the rendered output of the head, in a position that wasn't in the original source text.
Why it doesn't bite the page-split path
Hyphenation runs only inside Paragraph.Layout() (paragraph.go ~line 390), not in wrapWords which is what PlanLayout (page-split) uses. PR #250 added a guard test (TestNoHyphenInPageSplitOverflow) locking this in. So today only SplitAfterLine is affected, since it goes through Layout.
Documented behavior
TestSplitAfterLineHyphenationInternalToHead in layout/split_test.go documents the current state — head re-lays at the same width with stable line count, but the hyphen artifact persists in the run text.
Fix sketch
- Add
Word.HyphenatedBoundary bool field.
hyphenateWord (paragraph.go ~line 868) sets it on both part and rest.
cloneWithWords (paragraph.go ~line 1646), when joining two consecutive words that BOTH have HyphenatedBoundary == true, strips the trailing - from the prev text and joins without space, recovering the original word.
- wordToRun does NOT propagate the flag (it's a measurement-time signal, not user data).
When the pair is split across halves (part in head, rest in tail), each half's words slice contains only one of them, so neither half's join logic triggers. Each renders correctly: head ends with "linguis-", tail starts with "tic".
Scope
Narrow — only affects callers using SetHyphens("auto") AND SplitAfterLine. Not user-visible for the HubSpot-PDF clamp/appendix flow unless that flow opts into hyphenation.
Related
Summary
When
SplitAfterLine(added in PR #253) is called on a paragraph withSetHyphens("auto"), and a hyphenated word pair lands BOTH in the same head/tail half, the reconstructed paragraph's run text contains a literal-artifact.Example: original word
"linguistic"hyphenates topart="linguis-"+rest="tic"between rendered lines 5 and 6. If the user splits at line 8, both halves of the pair go into the head. cloneWithWords joins them as"linguis-tic"(with literal hyphen) instead of recombining to"linguistic".The user-visible artifact is a hyphen mid-word in the rendered output of the head, in a position that wasn't in the original source text.
Why it doesn't bite the page-split path
Hyphenation runs only inside
Paragraph.Layout()(paragraph.go ~line 390), not inwrapWordswhich is whatPlanLayout(page-split) uses. PR #250 added a guard test (TestNoHyphenInPageSplitOverflow) locking this in. So today onlySplitAfterLineis affected, since it goes throughLayout.Documented behavior
TestSplitAfterLineHyphenationInternalToHeadinlayout/split_test.godocuments the current state — head re-lays at the same width with stable line count, but the hyphen artifact persists in the run text.Fix sketch
Word.HyphenatedBoundary boolfield.hyphenateWord(paragraph.go ~line 868) sets it on bothpartandrest.cloneWithWords(paragraph.go ~line 1646), when joining two consecutive words that BOTH haveHyphenatedBoundary == true, strips the trailing-from the prev text and joins without space, recovering the original word.When the pair is split across halves (part in head, rest in tail), each half's words slice contains only one of them, so neither half's join logic triggers. Each renders correctly: head ends with
"linguis-", tail starts with"tic".Scope
Narrow — only affects callers using
SetHyphens("auto")ANDSplitAfterLine. Not user-visible for the HubSpot-PDF clamp/appendix flow unless that flow opts into hyphenation.Related
SplitAfterLineand documented this limitation in its description and test.