fix: refine paragraph boundaries for inline contents#36
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refines how Typst inline/script content is paragraphized in the generated textlint AST, addressing the paragraph-boundary bugs surfaced by the integration tests added in #31 (comments, nested list items, term lists, and inline content blocks).
Changes:
- Reworked paragraphization to better split/merge inline vs non-inline nodes (including hash statements, term list bodies, and content blocks) and updated node conversions for labels/refs and linebreak escapes.
- Updated integration tests that were previously marked as expected failures, and added/updated fixture-based unit coverage for the new paragraph boundary behavior.
- Regenerated/adjusted expected AST fixture outputs to match the refined paragraphization.
Reviewed changes
Copilot reviewed 32 out of 32 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/typstToTextlintAst.ts | Core changes: refine paragraph boundary detection/splitting, content-block handling, list/term normalization, and inline label/ref conversions. |
| test/integration/linting.test.ts | Enables previously failing integration tests (comments immediately followed by elements, nested lists, term list violations). |
| test/unit/paragraphizedTextlintAstObject.json | Updates expected paragraphized AST snapshot to reflect trimming/refined paragraph boundaries. |
| test/unit/fixtures/unsupported-node-paragraph-split/input.typ | Adds fixture input for splitting paragraphs around unsupported nodes in content blocks. |
| test/unit/fixtures/unsupported-node-paragraph-split/output.json | Expected AST output for unsupported-node paragraph splitting behavior. |
| test/unit/fixtures/paragraph-between-term-list/output.json | Updates expected term list handling so term bodies are paragraphized consistently. |
| test/unit/fixtures/list-item-split-hash-statement/input.typ | Adds fixture input for list item splitting around hash statements. |
| test/unit/fixtures/list-item-split-hash-statement/output.json | Expected AST output for list-item hash-statement splitting. |
| test/unit/fixtures/inline-text-space-linebreak/input.typ | Adds fixture input for forced line break (\\ + newline) staying inline within a paragraph. |
| test/unit/fixtures/inline-text-space-linebreak/output.json | Expected AST output for inline forced line break handling. |
| test/unit/fixtures/inline-raw-and-equation/input.typ | Adds fixture input for inline raw/code and inline equations staying within one paragraph. |
| test/unit/fixtures/inline-raw-and-equation/output.json | Expected AST output for inline raw/equation paragraphization. |
| test/unit/fixtures/inline-label-ref/input.typ | Adds fixture input for inline label + reference use in same paragraph. |
| test/unit/fixtures/inline-label-ref/output.json | Expected AST output for label/ref nodes represented inline (as code nodes). |
| test/unit/fixtures/inline-escape-shorthand-smartquote/input.typ | Adds fixture input covering escapes, shorthand, and smart quotes in a single paragraph. |
| test/unit/fixtures/inline-escape-shorthand-smartquote/output.json | Expected AST output for escape/shorthand/smartquote inline behavior. |
| test/unit/fixtures/inline-empty-content-block-inline/input.typ | Adds fixture input for empty inline content block #[] behavior. |
| test/unit/fixtures/inline-empty-content-block-inline/output.json | Expected AST output for empty inline content block behavior. |
| test/unit/fixtures/inline-content-block-with-non-inline/input.typ | Adds fixture input for inline content block that contains non-inline nodes (e.g., table). |
| test/unit/fixtures/inline-content-block-with-non-inline/output.json | Expected AST output for inline content block splitting around non-inline content. |
| test/unit/fixtures/inline-content-block-inline/input.typ | Adds fixture input for a purely inline content block #[inline content]. |
| test/unit/fixtures/inline-content-block-inline/output.json | Expected AST output for purely inline content block paragraphization. |
| test/unit/fixtures/figure-table-block/input.typ | Adds fixture input for figure/table block behavior and boundaries. |
| test/unit/fixtures/figure-table-block/output.json | Expected AST output for figure/table block structure under new paragraph rules. |
| test/unit/fixtures/content-blocks/output.json | Updates expected AST for #{ ... } content blocks (avoids wrapping code blocks in Paragraph). |
| test/unit/fixtures/comment-followed-hash-statement/input.typ | Adds fixture input for comment immediately followed by a hash statement. |
| test/unit/fixtures/comment-followed-hash-statement/output.json | Expected AST output for comment + following hash-statement boundary handling. |
| test/unit/fixtures/Strong/output.json | Updates expected output for strong/emphasis + forced linebreak behavior within paragraphs. |
| test/unit/fixtures/Emphasis/output.json | Updates expected output for emphasis + forced linebreak behavior within paragraphs. |
| test/unit/fixtures/Break/output.json | Updates expected output to keep forced line breaks inline rather than separate Break nodes. |
| test/unit/fixtures/Equation/output.json | Updates expected output for equation-adjacent paragraph boundaries. |
| test/unit/fixtures/List/output.json | Updates expected list-related outputs impacted by paragraphization changes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 32 out of 32 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 33 out of 33 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
b76172b to
94b5b14
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 33 out of 33 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (1)
src/typstToTextlintAst.ts:1078
includesLineBreaktreats anyStrcontaining\nas a statement boundary. After this PR, forced inline line breaks are represented asStrnodes (e.g. "\" + "\n"), so a#...expression immediately after a forced line break may be mis-detected as a new hash statement and split out of the paragraph. Consider teachingisStatementBoundaryBeforeto ignore the"\\" + "\n"inline-linebreak pattern (or use a dedicated node/type/flag for inline line breaks) so only real statement boundaries trigger hash-statement collection/splitting.
const includesLineBreak = (n: Content): boolean => {
if (n.type === ASTNodeTypes.Str) {
return n.raw.includes("\n");
}
if (n.type === ASTNodeTypes.Break) {
return n.raw.includes("\n");
}
return false;
};
const isStatementBoundaryBefore = (
arr: Content[],
index: number,
): boolean => {
if (index === 0) {
return true;
}
return includesLineBreak(arr[index - 1]);
};
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 33 out of 33 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 35 out of 35 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Could you please review this? @r4ai |
LGTM! |
cf. #31 (comment)