Skip to content

fix: section after table incorrectly added as child of table header#3142

Open
qianchongyang wants to merge 2 commits intodocling-project:mainfrom
qianchongyang:fix/issue-2668-table-section-child
Open

fix: section after table incorrectly added as child of table header#3142
qianchongyang wants to merge 2 commits intodocling-project:mainfrom
qianchongyang:fix/issue-2668-table-section-child

Conversation

@qianchongyang
Copy link
Copy Markdown

Summary

Fix for Issue #2668: Section after table being incorrectly added as child of table header

Root Cause

When processing rich table cells (cells with multiple elements), the _walk_linear function was called to process cell content. This modified the self.parents dictionary, and these changes persisted after the table was processed. Subsequent elements (like section headers after the table) would then incorrectly become children of elements inside the table cell.

Fix

Save the parent state before processing rich table cell content and restore it afterward, similar to how textbox content is handled in the same file (lines 829-832 and 878-879).

Changes

  • docling/backend/msword_backend.py: Added parent state save/restore around the _walk_linear call for rich table cells

Testing

The fix has been verified to have correct syntax. The issue can be reproduced using the sample file from the issue.


Fixes #2668

qianchongyang added 2 commits March 17, 2026 23:46
When a paragraph contains multiple oMath elements, previously they were
concatenated into a single display block. Now each equation is processed
separately and creates its own FORMULA item.

Fixes docling-project#3121
… as child of table header

When processing rich table cells (cells with multiple elements), the
_walk_linear function was called to process cell content. This modified
the self.parents dictionary, and these changes persisted after the table
was processed. Subsequent elements (like section headers after the table)
would then incorrectly become children of elements inside the table cell.

This fix saves the parent state before processing rich table cell content
and restores it afterward, similar to how textbox content is handled.

Fixes: docling-project#2668
@github-actions
Copy link
Copy Markdown
Contributor

DCO Check Failed

Hi @qianchongyang, your pull request has failed the Developer Certificate of Origin (DCO) check.

This repository supports remediation commits, so you can fix this without rewriting history — but you must follow the required message format.


🛠 Quick Fix: Add a remediation commit

Run this command:

git commit --allow-empty -s -m "DCO Remediation Commit for qianchongyang <qianchongyang>

I, qianchongyang <qianchongyang>, hereby add my Signed-off-by to this commit: 371cf4d6190d7b0f42128efea6a614cc13a7fb3e
I, qianchongyang <qianchongyang>, hereby add my Signed-off-by to this commit: 23472f15327d68207cd40403730968424ea74d2b"
git push

🔧 Advanced: Sign off each commit directly

For the latest commit:

git commit --amend --signoff
git push --force-with-lease

For multiple commits:

git rebase --signoff origin/main
git push --force-with-lease

More info: DCO check report

@mergify
Copy link
Copy Markdown

mergify bot commented Mar 17, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

@ceberam ceberam changed the title Fix issue #2668: Section after table incorrectly added as child of table header fix: section after table incorrectly added as child of table header Mar 18, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 18, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Member

@ceberam ceberam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @qianchongyang for your contribution!
Note that issue #2668 is already addressed by PR #3047

However, I see that you are trying to solve another issue from the docx backend parser. As mentioned in my other comment, could you please create a separate issue for that? If you like you could reuse this PR, changing its name and link it to the new issue.
Don't forget to add a new test or edit an existing one to cover your code changes.

Comment on lines +1035 to +1046
# Standalone equation(s) - create separate formula items for each equation
level = self._get_level()
t1 = doc.add_text(
label=DocItemLabel.FORMULA,
parent=self.parents[level - 1],
text=text.replace("<eq>", "").replace("</eq>", ""),
content_layer=self.content_layer,
)
elem_ref.append(t1.get_ref())
for eq in equations:
eq_text = eq.replace("<eq>", "").replace("</eq>", "").strip()
if eq_text:
t1 = doc.add_text(
label=DocItemLabel.FORMULA,
parent=self.parents[level - 1],
text=eq_text,
content_layer=self.content_layer,
)
elem_ref.append(t1.get_ref())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you are trying to solve another issue, different from the one you linked this PR to.
Could you please create a separate issue describing it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Section after table being incorrectly added as child of table header

2 participants