Skip to content

[HTML] "Error while making rich table" error #2408

@tysonite

Description

@tysonite

Bug

In continuation of #2360 (comment), here is the report regarding the "Error while making rich table" error. The HTML page and a Docling command that produces such errors are presented below.

Steps to reproduce

Given the HTML:

<html>

<head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <title>Page Title</title>
</head>

<body class="mceContentBody aui-theme-default wiki-content fullsize">


    <h3>Header 3</h3>
    <table class="wrapped confluenceTable">
        <colgroup>
            <col>
            <col>
            <col>
            <col>
        </colgroup>
        <tbody>
            <tr>
                <th scope="col" class="confluenceTh"><br></th>
                <th scope="col" class="confluenceTh">
                    <h3>Column 2</h3>
                </th>
                <th scope="col" class="confluenceTh">Column 3</th>
                <th scope="col" class="confluenceTh"><br></th>
            </tr>
            <tr>
                <td class="confluenceTd">...</td>
                <td class="confluenceTd">
                    <ol>
                        <li>...&nbsp;<span style="color: rgb(0,0,255);"><br><span
                                    style="color: rgb(0,0,0);">..&nbsp;</span><br></span></li>
                        <li>...<ol>
                                <li>...</li>
                                <li>...<span style="color: rgb(36,36,36);">...<span>&nbsp;</span></span>
                                    &nbsp;<span style="color: rgb(0,0,255);"><strong>&nbsp;</strong></span></li>
                                <li>...</li>
                            </ol>
                        </li>
                    </ol>
                </td>
                <td class="confluenceTd">
                    <div class="content-wrapper">
                        <p><img class="editor-inline-macro" src="./bad_2_files/macro" data-macro-name="jira"
                                data-macro-id="..." role="button" tabindex="0" aria-haspopup="true"
                                aria-label="jira macro" data-macro-parameters="..." data-macro-schema-version="1"></p>
                    </div>
                </td>
                <td class="confluenceTd"><br></td>
            </tr>
            <tr>
                <td class="confluenceTd">...</td>
                <td class="confluenceTd">
                    <ol>
                        <li>...&nbsp;<br>...<span style="color: rgb(0,0,255);"><br><span
                                    style="color: rgb(0,0,0);">...&nbsp;</span></span></li>
                        <li>...<ol>
                                <li>...</li>
                                <li>...</li>
                                <li>...</li>
                                <li>...&nbsp;</li>
                            </ol>
                        </li>
                    </ol>
                </td>
                <td class="confluenceTd">
                    <div class="content-wrapper">
                        <p><img class="editor-inline-macro" src="./bad_2_files/macro(1)" data-macro-name="jira"
                                data-macro-id="..." role="button" tabindex="0" aria-haspopup="true"
                                aria-label="jira macro" data-macro-parameters="..." data-macro-schema-version="1"></p>
                    </div>
                </td>
                <td class="confluenceTd"><br></td>
            </tr>
            <tr>
                <td class="confluenceTd"><br></td>
                <td class="confluenceTd">...</td>
                <td class="confluenceTd">
                    <div class="content-wrapper">
                        <p><img class="editor-inline-macro" src="./bad_2_files/macro(2)" data-macro-name="jira"
                                data-macro-id="..." role="button" tabindex="0" aria-haspopup="true"
                                aria-label="jira macro" data-macro-parameters="..." data-macro-schema-version="1"></p>
                    </div>
                </td>
                <td class="confluenceTd"><br></td>
            </tr>
            <tr>
                <td class="confluenceTd"><br></td>
                <td class="confluenceTd">...&nbsp;</td>
                <td class="confluenceTd"><br></td>
                <td class="confluenceTd"><br></td>
            </tr>
            <tr>
                <td class="confluenceTd"><br></td>
                <td class="confluenceTd">...</td>
                <td class="confluenceTd"><br></td>
                <td class="confluenceTd"><br></td>
            </tr>
            <tr>
                <td class="confluenceTd"><br></td>
                <td class="confluenceTd">...</td>
                <td class="confluenceTd"><br></td>
                <td class="confluenceTd"><br></td>
            </tr>
        </tbody>
    </table>

</body>

</html>

The docling shows below errors (the page is converted, though, but its content in Mardown differs to HTML one):

$ docling --from html --to md ./bad_2.html 
2025-10-08 05:07:10,335 - INFO - Loading plugin 'docling_defaults'
2025-10-08 05:07:10,336 - INFO - Registered ocr engines: ['easyocr', 'ocrmac', 'rapidocr', 'tesserocr', 'tesseract']
2025-10-08 05:07:10,341 - INFO - paths: [PosixPath('/tmp/tmpo7j6jom2/bad_2.html')]
2025-10-08 05:07:10,341 - INFO - detected formats: [<InputFormat.HTML: 'html'>]
2025-10-08 05:07:10,343 - INFO - Going to convert document batch...
2025-10-08 05:07:10,343 - INFO - Initializing pipeline for SimplePipeline with options hash 995a146ad601044538e6a923bea22f4e
2025-10-08 05:07:10,346 - INFO - Loading plugin 'docling_defaults'
2025-10-08 05:07:10,346 - INFO - Registered picture descriptions: ['vlm', 'api']
2025-10-08 05:07:10,346 - INFO - Processing document bad_2.html
2025-10-08 05:07:10,348 - INFO - deleted item in tree at stack: (1, 0, 1) => #/texts/2
2025-10-08 05:07:10,349 - ERROR - Error while making rich table: Cannot find all provided RefItems in doc: ['#/texts/2'].
2025-10-08 05:07:10,349 - ERROR - Error while making rich table: Cannot find all provided RefItems in doc: ['#/texts/3'].
2025-10-08 05:07:10,350 - ERROR - Error while making rich table: Cannot find all provided RefItems in doc: ['#/texts/9'].
2025-10-08 05:07:10,351 - ERROR - Error while making rich table: Cannot find all provided RefItems in doc: ['#/texts/16'].
2025-10-08 05:07:10,351 - ERROR - Error while making rich table: Cannot find all provided RefItems in doc: ['#/texts/17'].
2025-10-08 05:07:10,352 - ERROR - Error while making rich table: Cannot find all provided RefItems in doc: ['#/texts/18'].
2025-10-08 05:07:10,352 - ERROR - Error while making rich table: Cannot find all provided RefItems in doc: ['#/texts/19'].
2025-10-08 05:07:10,352 - INFO - Finished converting document bad_2.html in 0.01 sec.
2025-10-08 05:07:10,352 - INFO - writing Markdown output to bad_2.md
2025-10-08 05:07:10,359 - INFO - Processed 1 docs, of which 0 failed
2025-10-08 05:07:10,359 - INFO - All documents were converted in 0.02 seconds.

Docling version

Output of the docling:

$ docling --version
2025-10-08 05:05:22,769 - INFO - Loading plugin 'docling_defaults'
2025-10-08 05:05:22,770 - INFO - Registered ocr engines: ['easyocr', 'ocrmac', 'rapidocr', 'tesserocr', 'tesseract']
Docling version: 2.55.1
Docling Core version: 2.48.4
Docling IBM Models version: 3.9.1
Docling Parse version: 4.5.0
Python: cpython-310 (3.10.12)
Platform: Linux-6.8.0-79-generic-x86_64-with-glibc2.35

Actually, the version from the 9705f40 revision is installed in my env.

Python version

$ python -V
Python 3.10.12

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghtmlissue related to html backend

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions