Skip to content

Commit 662571a

Browse files
fix: ValueError when converting cells to html (#359)
This PR will address #357 and #358. ### Summary - add logic to validate the input parameter to the fill_cells() function. Now, the function checks if the input is a list of dictionaries before processing. - correct type hint for parameter `cells` in `table_cells_to_dataframe()`
1 parent 0911892 commit 662571a

File tree

4 files changed

+12
-2
lines changed

4 files changed

+12
-2
lines changed

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,9 @@
1+
## 0.7.36
2+
3+
fix: add input parameter validation to `fill_cells()` when converting cells to html
4+
15
## 0.7.35
6+
27
Fix syntax for generated HTML tables
38

49
## 0.7.34
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "0.7.35" # pragma: no cover
1+
__version__ = "0.7.36" # pragma: no cover

unstructured_inference/inference/layoutelement.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -214,7 +214,9 @@ def reduce(keep: Rectangle, reduce: Rectangle):
214214
reduce(keep=region_b, reduce=region_a)
215215

216216

217-
def table_cells_to_dataframe(cells: dict, nrows: int = 1, ncols: int = 1, header=None) -> DataFrame:
217+
def table_cells_to_dataframe(
218+
cells: List[dict], nrows: int = 1, ncols: int = 1, header=None
219+
) -> DataFrame:
218220
"""convert table-transformer's cells data into a pandas dataframe"""
219221
arr = np.empty((nrows, ncols), dtype=object)
220222
for cell in cells:

unstructured_inference/models/tables.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -664,6 +664,9 @@ def fill_cells(cells: List[dict]) -> List[dict]:
664664
whether this cell is a column header
665665
666666
"""
667+
if not cells:
668+
return []
669+
667670
table_rows_no = max({row for cell in cells for row in cell["row_nums"]})
668671
table_cols_no = max({col for cell in cells for col in cell["column_nums"]})
669672
filled = np.zeros((table_rows_no + 1, table_cols_no + 1), dtype=bool)

0 commit comments

Comments
 (0)