Commit 326f180
Feat: add more output format for table inference (#263)
This PR addresses
[CORE-2307](https://unstructured-ai.atlassian.net/browse/CORE-2307)
- add a new kwarg to `UnstructuredTableTransformerModel.run_prediction`:
`output_format`
- default `output_format` is `html`, which is current behavior: output
html string representation of the table
- another options available is `dataframe`, which returns a pandas
dataframe representation of the table
- if not specified or any other string value for `output_format` it
returns a list of dictionaries: table cell format, the original output
format from table transformer
- `unstructured.model.tables.recognize` no longer accepts `out_html`
kwarg and it now only returns table cell format
[CORE-2307]:
https://unstructured-ai.atlassian.net/browse/CORE-2307?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
---------
Co-authored-by: qued <[email protected]>1 parent 2ee38e6 commit 326f180
File tree
5 files changed
+107
-16
lines changed- test_unstructured_inference
- models
- unstructured_inference
- models
5 files changed
+107
-16
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
| 7 | + | |
6 | 8 | | |
7 | 9 | | |
8 | 10 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
6 | 10 | | |
7 | 11 | | |
8 | 12 | | |
| |||
122 | 126 | | |
123 | 127 | | |
124 | 128 | | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
361 | 361 | | |
362 | 362 | | |
363 | 363 | | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
364 | 412 | | |
365 | 413 | | |
366 | 414 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
| |||
176 | 177 | | |
177 | 178 | | |
178 | 179 | | |
| 180 | + | |
179 | 181 | | |
180 | 182 | | |
181 | 183 | | |
| |||
186 | 188 | | |
187 | 189 | | |
188 | 190 | | |
189 | | - | |
190 | | - | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
191 | 197 | | |
192 | 198 | | |
193 | 199 | | |
| |||
234 | 240 | | |
235 | 241 | | |
236 | 242 | | |
237 | | - | |
| 243 | + | |
238 | 244 | | |
239 | | - | |
240 | | - | |
241 | 245 | | |
242 | 246 | | |
243 | 247 | | |
| |||
248 | 252 | | |
249 | 253 | | |
250 | 254 | | |
251 | | - | |
252 | | - | |
253 | | - | |
254 | | - | |
255 | | - | |
256 | | - | |
257 | | - | |
258 | | - | |
| 255 | + | |
259 | 256 | | |
260 | 257 | | |
261 | 258 | | |
| |||
0 commit comments