When attempting to convert a PDF document containing tables, I noticed that the content of one cell is missing.
Here is my PDF document: pdf_10.pdf. And it lost the cell of "护发类". For reference, I'm using the following parameters:
-H 'accept: application/json'
-H 'Content-Type: multipart/form-data'
-F 'files=@pdf_10.pdf;type=application/pdf'
-F 'from_formats=pdf'
-F 'to_formats=md'
-F 'ocr_engine=rapidocr'
-F 'force_ocr=true'
-F 'table_mode=accurate'
-F 'pdf_backend=dlparse_v4'
When I track the code, I find it uses tablemodel04_rs.py(url) to process the tables. I think the problem is with the model. Would it be possible to fix this, like using another model?
Thanks in advance!
When attempting to convert a PDF document containing tables, I noticed that the content of one cell is missing.
Here is my PDF document: pdf_10.pdf. And it lost the cell of "护发类". For reference, I'm using the following parameters:
-H 'accept: application/json'
-H 'Content-Type: multipart/form-data'
-F 'files=@pdf_10.pdf;type=application/pdf'
-F 'from_formats=pdf'
-F 'to_formats=md'
-F 'ocr_engine=rapidocr'
-F 'force_ocr=true'
-F 'table_mode=accurate'
-F 'pdf_backend=dlparse_v4'
When I track the code, I find it uses tablemodel04_rs.py(url) to process the tables. I think the problem is with the model. Would it be possible to fix this, like using another model?
Thanks in advance!