Commit ecf0267
This PR addresses issue #3659 by adding an optional `language` parameter
to the `OCRAgentGoogleVision` class constructor.
This parameter serves as a "language hint" for the
`document_text_detection` method in the `ImageAnnotatorClient`. For more
information on language hints, refer to the [Google Cloud Vision
documentation](https://cloud.google.com/vision/docs/languages).
**Default Behavior**:
The language parameter defaults to None, allowing Google Cloud Vision to
auto-detect the language, as recommended in their documentation.
**Purpose**:
This change is necessary because the `OCRAgent`'s `get_instance` method
expects all `OCRAgent`s to include a language parameter in their
constructors.
**Context on Issue:**
When trying to parse a PDF with
`OCR_AGENT=unstructured.partition.utils.ocr_models.google_vision_ocr.OCRAgentGoogleVision`,
an error occurs in the `get_instance` method. The method expects a
`language` parameter, which the current `OCRAgentGoogleVision`
constructor does not support, leading to a positional argument error.
---------
Co-authored-by: Christine Straub <[email protected]>
1 parent 6ba376a commit ecf0267
File tree
4 files changed
+32
-19
lines changed- test_unstructured/partition/pdf_image
- unstructured
- partition/utils/ocr_models
4 files changed
+32
-19
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
1 | 11 | | |
2 | 12 | | |
3 | 13 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
| 2 | + | |
2 | 3 | | |
3 | 4 | | |
4 | 5 | | |
| |||
226 | 227 | | |
227 | 228 | | |
228 | 229 | | |
229 | | - | |
| 230 | + | |
230 | 231 | | |
231 | 232 | | |
232 | 233 | | |
233 | | - | |
| 234 | + | |
234 | 235 | | |
| 236 | + | |
235 | 237 | | |
236 | 238 | | |
237 | 239 | | |
| |||
249 | 251 | | |
250 | 252 | | |
251 | 253 | | |
252 | | - | |
| 254 | + | |
253 | 255 | | |
254 | 256 | | |
255 | 257 | | |
| |||
263 | 265 | | |
264 | 266 | | |
265 | 267 | | |
266 | | - | |
| 268 | + | |
267 | 269 | | |
268 | 270 | | |
269 | 271 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
Lines changed: 15 additions & 14 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
5 | 5 | | |
6 | | - | |
| 6 | + | |
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
| 23 | + | |
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
| |||
32 | 33 | | |
33 | 34 | | |
34 | 35 | | |
35 | | - | |
| 36 | + | |
| 37 | + | |
36 | 38 | | |
37 | 39 | | |
38 | | - | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
39 | 43 | | |
40 | 44 | | |
41 | 45 | | |
42 | 46 | | |
43 | | - | |
44 | | - | |
45 | | - | |
| 47 | + | |
46 | 48 | | |
| 49 | + | |
47 | 50 | | |
48 | 51 | | |
49 | | - | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
50 | 55 | | |
51 | 56 | | |
52 | 57 | | |
53 | 58 | | |
54 | 59 | | |
55 | | - | |
56 | | - | |
57 | | - | |
| 60 | + | |
58 | 61 | | |
59 | 62 | | |
60 | 63 | | |
61 | 64 | | |
62 | 65 | | |
63 | 66 | | |
64 | | - | |
65 | 67 | | |
66 | 68 | | |
67 | 69 | | |
68 | | - | |
69 | 70 | | |
70 | 71 | | |
71 | 72 | | |
| |||
0 commit comments