You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
chore: pass ocr_mode in partition_pdf_or_image (#1154)
Set to individual_blocks for now to work around [this
bug](Unstructured-IO/unstructured-inference#179).
I verified by printing the current ocr_mode in inference. The
`entire_page` default is overridden.
---------
Co-authored-by: ryannikolaidis <[email protected]>
Co-authored-by: awalker4 <[email protected]>
Copy file name to clipboardExpand all lines: test_unstructured_ingest/expected-structured-output/azure/IRS-form-1987.png.json
+57-47
Original file line number
Diff line number
Diff line change
@@ -1,17 +1,17 @@
1
1
[
2
2
{
3
3
"type": "Title",
4
-
"element_id": "88591a76b54e47215c0827ae8838ec13",
4
+
"element_id": "0c4e18d78e721c8179f3946b75b17d15",
5
5
"metadata": {
6
6
"data_source": {},
7
7
"filetype": "image/png",
8
8
"page_number": 1
9
9
},
10
-
"text": "Instructions for Form 3115 (Rev. November 1987)"
10
+
"text": "Instructions for Form 3115 (Rev. November 1987) Annlicatinn far Chance in Accounting Mathond"
11
11
},
12
12
{
13
13
"type": "NarrativeText",
14
-
"element_id": "766cf1d1243ef2cdbb0db5ad32d7f9c9",
14
+
"element_id": "41f3d9c83b2b4679195c9796134fd8f5",
15
15
"metadata": {
16
16
"data_source": {},
17
17
"filetype": "image/png",
@@ -21,7 +21,7 @@
21
21
},
22
22
{
23
23
"type": "ListItem",
24
-
"element_id": "36a565493a214d3f7e7f24794c1dc7f4",
24
+
"element_id": "97968e4ba14bd2d082a70ec61ef2d9b1",
25
25
"metadata": {
26
26
"data_source": {},
27
27
"filetype": "image/png",
@@ -111,7 +111,7 @@
111
111
},
112
112
{
113
113
"type": "ListItem",
114
-
"element_id": "59bc2945a7f606bd5078bac3bc1199d4",
114
+
"element_id": "f0d2beb7f43493694a91137e8e65b5f3",
115
115
"metadata": {
116
116
"data_source": {},
117
117
"filetype": "image/png",
@@ -121,7 +121,7 @@
121
121
},
122
122
{
123
123
"type": "ListItem",
124
-
"element_id": "5157d731aa6a97c9b166799db2295bce",
124
+
"element_id": "13f2a282f705590fbe7b6ce15b08862a",
125
125
"metadata": {
126
126
"data_source": {},
127
127
"filetype": "image/png",
@@ -141,7 +141,7 @@
141
141
},
142
142
{
143
143
"type": "ListItem",
144
-
"element_id": "34b66452ca63c465c69d849e4acf6d46",
144
+
"element_id": "9820f79275e683f5afe3f2f1283de4ca",
145
145
"metadata": {
146
146
"data_source": {},
147
147
"filetype": "image/png",
@@ -161,7 +161,7 @@
161
161
},
162
162
{
163
163
"type": "ListItem",
164
-
"element_id": "b0fa5aaff0cee8574822dd8ac6537c06",
164
+
"element_id": "a98378f4a88db65dff42b7d8bd75be92",
165
165
"metadata": {
166
166
"data_source": {},
167
167
"filetype": "image/png",
@@ -181,7 +181,7 @@
181
181
},
182
182
{
183
183
"type": "ListItem",
184
-
"element_id": "13f155c0754434406190f3cf49c82c3c",
184
+
"element_id": "3cb57c50002187a715e1c5048e643c65",
185
185
"metadata": {
186
186
"data_source": {},
187
187
"filetype": "image/png",
@@ -201,33 +201,33 @@
201
201
},
202
202
{
203
203
"type": "ListItem",
204
-
"element_id": "178d6933ed193747b1c4aa1c048e7f94",
204
+
"element_id": "beeb50db70ce1aa76813cce98e46bd56",
205
205
"metadata": {
206
206
"data_source": {},
207
207
"filetype": "image/png",
208
208
"page_number": 1
209
209
},
210
-
"text": "for these changes."
210
+
"text": "for these changes. Tb od Db bee Cl"
211
211
},
212
212
{
213
213
"type": "NarrativeText",
214
-
"element_id": "7685df2334a5f6c8c8099dea61a8f1b4",
214
+
"element_id": "640a100da1a3bee6f1f134c51a2c8648",
215
215
"metadata": {
216
216
"data_source": {},
217
217
"filetype": "image/png",
218
218
"page_number": 1
219
219
},
220
-
"text": "Long-term contracts.—If you are required to change your method of accounting for long-term contracts under section 460, see Notice 87-61 (9/21/87), 1987-38 IRB 40, for the notification procedures that must be followed."
220
+
"text": "Long-term contracts.—If you are required to change your method of accounting for long-term contracts under section 460, see Notice 87-61 (9/21/87), 1987-38 IRB 40, for the notification procedures that must be followed"
221
221
},
222
222
{
223
223
"type": "Title",
224
-
"element_id": "61ed58fa51293f429f87e8cf1896c9e4",
224
+
"element_id": "a232d246e22a4f6bb8dcab62cffb2567",
225
225
"metadata": {
226
226
"data_source": {},
227
227
"filetype": "image/png",
228
228
"page_number": 1
229
229
},
230
-
"text": "Paperwork Reduction Act Notice"
230
+
"text": "Paperwork Reduction Act Notice We ack for thic infarenatinn te marry mye the."
231
231
},
232
232
{
233
233
"type": "Title",
@@ -241,27 +241,37 @@
241
241
},
242
242
{
243
243
"type": "ListItem",
244
-
"element_id": "5f8051f8010896bab02aaf784c04ae02",
244
+
"element_id": "58f1649a32eda8b8c513e51a209666a6",
245
245
"metadata": {
246
246
"data_source": {},
247
247
"filetype": "image/png",
248
248
"page_number": 1
249
249
},
250
-
"text": "Individuals.—An individual desiring the change should sign the application. Ifthe application pertains to a husband and wife filing a joint Income tax return, the names of both should appear in the heading and both should sign Partnerships.—The form should be signed with the partnership name followed by the signature of one of the general partners and the words “General Partner.” Corporations, cooperatives, and insurance companies.—The form should show the name of the corporation, cooperative, or insurance Company and the signature of the president, vice president, treasurer, assistant treasurer, or chief accounting officer (such as tax officer) authorized tosign, and his or her official title. Receivers, trustees, or assignees must sign any application they are required to file, For a subsidiary corporation filing a consolidated return with its parent, the form should be signed by an officer of the parent corporation, Fiduciaries.—The-form should show the name of the estate or trust and be signed by the fiduciary, personal representative, executor, executrix, administrator, administratrx, etc’, having legal authority to'sign, and his or her ttle. Preparer other than partner, officer, etc.—The signature of the individual preparing the application should appear in the space provided on page"
250
+
"text": "Signature Individuals.—An individual desiring the change should sign the application. Ifthe application pertains to a husband and wife filing a joint Income tax return, the names of both should appear in the heading and both should sign Partnerships.—The form should be signed with the partnership name followed by the signature of one of the general partners and the words “General Partner.” Corporations, cooperatives, and insurance companies.—The form should show the name of the corporation, cooperative, or insurance Company and the signature of the president, vice president, treasurer, assistant treasurer, or chief accounting officer (such as tax officer) authorized tosign, and his or her official title. Receivers, trustees, or assignees must sign any application they are required to file, For a subsidiary corporation filing a consolidated return with its parent, the form should be signed by an officer of the parent corporation, Fiduciaries.—The-form should show the name of the estate or trust and be signed by the fiduciary, personal representative, executor, executrix, administrator, administratrx, etc’, having legal authority to'sign, and his or her ttle. Preparer other than partner, officer, etc.—The signature of the individual preparing the application should appear in the space provided on page"
251
+
},
252
+
{
253
+
"type": "ListItem",
254
+
"element_id": "586e989b479e4362ebe28a6954c1427b",
255
+
"metadata": {
256
+
"data_source": {},
257
+
"filetype": "image/png",
258
+
"page_number": 1
259
+
},
260
+
"text": "If the individual or firm is also authorized to"
251
261
},
252
262
{
253
263
"type": "NarrativeText",
254
-
"element_id": "4660422c06dddc914ab634c5e4045dec",
264
+
"element_id": "446ccb7d96fea659d50aef8a6dd670df",
255
265
"metadata": {
256
266
"data_source": {},
257
267
"filetype": "image/png",
258
268
"page_number": 1
259
269
},
260
-
"text": "We ask for this information to carry out the Internal Revenue laws of the United States. We need it to ensure that taxpayers are complying with these laws an¢ to allow us to figure and collect the nght amount of tax. You are required to give us this information."
270
+
"text": "We ask for this information to carry out the Internal Revenue laws of the United States. We need it to ensure that taxpayers are complying with these laws an¢ to allow us to figure and collect the right amount of tax. You are required to give us this information,"
261
271
},
262
272
{
263
273
"type": "Title",
264
-
"element_id": "a1547a4ed1611eee44b15e99120fb978",
274
+
"element_id": "226fa83297914d5195e002508d61fb1d",
265
275
"metadata": {
266
276
"data_source": {},
267
277
"filetype": "image/png",
@@ -271,77 +281,77 @@
271
281
},
272
282
{
273
283
"type": "Title",
274
-
"element_id": "68a3289177b49b285e133a5267eb355f",
284
+
"element_id": "f0e951e5bcb4a6070fa6672b37822348",
275
285
"metadata": {
276
286
"data_source": {},
277
287
"filetype": "image/png",
278
288
"page_number": 1
279
289
},
280
-
"text": "Purpose of Form"
290
+
"text": "Purpose of Form Cin bce Secon te cece cget."
281
291
},
282
292
{
283
293
"type": "NarrativeText",
284
-
"element_id": "f9b8e17da7a31507773f78959378e09c",
294
+
"element_id": "5e5451e052baf894b2bdad4132f6cd2f",
285
295
"metadata": {
286
296
"data_source": {},
287
297
"filetype": "image/png",
288
298
"page_number": 1
289
299
},
290
-
"text": "File this form to request a change in your accounting method, including the accounting treatment of any item. if you are requesting 2 change in accounting period, use Form 1128, Application for Change in Accounting Period. For more information, see Publication 538, Accounting Periods and Methods,"
300
+
"text": "ee File this form to request a change in your accounting method, including the accounting treatment of any item. if you are requesting 2 change in accounting period, use Form 1128, Application for Change in Accounting Period. For more information, see Publication 538, Accounting Periods and Methods,"
291
301
},
292
302
{
293
303
"type": "NarrativeText",
294
-
"element_id": "b3859f2f29884b1d3ba0892e52859a99",
304
+
"element_id": "cc1701e3ce9347e344b3df80d426bd21",
295
305
"metadata": {
296
306
"data_source": {},
297
307
"filetype": "image/png",
298
308
"page_number": 1
299
309
},
300
-
"text": "When filing Form 3115, taxpayers are reminded to determine if IRS has published a ruling or procedure dealing with the specific type of change since November 1987 (the current. revision date of Form 3115)"
310
+
"text": "Seti aes When filing Form 3115, taxpayers are reminded to determine if IRS has published a ruling or procedure dealing with the specific type of change since November 1987 (the current. revision date of Form 3115)"
301
311
},
302
312
{
303
313
"type": "NarrativeText",
304
-
"element_id": "e5a95dc10d4071983b70898a21f11175",
314
+
"element_id": "b81dc18d0f8666f9bf7400a00657dc72",
305
315
"metadata": {
306
316
"data_source": {},
307
317
"filetype": "image/png",
308
318
"page_number": 1
309
319
},
310
-
"text": "Generally, applicants must complete Section ‘A. In addition, complete the appropriate sections (B:1 through H) for which a change is desired."
320
+
"text": "POMS SANE OPFOR DA 29). Generally, applicants must complete Section ‘A. In addition, complete the appropriate sections (B:1 through H) for which a change is desired. You must give alll relevant facts, including a"
311
321
},
312
322
{
313
323
"type": "Title",
314
-
"element_id": "5756fb398995bb6518a87637f24f426e",
324
+
"element_id": "c7502aa5b000d6446f3eca882518a260",
315
325
"metadata": {
316
326
"data_source": {},
317
327
"filetype": "image/png",
318
328
"page_number": 1
319
329
},
320
-
"text": "Time and Place for Filing"
330
+
"text": "Time and Place for Filing amarall, ammlimeete maet file snete"
321
331
},
322
332
{
323
333
"type": "NarrativeText",
324
-
"element_id": "25f830e7c39c115c9937eb9d11cfb1f2",
334
+
"element_id": "8b35e7c212710b1099b675ce9394fb47",
325
335
"metadata": {
326
336
"data_source": {},
327
337
"filetype": "image/png",
328
338
"page_number": 1
329
339
},
330
-
"text": "State whether you desire a conference in the National Office if the Service proposes to disapprove your application"
340
+
"text": "Se NB ON State whether you desire a conference in the National Office if the Service proposes to disapprove your application."
331
341
},
332
342
{
333
343
"type": "Title",
334
-
"element_id": "8b06cd6e2bf7fc15130d5d9ed7e66283",
344
+
"element_id": "0a16a0fea889be77576c0fd88575554a",
335
345
"metadata": {
336
346
"data_source": {},
337
347
"filetype": "image/png",
338
348
"page_number": 1
339
349
},
340
-
"text": "Affiliated Groups"
350
+
"text": "Affiliated Groups Tavmayare that ara mam)"
341
351
},
342
352
{
343
353
"type": "Title",
344
-
"element_id": "242a9dba10a04654d4adef9c58ff96f6",
354
+
"element_id": "68b58298cabd9069c975b192a7183139",
345
355
"metadata": {
346
356
"data_source": {},
347
357
"filetype": "image/png",
@@ -351,62 +361,62 @@
351
361
},
352
362
{
353
363
"type": "Title",
354
-
"element_id": "11c98a9cbd6a200fbc5b93fed15007ac",
364
+
"element_id": "6a8881a6e87021b2362243f7df3e4b1d",
355
365
"metadata": {
356
366
"data_source": {},
357
367
"filetype": "image/png",
358
368
"page_number": 1
359
369
},
360
-
"text": "Uniform capitalization rules and limitation on"
370
+
"text": "Uniform capitalization rules and limitation on cash method.—If you are required to char"
361
371
},
362
372
{
363
373
"type": "Title",
364
-
"element_id": "58703de56debc34a1d68e6ed6f8fd067",
374
+
"element_id": "8daeb8b48fb666f1dd54e2af283d0c22",
365
375
"metadata": {
366
376
"data_source": {},
367
377
"filetype": "image/png",
368
378
"page_number": 1
369
379
},
370
-
"text": "Specific Instructions Section A"
380
+
"text": "Specific Instructions Section A Neem Ea mama 1 !Taeahle inemes"
371
381
},
372
382
{
373
383
"type": "Title",
374
-
"element_id": "a4316c02df07840f1beb56609cb09735",
384
+
"element_id": "09203a0c6955f64ca8eb52cd6ea47034",
375
385
"metadata": {
376
386
"data_source": {},
377
387
"filetype": "image/png",
378
388
"page_number": 1
379
389
},
380
-
"text": "Late Applications"
390
+
"text": "Late Applications Me coup armlimatinm te ler"
381
391
},
382
392
{
383
393
"type": "NarrativeText",
384
-
"element_id": "39458f370b98a606db29ac6dee975e07",
394
+
"element_id": "962e3f0ceb1f0b1b08a1c19adde8d962",
385
395
"metadata": {
386
396
"data_source": {},
387
397
"filetype": "image/png",
388
398
"page_number": 1
389
399
},
390
-
"text": "Disregard the instructions under Time and Place for Filing and Late Applications. instead, attach Form 3115 to your income tax return for the year of change; do not file it separately. Also include on a separate statement accompanying the Form 3115 the period over which the section 481(2) adjustment will be taken into account and"
400
+
"text": "lethal elaine bela Disregard the instructions under Time and Place for Filing and Late Applications. instead, attach Form 3115 to your income tax return for the year of change; do not file it separately. Also include on a separate statement accompanying the Form 3115 the period over which the section 481(2) adjustment will be taken into account and the basis for that conclusion. Identify the"
391
401
},
392
402
{
393
403
"type": "Title",
394
-
"element_id": "025a65465b6fd9635316e92633b24c7e",
404
+
"element_id": "bfe98eb672d95c15a11ed3e618928b4e",
395
405
"metadata": {
396
406
"data_source": {},
397
407
"filetype": "image/png",
398
408
"page_number": 1
399
409
},
400
-
"text": "Identifying Number"
410
+
"text": "Identifying Number Ndiuidesale Am omptisoehesal"
401
411
},
402
412
{
403
413
"type": "NarrativeText",
404
-
"element_id": "9240bfa889b87dc2fb3fa746ca4eeeb4",
414
+
"element_id": "87f8128b03a72c616ee1a1bb91e11c56",
405
415
"metadata": {
406
416
"data_source": {},
407
417
"filetype": "image/png",
408
418
"page_number": 1
409
419
},
410
-
"text": "Others.-—The employer identification number of an applicant other than an individual should be entered in this block,"
420
+
"text": "—e—e—— eee Others.-—The employer identification number of an applicant other than an individual should be entered in this block,"
0 commit comments