You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: sdk/formrecognizer/azure-ai-formrecognizer/MIGRATION_GUIDE.md
+72-76
Original file line number
Diff line number
Diff line change
@@ -52,8 +52,8 @@ Please refer to the [README][readme] for more information on these new clients.
52
52
53
53
Some terminology has changed to reflect the enhanced capabilities of the newest Form Recognizer service APIs. While the service is still called `Form Recognizer`, it is capable of much more than simple recognition and is not limited to documents that are `forms`. As a result, we've made the following broad changes to the terminology used throughout the SDK:
54
54
55
-
- The word `Document` has broadly replaced the word `Form.` The service supports a wide variety of documents and data-extraction scenarios, not merely limited to `forms.`
56
-
- The word `Analyze` has broadly replaced the word `Recognize.` The document analysis operation executes a data extraction pipeline that supports more than just recognition.
55
+
- The word `Document` has broadly replaced the word `Form.` The service supports a wide variety of documents and data-extraction scenarios, not merely limited to `forms`.
56
+
- The word `Analyze` has broadly replaced the word `Recognize`. The document analysis operation executes a data extraction pipeline that supports more than just recognition.
57
57
- Distinctions between `custom` and `prebuilt` models have broadly been eliminated. Prebuilt models are simply models that were created by the Form Recognizer service team and that exist within every Form Recognizer resource.
58
58
- The concept of `model training` has broadly been replaced with `model creation`, `building a model`, or `model administration` (whatever is most appropriate in context), as not all model creation operations involve `training` a model from a data set. When referring to a model schema trained from a data set, we will use the term `document type` instead.
-`begin_analyze_document` and `begin_analyze_document_from_url` accept a string with the desired model ID for analysis. The model ID can be any of the prebuilt model IDs or a custom model ID.
108
108
- Along with more consolidated analysis methods in the `DocumentAnalysisClient`, the return types have also been improved and remove the hierarchical dependencies between elements. An instance of the `AnalyzeResult` model is now returned which showcases important document elements, such as key-value pairs, tables, and document fields and values, among others, at the top level of the returned model. This can be contrasted with `RecognizedForm` which included more hierarchical relationships, for instance tables were an element of a `FormPage` and not a top-level element.
109
-
- In the new version of the library, the functionality of `begin_recognize_content` has been added as a prebuilt model and can be called in library version `azure-ai-formrecognizer (3.2.x)` with `begin_analyze_document` by passing in the `prebuilt-layout` model ID. Similarly, to get general document information, such as key-value pairs and text layout, the `prebuilt-document` model ID can be used with `begin_analyze_document`. Additionally, passing in the `prebuilt-read` model was added to read information about pages and detected languages.
109
+
- In the new version of the library, the functionality of `begin_recognize_content` has been added as a prebuilt model and can be called in library version `azure-ai-formrecognizer (3.2.x)` with `begin_analyze_document` by passing in the `prebuilt-layout` model ID. Similarly, to get general document information, such as key-value pairs and text layout, the `prebuilt-document` model ID can be used with `begin_analyze_document`. Additionally, the `prebuilt-read` model was added to read information about pages and detected languages.
110
110
- When calling `begin_analyze_document` and `begin_analyze_document_from_url` the returned type is an `AnalyzeResult` object, while the various methods used with `FormRecognizerClient` return a list of `RecognizedForm`.
111
111
- The `pages` keyword argument is a string with library version `azure-ai-formrecognizer (3.2.x)`. In `azure-ai-formrecognizer (3.1.x)`, `pages` was a list of strings.
112
112
- The `include_field_elements` keyword argument is not supported with the `DocumentAnalysisClient`, text details are automatically included with API version `2022-08-31` and later.
@@ -169,14 +169,8 @@ with open(path_to_sample_documents, "rb") as f:
print("Document has type {}".format(document.doc_type))
551
-
print("Document has document type confidence {}".format(document.confidence))
552
-
print("Document was analyzed with model with ID {}".format(result.model_id))
548
+
print("Document has confidence {}".format(document.confidence))
549
+
print("Document was analyzed by model with ID {}".format(result.model_id))
553
550
for name, field in document.fields.items():
554
551
field_value = field.value if field.value else field.content
555
552
print("......found field of type '{}' with value '{}' and with confidence {}".format(field.value_type, field_value, field.confidence))
@@ -566,22 +563,20 @@ for page in result.pages:
566
563
word.content, word.confidence
567
564
)
568
565
)
569
-
if page.selection_marks:
570
-
print("\nSelection marks found on page {}".format(page.page_number))
571
-
for selection_mark in page.selection_marks:
572
-
print(
573
-
"...Selection mark is '{}' and has a confidence of {}".format(
574
-
selection_mark.state, selection_mark.confidence
575
-
)
566
+
for selection_mark in page.selection_marks:
567
+
print(
568
+
"...Selection mark is '{}' and has a confidence of {}".format(
569
+
selection_mark.state, selection_mark.confidence
576
570
)
571
+
)
577
572
578
573
for i, table inenumerate(result.tables):
579
574
print("\nTable {} can be found on page:".format(i +1))
580
575
for region in table.bounding_regions:
581
576
print("...{}".format(i +1, region.page_number))
582
577
for cell in table.cells:
583
578
print(
584
-
"...Cell[{}][{}] has text '{}'".format(
579
+
"...Cell[{}][{}] has content '{}'".format(
585
580
cell.row_index, cell.column_index, cell.content
586
581
)
587
582
)
@@ -632,6 +627,7 @@ for doc in model.training_documents:
632
627
```
633
628
634
629
Train a custom model with `3.2.x`:
630
+
635
631
Use `begin_build_document_model()` to build a custom document model. Please note that this method has a required `build_mode` parameter. See https://aka.ms/azsdk/formrecognizer/buildmode for more information about build modes. Additionally, `blob_container_url` is a required keyword-only parameter.
These code samples show common scenario operations with the Azure Form Recognizer client library.
19
19
20
-
These sample programs show common scenarios for the Form Recognizer client's offerings.
21
-
22
20
All of these samples need the endpoint to your Form Recognizer resource ([instructions on how to get endpoint][get-endpoint-instructions]), and your Form Recognizer API key ([instructions on how to get key][get-key-instructions]).
23
21
24
22
## Samples for client library versions 3.2.0 and later
Copy file name to clipboardExpand all lines: sdk/formrecognizer/azure-ai-formrecognizer/samples/v3.2/async_samples/sample_analyze_identity_documents_async.py
0 commit comments