The confidence score

**Describe the bug**
Post parsing PDF , how to validate the parsing results

**To Reproduce**
detection_class_prob, This key is not consistent that is, it is not available for all extracted elements.

**Expected behavior**
Let's say i am parsing a pdf which have images, texts, tables as image etc. I have used partition_pdf() and used hi_res as strategy. Now the behaviour should ,for each element in metadata ,detection_class_prob key should be available which will tell confidence score.However i am not seeing the detection_class_prob for few elements. Like for a Table element detection_class_prob is available and for Image element detection_class_prob is not, Simillarly for other elements the key is unavailable. Expected is to have this key for all the elements.

**Screenshots**

<img width="532" alt="Image" src="https://github.com/user-attachments/assets/19e65744-62a1-4e38-8c4b-89fb928bce0e" />

![Image](https://github.com/user-attachments/assets/5559e250-188c-49f2-8347-e0c8be378258)

**Environment Info**
please use 👍 
unstructured version :  0.16.23
```py
raw_pdf_elements=partition_pdf(
    filename="/content/data/Cocktails_Spirits.pdf",
    strategy="hi_res",
    infer_table_structure=True,  # Infers table structures from content
    extract_images_in_pdf=True,  # Extract images from the PDF
    extract_image_block_types=["Image", "Table"],  # Image and Table extraction
    extract_image_block_to_payload=True,  # Return images in the response
    output_format="application/json",  # JSON output format
    extract_image_block_output_dir="extracted_data_test"
  )
```

**Additional context**
probabilities value we should get.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The confidence score #3903

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The confidence score #3903

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions