Skip to content

Unexpectedly Deep Section Nesting on Second Page – Bug or Expected Behavior in Azure Document Intelligence Multi-page Processing? #41107

Open
@kazunaritakeichi

Description

@kazunaritakeichi

When processing a multi-page file (see attachment) with Azure Document Intelligence, the resulting layout shows that the section on the second page is nested much deeper than expected.

Is this simply a detection failure?

Or is this expected behavior when handling multi-page documents—meaning that post-processing is required to adjust layout consistency across pages?

I'd appreciate any clarification on whether this is a bug or something that needs to be handled on the client side.

$ for line in result.content.splitlines():
$   if '#' in line:
$     print(line)
# This is title
## 1. Text
## 2. Page Objects
### 2.1 Table
### 2.2. Figure
## 3. Others
## This is title
### 1. Text
### 2. Page Objects
#### 2.1 Table
#### 2.2. Figure
### 3. Others

$ for i, paragraph in enumerate(result.paragraphs):
$   if paragraph.get('role') in ['title', 'sectionHeading']:
$     print(paragraph['role'], paragraph['content'])
title This is title
sectionHeading 1. Text
sectionHeading 2. Page Objects
sectionHeading 2.1 Table
sectionHeading 2.2. Figure
sectionHeading 3. Others
title This is title
sectionHeading 1. Text
sectionHeading 2. Page Objects
sectionHeading 2.1 Table
sectionHeading 2.2. Figure
sectionHeading 3. Others

Metadata

Metadata

Assignees

Labels

ClientThis issue points to a problem in the data-plane of the library.Document IntelligenceService AttentionWorkflow: This issue is responsible by Azure service team.customer-reportedIssues that are reported by GitHub users external to the Azure organization.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK teamquestionThe issue doesn't require a change to the product in order to be resolved. Most issues start as that

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions