'extract_text' text matrix seems to be sometimes broken with v5.1.0

Extracting text used to extract all words, now at least one is missing from the bounding box

## Environment

Both Linux and Windows.
v5.0.1 has been tested and is fine.

## Code + PDF

With this PDF: 
[EMSR718_AOI02_DEL_PRODUCT_18000_map_v1.pdf](https://github.com/user-attachments/files/17620595/EMSR718_AOI02_DEL_PRODUCT_18000_map_v1.pdf)

```python
def extract_map_text(
    page: PageObject,
    x_min: float = 0,
    x_max: float = 1,
    y_min: float = 0,
    y_max: float = 1,
    sep=";",
):
    """
    Extract the text from the given page (in PDF)
    Args:
        page (PageObject): PDF page
        x_thresh (float): Threshold (%age of total width) on x-axis to read the text only on the right of it

    Returns:
        str: Extracted text

    """
    parts = []

    def visitor_right(text, cm, tm, font_dict, font_size):
        x = tm[4]
        y = tm[5]
        in_window = (
            float(x_max * float(page.cropbox.right))
            > x
            > float(x_min * float(page.cropbox.right))
        ) and (
            float(y_max * float(page.cropbox.top))
            > y
            > float(y_min * float(page.cropbox.top))
        )
        if in_window and text not in ["!", "", " "]:
            parts.append(text)

    page.extract_text(orientations=0, visitor_text=visitor_right)
    page_txt = (
        sep.join([p for p in parts if p not in ["\n"]])
        .replace("\n", " ")
        .replace("\x00", "")
        .replace("\xa0", " ")
    )
    return page_txt
```

Running this snippet: 
```python
extract_map_text(
    page, x_min=0.8, y_min=0.6, y_max=0.8, sep=" "
).replace("  ", " ")
```

With pypdf v5.1.0, the output is:
```
'3.5 km Potentially Affected Built-up and Transportations Built-Up 1 No. 0.9 km Flooded area 33.1 ha Potentially affected population ~ 200'
```

With pypdf v5.0.1, the output is:
```
'3.5 km Potentially Affected Built-up and Transportations Built-Up 1 No. Road 0.9 km Flooded area 33.1 ha Potentially affected population ~ 200'
```

The "Road" word is missing. After some checks, I see in the new version that x, y for Road is set to 0, 0 which is really weird.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

'extract_text' text matrix seems to be sometimes broken with v5.1.0 #2932

Environment

Code + PDF

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

'extract_text' text matrix seems to be sometimes broken with v5.1.0 #2932

Description

Environment

Code + PDF

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions