Skip to content

No reliable signal for TextLayer completion / stability #2048

@rano667

Description

@rano667

Before you start - checklist

  • I understand that React-PDF does not aim to be a fully-fledged PDF viewer and is only a tool to make one
  • I have checked if this feature request is not already reported

Description

🐞 No reliable signal for TextLayer completion / stability

Summary

react-pdf does not expose any reliable signal to determine when the TextLayer for all rendered pages is complete and stable.
This makes it extremely difficult to safely run downstream DOM-dependent logic (e.g. text-based highlighting, annotations, selection overlays) without relying on fragile heuristics.


What I am trying to do

I am building a PDF viewer on top of react-pdf that:

  • Relies on the TextLayer DOM (span elements)
  • Needs to run post-processing logic exactly once, after all text spans are present
  • Must support:
    • Multi-page PDFs
    • Dynamic scale changes
    • Both text-based PDFs and OCR-generated artificial text layers

Example use cases:

  • Auto-highlighting extracted values
  • Text anchoring / annotation systems
  • Selection overlays synced with text

The core problem

There is no explicit signal indicating that:

“All text spans for the rendered pages have been inserted and no further DOM mutations will occur.”

Available callbacks/events today:

  • onLoadSuccess → PDF bytes loaded (not rendering)
  • onRenderSuccess → Page canvas rendered (not text)
  • onRenderTextLayerSuccess → Fires per page, but:
    • Does not guarantee all spans are inserted
    • Does not indicate global completion across pages
    • Late DOM mutations (fonts, layout, async inserts) may still occur

As a result, consumers are forced to infer TextLayer readiness, typically using heuristics.


Current workaround (fragile)

The only viable workaround today is something like:

  • Observe the container DOM using MutationObserver
  • Watch for TextLayer span insertions
  • Debounce mutations (e.g. “300ms without changes”)
  • Assume the TextLayer is “ready”

This approach:

  • Is timing-based, not state-based
  • Can fire too early or too late
  • Breaks under scale changes, large PDFs, or delayed font/layout recalculations
  • Becomes brittle in real-world usage

This is not a bug in react-pdf — it is a missing lifecycle signal.


Why this matters

Many advanced features depend on knowing when text is stable:

  • Auto-highlighting / value matching
  • Anchored annotations
  • Text-driven overlays
  • OCR text synchronization
  • Accurate scroll-to-text behavior

Without a reliable completion signal:

  • Downstream logic becomes one-shot and irreversible
  • Consumers must build fragile MutationObserver logic
  • Bugs appear nondeterministic and hard to reproduce

Minimal reproduction

Repository demonstrating the issue:

👉 https://github.com/rano667/pdf-text-extraction

The example shows that:

  • TextLayer spans are inserted incrementally
  • There is no reliable point at which all spans can be considered “final”
  • Any DOM-dependent post-processing must rely on heuristics

What would help (suggestions, not demands)

Any of the following would significantly improve reliability:

  1. A global TextLayer completion signal, for example:

    • onTextLayerRenderComplete({ pageNumber })
    • onAllTextLayersRendered()
  2. A documented contract clarifying:

    • When onRenderTextLayerSuccess fires relative to span insertion
    • Whether late DOM mutations are expected
  3. An internal lifecycle hook exposed to consumers indicating:

    • “All pages’ text layers have finished rendering”

Even a best-effort signal would be better than none.


Environment

{
  "react": "^19.2.0",
  "react-pdf": "^10.2.0"
}


### Proposed solution

_No response_

### Alternatives

_No response_

### Additional information

_No response_

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions