Progressive loading is a feature that allows PDF documents to be loaded page-by-page instead of loading all pages at once. This is particularly useful for large PDF files, as it significantly reduces initial load time and memory usage.
When you open a PDF document, pdfrx can operate in two modes:
- Standard Loading (default): All pages are loaded immediately when the document is opened
- Progressive Loading: Only the first page is loaded initially, and additional pages are loaded on-demand
Progressive loading is especially beneficial when:
- Working with large PDF files (hundreds of pages)
- Memory is constrained
- You want faster initial document load times
- Users typically don't view all pages in a session
Note: PdfViewer uses progressive loading by default, while PdfDocument requires explicit opt-in.
When progressive loading is enabled, pages can be in one of two states:
Pages that are fully loaded have complete information:
- Accurate
width,height, androtationvalues - Can be rendered properly
- Text extraction works (
loadText(),loadStructuredText()) - Link extraction works (
loadLinks()) isLoadedproperty returnstrue
Pages that haven't been loaded yet have limited functionality:
width,height, androtationare estimated values (may be incorrect)- Rendering produces an empty page with the specified background color
- Text extraction returns
null - Link extraction returns an empty list
isLoadedproperty returnsfalse
PdfViewer uses progressive loading by default (useProgressiveLoading: true). This means PDF documents are loaded page-by-page automatically as you scroll, providing optimal performance for large files.
All PdfViewer constructors support the useProgressiveLoading parameter:
// Uses progressive loading by default
PdfViewer.file('path/to/document.pdf')
// Explicitly enable progressive loading (same as default)
PdfViewer.asset(
'assets/large-document.pdf',
useProgressiveLoading: true,
)
// Disable progressive loading (load all pages at once)
PdfViewer.uri(
Uri.parse('https://example.com/document.pdf'),
useProgressiveLoading: false,
)When using PdfViewer, progressive loading happens automatically in the background as you scroll through the document. You don't need to manually call loadPagesProgressively().
When using PdfDocument directly (without PdfViewer), progressive loading is disabled by default (useProgressiveLoading: false). You need to explicitly enable it:
import 'package:pdfrx_engine/pdfrx_engine.dart';
// Open a document with progressive loading enabled
final document = await PdfDocument.openFile(
'path/to/document.pdf',
useProgressiveLoading: true,
);
// At this point, only the first page is loaded
print('First page loaded: ${document.pages[0].isLoaded}'); // true
print('Second page loaded: ${document.pages[1].isLoaded}'); // falseUse the isLoaded property to check if a page is fully loaded:
final page = document.pages[5];
if (page.isLoaded) {
// Page is fully loaded - all operations work normally
final text = await page.loadText();
print(text?.text);
} else {
// Page is not loaded yet - dimensions may be estimates
print('Page not loaded yet');
}Use the ensureLoaded() extension method to wait for a specific page to load. This method waits indefinitely and always returns a loaded page:
final page = document.pages[10];
// Wait for the page to load (waits indefinitely, never returns null)
final loadedPage = await page.ensureLoaded();
final text = await loadedPage.loadText();
print(text?.text);If you need to set a timeout, use waitForLoaded() instead. This method returns null if the timeout occurs:
final page = document.pages[10];
// Wait for the page to load with a timeout
final loadedPage = await page.waitForLoaded(
timeout: Duration(seconds: 5),
);
if (loadedPage != null) {
// Page loaded successfully
final text = await loadedPage.loadText();
print(text?.text);
} else {
// Timeout occurred
print('Page failed to load within timeout');
}Important: The ensureLoaded() method may return a different instance of PdfPage than the original. Always use the returned instance:
// ❌ WRONG - using the old page instance
final page = document.pages[10];
await page.ensureLoaded();
final text = await page.loadText(); // May not work as expected
// ✅ CORRECT - using the returned loaded page instance
final page = document.pages[10];
final loadedPage = await page.ensureLoaded();
final text = await loadedPage.loadText(); // Works correctlyYou can listen to page status changes using the events stream. The event provides the latest page instance directly via the page property:
final page = document.pages[5];
// Listen for status changes on this specific page
page.events.listen((change) {
// The change.page property provides the newest page instance
final updatedPage = change.page;
print('Page ${updatedPage.pageNumber} status changed');
print('Is loaded: ${updatedPage.isLoaded}');
if (updatedPage.isLoaded) {
print('Page dimensions: ${updatedPage.width} x ${updatedPage.height}');
}
});The latestPageStream provides the most recent page instance whenever the page status changes:
final page = document.pages[10];
page.latestPageStream.listen((latestPage) {
print('Page updated, isLoaded: ${latestPage.isLoaded}');
if (latestPage.isLoaded) {
// Use the latest loaded instance
}
});When using PdfDocument directly (not PdfViewer), you need to manually trigger progressive loading using loadPagesProgressively():
final document = await PdfDocument.openFile(
'path/to/document.pdf',
useProgressiveLoading: true,
);
// Load pages progressively with progress callback
await document.loadPagesProgressively(
onPageLoadProgress: (data, loadedPageCount, totalPageCount) {
print('Loaded $loadedPageCount of $totalPageCount pages');
// Return true to continue loading, false to stop
return true;
},
loadUnitDuration: Duration(milliseconds: 250),
);The callback is invoked periodically (every loadUnitDuration) as pages are loaded. Return false from the callback to stop the loading process early.
Problem: Trying to extract text from an unloaded page returns null.
final page = document.pages[50];
final text = await page.loadText(); // Returns null if page not loadedSolution: Always wait for the page to load first:
final page = document.pages[50];
final loadedPage = await page.ensureLoaded();
final text = await loadedPage.loadText();
print(text?.text);Problem: Using width, height, or rotation values from unloaded pages gives estimated values that may be wrong.
final page = document.pages[20];
print('Width: ${page.width}'); // May be an estimate if page is not loadedSolution: Check isLoaded or use ensureLoaded():
final page = document.pages[20];
final loadedPage = await page.ensureLoaded();
print('Actual width: ${loadedPage.width}');
print('Actual height: ${loadedPage.height}');Problem: Iterating through all pages without waiting for them to load:
// ❌ WRONG - pages may not be loaded yet
for (final page in document.pages) {
final text = await page.loadText(); // May return null
processText(text?.text); // Silently skips unloaded pages
}Solution: Ensure pages are loaded before processing:
// ✅ CORRECT - ensure each page is loaded
for (final page in document.pages) {
final loadedPage = await page.ensureLoaded();
final text = await loadedPage.loadText();
processText(text?.text);
}
// Alternative: Load all pages first using loadPagesProgressively()
await document.loadPagesProgressively();
for (final page in document.pages) {
final text = await page.loadText();
processText(text?.text);
}Use progressive loading when:
- Working with large PDFs (100+ pages)
- Initial load time is critical
- Users typically view only a few pages
- Memory usage is a concern
- Loading from network (reduces initial bandwidth)
Avoid progressive loading when:
- Working with small PDFs (< 20 pages)
- You need to process all pages immediately
- All pages will be accessed anyway
- Simplicity is preferred over optimization
Progressive loading reduces initial memory usage but doesn't automatically unload pages. Once a page is loaded, it stays in memory until the document is disposed. For very large documents, consider:
- Loading and processing pages in batches
- Disposing and reopening the document periodically if processing thousands of pages
- Using
PdfViewerwhich handles page lifecycle automatically
Here's a complete example showing how to process all pages in a large PDF with a progress indicator:
import 'package:pdfrx_engine/pdfrx_engine.dart';
Future<void> processPdfPages(String filePath) async {
// Open document with progressive loading
final document = await PdfDocument.openFile(
filePath,
useProgressiveLoading: true,
);
try {
// Load pages progressively with progress reporting
await document.loadPagesProgressively(
onPageLoadProgress: (_, loadedCount, totalCount) {
final progress = (loadedCount / totalCount * 100).toStringAsFixed(1);
print('Loading pages: $progress% ($loadedCount/$totalCount)');
return true; // Continue loading
},
);
// Now all pages are loaded, safe to process
for (int i = 0; i < document.pages.length; i++) {
final page = document.pages[i];
// Extract text from page
final text = await page.loadText();
print('Page ${i + 1}: ${text?.text.substring(0, 100)}...');
// Extract links
final links = await page.loadLinks();
print('Page ${i + 1} has ${links.length} links');
}
} finally {
await document.dispose();
}
}- Document Loading Indicator - Show loading progress in UI
- Low-Level PDFium Bindings Access - Advanced PDFium usage
- pdfrx Initialization - Setting up pdfrx