Web Client

Home > Docs > Web Client

The Web client runs OCR entirely in the browser using the IBM Granite Docling 258M model via WebGPU or WASM. No server is required.

Prerequisites

Install the required peer dependencies:

npm install @huggingface/transformers onnxruntime-web

For PDF support, also install:

npm install unpdf

Browser support

WebGPU is the recommended backend for best performance. If WebGPU is unavailable, the client falls back to WASM.

Backend	Performance	Compatibility
`webgpu`	Fast (GPU-accelerated)	Chrome 113+, Edge 113+
`wasm`	Slower (CPU-based)	All modern browsers
`auto`	Tries WebGPU first, falls back to WASM	All modern browsers

Creating the Client

import { createWebClient } from "docling-sdk/web";

const client = createWebClient({ device: "webgpu" });

Or through the main factory:

import { Docling } from "docling-sdk";

const client = new Docling({
  web: {
    device: "webgpu",
    maxNewTokens: 4096,
  },
});

Lifecycle

Initialize

Download and load the model. The first call downloads the model (~500 MB), which is then cached in IndexedDB:

client.on("loading", ({ progress, status }) => {
  console.log(`Loading: ${Math.round(progress * 100)}% - ${status}`);
});

await client.initialize();
// client.ready === true

Destroy

Terminate the Web Worker and release resources:

client.destroy();
// client.ready === false

State properties

Property	Type	Description
`client.ready`	`boolean`	`true` after `initialize()` completes
`client.processing`	`boolean`	`true` while `processImage()` is running

Processing Images

Input types

processImage accepts several input types:

const result = await client.processImage(input);

Input type	Description
`File`	File from `<input type="file">`
`Blob`	Binary data blob
`string`	Data URL or base64 string
`HTMLCanvasElement`	Canvas element
`HTMLImageElement`	Image element
`ImageBitmap`	Decoded image bitmap
`OffscreenCanvas`	Offscreen canvas

Options

const result = await client.processImage(file, {
  maxNewTokens: 8192,  // Override max tokens for this call
});

WebOCRResult

The result object contains all output formats:

const result = await client.processImage(file);

result.raw;        // Raw DocTags markup
result.markdown;   // Converted Markdown
result.html;       // Converted HTML
result.plainText;  // Plain text
result.json;       // Structured WebOCRDocument
result.tables;     // Extracted tables: { headers: string[], rows: string[][] }[]
result.overlays;   // Bounding boxes: { tagType: string, bbox: {...} }[]

Document Conversion

The Web client also implements DoclingClientBase, so you can convert PDFs (requires unpdf peer dependency):

const result = await client.convert(pdfBuffer, "document.pdf", {
  to_formats: ["md"],
});

console.log(result.document.md_content);

Convenience methods:

const md = await client.toMarkdown(pdfBuffer, "doc.pdf");
const html = await client.toHtml(pdfBuffer, "doc.pdf");
const text = await client.extractText(pdfBuffer, "doc.pdf");

Events

Subscribe to events with on and unsubscribe with off:

client.on("loading", ({ progress, status }) => {
  // Model download progress (0-1)
});

client.on("ready", () => {
  // Model loaded and ready
});

client.on("status", ({ status }) => {
  // Processing status updates
});

client.on("stream", ({ chunk, progress }) => {
  // Streaming tokens during inference
  process.stdout.write(chunk);
});

client.on("complete", (result) => {
  // Processing complete, result is WebOCRResult
});

client.on("error", ({ message, code }) => {
  // Error during initialization or processing
});

Event	Data	When
`loading`	`{ progress: number, status: string }`	During model download
`ready`	`undefined`	Model loaded
`status`	`{ status: string }`	Processing status change
`stream`	`{ chunk: string, progress: number }`	Token generated during inference
`complete`	`WebOCRResult`	Processing finished
`error`	`{ message: string, code?: string }`	Error occurred

Cache Management

The model is cached in IndexedDB after the first download:

// Get cache size in bytes
const size = await client.getCacheSize();
console.log(`Cache: ${(size / 1024 / 1024).toFixed(1)} MB`);

// Clear the cache (forces re-download on next initialize())
const cleared = await client.clearCache();

Standalone Utilities

The docling-sdk/web entry point exports converter and extractor functions that work without the full client:

Converters

Convert a WebOCRDocument (the .json field of WebOCRResult) to different formats:

import {
  doclingToHtml,
  doclingToMarkdown,
  doclingToPlainText,
  doclingToJson,
} from "docling-sdk/web";

const html = doclingToHtml(result.json);
const markdown = doclingToMarkdown(result.json);
const text = doclingToPlainText(result.json);
const json = doclingToJson(result.json);

Extractors

Extract structured data from the document:

import { extractTables, tableToCSV, extractOverlays } from "docling-sdk/web";

const tables = extractTables(result.json);
for (const table of tables) {
  console.log(table.headers);
  console.log(table.rows);
  console.log(tableToCSV(table));
}

const overlays = extractOverlays(result.raw);
for (const overlay of overlays) {
  console.log(overlay.tagType, overlay.bbox);
}

PDF renderer

Render PDF pages to images (requires unpdf):

import { renderPdfToImages } from "docling-sdk/web";

const pages = await renderPdfToImages(pdfBuffer);
for (const page of pages) {
  console.log(page.pageNumber, page.width, page.height);
  // page.dataUrl is a data URL you can pass to processImage()
}

Cache utilities

import { clearModelCache, getModelCacheSize } from "docling-sdk/web";

const size = await getModelCacheSize();
await clearModelCache();

Demo Application

The repository includes a full demo application:

npm run examples:web-ocr

This starts a Vite dev server at http://localhost:5173 with a drag-and-drop interface for testing the Web OCR client.

Getting Started -- first Web OCR conversion
Configuration -- Web config options
Cross-Runtime -- browser entry points
API Reference -- full method reference

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Web Client

Prerequisites

Browser support

Creating the Client

Lifecycle

Initialize

Destroy

State properties

Processing Images

Input types

Options

WebOCRResult

Document Conversion

Events

Cache Management

Standalone Utilities

Converters

Extractors

PDF renderer

Cache utilities

Demo Application

Related

FilesExpand file tree

web-client.md

Latest commit

History

web-client.md

File metadata and controls

Web Client

Prerequisites

Browser support

Creating the Client

Lifecycle

Initialize

Destroy

State properties

Processing Images

Input types

Options

WebOCRResult

Document Conversion

Events

Cache Management

Standalone Utilities

Converters

Extractors

PDF renderer

Cache utilities

Demo Application

Related