OfficeIMO.Pdf Support Matrix

This matrix describes the current first-party OfficeIMO.Pdf state. It is intentionally blunt so PSWriteOffice can wrap what exists and avoid promising PSWritePDF/iText parity too early.

OfficeIMO.Drawing is a valid first-party dependency for shared drawing concepts such as colors, font metadata, image metadata, text measurement, and reusable drawing primitives. Prefer that reusable layer when a primitive belongs across OfficeIMO packages, while keeping PDF syntax, page objects, and PDF-specific layout inside OfficeIMO.Pdf.

Status values:

Supported: public API exists and has tests.
Partial: useful capability exists, but scope is deliberately limited.
Planned: roadmap item, no dependable public API yet.
External bridge: exists elsewhere in OfficeIMO but not in the dependency-free PDF engine.

Area	Capability	Status	Current API / Notes
Create	Build a PDF from fluent blocks	Partial	`PdfDoc.Create()`, headings with Word-like spacing-before suppression at fresh page/column starts, paragraphs, rich text with scoped per-run standard font family, font-size, and background/highlight changes, Word-compatible default half-inch paragraph tab stops with `PdfParagraphStyle.DefaultTabStopWidth` overrides, explicit paragraph tab runs with dotted, hyphen, or underscore leaders and left/center/right/decimal alignment through `PdfParagraphBuilder.Tab(PdfTabLeaderStyle.Dots, PdfTabAlignment.DecimalSeparator)`, Word-like flow-object spacing-before suppression at fresh page/column starts, invisible `Spacer(...)` flow gaps, simple bullets/numbering plus rich list item runs through `PdfListItem`, `RichBullets(...)`, and `RichNumbered(...)`, panels, rows/columns, simple tables, JPEG/PNG images, headers, footers, page numbers; page-scoped content compose supports direct `Item(...)` groups, nested element groups, `Spacer(...)` rhythm blocks, and `PageBreak()` page transitions alongside columns and rows
Create	Save to bytes/path/stream workflow	Supported	`ToBytes`, `Save(string)`, `Save(Stream)`, `SaveAsync(string)`, and `SaveAsync(Stream)`
Create	Metadata	Supported	`PdfDoc.Meta(title, author, subject, keywords)`
Create	Page setup	Partial	`PdfOptions.PageSize`, `PdfOptions.Margins`, `PdfOptions.BackgroundColor`, `PageSize.FromInches(...)`, `PageSize.FromCentimeters(...)`, `PageMargins.UniformInches(...)`, `PageMargins.FromInches(...)`, `PageMargins.UniformCentimeters(...)`, `PageMargins.FromCentimeters(...)`, `PdfDoc.Size(...)`, `Margin(...)`, `Margin(PageMargins)`, `Orientation(...)`, `Portrait()`, `Landscape()`, `Background(...)`, top-level `PdfDoc.Page(...)` / `Section(...)`, `PdfDoc.Compose(...Page...)` / `Compose(...Section...)`, and matching `PdfPageCompose` methods provide Word-like size, orientation, margin, page-background color, and scoped flow setup with immediate intrinsic scalar validation and reusable Word-compatible `PageMargins` presets; richer section inheritance, mid-page section breaks, and image/background-shape page fills remain roadmap work
Create	Tables	Partial	Basic styling, proportional standard-font wrapping for cells and captions, rich `PdfTableCell` text runs with scoped color, bold/italic, underline/strike, font size, background/highlight, baseline, tabs, and links rendered through the shared rich text engine, table-cell images through `PdfTableCell.WithImages(...)`, report-friendly `TableStyles.Light()` defaults, Word-like table presets with neutral header/footer separators, including `TableNormal`, `TableGrid`, `TableGridLight`, `PlainTable1`, `GridTable1Light`, and Accent1-6 variants with Word default theme border, separator, and soft band colors for the existing light grid/list styles, canonical style normalization through `TableStyles.GetCanonicalWordStyleName(...)` / `TryGetCanonicalWordStyleName(...)`, canonical display names through `TableStyles.CanonicalWordStyleNames`, accepted aliases through `TableStyles.SupportedWordStyleNames`, row/header/footer separators, side-specific per-cell border overrides with independent side colors, widths, solid/dashed/dotted/dash-dot strokes, two-line borders, and diagonal-up/diagonal-down cell lines, body column fills, per-cell fills, proportional per-cell data bars through `PdfCellDataBar`, per-cell vector icons through `PdfCellIcon`, per-cell padding overrides, column and per-cell horizontal/vertical cell alignment, configurable cell spacing, configurable visual header/footer row counts with render-time bounds validation, optional repeated-header row count through `PdfTableStyle.RepeatHeaderRowCount`, table-wide and per-row minimum heights, table-wide and per-row row-break policies, table left indentation and max-width caps with left/center/right placement, spacing before/after with Word-like spacing-before suppression at fresh page/column starts, keep-together and keep-with-next page flow with matching first-row preflight diagnostics that honor configured column widths, fixed/min/max column widths including proportional fitting for oversized fixed-width tables in top-level and row/column frames, relative column width weights, column-scoped style bounds validation for sizing/fills/horizontal and vertical alignment, OfficeIMO.Drawing-backed auto-fit column sizing with token minimums, initial `PdfTableCell` column spans, row spans, rectangular merged cells with combined-box alignment, overlong row-span validation, row-spanned-cell header/footer boundary validation, row-spanned explicit cell fills/borders, explicit cell fill/data-bar/icon/border/padding/alignment coordinate bounds validation plus row-span and column-span continuation-slot skips, row/header/footer separators, body-column background fills that skip merged-cell continuation columns, row/background fills, and default table border grids that skip row-spanned and rectangular merged-cell interiors, cell-owned URI or named-destination links including linked column/row-spanned cell annotations over the merged text frame in top-level and row/column flows, and cell-owned named-destination anchors through `PdfTableCell.NamedDestinationName` / `WithNamedDestination(...)`, row-by-row pagination, oversized-row line splitting, repeated header rows, caption-plus-first-row overflow diagnostics, generic line-item visual rhythm gates, and PDF-level clipping when cell text would escape its cell rectangle with a small antialiasing tolerance exist; richer merged-cell conflict behavior and report tables are still roadmap work
Create	Rows and columns	Partial	`PdfRowCompose` supports percentage columns with explicit gutters plus reusable `PdfRowStyle` defaults/overrides for Word-like column gutters, optional vertical column separators, row-level spacing, keep-together, and keep-with-next page flow through `PdfOptions.DefaultRowStyle`, `PdfDoc.DefaultRowStyle(...)`, `PdfPageCompose.DefaultRowStyle(...)`, `PdfTheme.RowStyle`, or per-row `Style(...)` / `ColumnSeparator(...)`; column flows can use `Item(...)` groups and `Spacer(...)` for invisible vertical rhythm without fake blank text; the native Word exporter maps Word section columns with explicit column breaks, inline paragraph column breaks, explicit unequal section column widths from Word section properties, Word section column separator lines, and a heading/keep-with-next-aware automatic distribution fallback for multi-column sections without explicit breaks through this same row/column flow; kept rows that exceed the available page content height fail with a clear diagnostic, and richer balanced newspaper-style section flow remains roadmap work
Create	Images	Partial	JPEG and simple non-interlaced 8-bit grayscale/grayscale-alpha/RGB/RGBA PNG placement, including PNG alpha soft masks; image payload validation uses `OfficeIMO.Drawing.OfficeImageReader` and rejects unsupported recognized formats clearly; flow images can use shared `OfficeImageFit` stretch/contain/cover placement, shared `OfficeClipPath` rectangle/rounded/freeform clipping, and optional URI link annotations with contents metadata in top-level, row/column, and table-cell flows
Create	Drawing primitives	Partial	Flow lines, rectangles, rounded rectangles, ellipses, polygons, paths, and simple grouped drawing scenes render from shared `OfficeIMO.Drawing` descriptors with solid fill, two-stop linear gradient fill, simple offset shadow, stroke/width/dash style/line cap/line join/fill and stroke opacity/affine transform/clipping path/alignment/spacing/keep-with-next flow plus optional URI link annotations with contents metadata on generic shape and drawing scene blocks and on vector convenience helpers; richer gradients and richer shape effects remain roadmap work
Create	Headers and footers	Partial	Simple generated page headers and footers support `PdfOptions`, document-level `PdfDoc.Header(...)` / `PdfDoc.Footer(...)`, and page-scoped `PdfPageCompose.Header(...)` / `Footer(...)` configuration with literal text formats, `{page}` / `{pages}` tokens, composed text/token segment builders, Word-like left/center/right text zones through `PdfHeaderCompose.Zones(...)` and `PdfFooterCompose.Zones(...)`, simple images through `Image(...)`, simple shared drawing shapes through `Shape(...)`, plus first-page/even-page text, image, and shape variants through `FirstPageZones(...)`, `EvenPagesZones(...)`, `FirstPageImage(...)`, `EvenPagesImage(...)`, `FirstPageShape(...)`, and `EvenPagesShape(...)`, font, size, text color, alignment, margin-relative offsets with placement validation, first-page overrides, and odd/even page overrides; zone text is measured and rejected when it would overflow or overlap. First/even/odd selection is scoped to the current document or section flow, while visible page tokens continue by default across flows for Word-like section numbering. `PdfOptions.PageNumberStart`, `PdfDoc.PageNumberStart(...)`, and `PdfPageCompose.PageNumberStart(...)` can explicitly restart the visible page number without breaking first/even/odd variant selection, and `PdfOptions.PageNumberStyle`, `PdfDoc.PageNumberStyle(...)`, and `PdfPageCompose.PageNumberStyle(...)` can render decimal, roman, or alphabetic page tokens. The native Word-to-PDF path maps simple default/first/even header and footer text/images/shapes into this model, including left/center/right paragraph alignment, simple text-box text routed through header/footer zones, and simple two-/three-cell header/footer table text/images/shapes through first-party zones; richer table header and footer fidelity remains roadmap work
Create	Themes and styles	Partial	`PdfTheme` bundles default text, paragraph, heading, list, table, panel, rule, image, drawing, and row styles for `PdfOptions.ApplyTheme(...)`, `PdfDoc.Theme(...)`, and `PdfPageCompose.Theme(...)`; `PdfTheme.WordLike()` provides a generic opt-in document theme with neutral typography, readable paragraph/list/table rhythm, heading hierarchy, table footer separators for summary rows, and flow-object spacing without introducing invoice/report-specific engine concepts
Create	Fonts	Partial	Standard PDF fonts only; document defaults, header/footer fonts, default text styles, and rich text runs can select Helvetica, Times, or Courier family variants without embedding. Helvetica and Times family measurement uses built-in glyph-width tables, including common WinAnsi punctuation and accented Latin letters, for generated layout and standard-font readback; TrueType/OpenType embedding is planned
Create	Outlines/bookmarks	Partial	`PdfOptions.CreateOutlineFromHeadings` writes nested PDF outlines from H1/H2/H3 blocks, generic `PdfDoc.Bookmark(...)` / compose `Bookmark(...)` helpers write simple PDF named destinations from the current top-level or row/column flow position with duplicate-name validation, rich `PdfListItem` can anchor per-item named destinations in top-level and row/column list flows, and paragraph `LinkToBookmark(...)` runs plus bookmark-targeted H1/H2/H3 links write internal GoTo annotations targeting those named destinations with missing-target validation
Create	Forms	Partial	`PdfDoc.TextField(...)`, `PdfDoc.CheckBox(...)`, `PdfDoc.ChoiceField(...)`, `PdfDoc.MultiSelectChoiceField(...)`, and `PdfDoc.RadioButtonGroup(...)` write initial simple AcroForm text fields, check boxes, scalar choice fields, multi-select choice fields, and vertical radio button groups in top-level page flow, compose item/element flow, and row/column flow; `PdfTableCell.WithCheckBoxes(...)` writes simple check boxes inside table cells and `PdfTableCell.WithFormFields(...)` writes simple table-cell text and scalar choice fields. Generated fields include visible normal appearance streams, `/Widget` annotations, catalog `/AcroForm`, generated `/NeedAppearances false`, Helvetica default resource registration, button Off/selected appearance states, choice field flags/options, and radio parent/kid widgets; `PdfFormFieldStyle` can set generated background, border, text, and button mark colors plus border width; generated fields are immediately readable by `PdfInspector` / `PdfLogicalDocument` and can be filled or filled-and-flattened by `PdfFormFiller`. The native Word exporter maps simple body-level and table-cell dropdown, combo box, and date picker content controls to these first-party form primitives. Richer field widgets and broader Word/Excel/PowerPoint mapped form export remain roadmap work
Shared drawing	Color interop	Supported	`PdfColor.FromOfficeColor`, `PdfColor.FromOfficeColorOrNull`, `PdfColor.ToOfficeColor`, and implicit `OfficeColor` to `PdfColor` conversion
Shared drawing	Image metadata	Supported	PDF image validation/rendering stores `OfficeImageInfo` on internal image blocks and uses `OfficeImageReader` for format detection
Shared drawing	Font metadata, text measurement, image fitting, vector descriptors	Partial	Generated table auto-fit sizing uses `OfficeIMO.Drawing.OfficeTextMeasurer`; flow images use `OfficeIMO.Drawing.OfficeImageFit`; flow lines, rectangles, rounded rectangles, ellipses, polygons, paths, and grouped scenes use `OfficeIMO.Drawing.OfficeShape` / `OfficeDrawing`, including shared stroke dash/cap/join, two-stop linear gradient fill, simple offset shadow, fill/stroke opacity, affine transform, and clipping path descriptors; keep expanding this shared layer instead of duplicating reusable primitives inside `OfficeIMO.Pdf`
Read	Load PDF object model	Partial	`PdfReadDocument.Load(byte[]/path/stream)` handles the current pragmatic parser scope and prefers the trailer `/Root` catalog when stale catalog objects remain in the file
Read	Lightweight document probe	Supported	`PdfInspector.Probe(byte[]/path/stream)` returns `PdfDocumentProbe.HeaderVersion`, `HasEncryption`, `HasSignatures`, `HasForms`, `HasAnnotations`, `HasOutlines`, `HasCatalogViewSettings`, `HasPageLabels`, `HasCatalogNameTrees`, `HasNamedDestinations`, `HasOpenActions`, `HasViewerPreferences`, `HasTaggedContent`, `HasXmpMetadata`, `HasCatalogUri`, `HasOutputIntents`, `HasEmbeddedFiles`, `HasOptionalContent`, and `HasActiveContent` without full parsing so wrappers can choose safe read/manipulation paths
Read	Wrapper validation/preflight	Supported	`PdfValidator.Validate(byte[]/path/stream)` and `PdfInspector.Preflight(byte[]/path/stream)` return `IsValid`/`CanRead`, `CanExtractText`, `CanExtractImages`, `CanReadLogicalObjects`, `CanRewrite`, `CanManipulatePages`, `CanFillSimpleFormFields`, `CanFlattenSimpleFormFields`, `CanFillAndFlattenSimpleFormFields`, `Can(PdfPreflightCapability)`, `GetCapabilityDiagnostics(PdfPreflightCapability)`, parsed `DocumentInfo` when available, structured `ReadBlockers` / `RewriteBlockers`, `HasReadBlocker(...)` / `HasRewriteBlocker(...)` helpers, and diagnostics for encrypted, signed, form-bearing, complex-outline-bearing, complex-page-label-bearing, unsupported-catalog-name-tree-bearing, complex-named-destination-name-tree-bearing, complex-open-action-dictionary-bearing, complex-viewer-preference-bearing, complex-XMP-metadata-bearing, complex-catalog-URI-bearing, tagged, complex-output-intent-bearing, complex-embedded-file-bearing, complex-optional-content-bearing, active-content-bearing, invalid rewrite object references, unsupported page content stream filters, invalid, empty, or parser-unsupported inputs; simple direct catalog view settings, simple outlines including simple GoTo action outline entries, simple direct page labels, direct named destinations, simple destination name trees including leaf `/Kids`, destination-array open actions, simple GoTo open-action dictionaries, simple viewer preferences, simple catalog XMP metadata streams, simple catalog URI base dictionaries, simple output intents, simple embedded-file attachment trees, and simple optional-content metadata are detected but no longer block rewrite; image extraction can still be allowed when document inspection succeeds but content-stream filters block text or logical-object extraction, and simple AcroForm fill/flatten gates are reported separately because form PDFs still block generic page-rewrite helpers while dedicated form helpers can support a narrower safe path
Read	Page count, page sizes, and rotation	Supported	`PdfInspector.Inspect(byte[]/path/stream)` and `InspectPageRanges(byte[]/path/stream, PdfPageRange...)` return `PdfDocumentInfo.PageCount`, `HeaderVersion`, `PdfPageInfo` geometry, `RotationDegrees`, page-level link annotations, page-level AcroForm widget annotations when readable, signature marker state, form marker state, annotation marker state, outline marker state, catalog-view-setting marker state, page-label marker state, catalog-name-tree marker state, named-destination marker state, open-action marker state, viewer-preference marker state, tagged-structure marker state, XMP metadata marker state, catalog URI marker state, output-intent marker state, embedded-file marker state, optional-content marker state, and active-content marker state; page-range inspection preserves caller order and overlaps while narrowing page labels, page-resolved outlines, named destinations, open actions, AcroForm fields, and form widgets to selected source pages
Read	Catalog view and identity	Partial	`PdfReadDocument` and `PdfInspector.Inspect(...)` expose simple catalog `CatalogPageMode`, `CatalogPageLayout`, `CatalogVersion`, and `CatalogLanguage`
Read	Page labels	Partial	`PdfReadDocument.PageLabels` and `PdfInspector.Inspect(...).PageLabels` read simple direct catalog `/PageLabels` number trees as generic page-label rules with `StartPageIndex`, `StartPageNumber`, `Style`, `Prefix`, `StartNumber`, `PageLabelCount`, and `HasReadablePageLabels`; rewrite-style copied-page label reindexing follows the trailer-root page tree so stale catalog objects do not skew selected-page labels; complex page-label trees remain marker/blocker-only until richer number-tree support exists
Read	Viewer preferences	Partial	`PdfReadDocument.ViewerPreferences` and `PdfInspector.Inspect(...).ViewerPreferences` read simple catalog `/ViewerPreferences` dictionaries as generic key/value entries with `Count`, `GetValue(...)`, `GetBoolean(...)`, and `HasReadableViewerPreferences`; complex viewer preference graphs remain readable only as markers/blockers until richer typed models exist
Read	Metadata	Supported	`PdfReadDocument.Metadata`, `PdfInspector.Inspect(...).Metadata`, `PdfTextExtractor.GetMetadata(byte[]/path/stream)`
Read	Outlines/bookmarks	Partial	`PdfReadDocument.Outlines` and `PdfInspector.Inspect(...).Outlines` read simple outline trees, indirect destinations, direct/name-tree named-destination targets, and simple GoTo action destinations from the trailer-root catalog; rewrite-style manipulation preserves simple outline trees, including simple GoTo action outline entries, whose destinations point only at copied pages, drops outline trees when a selected-page operation would leave stale outline destinations, and still blocks complex non-GoTo or additional-action outline trees
Read	Named destinations	Partial	`PdfReadDocument.NamedDestinations` and `PdfInspector.Inspect(...).NamedDestinations` read simple direct catalog `/Dests` dictionaries and simple catalog `/Names` `/Dests` name trees, including leaf `/Kids`, exposing `Name`, `PageNumber`, `DestinationTop`, `NamedDestinationCount`, and `NamedDestinationNames`; `PdfReadPage.GetLinkAnnotations()` / `PdfInspector.Inspect(...)` also read simple URI links plus GoTo/direct named-destination page links with distinct `LinkUris` and `LinkDestinationNames`; malformed or unsupported name-tree destinations remain blocked for rewrite
Read	Document open action	Partial	`PdfReadDocument.OpenAction` and `PdfInspector.Inspect(...).OpenAction` read simple destination-array and simple GoTo dictionary `/OpenAction` entries, exposing `ActionType`, `PageNumber`, `DestinationTop`, and `HasReadableOpenAction`; complex open actions remain readable only as markers/blockers until richer action support exists
Read	Text extraction	Partial	`PdfReadDocument.ExtractText`, `PdfReadPage.ExtractText`, `PdfTextExtractor.ExtractAllText(byte[]/path/stream)`, `PdfTextExtractor.ExtractAllTextByPageRanges(byte[]/path/stream, PdfPageRange...)`, `PdfTextExtractor.ExtractTextByPage(byte[]/path/stream)`, and `PdfTextExtractor.ExtractTextByPageRanges(byte[]/path/stream, PdfPageRange...)`; byte/path/stream whole-document extraction can write UTF-8 text to output paths or caller-owned streams, selected range extraction can return one concatenated text result or write one text file/stream for wrapper-style `Convert-PdfToText -Pages`, byte-array/path/stream page extraction can write deterministic `source-page-0001.txt` files, and range-list text extraction preserves caller order plus repeated or overlapping selections while writing selected source-page-numbered files with or without layout options
Read	Text positions/spans	Partial	`PdfReadPage.GetTextSpans()` returns generated standard-font spans with glyph-width-based advances when `/Widths` is omitted, including common WinAnsi punctuation and accented Latin letters
Read	Image extraction	Partial	`PdfImageExtractor.ExtractImages(byte[]/path/stream/document)`, `ExtractImagesByPageRanges(byte[]/path/stream/document, PdfPageRange...)`, and `PdfReadDocument.ExtractImages()` return page image XObjects; byte-array, path, and stream extraction can also write deterministic `source-page-0001-image-0001.png` files for all pages or selected source-page ranges, while range-list image extraction preserves caller order and deduplicates overlapping selections; JPEG images are returned as JPEG files and simple PNG-predictor Flate images as PNG files, including compatible grayscale/RGB Flate images with grayscale `/SMask` alpha as gray-alpha/RGBA PNGs
Read	Logical object model	Partial	`PdfLogicalDocument.Load(byte[]/path/stream, PdfTextLayoutOptions?)`, `LoadPageRanges(byte[]/path/stream, options, PdfPageRange...)`, `From(PdfReadDocument, ...)`, and `FromPageRanges(PdfReadDocument, options, PdfPageRange...)` expose one wrapper-friendly read surface with metadata, selected source pages in caller order when ranges are used, `PagesBySourcePageNumber`, `HasSourcePage(...)`, and `GetPages(...)` helpers that preserve range-selection duplicates, document-level `TextBlocks`, `Headings`, `Paragraphs`, `ListItems`, `Tables`, and `Images`, flattened logical `Elements`, `ElementsByKind`, `ElementsByPageNumber`, `HasElementKind(...)`, and `GetElements(...)` helpers on both documents and pages, line-level text blocks, heuristic headings, list item objects with marker/level/text hints, heuristic paragraph groups, leader rows, detected tables with row/column/cell objects, image XObjects, URI/named-destination link annotation objects with document-level `Links`, `LinksByUri`, `LinksByDestinationName`, `GetLinksByUri(...)`, and `GetLinksByDestinationName(...)`, page-level AcroForm widget objects with current `/AS` and named `/AP /N` normal appearance states, catalog view settings, outlines/bookmarks, page-label rules, named destinations, open actions, viewer preferences, AcroForm `/NeedAppearances`, `/SigFlags`, named signature flag helpers, and `/DA` metadata, simple AcroForm fields with typed `PdfFormFieldKind`, inherited common `/Ff` flag helpers, scalar and array current/default values, selected/default-selected choice-option matching, inherited text `/MaxLen`, inherited AcroForm/field-tree `/DA` default appearance strings, inherited `/Q` text alignment, inherited simple choice `/Opt` options, distinct field page-number helpers, field-local widget page lookups, named, kind-based, and page-number form-field lookup helpers, document-level `FormWidgets`, `FormWidgetsByFieldName`, `FormWidgetsByPageNumber`, `GetFormWidgets(string)`, `GetFormWidgets(int)`, and simple form-widget page/rectangle objects. `PdfLogicalDocument.ToMarkdown(...)`, `PdfLogicalPage.ToMarkdown(...)`, and `PdfTextExtractor.ExtractMarkdown(...)` / `ExtractMarkdownByPage(...)` / `ExtractMarkdownByPageRanges(...)` / `ExtractMarkdownByPageRangesAsDocument(...)` render the same logical model as Markdown with headings, paragraphs, lists, detected tables, image placeholders, optional link/form annotations, UTF-8 output-path/stream helpers, and deterministic per-page `.md` files for wrapper pipelines. Range-based logical loads filter page labels, page-resolved outlines, named destinations, open actions, AcroForm fields, and form widgets to selected source pages while preserving duplicate selected page widgets in caller order. The two-page line-item statement fixture now guards source-page ordering, table readback, totals readback, and selected range ordering through the logical model. This is the first AST-style surface for PSWriteOffice-style workflows, but heading/paragraph/table/list detection remains heuristic rather than a full tagged-PDF or Word-like semantic reconstruction
Read	Simple structure extraction	Partial	`PdfReadPage.ExtractStructured(...)`, `PdfReadDocument.ExtractStructuredPages(...)`, `PdfReadDocument.ExtractHeadingsByPage(...)`, `PdfReadDocument.ExtractListItemsByPage(...)`, `PdfReadDocument.ExtractParagraphsByPage(...)`, `PdfTextExtractor.ExtractStructuredByPage(byte[]/path/stream, options)`, `ExtractStructuredByPageRanges(byte[]/path/stream, options, PdfPageRange...)`, `ExtractHeadingsByPage(byte[]/path/stream, options)`, `ExtractHeadingsByPageRanges(byte[]/path/stream, options, PdfPageRange...)`, `ExtractListItemsByPage(byte[]/path/stream, options)`, `ExtractListItemsByPageRanges(byte[]/path/stream, options, PdfPageRange...)`, `ExtractParagraphsByPage(byte[]/path/stream, options)`, `ExtractParagraphsByPageRanges(byte[]/path/stream, options, PdfPageRange...)`, `PdfTextExtractor.ExtractTablesByPage(byte[]/path/stream, options)`, and `ExtractTablesByPageRanges(byte[]/path/stream, options, PdfPageRange...)` expose column-aware text, heuristic headings, heuristic paragraph groups, list item marker/level hints, dot/hyphen/underscore leader rows that preserve decimal/currency value punctuation, and heuristic table rows/geometry for wrapper-friendly readback while preserving selected source page numbers for heading/list-item/paragraph/table results; `PdfTextExtractor.ExtractTablesByPage(pdfBytes, outputDirectory, baseName, options)`, `ExtractTablesByPage(inputPath, outputDirectory, options)`, `ExtractTablesByPage(stream, outputDirectory, baseName, options)`, and matching `ExtractTablesByPageRanges(...)` overloads write deterministic escaped CSV files per detected table for all pages or selected source-page ranges, including the two-page line-item statement fixture with selected source-page order, line-item rows, and totals guarded for wrapper use
Manipulate	Split by page range	Partial	`PdfPageExtractor.ExtractPageRange(byte[]/path/stream, firstPage, lastPage)`, `ExtractPageRange(..., PdfPageRange)`, `ExtractPageRanges(..., PdfPageRange...)`, `SplitPages(byte[]/path/stream)`, and `SplitPageRanges(..., PdfPageRange...)` return bytes for wrapper pipelines; `PdfPageRange.Parse(...)`, `TryParse(...)`, `ParseMany("1-3,5")`, and `TryParseMany(...)` parse one-based single pages plus inclusive `first-last` / `first..last` range lists while preserving caller order; path and stream split helpers can also write deterministic `source-page-0001.pdf` and `source-pages-0001-0003.pdf` files; simple direct catalog `/PageMode`, `/PageLayout`, `/Version`, `/Lang`, simple direct `/PageLabels` number trees, simple outlines including simple GoTo action outline entries whose destinations point only at copied pages, direct `/Dests` dictionaries, simple `/Names` `/Dests` name trees including leaf `/Kids`, destination-array `/OpenAction` entries, simple GoTo open-action dictionaries, simple `/ViewerPreferences` dictionaries, simple catalog `/Metadata` XMP XML streams, simple catalog `/URI` base dictionaries, simple `/OutputIntents` metadata graphs, simple `/Names` `/EmbeddedFiles` attachment trees, simple catalog `/AF` associated-file arrays, and simple `/OCProperties` optional-content metadata are preserved, with copied-page labels reindexed, stale destinations/open actions pruned, stale outline trees/name-tree destinations dropped, and stale named-destination link annotations removed when their target pages are not copied; the two-page line-item statement fixture now guards split/extract readback through the logical model; currently scoped to PDFs handled by the OfficeIMO parser
Manipulate	Merge PDFs	Partial	`PdfMerger.Merge(byte[]/stream inputs)` and `PdfMerger.MergeFilesToBytes(path inputs)` can return bytes or write to output streams, while `PdfMerger.MergeFiles(...)` writes merged files from `params` paths or enumerable path lists and can write enumerable file-list inputs to output streams for wrapper pipelines; simple direct catalog `/PageMode`, `/PageLayout`, `/Version`, `/Lang`, simple direct `/PageLabels` number trees, simple outline trees including simple GoTo action outline entries, direct `/Dests` dictionaries, simple `/Names` `/Dests` name trees, destination-array `/OpenAction` entries, simple GoTo open-action dictionaries, simple `/ViewerPreferences` dictionaries, simple catalog `/Metadata` XMP XML streams, simple catalog `/URI` base dictionaries, simple `/OutputIntents` metadata graphs, simple `/Names` `/EmbeddedFiles` attachment trees, simple catalog `/AF` associated-file arrays, and simple `/OCProperties` optional-content metadata are preserved from the first source; the two-page line-item statement fixture now guards merge-after-split readback through the logical model; currently scoped to parser-supported PDFs
Manipulate	Extract pages	Partial	`PdfPageExtractor.ExtractPages(byte[]/path/stream, pageNumbers)`, `ExtractPageRange(...)`, and `ExtractPageRanges(..., PdfPageRange...)` create a new PDF from selected pages/ranges in requested order, including repeated selections and overlapping ranges as cloned page objects, preserves simple reachable URI and named-destination link annotations plus simple direct catalog `/PageMode`, `/PageLayout`, `/Version`, `/Lang`, simple direct `/PageLabels` number trees, simple outlines including simple GoTo action outline entries whose destinations point only at copied pages, direct `/Dests` dictionaries, simple `/Names` `/Dests` name trees including leaf `/Kids`, destination-array `/OpenAction` entries, simple GoTo open-action dictionaries, simple `/ViewerPreferences` dictionaries, simple catalog `/Metadata` XMP XML streams, simple catalog `/URI` base dictionaries, simple `/OutputIntents` metadata graphs, simple `/Names` `/EmbeddedFiles` attachment trees, simple catalog `/AF` associated-file arrays, and simple `/OCProperties` optional-content metadata, with copied-page labels reindexed, stale destinations/open actions pruned, stale outline trees/name-tree destinations dropped, and stale named-destination link annotations removed when their target pages are not copied; helpers can return bytes from path inputs and write byte, stream, or path inputs to caller-owned output streams
Manipulate	Import pages	Partial	`PdfPageImporter.AppendPages`, `PrependPages`, `InsertPages`, `InsertPageRange`, `AppendPageRanges`, `PrependPageRanges`, and `InsertPageRanges` import selected one-based source pages, inclusive source ranges from `firstPage` / `lastPage` pairs or `PdfPageRange`, parsed range lists, repeated selections/ranges as cloned pages, or all source pages when no selection is supplied, before, after, or inside a target PDF using byte-array, path, or stream inputs; `InsertPages`, `InsertPageRange`, and `InsertPageRanges` keep the target document as the primary catalog/metadata source even when inserting at page 1; helpers return bytes, write to paths, or write byte, stream, or path inputs to caller-owned output streams for wrapper pipelines and reuse parser-supported extraction plus merge object-copy behavior
Manipulate	Duplicate pages	Partial	`PdfPageEditor.DuplicatePages(byte[]/path/stream, pageNumbers)`, `DuplicatePageRange(byte[]/path/stream, firstPage, lastPage or PdfPageRange)`, and `DuplicatePageRanges(..., PdfPageRange...)` keep original document order and insert cloned copies immediately after each selected source page, including repeated page selections or repeated/overlapping parsed ranges as repeated clones, with byte-returning path helpers and output stream/path helpers for byte, stream, or path inputs in wrapper pipelines
Manipulate	Move pages	Partial	`PdfPageEditor.MovePages(byte[]/path/stream, insertBeforePageNumber, pageNumbers)`, `MovePageRange(byte[]/path/stream, insertBeforePageNumber, firstPage, lastPage or PdfPageRange)`, and `MovePageRanges(..., PdfPageRange...)` move selected one-based source pages, inclusive page ranges, or parsed range lists as a group in original relative order before another source page, or to the end with `pageCount + 1`; range-list movement treats overlaps as one moved page set and helpers include byte-returning path helpers plus output stream/path helpers for byte, stream, or path inputs in wrapper pipelines
Manipulate	Reorder pages	Partial	`PdfPageEditor.ReorderPages(byte[]/path/stream, pageNumbers)` and `ReorderPageRanges(byte[]/path/stream, PdfPageRange...)` create a new PDF containing every page exactly once in the requested order; range-list reorder can reuse `PdfPageRange.ParseMany("3,1-2")` for wrapper grammar, return bytes from file paths, or write byte, stream, or path inputs to output streams
Manipulate	Delete pages	Partial	`PdfPageEditor.DeletePages(byte[]/path/stream, pageNumbers)`, `DeletePageRange(byte[]/path/stream, firstPage, lastPage or PdfPageRange)`, and `DeletePageRanges(..., PdfPageRange...)` create a new PDF without selected pages, one inclusive page range, or a parsed range list; overlapping delete ranges are treated as one deletion set; helpers can return bytes from file paths or write byte, stream, or path inputs to output streams, and deleting every page is rejected
Manipulate	Rotate pages	Partial	`PdfPageEditor.RotatePages(byte[]/path/stream, degrees, pageNumbers)`, `RotatePageRange(byte[]/path/stream, degrees, firstPage, lastPage or PdfPageRange)`, and `RotatePageRanges(..., PdfPageRange...)` set `/Rotate` for selected pages, inclusive page ranges, parsed range lists, or all pages when no selection is supplied; range-list rotation treats overlaps as one selected page set and can return bytes from file paths or write byte, stream, or path inputs to output streams
Manipulate	Update metadata	Partial	`PdfMetadataEditor.UpdateMetadata(byte[]/stream/path, ...)` and `UpdateMetadataToBytes(path, ...)` preserve unspecified fields, while `ReplaceMetadata(byte[]/stream/path, ...)`, `ReplaceMetadataToBytes(path, ...)`, and path output helpers replace the Info dictionary fields; helpers can write byte, stream, or path inputs to caller-owned output streams, and path helpers can also return bytes
Manipulate	Text/image stamp/watermark	Partial	`PdfStamper.StampText(byte[]/stream/path, ...)`, `StampTextToBytes(path, ...)`, `WatermarkText(byte[]/stream/path, ...)`, `WatermarkTextToBytes(path, ...)`, `StampImage(byte[]/stream/path PDF, byte[]/stream image, ...)`, `StampImageToBytes(path PDF, byte[]/stream image, ...)`, `WatermarkImage(byte[]/stream/path PDF, byte[]/stream image, ...)`, and `WatermarkImageToBytes(path PDF, byte[]/stream image, ...)` append content streams to selected pages, return bytes for wrapper pipelines, and can write byte, stream, or path PDF inputs to paths or caller-owned output streams; `PdfTextStampOptions.UsePageRange(...)` / `UsePageRanges(...)` and `PdfImageStampOptions.UsePageRange(...)` / `UsePageRanges(...)` select inclusive one-based page ranges or parsed range lists from `firstPage` / `lastPage` pairs or `PdfPageRange` without wrappers materializing page arrays, with overlapping range-list selections treated as one page selection set; simple PNG alpha soft masks are preserved for image stamps/watermarks
Forms	Inspect fields	Partial	`PdfInspector.Inspect(...)` and `Preflight(...).DocumentInfo` can list simple AcroForm fields through `PdfDocumentInfo.FormFields`, including document-level `/NeedAppearances`, `/SigFlags`, named `/SigFlags` helpers for signatures-exist and append-only, and `/DA`, fully qualified names, raw field types, typed `PdfFormFieldKind`, simple display `Value`, scalar or array `Values`, simple default display `DefaultValue`, scalar or array `DefaultValues`, selected/default-selected choice-option matching, alternate/mapping names, inherited common `/Ff` flag helpers such as read-only/required/no-export/text/button/choice/signature/button-kind/choice-kind hints, inherited text `/MaxLen`, inherited AcroForm/field-tree `/DA` default appearance strings, inherited `/Q` text alignment, inherited simple choice `/Opt` options with export/display text, distinct widget page numbers per field, field-local `WidgetsByPageNumber` and `GetWidgets(int)` helpers, and simple widget annotation field-name/page/rectangle/current-appearance/normal-appearance-state metadata plus named annotation `/F` flag helpers when readable; `PdfDocumentInfo` and `PdfLogicalDocument` expose `FormFieldsByName`, `FormFieldsByKind`, `FormFieldsByPageNumber`, `FormFieldNames`, `TryGetFormField(...)`, `GetFormFields(PdfFormFieldKind)`, and `GetFormFields(int)` so wrappers can query the same simple fields without hand-scanning raw lists, plus document-level and page-level `FormWidgets`, `FormWidgetsByFieldName`, `FormWidgetsByPageNumber`, `GetFormWidgets(string)`, and `GetFormWidgets(int)` lookup helpers for widget geometry and appearance state; rewrite-style page manipulation remains blocked for form PDFs until broader preservation exists
Forms	Fill fields	Partial	`PdfFormFiller.FillFields(...)` can update simple AcroForm text/choice-style string values and button name values by fully qualified field name from bytes, paths, or streams, accepts choice values as export values or `/Opt` display text when available while storing the export value and painting display text, supports multi-select choice arrays through `PdfFormFieldValue.FromValues(...)`, updates radio button groups by switching only the matching child widget appearance state on, generates simple text-widget normal appearance streams and simple button-widget Off/selected appearance states for widgets with `/Rect`, marks `/NeedAppearances true`, returns bytes from path inputs, writes path inputs to paths or caller-owned output streams, and rejects signed or active-content PDFs; rich widgets, JavaScript actions, and full appearance regeneration remain roadmap work
Forms	Flatten forms	Partial	`PdfFormFiller.FlattenFields(...)` and `FillAndFlattenFields(...)` can paint simple text-widget appearances, simple choice-widget text appearances with `/Opt` display text when available for scalar or array selected values, and simple button-widget normal appearance states into page content, generating minimal button appearances when needed, remove those page annotations, and remove the AcroForm tree for parser-supported PDFs from bytes, paths, or streams; helpers return bytes from path inputs and write path inputs to paths or caller-owned output streams; rich/custom appearances, JavaScript actions, and safe complex form preservation remain roadmap work
Security	Encryption/signatures/redaction	Partial	`PdfInspector.Probe` reports encryption/signature/form/outline/catalog-view-setting/page-label/catalog-name-tree/named-destination/open-action/viewer-preference/tagged-structure/XMP-metadata/catalog-URI/output-intent/embedded-file/optional-content/active-content markers and `PdfInspector.Preflight` turns unsupported markers into read/rewrite decisions with diagnostics plus structured `PdfReadBlockerKind` and `PdfRewriteBlockerKind` entries; encrypted PDFs fail with a clear unsupported diagnostic for parser-supported read/manipulation flows; signed PDFs, form PDFs, complex outline PDFs, complex page-label PDFs, unsupported catalog name-tree PDFs, malformed or unsupported named-destination name-tree PDFs, complex open-action dictionary PDFs, complex viewer-preference PDFs, complex XMP metadata PDFs, complex catalog URI PDFs, tagged PDFs, complex output-intent PDFs, complex embedded-file/associated-file PDFs, complex optional-content PDFs, and active-content PDFs are blocked for rewrite-style manipulation. Simple direct catalog view settings, simple outlines including simple GoTo action outline entries, simple direct page labels, direct named destinations, simple destination name trees including leaf `/Kids`, destination-array open actions, simple GoTo open-action dictionaries, simple viewer preferences, simple catalog XMP metadata streams, simple catalog URI base dictionaries, simple output intents, simple embedded-file attachment trees, simple associated-file arrays, and simple optional-content metadata are preserved. Creation, validation, redaction, and encrypted reading remain planned
Convert	Word to PDF without QuestPDF	Partial	`OfficeIMO.Word.Pdf` now defaults to the first-party engine; `PdfSaveOptions.PageSize` and `Margins` provide a QuestPDF-free page setup surface using first-party `OfficeIMO.Pdf` geometry types, with explicit `PageSize` geometry preserved unless `PdfSaveOptions.Orientation` is set; the current native path maps basic Word sections, page setup, Word document background color, Word section columns with explicit and inline paragraph column breaks, explicit unequal section column widths, Word section column separator lines, and heading/keep-with-next-aware automatic distribution for multi-column sections without explicit breaks, page breaks, headings including linked headings, paragraphs/runs with common Word/PDF font family requests mapped to standard Helvetica, Times, and Courier PDF families, isolated run color, font-size, superscript/subscript baseline, justified paragraph alignment, text-wrapping breaks, and highlight/background state, paragraph spacing/indents, simple tab stops with leaders/alignment, keep-with-next/keep-lines/widow-control flags, simple shaded and uniform/non-uniform bordered paragraphs, Word horizontal lines and paragraph top/bottom border rules, simple level-0 bullet/decimal lists with rich list-item runs, list-item bookmarks, links/bookmarks with tooltip metadata, generated table-of-contents entries with internal links to heading destinations, heading-based PDF outlines, footnote/endnote markers, simple tables with supported Word table style presets, rich text runs inside table cells, default and per-cell table margins, table cell spacing, table-level borders, uniform/non-uniform, double, and diagonal cell borders, uniform and non-uniform row heights, row-level break policies, preferred DXA table widths that fit into narrower native PDF column frames, explicit autofit-to-contents tables, cell fills, left/center/right table placement, uniform column and non-uniform per-cell horizontal/vertical alignment, simple merged cells, separated first-row visual table styling and repeated leading table header rows, and linked cells including linked merged cells, paragraph-aligned images, simple VML shapes plus the DrawingML preset flow shapes exposed by `WordShape`, simple body text boxes rendered through first-party panel paragraphs, simple body, table-cell, header, and footer picture content controls rendered as first-party PDF images, simple body repeating-section text items rendered as ordinary first-party PDF paragraphs, simple table-cell repeating-section text items rendered as first-party rich table-cell text, simple header/footer repeating-section text items rendered as first-party zone text, simple header/footer text boxes with extractable text routed through first-party zones, simple inline body/table/header/footer text content controls, simple body-level and table-cell Word check boxes as inspectable PDF AcroForm check boxes with readback and Poppler raster-baseline coverage in the native Word report fixture, simple body-level and table-cell Word dropdown, combo box, and date picker content controls as inspectable PDF AcroForm choice/text fields, simple header/footer Word check boxes, dropdowns, combo boxes, and date pickers as static first-party zone text, simple default/first/even header and footer text/images/shapes with left/center/right paragraph alignment, Word PAGE/NUMPAGES header/footer fields and their simple numeric format switches, and simple header/footer table-cell text/images/shapes mapped to first-party zones, simple footnote/endnote markers with end-of-section note text, metadata, and page-number footer settings including Word section page-number starts/styles into `OfficeIMO.Pdf`; the Poppler lane now includes a daily-layout Word fixture covering TOC, margins, page background color, columns including inline column breaks, separator lines, fonts, colors, lists, links, images, headers/footers, and a table inside the column flow. `PdfSaveOptions.Warnings` records unsupported native header/footer visual content such as shapes without supported geometry, text boxes without extractable text, SmartArt, equations, unsupported content controls, and embedded documents, plus unsupported body SmartArt, equations, unsupported header/footer content controls, embedded documents, and unhandled body elements that are not yet faithfully mapped. The old QuestPDF/SkiaSharp engine path has been removed from `OfficeIMO.Word.Pdf`; remaining work is fidelity and coverage in the first-party exporter
Convert	Excel to PDF	Partial	`OfficeIMO.Excel.Pdf` provides the first Excel-to-PDF package surface. The exporter maps selected or all visible workbook worksheets into first-party `OfficeIMO.Pdf` headings and tables, honors worksheet print areas, worksheet orientation, worksheet margins, hidden workbook worksheet filtering for default all-sheet exports, hidden worksheet rows and columns, repeated print-title rows through the PDF table header model, manual worksheet row and column page breaks as explicit PDF page breaks while preserving repeated header/title rows across split table chunks, simple worksheet header/footer text zones with first-page and even-page text variants plus page-number, page-count, sheet-name, date, time, workbook file-name, and workbook path tokens, simple line-level header/footer font family/style, font size, and RGB text color when representable as one first-party PDF header/footer line style, and supported header/footer images, worksheet merged cells through PDF table column/row spans, supported worksheet drawing images anchored into exported PDF table cells when the anchor cell is exported and otherwise emitted as PDF flow images in anchor order, supported column/bar/line/area/scatter/radar/pie/doughnut worksheet chart families as first-party vector drawing snapshots when chart data can be read, and common number formats plus basic explicit cell font emphasis, font color, fill color, two-color conditional color-scale fills, conditional data bars, conditional icon-set indicators, horizontal/vertical alignment, simple cell borders including dashed, dotted, dash-dot, double, and diagonal strokes, external cell hyperlinks, internal workbook links as sheet-level PDF named destinations, explicit worksheet column widths, explicit worksheet row heights, manual worksheet print scale, and fit-to-width table sizing through first-party table/rich-text/image primitives; supports explicit page size/margin options through reusable PDF geometry types; can return bytes or write to paths/streams; and now has a Poppler raster baseline for a daily two-sheet workbook covering worksheet header/footer text/images, orientation/margins, merged title cells, fills/borders, number formats, explicit row/column sizing, hidden row/column filtering, anchored worksheet images, chart snapshots, and internal/external links. `ExcelPdfSaveOptions.Warnings` records unsupported or simplified export features such as mixed or rich per-run worksheet header/footer formatting, unsupported or unreadable worksheet/header/footer images, unsupported or unreadable chart snapshots, and row truncation from `MaxRowsPerSheet`. Richer worksheet header/footer formatting beyond the current line-level style mapping, cell-specific internal workbook-link destinations, fit-to-height and automatic multi-page pagination/scaling, richer worksheet image placement fidelity beyond exported table-cell anchors, richer chart fidelity beyond initial column/bar/line/area/scatter/radar/pie/doughnut snapshots, richer cell style fidelity such as additional conditional formats and locale-specific formats, richer merged-cell edge cases, and broader unsupported-feature diagnostics remain roadmap work
Convert	PowerPoint to PDF	Planned	Later phases after the PDF layout engine matures

Word-to-PDF equation note: simple OMML equations with extractable math text are mapped as static first-party PDF text in body paragraphs, table cells, headers, and footers. Equation warnings in the convert row refer to equations without extractable text.

Wrapper Guidance

For PSWriteOffice parity work, call PdfInspector.Preflight before read or rewrite-style operations and wrap only the rows marked Supported or carefully expose Partial rows with clear naming. Prefer the direct capability gates for command dispatch: CanExtractText for text/structured text readback, CanExtractImages for image extraction, CanReadLogicalObjects for PdfLogicalDocument PDF-to-object conversion, CanManipulatePages for extract/split/merge/import/edit/stamp/metadata rewrite, CanFillSimpleFormFields for simple AcroForm value updates, and CanFlattenSimpleFormFields or CanFillAndFlattenSimpleFormFields for simple text/choice/button-widget flattening. Use Can(PdfPreflightCapability) and GetCapabilityDiagnostics(PdfPreflightCapability) when writing generic wrappers, and keep HasReadBlocker(...), HasRewriteBlocker(...), ReadBlockers, and RewriteBlockers for advanced user-facing explanation, with Diagnostics as the readable log/error text. Creation wrappers should expose Word-like primitives such as document defaults, sections, paragraphs, tables, drawings, images, headers, footers, and page setup rather than template nouns such as invoices or statements. For Word-like table style pickers, show TableStyles.CanonicalWordStyleNames, accept TableStyles.SupportedWordStyleNames, and normalize caller input with TableStyles.GetCanonicalWordStyleName(...) or TryGetCanonicalWordStyleName(...) before storing wrapper configuration. Page extraction, range extraction, splitting, basic merge, duplicate, move, delete, reorder, rotate, metadata editing, text/image stamp/watermark operations, simple form-field inventory with typed field kind, common flag helpers, scalar or array current/default values, selected/default-selected choice options, text max length, and choice options, simple AcroForm value fill with basic text-widget appearances, and simple text/choice/button-widget flattening can now be wrapped as early capabilities, but richer image transparency cases, full appearance regeneration, complex form flattening, and advanced page editing should stay behind feature work until the import/edit pipeline is stronger.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

OfficeIMO.Pdf Support Matrix

Wrapper Guidance

Uh oh!

Uh oh!

FilesExpand file tree

officeimo.pdf.support-matrix.md

Latest commit

History

officeimo.pdf.support-matrix.md

File metadata and controls

OfficeIMO.Pdf Support Matrix

Wrapper Guidance