Skip to content

view outline fails on WPS-created .docx with numeric style IDs, and ignores explicit outlineLvl #163

Description

@JinCheng666

officecli view outline fails on WPS-created .docx documents with numeric style IDs, and ignores explicit outlineLvl

Environment

  • officecli version: 1.0.117
  • OS: Linux
  • Document source: Created and edited in WPS Office 12.1.2 (Chinese version, .docx format, template: Normal)

Summary

officecli view outline returns zero headings for .docx files created by WPS Office (Chinese version), for two reasons:

  1. WPS uses numeric style IDs (2, 3, 4) for heading styles, instead of the standard English names (Heading1, Heading2, Heading3) that officecli expects. The /styles part is empty — WPS does not emit explicit <w:style> definitions.
  2. view outline completely ignores outlineLvl. Even when paragraphs have an explicit OOXML outline level (e.g., outlineLvl=2 on a Normal paragraph), they are not included in the outline tree.

Steps to Reproduce

  1. Create a .docx in WPS Office (Chinese version) with heading styles applied to multiple paragraphs
  2. Run officecli view file.docx outline:
$ officecli view test.docx outline
File: test.docx | 1956 paragraphs | 79 tables | 86 images | 19 OLE objects
Footer: "5-[PAGE]"
# ← No outline tree — only file stats and footers
  1. JSON confirms the headings array is empty:
$ officecli view test.docx outline --json
{
  "success": true,
  "data": {
    "paragraphs": 1956,
    "headings": []    // ← always empty for WPS documents
  }
}

Actual vs Expected

Check Actual Expected
view outline (WPS doc) Empty tree Full outline with all heading levels
view outline --json "headings": [] Array of heading nodes
Paragraph with outlineLvl=2 Ignored by outline Included in outline

Technical Analysis

WPS numeric style IDs

In this 1956-paragraph WPS document, the heading styles are represented by numeric IDs:

w:pStyle w:val Count Heading role Text example
2 1 H1 "5 工程布置及建筑物"
3 13 H2 "5.1 设计依据", "5.2 工程等级和标准"
4 38 H3 "5.1.1 各主管部门...", "5.2.1 工程等别..."

officecli get correctly reports styleId=2, styleId=3, styleId=4 on these paragraphs, but view outline does not recognize them as headings.

outlineLvl is ignored

A paragraph was manually edited to style=Normal, outlineLvl=2 (explicit outline level without heading style):

$ officecli get doc.docx '/body/p[69]' --depth 0
/body/p[@paraId=4FA1DC66] (paragraph) "5.1.1 ..." style=Normal ... outlineLvl=2 ...

get and query correctly return outlineLvl=2, and the property is documented in the schema:

$ officecli help docx paragraph | grep outlineLvl
  outlineLvl   number   [add/set/get]
    description: outline level (0-9). Used by Word's TOC and document map.

However, view outline does not include this paragraph in the outline tree.

Empty styles part

$ officecli raw doc.docx /styles
(no styles)

WPS does not write explicit <w:style> definitions for the standard styles, so there is no w:name attribute to cross-reference the numeric IDs.

Proposed Fix

  1. Use outlineLvl as the primary signal for view outline. In OOXML, every paragraph that contributes to the document outline has an outline level — whether inherited implicitly from a Heading style or set explicitly via w:outlineLvl. This is the robust, standards-based approach and would fix both the WPS compatibility issue and the manual outline level case.

  2. Fallback: If outlineLvl is absent, attempt to infer it from the paragraph style. This fallback should handle:

    • Standard English style names (Heading1, Heading2, …)
    • Numeric style IDs that reference known heading styles (via w:style definitions in /styles)
    • Alternate heading style patterns used by non-English Office suites

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions