Add support for PP-DocLayoutV2 by alex-dinh · Pull Request #1619 · PaddlePaddle/Paddle2ONNX

alex-dinh · 2026-01-05T18:48:52Z

Credit to user predict-woo, but their code does not work out of the box. I needed to make more changes to get it working. Here are detailed instructions on how to setup and run the conversion. I am working on macOS, but if anyone else is able to test it out on Windows or Linux, I welcome your input.

Setup steps (macOS):

brew install protobuf (if not installed already)
git clone https://github.com/alex-dinh/Paddle2ONNX
cd Paddle2ONNX
git submodule init
git submodule update
pip install .

(Note: I tried using pip install -e . , but this results in an incorrect working directory when running paddle2onnx from the terminal)

My environment package versions:

paddleocr==3.3.2
paddlepaddle==3.0.0                  
paddlex==3.1.3                  
onnx==1.17.0                                 
onnxoptimizer==0.3.13                 
onnxruntime==1.22.1

Test DocLayoutV2 conversion:

cd <path to DocLayoutV2 folder>
paddle2onnx --model_dir PP-DocLayoutV2 \
            --model_filename inference.json \
            --params_filename inference.pdiparams \
            --save_file PP-DocLayoutV2/inference.onnx

Test ONNX model:

import onnxruntime as ort
import cv2
import numpy as np

def preprocess_image_doclayout(image, target_input_size=(800, 800)):
    """
    Preprocessing for DocLayoutV2 with 800x800 input
    """
    # Get original dimensions
    orig_h, orig_w = image.shape[:2]

    # Resize, do not maintain aspect ratio
    target_h, target_w = target_input_size
    scale_h = target_h / orig_h
    scale_w = target_w / orig_w

    new_h, new_w = int(orig_h * scale_h), int(orig_w * scale_w)
    resized = cv2.resize(image, (new_w, new_h), interpolation=cv2.INTER_LINEAR)

    # Convert to RGB and normalize
    padded = cv2.cvtColor(resized, cv2.COLOR_BGR2RGB)

    # Normalize: scale to [0, 1] then apply ImageNet normalization
    input_blob = padded.astype(np.float32) / 255.0

    # ImageNet normalization
    mean = np.array([0.485, 0.456, 0.406], dtype=np.float32)
    std = np.array([0.229, 0.224, 0.225], dtype=np.float32)
    input_blob = (input_blob - mean) / std

    # Transpose to CHW format and add batch dimension
    input_blob = input_blob.transpose(2, 0, 1)[np.newaxis, ...]

    return input_blob, scale_h, scale_w

def paddle_onnx_doclayout():
    model = ort.InferenceSession('./PP-DocLayoutV2/inference.onnx')  # Specify onnx path
    input_names = [i.name for i in model.get_inputs()]
    output_names = [o.name for o in model.get_outputs()]

    image = cv2.imread('test_input.png')  # Specify test input image
    input_blob, scale_h, scale_w = preprocess_image_doclayout(image)
    print(scale_h, scale_w)
    preprocess_shape = [np.array([800, 800], dtype=np.float32)]
    input_feed = {input_names[0]: preprocess_shape,
                  input_names[1]: input_blob,
                  input_names[2]: [[scale_h, scale_w]]}

    output = model.run(output_names, input_feed)[0]
    # First 6 values are [label_index, score, xmin, ymin, xmax, ymax]
    print(output[0])  # (300, 8)

    # Filter out low-confidence boxes
    boxes = output[output[:, 1] > 0.5]
    print(boxes)

if __name__ == '__main__':
    paddle_onnx_doclayout()

On macOS, the outputted model has a lot of unused weight initializers, so I also recommend optimizing the model via onnxoptimizer to avoid warnings and reduce model size. This issue is not present on Linux for some reason (I tested on Zorin OS).

python -m onnxoptimizer inference.onnx inference_optimized.onnx

On macOS, the paddle2onnx conversion produces these warnings, but the model is still functional:

**2026-01-06 15:52:59 [WARNING]	Fail to fold onnx model with error: [ShapeInferenceError] (op_type:BatchNormalization, node name: BatchNormalization.1): [TypeInferenceError] Input 0 expected to have type but instead is null. Skip folding.**
2026-01-06 15:53:00 [INFO]	ONNX model saved in PP-DocLayoutV2/inference.onnx.

paddle-bot · 2026-01-05T18:48:58Z

Thanks for your contribution!

CLAassistant · 2026-01-05T18:48:59Z

All committers have signed the CLA.

alex-dinh · 2026-01-06T18:38:36Z

Looks like even though converting DocLayoutV2 to onnx works, the changes break some tests with the strided_slice and stack ops. Will mark as a draft for now.

alex-dinh · 2026-01-06T20:08:38Z

paddle2onnx/mapper/tensor/slice.cc

Modifying this file from the version prior to this PR (3e77ec7) was unnecessary, reverted it.

alex-dinh · 2026-01-07T02:15:41Z

@zhangbo9674 Do you know why the Windows build is failing? There seems to be some missing DLL in the test environment: https://github.com/PaddlePaddle/Paddle2ONNX/actions/runs/20767660339/job/59637230828?pr=1619#step:10:224

GreatV · 2026-01-07T03:15:26Z

paddle2onnx --model_dir ../PaddleOCR-VL/PP-DocLayoutV2/ --model_filename inference.pdmodel --params_filename inference.pdiparams --save_file pp-doclayoutv2.ooonnx
2026-01-07 03:14:53 [WARNING]   The .pdmodel file is deprecated in paddlepaddle 3.0 and will be removed in the future. Try to convert from .pdmodel file to json file.
I0107 03:14:53.903254 2728575 program_interpreter.cc:257] New Executor is Running.
[Paddle2ONNX] Start parsing the Paddle model file...
[Paddle2ONNX] Use opset_version = 17 for ONNX export.


--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0   paddle2onnx::Export(char const*, char const*, char**, int*, int, bool, bool, bool, bool, bool, paddle2onnx::CustomOp*, int, char const*, char**, int*, char const*, bool*, bool, char**, int)

----------------------
Error Message Summary:
----------------------
FatalError: `Segmentation fault` is detected by the operating system.
  [TimeInfo: *** Aborted at 1767755694 (unix time) try "date -d @1767755694" if you are using GNU date ***]
  [SignalInfo: *** SIGSEGV (@0x0) received by PID 2728575 (TID 0x7655a5d0d140) from PID 0 ***]

Segmentation fault (core dumped)

paddle2onnx --model_dir PP-DocLayoutV2_infer/ --model_filename inference.json --params_filename inference.pdiparams --save_file pp-doclayoutv2.onnx
[Paddle2ONNX] Start parsing the Paddle model file...
[Paddle2ONNX] Use opset_version = 17 for ONNX export.
[Paddle2ONNX] PaddlePaddle model is exported as ONNX format now.
2026-01-07 04:57:16 [INFO]      Try to perform constant folding on the ONNX model with Polygraphy.
[W] 'colored' module is not installed, will not use colors when logging. To enable colors, please install the 'colored' module: python3 -m pip install colored
[I] Folding Constants | Pass 1
2026-01-07 04:57:16.696936887 [W:onnxruntime:, unsqueeze_elimination.cc:20 Apply] UnsqueezeElimination cannot remove node Unsqueeze.277
[W] colored module is not installed, will not use colors when logging. To enable colors, please install the colored module: python3 -m pip install colored
[W] It looks like this model contains foldable nodes that produce large outputs.
In order to avoid bloating the model, you may want to set a constant-folding size threshold.
Note: Large tensors and their corresponding sizes were: {'Mul.204': '1 MiB'}
[W] Falling back to `onnx.shape_inference` because `onnxruntime.tools.symbolic_shape_infer` either could not be loaded or did not run successfully.
    Note that using ONNX-Runtime for shape inference may be faster and require less memory.
    Consider installing ONNX-Runtime or setting POLYGRAPHY_AUTOINSTALL_DEPS=1 in your environment variables to allow Polygraphy to do so automatically.
[I]     Total Nodes | Original:  8512, After Folding:  3767 |  4745 Nodes Folded
[I] Folding Constants | Pass 2
[I]     Total Nodes | Original:  3767, After Folding:  3753 |    14 Nodes Folded
[I] Folding Constants | Pass 3
[I]     Total Nodes | Original:  3753, After Folding:  3753 |     0 Nodes Folded
2026-01-07 04:57:31 [INFO]      ONNX model saved in pp-doclayoutv2.onnx.

zhaohb · 2026-01-07T06:25:03Z

@GreatV Hi can you share your onnx model?
Thank you.

GreatV · 2026-01-07T06:52:41Z

Why is there such a large discrepancy between the output and Paddle Inference?

#!/usr/bin/env python3
from __future__ import annotations

import argparse
from pathlib import Path

import numpy as np


def _parse_int_list(csv: str) -> list[int]:
    return [int(x.strip()) for x in csv.split(",") if x.strip()]


def _add_bool_arg(
    parser: argparse.ArgumentParser, name: str, default: bool, help_text: str
) -> None:
    dest = name.lstrip("-").replace("-", "_")
    group = parser.add_mutually_exclusive_group(required=False)
    group.add_argument(name, dest=dest, action="store_true", help=help_text)
    group.add_argument(
        f"--no_{dest}", dest=dest, action="store_false", help=f"Disable: {help_text}"
    )
    parser.set_defaults(**{dest: default})


def export_onnx(
    model_dir: Path,
    model_filename: str,
    params_filename: str,
    onnx_path: Path,
    opset_version: int,
    auto_update_opset: bool,
    enable_onnx_checker: bool,
    optimize_tool: str,
    verbose: bool,
) -> None:
    import paddle2onnx

    model_file = model_dir / model_filename
    params_file = model_dir / params_filename
    if not model_file.exists():
        raise FileNotFoundError(f"model file not found: {model_file}")
    if not params_file.exists():
        raise FileNotFoundError(f"params file not found: {params_file}")

    onnx_path.parent.mkdir(parents=True, exist_ok=True)
    paddle2onnx.export(
        str(model_file),
        str(params_file),
        str(onnx_path),
        opset_version=opset_version,
        auto_upgrade_opset=auto_update_opset,
        verbose=verbose,
        enable_onnx_checker=enable_onnx_checker,
        optimize_tool=optimize_tool,
    )


def build_paddle_predictor(
    model_dir: Path,
    model_filename: str,
    params_filename: str,
    disable_mkldnn: bool,
    disable_ir_optim: bool,
):
    import paddle.inference as paddle_infer

    model_file = model_dir / model_filename
    params_file = model_dir / params_filename
    config = paddle_infer.Config(str(model_file), str(params_file))
    config.disable_gpu()
    if disable_mkldnn:
        config.disable_mkldnn()
    if disable_ir_optim:
        config.switch_ir_optim(False)
    return paddle_infer.create_predictor(config)


def build_ort_session(onnx_path: Path, disable_ort_optim: bool):
    import onnxruntime as ort

    sess_options = ort.SessionOptions()
    if disable_ort_optim:
        sess_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_DISABLE_ALL
    return ort.InferenceSession(
        str(onnx_path),
        sess_options=sess_options,
        providers=["CPUExecutionProvider"],
    )


def generate_inputs(
    batch: int,
    seed: int,
    height: int,
    width: int,
    repeat_first: bool,
) -> dict[str, np.ndarray]:
    rng = np.random.default_rng(seed)
    if repeat_first:
        base = rng.standard_normal((1, 3, height, width)).astype(np.float32)
        image = np.repeat(base, batch, axis=0)
    else:
        image = rng.standard_normal((batch, 3, height, width)).astype(np.float32)

    im_shape = np.tile(np.array([[float(height), float(width)]], dtype=np.float32), (batch, 1))
    scale_factor = np.tile(np.array([[1.0, 1.0]], dtype=np.float32), (batch, 1))
    return {"image": image, "im_shape": im_shape, "scale_factor": scale_factor}


def main() -> int:
    parser = argparse.ArgumentParser(
        description="Compare Paddle Inference vs ONNXRuntime for PP-DocLayoutV2."
    )
    parser.add_argument("--model_dir", type=Path, required=True)
    parser.add_argument("--model_filename", type=str, default="inference.json")
    parser.add_argument("--params_filename", type=str, default="inference.pdiparams")
    parser.add_argument("--onnx_path", type=Path, required=True)
    parser.add_argument("--export_onnx", action="store_true")
    parser.add_argument("--opset_version", type=int, default=17)
    _add_bool_arg(
        parser,
        "--auto_update_opset",
        default=True,
        help_text="Auto update ONNX opset",
    )
    _add_bool_arg(
        parser,
        "--enable_onnx_checker",
        default=False,
        help_text="Run ONNX checker",
    )
    parser.add_argument("--optimize_tool", type=str, default="None")
    parser.add_argument("--verbose", action="store_true")

    parser.add_argument("--batches", type=_parse_int_list, default=[1, 2, 4, 8])
    parser.add_argument("--seed", type=int, default=20260107)
    parser.add_argument("--height", type=int, default=800)
    parser.add_argument("--width", type=int, default=800)
    parser.add_argument("--repeat_first", action="store_true")
    parser.add_argument("--atol", type=float, default=1e-4)
    parser.add_argument("--rtol", type=float, default=1e-4)
    parser.add_argument("--show_max_loc", action="store_true")

    _add_bool_arg(
        parser,
        "--disable_mkldnn",
        default=True,
        help_text="Disable Paddle MKLDNN",
    )
    _add_bool_arg(
        parser,
        "--disable_ir_optim",
        default=True,
        help_text="Disable Paddle IR optim",
    )
    _add_bool_arg(
        parser,
        "--disable_ort_optim",
        default=True,
        help_text="Disable ORT graph optim",
    )

    args = parser.parse_args()

    if args.export_onnx or not args.onnx_path.exists():
        export_onnx(
            model_dir=args.model_dir,
            model_filename=args.model_filename,
            params_filename=args.params_filename,
            onnx_path=args.onnx_path,
            opset_version=args.opset_version,
            auto_update_opset=args.auto_update_opset,
            enable_onnx_checker=args.enable_onnx_checker,
            optimize_tool=args.optimize_tool,
            verbose=args.verbose,
        )

    predictor = build_paddle_predictor(
        model_dir=args.model_dir,
        model_filename=args.model_filename,
        params_filename=args.params_filename,
        disable_mkldnn=args.disable_mkldnn,
        disable_ir_optim=args.disable_ir_optim,
    )
    sess = build_ort_session(args.onnx_path, disable_ort_optim=args.disable_ort_optim)

    ort_output_names = [o.name for o in sess.get_outputs()]
    pd_output_names = predictor.get_output_names()

    common_outputs = [name for name in ort_output_names if name in set(pd_output_names)]
    if not common_outputs:
        raise RuntimeError(
            f"No common outputs between Paddle({pd_output_names}) and ORT({ort_output_names})"
        )

    print("Paddle inputs:", predictor.get_input_names())
    print("Paddle outputs:", pd_output_names)
    print("ORT inputs:", [i.name for i in sess.get_inputs()])
    print("ORT outputs:", ort_output_names)
    print("Compare outputs:", common_outputs)
    print()

    for batch in args.batches:
        inputs = generate_inputs(
            batch=batch,
            seed=args.seed + batch,
            height=args.height,
            width=args.width,
            repeat_first=args.repeat_first,
        )

        # Paddle
        for name in predictor.get_input_names():
            if name not in inputs:
                raise RuntimeError(f"Missing input '{name}' in generated inputs: {list(inputs)}")
            arr = inputs[name]
            h = predictor.get_input_handle(name)
            h.reshape(arr.shape)
            h.copy_from_cpu(arr)
        predictor.run()
        pd_outputs = {name: predictor.get_output_handle(name).copy_to_cpu() for name in pd_output_names}

        # ORT
        ort_outputs_list = sess.run(None, inputs)
        if len(ort_output_names) != len(ort_outputs_list):
            raise RuntimeError(
                f"ORT outputs mismatch: names={len(ort_output_names)} values={len(ort_outputs_list)}"
            )
        ort_outputs = dict(zip(ort_output_names, ort_outputs_list))

        print(f"batch={batch} repeat_first={args.repeat_first} seed={args.seed + batch}")
        for name in common_outputs:
            pd = pd_outputs[name]
            ort = ort_outputs[name]
            same_shape = pd.shape == ort.shape
            same_dtype = pd.dtype == ort.dtype
            print(f"  {name}: shape {pd.shape} vs {ort.shape} match={same_shape} dtype {pd.dtype} vs {ort.dtype} match={same_dtype}")
            if not same_shape:
                continue
            if np.issubdtype(pd.dtype, np.floating) and np.issubdtype(ort.dtype, np.floating):
                diff = pd - ort
                absdiff = np.abs(diff)
                max_abs = float(absdiff.max())
                mean_abs = float(absdiff.mean())
                allclose = bool(np.allclose(pd, ort, atol=args.atol, rtol=args.rtol))
                print(f"    max_abs={max_abs} mean_abs={mean_abs} allclose(atol={args.atol},rtol={args.rtol})={allclose}")
                if pd.ndim == 2 and pd.shape[1] <= 64:
                    print(f"    per_col_max={absdiff.max(axis=0)}")
                if args.show_max_loc:
                    max_idx = np.unravel_index(np.argmax(absdiff), absdiff.shape)
                    print(f"    max_loc={max_idx} pd={pd[max_idx]} ort={ort[max_idx]}")
            else:
                equal = bool(np.array_equal(pd, ort))
                print(f"    equal={equal}")
        print()

    return 0


if __name__ == "__main__":
    raise SystemExit(main())

 python debug/compare_paddle_ort_ppdoclayoutv2.py --model_dir PP-DocLayoutV2_infer --onnx_path pp-doclayoutv2.onnx
--- Running PIR pass [add_shadow_output_after_dead_parameter_pass]
--- Running PIR pass [delete_quant_dequant_linear_op_pass]
--- Running PIR pass [delete_weight_dequant_linear_op_pass]
--- Running PIR pass [transfer_layout_pass]
--- Running PIR pass [common_subexpression_elimination_pass]
I0107 06:40:56.880336 2811109 print_statistics.cc:50] --- detected [870] subgraphs!
--- Running PIR pass [constant_folding_pass]
I0107 06:40:56.881546 2811109 pir_interpreter.cc:1601] New Executor is Running ...
I0107 06:40:56.881672 2811109 pir_interpreter.cc:1625] pir interpreter is running by multi-thread mode ...
I0107 06:40:56.932209 2811109 print_statistics.cc:44] --- detected [165, 2907] subgraphs!
--- Running PIR pass [dead_code_elimination_pass]
I0107 06:40:56.933084 2811109 print_statistics.cc:50] --- detected [54] subgraphs!
--- Running PIR pass [replace_fetch_with_shadow_output_pass]
I0107 06:40:56.933688 2811109 print_statistics.cc:50] --- detected [2] subgraphs!
--- Running PIR pass [remove_shadow_feed_pass]
--- Running PIR pass [inplace_pass]
I0107 06:40:57.141657 2811109 print_statistics.cc:50] --- detected [676] subgraphs!
I0107 06:40:57.142059 2811109 analysis_predictor.cc:1217] ======= pir optimization completed =======
Paddle inputs: ['im_shape', 'image', 'scale_factor']
Paddle outputs: ['fetch_name_0', 'fetch_name_1']
ORT inputs: ['im_shape', 'image', 'scale_factor']
ORT outputs: ['fetch_name_0', 'fetch_name_1']
Compare outputs: ['fetch_name_0', 'fetch_name_1']

I0107 06:40:57.943626 2811109 pir_interpreter.cc:1622] pir interpreter is running by trace mode ...
batch=1 repeat_first=False seed=20260108
  fetch_name_0: shape (300, 8) vs (300, 8) match=True dtype float32 vs float32 match=True
    max_abs=291.0 mean_abs=39.715877532958984 allclose(atol=0.0001,rtol=0.0001)=False
    per_col_max=[0.0000000e+00 1.1026859e-06 4.3411255e-03 1.2893677e-03 7.9345703e-04
 1.6479492e-03 2.9100000e+02 2.9100000e+02]
  fetch_name_1: shape (1,) vs (1,) match=True dtype int32 vs int32 match=True
    equal=True

batch=2 repeat_first=False seed=20260109
  fetch_name_0: shape (600, 8) vs (600, 8) match=True dtype float32 vs float32 match=True
    max_abs=291.0 mean_abs=39.5771369934082 allclose(atol=0.0001,rtol=0.0001)=False
    per_col_max=[0.0000000e+00 2.3014843e-05 1.4595032e-02 1.6174316e-03 7.9040527e-03
 1.5258789e-03 2.9100000e+02 2.9100000e+02]
  fetch_name_1: shape (2,) vs (2,) match=True dtype int32 vs int32 match=True
    equal=True

batch=4 repeat_first=False seed=20260111
  fetch_name_0: shape (1200, 8) vs (1200, 8) match=True dtype float32 vs float32 match=True
    max_abs=291.0 mean_abs=39.563392639160156 allclose(atol=0.0001,rtol=0.0001)=False
    per_col_max=[0.0000000e+00 6.0498714e-06 1.7265320e-02 2.3117065e-03 3.7841797e-03
 5.0048828e-03 2.9100000e+02 2.9100000e+02]
  fetch_name_1: shape (4,) vs (4,) match=True dtype int32 vs int32 match=True
    equal=True

batch=8 repeat_first=False seed=20260115
  fetch_name_0: shape (2400, 8) vs (2400, 8) match=True dtype float32 vs float32 match=True
    max_abs=291.0 mean_abs=39.15203094482422 allclose(atol=0.0001,rtol=0.0001)=False
    per_col_max=[0.0000000e+00 1.8328428e-06 1.6151428e-02 4.8522949e-03 8.9721680e-03
 6.7138672e-03 2.9100000e+02 2.9100000e+02]
  fetch_name_1: shape (8,) vs (8,) match=True dtype int32 vs int32 match=True
    equal=True

zhaohb · 2026-01-07T06:58:28Z

I'm also encountering this problem. The ONNX model outputs zero coordinates, while the classification results and scores seem accurate.

GreatV · 2026-01-07T07:05:32Z

Hi, @zhaohb You might want to try the version I exported myself, which differs slightly from the current PR implementation. pp-doclayoutv2.onnx

alex-dinh · 2026-01-07T07:22:57Z

Hi, @zhaohb You might want to try the version I exported myself, which differs slightly from the current PR implementation. pp-doclayoutv2.onnx

Hi, what changes did you need to make, and what OS are you using?

Also, here is the onnx model I exported using the changes in this PR: PP-DocLayoutV2.onnx

GreatV · 2026-01-07T07:50:20Z

Hi @alex-dinh I add tie-break logic for argsort

zhaohb · 2026-01-07T08:15:53Z

Hi, @zhaohb You might want to try the version I exported myself, which differs slightly from the current PR implementation. pp-doclayoutv2.onnx

Hi, what changes did you need to make, and what OS are you using?

Also, here is the onnx model I exported using the changes in this PR: PP-DocLayoutV2.onnx

Hi @alex-dinh
I've verified the ONNX model you provided, and the results are as expected. Thank you very much.

zhaohb · 2026-01-07T10:09:47Z

Hi, @zhaohb You might want to try the version I exported myself, which differs slightly from the current PR implementation. pp-doclayoutv2.onnx

Hi, what changes did you need to make, and what OS are you using?
Also, here is the onnx model I exported using the changes in this PR: PP-DocLayoutV2.onnx

Hi @alex-dinh I've verified the ONNX model you provided, and the results are as expected. Thank you very much.

Just to clarify: we're still seeing some discrepancies in the coordinate outputs between the ONNX and Paddle inference results. Have you noticed any differences in your tests? @alex-dinh

alex-dinh · 2026-01-07T16:52:02Z

Just to clarify: we're still seeing some discrepancies in the coordinate outputs between the ONNX and Paddle inference results. Have you noticed any differences in your tests? @alex-dinh

Hi @zhaohb, I see some discrepancies but they are very minor. The PaddleX library also does some extra postprocessing (such as unclipping, merging) that I do not apply to the ONNX model output, which would explain the difference. However, the results for my local testing are very similar. I am only filtering out low-confidence boxes from the ONNX model.

# ONNX inference results
cls_id	score	xmin	ymin	xmax	ymax
22		0.988	129.06	980.56	1008.02	1389.72
22		0.981	129.53	676.27	1005.06	758.30
22		0.985	129.69	760.43	1006.29	869.83
22		0.986	129.77	461.66	1007.36	595.17
22		0.988	129.90	215.78	1006.80	459.33
22		0.967	130.72	923.87	1004.14	977.11
17		0.955	132.66	157.30	910.44	193.03
15		0.845	134.74	242.61	219.08	268.67
16		0.905	134.82	87.82	172.78	108.21
5		0.970	186.75	602.91	432.31	666.96
5		0.946	187.07	880.80	411.39	910.06
15		0.600	198.66	678.32	316.77	700.77
15		0.759	205.82	352.43	272.44	376.67
15		0.920	327.63	297.45	481.57	321.51
15		0.926	357.26	923.21	492.15	949.41
15		0.842	502.21	269.64	584.47	294.52
15		0.918	536.70	294.97	614.41	319.93
12		0.950	655.48	86.78	1004.64	110.35
15		0.930	781.65	488.08	916.63	513.95
15		0.921	835.40	323.57	996.94	348.95
15		0.920	850.75	242.72	954.85	267.82
11		0.934	941.74	882.36	1001.35	909.73
11		0.934	941.85	621.43	1001.59	648.19

# Paddle inference results
cls_id	score	xmin	ymin	xmax	ymax
22		0.988	129.33	981.08	1007.87	1390.29
22		0.982	129.86	676.31	1005.09	758.65
22		0.985	130.17	761.11	1006.52	870.00
22		0.987	130.20	461.91	1007.33	596.17
22		0.989	130.42	216.08	1006.34	459.67
22		0.968	130.83	923.80	1005.02	978.05
17		0.954	132.14	157.31	911.11	193.37
16		0.890	134.67	87.14	172.41	108.45
15		0.910	134.95	242.78	218.56	269.02
5		0.945	187.28	880.58	411.67	910.64
5		0.969	187.43	603.18	432.71	667.15
15		0.559	198.57	678.35	316.43	700.74
15		0.774	206.47	352.40	271.67	377.55
15		0.872	328.40	296.68	481.14	322.20
15		0.934	356.49	922.29	492.36	950.71
15		0.866	501.80	269.53	583.92	294.96
15		0.922	537.86	294.66	613.58	320.30
12		0.952	656.88	86.38	1004.87	110.47
15		0.936	782.03	488.54	916.38	514.59
15		0.936	835.46	323.44	995.74	349.93
15		0.927	851.37	242.47	954.60	268.37
11		0.935	941.90	882.26	1001.40	910.11
11		0.936	941.94	621.05	1001.64	649.50

Here is the image I ran my test on:

sb_policy_approx.png

Edit: here is my doclayout test script:
doclayout.py

GreatV

We may need to incorporate additional unit tests for the index_put function, add the registration in paddle2onnx/mappers_registry.h.in, and update the copyright year in the newly appended files to 2026.

GreatV · 2026-01-07T23:04:25Z

paddle2onnx/mapper/tensor/index_put.cc

@@ -0,0 +1,169 @@
+// Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.


Suggested change

// Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.

// Copyright (c) 2026 PaddlePaddle Authors. All Rights Reserved.

GreatV · 2026-01-07T23:04:39Z

paddle2onnx/mapper/tensor/index_put.h

@@ -0,0 +1,46 @@
+// Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.


Suggested change

// Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.

// Copyright (c) 2026 PaddlePaddle Authors. All Rights Reserved.

Copilot

Pull request overview

This pull request adds support for converting PaddlePaddle's PP-DocLayoutV2 model to ONNX format. The implementation builds upon work from predict-woo but includes significant modifications and additional fixes needed for macOS compatibility.

Key changes:

Updated PaddlePaddle dependency from development version to stable 3.0.0
Added new index_put operation mapper to handle tensor indexing operations
Enhanced existing tensor operation mappers (stack, squeeze2, slice, set_value) to properly handle PIR mode and edge cases

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
pyproject.toml	Updated paddlepaddle to stable 3.0.0 release and bumped minimum onnx version to 1.16.1
paddle2onnx/mapper/tensor/stack.cc	Added logic to normalize mixed-rank inputs (scalars and single-element tensors) before stacking
paddle2onnx/mapper/tensor/squeeze2.cc	Added optimization to skip squeeze operation when target dimensions are not squeezable
paddle2onnx/mapper/tensor/slice.cc	Added early return for PIR mode when decrease_axis is present to avoid shape comparison failures
paddle2onnx/mapper/tensor/set_value.cc	Added support for PIR mode set_value_with_tensor operations with empty axes and alternative input names
paddle2onnx/mapper/tensor/index_put.h	New header file defining the IndexPutMapper class for handling index_put operations
paddle2onnx/mapper/tensor/index_put.cc	New implementation supporting both boolean mask and integer indexing with optional accumulation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-07T23:16:55Z

paddle2onnx/mapper/tensor/stack.cc

+      continue;
+    } else if (x_info[i].Rank() == 1) {
+      // Check if it's exactly [1] not [4] or other sizes
+      if (x_info[i].shape.size() > 0 && x_info[i].shape[0] == 1) {


The logic for detecting single-element tensors has a potential issue. At line 47, the code checks x_info[i].shape.size() > 0, but for a rank-1 tensor, shape.size() is always equal to 1 (the rank), not necessarily > 0. This condition will always be true for rank-1 tensors. The intent seems correct but the check is redundant since we already know Rank() == 1 at this point.

Suggested change

if (x_info[i].shape.size() > 0 && x_info[i].shape[0] == 1) {

if (x_info[i].shape[0] == 1) {

hzkitty · 2026-01-08T02:10:59Z

Hi, @zhaohb You might want to try the version I exported myself, which differs slightly from the current PR implementation. pp-doclayoutv2.onnx

Hi, what changes did you need to make, and what OS are you using?

Also, here is the onnx model I exported using the changes in this PR: PP-DocLayoutV2.onnx

PP-DocLayoutV2.onnx will encounter an error when loaded with openvino, while pp-doclayoutv2.onnx works normally

Edit: here is my doclayout-openvino test script:
doclayout-openvino.py

@alex-dinh

zhaohb · 2026-01-08T02:15:43Z

Hi, @zhaohb You might want to try the version I exported myself, which differs slightly from the current PR implementation. pp-doclayoutv2.onnx

Hi, what changes did you need to make, and what OS are you using?
Also, here is the onnx model I exported using the changes in this PR: PP-DocLayoutV2.onnx

PP-DocLayoutV2.onnx will encounter an error when loaded with openvino, while pp-doclayoutv2.onnx works normally

Edit: here is my doclayout-openvino test script: doclayout-openvino.py

@alex-dinh

Hi @hzkitty ,
I've just finished preparing the OpenVINO IR model. Please feel free to test it.
https://www.modelscope.cn/models/zhaohb/PP-DocLayoutV2-ov/summary

alex-dinh · 2026-01-08T02:38:41Z

PP-DocLayoutV2.onnx will encounter an error when loaded with openvino, while pp-doclayoutv2.onnx works normally

Hi @hzkitty, this revealed an issue with macOS and paddle2onnx. I tried exporting the onnx model on my linux machine and your script runs successfully.

Try this model: PP-DocLayoutV2.onnx (Exported in a linux environment)

Output of doclayout-openvino.py using this new PP-DocLayoutV2.onnx:

=== Model Inputs ===
im_shape [?,2]
image [?,3,800,800]
scale_factor [?,2]
=== Model Outputs ===
fetch_name_0 [?,8]
fetch_name_1 [?]
--- DocLayoutV2 OpenVINO Output ---
cls_id	score	xmin	ymin	xmax	ymax
22	0.987	128.88	980.50	1008.50	1389.00
22	0.986	129.75	462.00	1007.00	595.50
22	0.988	130.00	215.62	1006.50	459.50
22	0.984	130.00	760.50	1006.50	869.50
22	0.981	130.00	676.50	1004.00	758.50
22	0.966	130.75	924.00	1003.00	977.00
17	0.955	132.75	157.38	909.50	193.12
15	0.851	134.50	242.62	219.00	268.75
16	0.906	135.00	87.69	172.88	108.12
5	0.970	186.88	603.00	432.50	667.50
5	0.945	187.62	881.00	412.00	909.50
15	0.607	198.50	678.50	316.75	700.50
15	0.764	206.12	352.50	272.50	376.75
15	0.918	327.75	297.00	481.50	321.00
15	0.926	358.00	923.00	492.75	949.00
15	0.844	502.25	269.75	584.50	294.75
15	0.921	536.50	295.50	614.50	320.50
12	0.950	655.50	86.75	1004.00	110.19
15	0.930	781.50	488.25	917.50	514.00
15	0.919	835.00	323.25	997.50	348.75
15	0.919	850.50	242.50	955.00	267.50
11	0.934	942.00	882.50	1001.00	909.50
11	0.932	942.00	621.50	1002.00	648.00

alex-dinh · 2026-01-08T16:50:10Z

pyproject.toml

    "cmake>=3.16",
    "setuptools-scm",
-    "paddlepaddle==3.0.0",
+    "paddlepaddle==3.0.0.dev20250426",


Not sure why, but the Windows build requires "paddlepaddle==3.0.0.dev20250426" in the PR check workflow. For local development on macOS and linux, changing to 3.0.0 or 3.1.0 are fine for pip install -e . to run successfully.

alex-dinh · 2026-01-09T17:22:29Z

tests/run.bat

 %PY_CMD% -m pip install tqdm filelock
 %PY_CMD% -m pip install onnx==1.16.0 onnxruntime==1.19.0
 %PY_CMD% -m pip install six hypothesis
-%PY_CMD% -m pip install --pre paddlepaddle -i https://www.paddlepaddle.org.cn/packages/nightly/cpu/


My guess is that this used to work when paddlepaddle==3.0.0.dev20250426 was the newest dev version, but broke when newer versions of paddlepaddle were released.

alex-dinh · 2026-01-10T00:13:55Z

We may need to incorporate additional unit tests for the index_put function, add the registration in paddle2onnx/mappers_registry.h.in, and update the copyright year in the newly appended files to 2026.

Hi @GreatV, I made edits according to your suggestions. Please see the latest commits!

GreatV

LGTM

jzhang533 · 2026-01-12T02:43:36Z

I have created a new release which includes this PR. please give it a try:

https://pypi.org/project/paddle2onnx/2.1.0/

paddle-bot bot added the contributor label Jan 5, 2026

Add support for PP-DocLayoutV2

6b39e05

alex-dinh force-pushed the develop branch from eb63b6f to 6b39e05 Compare January 5, 2026 18:55

alex-dinh mentioned this pull request Jan 5, 2026

无法转换PP-DocLayoutV2 #1608

Closed

jzhang533 assigned zhangbo9674 Jan 6, 2026

alex-dinh marked this pull request as draft January 6, 2026 18:38

alex-dinh marked this pull request as ready for review January 6, 2026 19:58

alex-dinh commented Jan 6, 2026

View reviewed changes

alex-dinh force-pushed the develop branch from 0215f55 to b302d7f Compare January 7, 2026 01:32

Fixed slice and stack op tests + code style check errors

2cdf82b

alex-dinh force-pushed the develop branch from b302d7f to 2cdf82b Compare January 7, 2026 21:04

GreatV reviewed Jan 7, 2026

View reviewed changes

GreatV requested a review from Copilot January 7, 2026 23:11

Copilot started reviewing on behalf of GreatV January 7, 2026 23:11 View session

Copilot AI reviewed Jan 7, 2026

View reviewed changes

Add tests for index_put + fixed docstrings

6f1fb64

alex-dinh commented Jan 8, 2026

View reviewed changes

Fix run.bat paddle version for Windows build

d6e2932

alex-dinh force-pushed the develop branch from cfbb9b7 to d6e2932 Compare January 9, 2026 17:20

alex-dinh commented Jan 9, 2026

View reviewed changes

GreatV approved these changes Jan 10, 2026

View reviewed changes

GreatV requested a review from jzhang533 January 10, 2026 01:13

jzhang533 merged commit 820b83d into PaddlePaddle:develop Jan 12, 2026
5 checks passed

		@@ -0,0 +1,169 @@
		// Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.

	// Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
	// Copyright (c) 2026 PaddlePaddle Authors. All Rights Reserved.

		@@ -0,0 +1,46 @@
		// Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.

	if (x_info[i].shape.size() > 0 && x_info[i].shape[0] == 1) {
	if (x_info[i].shape[0] == 1) {

Conversation

alex-dinh commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paddle-bot bot commented Jan 5, 2026

Uh oh!

CLAassistant commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alex-dinh commented Jan 6, 2026

Uh oh!

alex-dinh Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

alex-dinh commented Jan 7, 2026

Uh oh!

GreatV commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhaohb commented Jan 7, 2026

Uh oh!

GreatV commented Jan 7, 2026

Uh oh!

zhaohb commented Jan 7, 2026

Uh oh!

GreatV commented Jan 7, 2026

Uh oh!

alex-dinh commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GreatV commented Jan 7, 2026

Uh oh!

zhaohb commented Jan 7, 2026

Uh oh!

zhaohb commented Jan 7, 2026

Uh oh!

alex-dinh commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GreatV left a comment

Choose a reason for hiding this comment

Uh oh!

GreatV Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

GreatV Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

hzkitty commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhaohb commented Jan 8, 2026

Uh oh!

alex-dinh commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alex-dinh Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alex-dinh Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

alex-dinh commented Jan 10, 2026

Uh oh!

GreatV left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jzhang533 commented Jan 12, 2026

Uh oh!

Reviewers

alex-dinh commented Jan 5, 2026 •

edited

Loading

CLAassistant commented Jan 5, 2026 •

edited

Loading

GreatV commented Jan 7, 2026 •

edited

Loading

alex-dinh commented Jan 7, 2026 •

edited

Loading

alex-dinh commented Jan 7, 2026 •

edited

Loading

hzkitty commented Jan 8, 2026 •

edited

Loading

alex-dinh commented Jan 8, 2026 •

edited

Loading

alex-dinh Jan 8, 2026 •

edited

Loading