Skip to content

add PaddleOCR_VL model#3263

Open
zhaohb wants to merge 17 commits intoopenvinotoolkit:latestfrom
zhaohb:latest
Open

add PaddleOCR_VL model#3263
zhaohb wants to merge 17 commits intoopenvinotoolkit:latestfrom
zhaohb:latest

Conversation

@zhaohb
Copy link

@zhaohb zhaohb commented Jan 20, 2026

No description provided.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@zhaohb zhaohb requested a review from brmarkus January 21, 2026 06:24
@openvino-dev-samples
Copy link
Collaborator

hi @zhaohb Thanks for your contribution. Please help to address my feedback.


This notebook shows an end-to-end workflow for **PaddleOCR-VL-1.5 → OpenVINO**:

- Download the pretrained PaddleOCR-VL-1.5/PaddleOCR-VL model.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this notebook for both or only 1.5, if its for both, please add a model selector for 1.0.

"\n",
"# Install the rest dependencies from PyPI (keep existing versions/constraints)\n",
"pip_install(\n",
" \"-U\",\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
" \"-U\",\n",
" \"-q\",\n",
" \"-U\",\n",

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for silent installtion

"output_type": "stream",
"name": "stdout",
"text": [
"stateful model inputs: [<Output: names[attention_mask, 211] shape[?,?] type: f32>, <Output: names[position_ids, 237, 238] shape[?,?,?] type: i64>, <Output: names[24] shape[?,?,?,128] type: f32>, <Output: names[25] shape[?,?,?,128] type: f32>, <Output: names[26] shape[?,?,?,128] type: f32>, <Output: names[27] shape[?,?,?,128] type: f32>, <Output: names[28] shape[?,?,?,128] type: f32>, <Output: names[29] shape[?,?,?,128] type: f32>, <Output: names[30] shape[?,?,?,128] type: f32>, <Output: names[31] shape[?,?,?,128] type: f32>, <Output: names[32] shape[?,?,?,128] type: f32>, <Output: names[33] shape[?,?,?,128] type: f32>, <Output: names[34] shape[?,?,?,128] type: f32>, <Output: names[35] shape[?,?,?,128] type: f32>, <Output: names[36] shape[?,?,?,128] type: f32>, <Output: names[37] shape[?,?,?,128] type: f32>, <Output: names[38] shape[?,?,?,128] type: f32>, <Output: names[39] shape[?,?,?,128] type: f32>, <Output: names[40] shape[?,?,?,128] type: f32>, <Output: names[41] shape[?,?,?,128] type: f32>, <Output: names[42] shape[?,?,?,128] type: f32>, <Output: names[43] shape[?,?,?,128] type: f32>, <Output: names[44] shape[?,?,?,128] type: f32>, <Output: names[45] shape[?,?,?,128] type: f32>, <Output: names[46] shape[?,?,?,128] type: f32>, <Output: names[47] shape[?,?,?,128] type: f32>, <Output: names[48] shape[?,?,?,128] type: f32>, <Output: names[49] shape[?,?,?,128] type: f32>, <Output: names[50] shape[?,?,?,128] type: f32>, <Output: names[51] shape[?,?,?,128] type: f32>, <Output: names[52] shape[?,?,?,128] type: f32>, <Output: names[53] shape[?,?,?,128] type: f32>, <Output: names[54] shape[?,?,?,128] type: f32>, <Output: names[55] shape[?,?,?,128] type: f32>, <Output: names[56] shape[?,?,?,128] type: f32>, <Output: names[57] shape[?,?,?,128] type: f32>, <Output: names[58] shape[?,?,?,128] type: f32>, <Output: names[59] shape[?,?,?,128] type: f32>, <Output: names[273, inputs_embeds, 262, hidden_states.1] shape[?,?,?] type: f32>]\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we print out these layer information ?

"core = ov.Core()\n",
"\n",
"paddleocr_vl_model = OVPaddleOCRVLForCausalLM(\n",
" core=core,\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of this param ? can we initialize the openvino inside of wrapper ?

"✅ INT8 compressed model saved to c:\\hongbo\\paddle_ocr_vl\\openvino_notebooks\\notebooks\\paddleocr_vl\\ov_paddleocr_vl_model/llm_stateful_int8.xml\n",
"✅ PaddleOCR-VL model has been successfully converted to OpenVINO format.\n",
"✅ Conversion complete.\n",
"✅ Resources released.\n"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we skip the step of model conversion if ov model existing already ?

"try:\n",
" demo.launch(debug=True)\n",
"except Exception:\n",
" demo.launch(debug=True, share=True)\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the size of text in example picture is too small for demonstration, could you improve it ?

Image

" d = ImageDraw.Draw(im)\n",
" d.text(\n",
" (40, 40),\n",
" \"PaddleOCR-VL OpenVINO test\\nOCR: Hello 123\\nTable: A | B | C\",\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its better to use a real picture for demonstration rather than a manually created one.
Meanwhile please also show the original picture along with the predictions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments