Skip to content

Conversation

@yangsu2022
Copy link
Collaborator

@yangsu2022 yangsu2022 commented Dec 17, 2025

Description

CVS-174065
add this pr to collect log, for 174065's logs are no longer available.

Copilot AI review requested due to automatic review settings December 17, 2025 05:04
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR changes test markers from skipif to xfail for Windows platform tests related to ticket CVS-174065. This allows tests to run on Windows but marks them as expected to fail, providing better visibility into the actual test behavior compared to skipping them entirely.

Key changes:

  • Replaced @pytest.mark.skipif with @pytest.mark.xfail for Windows-specific test conditions
  • Standardized the condition syntax to use condition=(sys.platform == "win32")

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions github-actions bot added the category: GGUF GGUF file reader label Dec 17, 2025
@yangsu2022
Copy link
Collaborator Author

add this pr to collect log, for CVS-174065's logs are no longer available.

@yangsu2022 yangsu2022 mentioned this pull request Dec 17, 2025
Copilot AI review requested due to automatic review settings December 23, 2025 03:23
@yangsu2022 yangsu2022 requested a review from sgonorov as a code owner December 23, 2025 03:23
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +175 to +185
print(f"\n{'='*60}")
print(f"[DEBUG] Model: {gguf_model_id}")
print(f"[DEBUG] Prompt: {repr(prompt)}")
print(f"[DEBUG] Pipeline: {pipeline_type}")
print(f"[DEBUG] HF input_ids: {input_ids.tolist()}")
ov_tokenized = ov_pipe_gguf.get_tokenizer().encode(prompt)
print(f"[DEBUG] OV input_ids: {ov_tokenized.input_ids.data.tolist()}")
print(f"[DEBUG] HF output: {repr(res_string_input_1)}")
print(f"[DEBUG] OV output: {repr(res_string_input_2)}")
print(f"[DEBUG] Match: {res_string_input_1 == res_string_input_2}")
print(f"{'='*60}\n")
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug print statements should use a proper logging framework instead of print(). Consider using Python's logging module or pytest's logging capabilities to make debug output configurable and avoid polluting test output in normal runs.

Copilot uses AI. Check for mistakes.
@yangsu2022
Copy link
Collaborator Author

I have reproduced this qwen2.5-0.5b-instruct-q4_0.gguf locally.

Logging of 4 errors:
FAILED tests\python_tests\test_gguf_reader.py::test_full_gguf_pipeline[model_gguf1-special_tokens_with_text-enable_save_ov_model=False-PipelineType.STATEFUL] - AssertionError: assert '\nThe sky is...tter light in' == '\nThe sky is...s through it,' FAILED tests\python_tests\test_gguf_reader.py::test_full_gguf_pipeline[model_gguf1-special_tokens_with_text-enable_save_ov_model=False-PipelineType.PAGED_ATTENTION] - AssertionError: assert '\nThe sky is...tter light in' == '\nThe sky is...tor of light,' FAILED tests\python_tests\test_gguf_reader.py::test_full_gguf_pipeline[model_gguf1-special_tokens_with_text-enable_save_ov_model=True-PipelineType.STATEFUL] - AssertionError: assert '\nThe sky is...tter light in' == '\nThe sky is...s through it,' FAILED tests\python_tests\test_gguf_reader.py::test_full_gguf_pipeline[model_gguf1-special_tokens_with_text-enable_save_ov_model=True-PipelineType.PAGED_ATTENTION] - AssertionError: assert '\nThe sky is...tter light in' == '\nThe sky is...tor of light,'

@yangsu2022
Copy link
Collaborator Author

yangsu2022 commented Dec 23, 2025

Added log printing and test failed test one by one via python -m pytest -s -v "tests/python_tests/test_gguf_reader.py::test_full_gguf_pipeline[model_gguf1-special_tokens_with_text-enable_save_ov_model=False-PipelineType.STATEFUL]"

logging of first error:
[DEBUG] Model: Qwen/Qwen2.5-0.5B-Instruct-GGUF
[DEBUG] Prompt: '<|endoftext|> Why the Sky is Blue? <|im_end|>'
[DEBUG] Pipeline: PipelineType.STATEFUL
[DEBUG] HF input_ids: [[151643, 8429, 279, 14722, 374, 8697, 30, 220, 151645]]
[DEBUG] OV input_ids: [[151643, 8429, 279, 14722, 374, 8697, 30, 220, 151645]]
[DEBUG] HF output: '\nThe sky is blue because of the way light is refracted by the atmosphere. The atmosphere is a mixture of gases and particles that scatter light in'
[DEBUG] OV output: '\nThe sky is blue because of the way light is refracted by the atmosphere. The atmosphere is a medium that allows light to pass through it,'
[DEBUG] Match: False

It shows that model_gguf1 is qwen2.5-0.5b-instruct-q4_0.gguf and Tokenization is ok

@yangsu2022
Copy link
Collaborator Author

As tokenizer is ok, I replaced qwen2.5-0.5b-instruct-q4_0.gguf with qwen2.5-1.5b-instruct-q4_0.gguf for further investigation.
1.5b model don't have errors with special_tokens_with_text.
Logging of 4 new errors with multiple_special_tokens:

FAILED tests\python_tests\test_gguf_reader.py::test_full_gguf_pipeline[model_gguf1-multiple_special_tokens-enable_save_ov_model=False-PipelineType.STATEFUL] - assert '\n <div cla...\n <h1' == 'çççççççççççç...ççççççççççççç'
FAILED tests\python_tests\test_gguf_reader.py::test_full_gguf_pipeline[model_gguf1-multiple_special_tokens-enable_save_ov_model=False-PipelineType.PAGED_ATTENTION] - assert '\n <div cla...\n <h1' == 'çççççççççççç...ççççççççççççç'
FAILED tests\python_tests\test_gguf_reader.py::test_full_gguf_pipeline[model_gguf1-multiple_special_tokens-enable_save_ov_model=True-PipelineType.STATEFUL] - assert '\n <div cla...\n <h1' == 'çççççççççççç...ççççççççççççç'
FAILED tests\python_tests\test_gguf_reader.py::test_full_gguf_pipeline[model_gguf1-multiple_special_tokens-enable_save_ov_model=True-PipelineType.PAGED_ATTENTION] - assert '\n <div cla...\n <h1' == 'çççççççççççç...ççççççççççççç'

@yangsu2022
Copy link
Collaborator Author

Debug for multiple_special_tokens with 1.5B model:
python -m pytest -s -v "tests/python_tests/test_gguf_reader.py::test_full_gguf_pipeline[model_gguf1-multiple_special_tokens-enable_save_ov_model=False-PipelineType.STATEFUL]"

logging:
[DEBUG] Model: Qwen/Qwen2.5-1.5B-Instruct-GGUF
[DEBUG] Prompt: '<|endoftext|><|endoftext|><|im_end|>'
[DEBUG] Pipeline: PipelineType.STATEFUL
[DEBUG] HF input_ids: [[151643, 151643, 151645]]
[DEBUG] OV input_ids: [[151643, 151643, 151645]]
[DEBUG] HF output: '\n

\n
\n
\n <h1'
[DEBUG] OV output: 'çççççççççççççççççççççççççççççç'
[DEBUG] Match: False

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: GGUF GGUF file reader

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant