Skip to content

Conversation

@openvino-dev-samples
Copy link
Contributor

@openvino-dev-samples openvino-dev-samples commented Aug 20, 2025

What does this PR do?

Conversion cmd-line for tencent/Hunyuan-7B-Instruct or Hunyuan-7B-Instruct:

optimum-cli export openvino --model tencent/Hunyuan-7B-Instruct Hunyuan-7B-Instruct-ov --weight-format fp16 --task text-generation-with-past

Inference of Hunyuan-7B-Instruct using OpenVINO backend:

from optimum.intel.openvino import OVModelForCausalLM
from transformers import AutoTokenizer
import os
import re

model_name_or_path = "tencent/Hunyuan-7B-Instruct"

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
model = OVModelForCausalLM.from_pretrained(model_name_or_path, device_map="auto")  # You may want to use bfloat16 and/or move to GPU here
messages = [
    {"role": "user", "content": "Write a short summary of the benefits of regular exercise"},
]
tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True,return_tensors="pt",
                                                enable_thinking=True # Toggle thinking mode (default: True)
                                                )
                                                
outputs = model.generate(tokenized_chat.to(model.device), max_new_tokens=2048)

output_text = tokenizer.decode(outputs[0])
print("output_text=",output_text)
think_pattern = r'<think>(.*?)</think>'
think_matches = re.findall(think_pattern, output_text, re.DOTALL)

answer_pattern = r'<answer>(.*?)</answer>'
answer_matches = re.findall(answer_pattern, output_text, re.DOTALL)

think_content = [match.strip() for match in think_matches][0]
answer_content = [match.strip() for match in answer_matches][0]
print(f"thinking_content:{think_content}\n\n")
print(f"answer_content:{answer_content}\n\n")

Before submitting

  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

return dummy_inputs


class HunyuanDummyPastKeyValuesGenerator(DummyPastKeyValuesGenerator):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use instead MistralDummyPastKeyValuesGenerator and set instead normalized_config.head_dim

self.random_float_tensor(shape, framework=framework, dtype=float_dtype),
)
for _ in range(self.num_layers)
]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you mind adding a test as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I will add it once the release version of transformers support this model.

Copy link
Collaborator

@rkazants rkazants left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@openvino-dev-samples
Copy link
Contributor Author

please add tests for inference: https://github.com/huggingface/optimum-intel/blob/main/tests/openvino/test_modeling.py

I will add it after this PR

@rkazants
Copy link
Collaborator

I will add it after this PR

Let us anticipate that PR to be merged. Then you can add tests to this PR. No need to have several PRs and separate implementation and tests. We need to make sure that inference works.

Best regards,
Roman

Comment on lines 230 to 234
<<<<<<< HEAD
"ernie4_5": 2,
"hunyuan_v1_dense": 2,
=======
>>>>>>> upstream/main
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

artifacts of merge, please fix


@register_in_tasks_manager("hunyuan_v1_dense", *["text-generation", "text-generation-with-past"], library_name="transformers")
class HunyuanOpenVINOConfig(TextDecoderWithPositionIdsOnnxConfig):
MIN_TRANSFORMERS_VERSION = "4.55.0.dev0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure that we need dev0 suffix

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since optimum-intel does not support Transformers 4.56, this PR can only work with the commit for Transformers 4.55.

git+https://github.com/huggingface/transformers@4970b23cedaf745f963779b4eae68da281e8c6ca

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests for modelling to test generate() method is needed as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests for modelling to test generate() method is needed as well

its already covered in test_compare_to_transformers i think

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@echarlaix
Copy link
Collaborator

Hi @openvino-dev-samples #1529 adds support for transformers v4.56 so we can soon merge this PR. looks like some test is currently failing would you mind taking a look ?

@openvino-dev-samples openvino-dev-samples changed the title [OpenVINO][Draft]support Hunyuan LLM [OpenVINO]support Hunyuan LLM Dec 3, 2025
Copy link
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @openvino-dev-samples ! Waiting for #1541 to be merged before we can merge this PR, hopefully this can be done soon cc @rkazants

Comment on lines 137 to 139
if is_transformers_version("<", "4.56.0"):
SUPPORTED_ARCHITECTURES += ("qwen", "chatglm", "chatglm4")

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why it is needed. It looks to be fixed during transition to latest transformers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its just copied from legacy version, and let me remove it.

@rkazants rkazants requested a review from popovaan December 15, 2025 19:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants