ChatHuggingFace can not generate responses with fuctions binding by "bind_tools"

### Checked other resources

- [x] I added a very descriptive title to this issue.
- [x] I searched the LangChain documentation with the integrated search.
- [x] I used the GitHub search to find a similar question and didn't find it.
- [x] I am sure that this is a bug in LangChain rather than my code.
- [x] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

### Example Code

my minimum code are here:

```python
from langchain_huggingface.llms import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from langchain_huggingface import ChatHuggingFace
from langchain_core.tools import tool
import langchain
langchain.debug = True


@tool
def add(a: int, b: int) -> int:
    """Adds a and b."""
    return a + b


@tool
def multiply(a: int, b: int) -> int:
    """Multiplies a and b."""
    return a * b



def init_chat(model_path="pretrained_models/THUDM-glm-4-9b-chat"):
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
    model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True)
    pipe = pipeline(
        "text-generation",
        model=model,
        tokenizer=tokenizer,
        max_new_tokens=1280,
        temperature=0.1,
    )
    return ChatHuggingFace(llm=HuggingFacePipeline(pipeline=pipe), tokenizer=tokenizer)


llm = init_chat()

llm_with_tools = llm.bind_tools([multiply, add])

print(llm_with_tools)

query = "What is 3 * 12? Also, what is 11 + 49?"

print(llm_with_tools.invoke(query))
```

### Error Message and Stack Trace (if applicable)

the code above has following outputs:
```bash
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:01<00:00,  9.08it/s]
Device set to use cuda:0
bound=ChatHuggingFace(llm=HuggingFacePipeline(pipeline=<transformers.pipelines.text_generation.TextGenerationPipeline object at 0x7f7532e0c220>, model_id='pretrained_models/THUDM-glm-4-9b-chat'), tokenizer=ChatGLM4Tokenizer(name_or_path='pretrained_models/THUDM-glm-4-9b-chat', vocab_size=151329, model_max_length=128000, is_fast=False, padding_side='left', truncation_side='right', special_tokens={'eos_token': '<|endoftext|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|endoftext|>', '[MASK]', '[gMASK]', '[sMASK]', '<sop>', '<eop>', '<|system|>', '<|user|>', '<|assistant|>', '<|observation|>', '<|begin_of_image|>', '<|end_of_image|>', '<|begin_of_video|>', '<|end_of_video|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
        151329: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151330: AddedToken("[MASK]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151331: AddedToken("[gMASK]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151332: AddedToken("[sMASK]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151333: AddedToken("<sop>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151334: AddedToken("<eop>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151335: AddedToken("<|system|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151336: AddedToken("<|user|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151337: AddedToken("<|assistant|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151338: AddedToken("<|observation|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151339: AddedToken("<|begin_of_image|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151340: AddedToken("<|end_of_image|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151341: AddedToken("<|begin_of_video|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151342: AddedToken("<|end_of_video|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
}
), model_id='pretrained_models/THUDM-glm-4-9b-chat') kwargs={'tools': [{'type': 'function', 'function': {'name': 'multiply', 'description': 'Multiplies a and b.', 'parameters': {'properties': {'a': {'type': 'integer'}, 'b': {'type': 'integer'}}, 'required': ['a', 'b'], 'type': 'object'}}}, {'type': 'function', 'function': {'name': 'add', 'description': 'Adds a and b.', 'parameters': {'properties': {'a': {'type': 'integer'}, 'b': {'type': 'integer'}}, 'required': ['a', 'b'], 'type': 'object'}}}]} config={} config_factories=[]

[llm/start] [llm:ChatHuggingFace] Entering LLM run with input:
{
  "prompts": [
    "Human: What is 3 * 12? Also, what is 11 + 49?"
  ]
}
[llm/end] [llm:ChatHuggingFace] [3.10s] Exiting LLM run with output:
{
  "generations": [
    [
      {
        "text": "[gMASK]<sop><|user|>\nWhat is 3 * 12? Also, what is 11 + 49?<|assistant|>\n3 multiplied by 12 equals 36.\n\n11 plus 49 equals 60.",
        "generation_info": null,
        "type": "ChatGeneration",
        "message": {
          "lc": 1,
          "type": "constructor",
          "id": [
            "langchain",
            "schema",
            "messages",
            "AIMessage"
          ],
          "kwargs": {
            "content": "[gMASK]<sop><|user|>\nWhat is 3 * 12? Also, what is 11 + 49?<|assistant|>\n3 multiplied by 12 equals 36.\n\n11 plus 49 equals 60.",
            "type": "ai",
            "id": "run-d01e8000-b48c-4646-9523-3a9aa8276e17-0",
            "tool_calls": [],
            "invalid_tool_calls": []
          }
        }
      }
    ]
  ],
  "llm_output": null,
  "run": null,
  "type": "LLMResult"
}
content='[gMASK]<sop><|user|>\nWhat is 3 * 12? Also, what is 11 + 49?<|assistant|>\n3 multiplied by 12 equals 36.\n\n11 plus 49 equals 60.' additional_kwargs={} response_metadata={} id='run-d01e8000-b48c-4646-9523-3a9aa8276e17-0'
```

### Description

as we can see above,
the model init tools with decroator '@' successfully as I print with `print(llm_with_tools)` after `bind_tools` method:

![Image](https://github.com/user-attachments/assets/fdbd9345-a010-4387-8e91-387da55027ba)


but when I try to generate outputs using invoke method with `print(llm_with_tools.invoke(query))`
we can find that the predefine functions `multiply` and `add` did not used as assuming, although the results are right

![Image](https://github.com/user-attachments/assets/0b8840af-d6ad-4999-855b-8763827e8473)

I just follow the tutorial here:
- [tool_calling](https://python.langchain.com/docs/how_to/tool_calling/)
- [ChatHuggingFace](https://python.langchain.com/api_reference/huggingface/chat_models/langchain_huggingface.chat_models.huggingface.ChatHuggingFace.html)

btw: I use `ChatHuggingFace` because I wanna init llm using local persist ckpt

I wonder if it is a bug for `ChatHuggingFace`

I'd appreciate it if you could help me using `ChatHuggingFace` to implement real `function call`

### System Info

my relevant package versions are as below:
```bash
langchain                                0.3.21
langchain-community                      0.3.20
langchain-core                           0.3.47
langchain-huggingface                    0.1.2
langchain-openai                         0.3.8
langchain-text-splitters                 0.3.7
sentence-transformers                    3.4.1
transformers                             4.48.0
```
if you need any else packages for this bug re-occur, please let me know
thanks anyway

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ChatHuggingFace can not generate responses with fuctions binding by "bind_tools" #30453

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ChatHuggingFace can not generate responses with fuctions binding by "bind_tools" #30453

Description

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions