Skip to content

ChatHuggingFace can not generate responses with fuctions binding by "bind_tools" #30453

Open
@Ying-Kang

Description

@Ying-Kang

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

my minimum code are here:

from langchain_huggingface.llms import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from langchain_huggingface import ChatHuggingFace
from langchain_core.tools import tool
import langchain
langchain.debug = True


@tool
def add(a: int, b: int) -> int:
    """Adds a and b."""
    return a + b


@tool
def multiply(a: int, b: int) -> int:
    """Multiplies a and b."""
    return a * b



def init_chat(model_path="pretrained_models/THUDM-glm-4-9b-chat"):
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
    model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True)
    pipe = pipeline(
        "text-generation",
        model=model,
        tokenizer=tokenizer,
        max_new_tokens=1280,
        temperature=0.1,
    )
    return ChatHuggingFace(llm=HuggingFacePipeline(pipeline=pipe), tokenizer=tokenizer)


llm = init_chat()

llm_with_tools = llm.bind_tools([multiply, add])

print(llm_with_tools)

query = "What is 3 * 12? Also, what is 11 + 49?"

print(llm_with_tools.invoke(query))

Error Message and Stack Trace (if applicable)

the code above has following outputs:

Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:01<00:00,  9.08it/s]
Device set to use cuda:0
bound=ChatHuggingFace(llm=HuggingFacePipeline(pipeline=<transformers.pipelines.text_generation.TextGenerationPipeline object at 0x7f7532e0c220>, model_id='pretrained_models/THUDM-glm-4-9b-chat'), tokenizer=ChatGLM4Tokenizer(name_or_path='pretrained_models/THUDM-glm-4-9b-chat', vocab_size=151329, model_max_length=128000, is_fast=False, padding_side='left', truncation_side='right', special_tokens={'eos_token': '<|endoftext|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|endoftext|>', '[MASK]', '[gMASK]', '[sMASK]', '<sop>', '<eop>', '<|system|>', '<|user|>', '<|assistant|>', '<|observation|>', '<|begin_of_image|>', '<|end_of_image|>', '<|begin_of_video|>', '<|end_of_video|>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
        151329: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151330: AddedToken("[MASK]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151331: AddedToken("[gMASK]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151332: AddedToken("[sMASK]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151333: AddedToken("<sop>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151334: AddedToken("<eop>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151335: AddedToken("<|system|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151336: AddedToken("<|user|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151337: AddedToken("<|assistant|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151338: AddedToken("<|observation|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151339: AddedToken("<|begin_of_image|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151340: AddedToken("<|end_of_image|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151341: AddedToken("<|begin_of_video|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
        151342: AddedToken("<|end_of_video|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
}
), model_id='pretrained_models/THUDM-glm-4-9b-chat') kwargs={'tools': [{'type': 'function', 'function': {'name': 'multiply', 'description': 'Multiplies a and b.', 'parameters': {'properties': {'a': {'type': 'integer'}, 'b': {'type': 'integer'}}, 'required': ['a', 'b'], 'type': 'object'}}}, {'type': 'function', 'function': {'name': 'add', 'description': 'Adds a and b.', 'parameters': {'properties': {'a': {'type': 'integer'}, 'b': {'type': 'integer'}}, 'required': ['a', 'b'], 'type': 'object'}}}]} config={} config_factories=[]

[llm/start] [llm:ChatHuggingFace] Entering LLM run with input:
{
  "prompts": [
    "Human: What is 3 * 12? Also, what is 11 + 49?"
  ]
}
[llm/end] [llm:ChatHuggingFace] [3.10s] Exiting LLM run with output:
{
  "generations": [
    [
      {
        "text": "[gMASK]<sop><|user|>\nWhat is 3 * 12? Also, what is 11 + 49?<|assistant|>\n3 multiplied by 12 equals 36.\n\n11 plus 49 equals 60.",
        "generation_info": null,
        "type": "ChatGeneration",
        "message": {
          "lc": 1,
          "type": "constructor",
          "id": [
            "langchain",
            "schema",
            "messages",
            "AIMessage"
          ],
          "kwargs": {
            "content": "[gMASK]<sop><|user|>\nWhat is 3 * 12? Also, what is 11 + 49?<|assistant|>\n3 multiplied by 12 equals 36.\n\n11 plus 49 equals 60.",
            "type": "ai",
            "id": "run-d01e8000-b48c-4646-9523-3a9aa8276e17-0",
            "tool_calls": [],
            "invalid_tool_calls": []
          }
        }
      }
    ]
  ],
  "llm_output": null,
  "run": null,
  "type": "LLMResult"
}
content='[gMASK]<sop><|user|>\nWhat is 3 * 12? Also, what is 11 + 49?<|assistant|>\n3 multiplied by 12 equals 36.\n\n11 plus 49 equals 60.' additional_kwargs={} response_metadata={} id='run-d01e8000-b48c-4646-9523-3a9aa8276e17-0'

Description

as we can see above,
the model init tools with decroator '@' successfully as I print with print(llm_with_tools) after bind_tools method:

Image

but when I try to generate outputs using invoke method with print(llm_with_tools.invoke(query))
we can find that the predefine functions multiply and add did not used as assuming, although the results are right

Image

I just follow the tutorial here:

btw: I use ChatHuggingFace because I wanna init llm using local persist ckpt

I wonder if it is a bug for ChatHuggingFace

I'd appreciate it if you could help me using ChatHuggingFace to implement real function call

System Info

my relevant package versions are as below:

langchain                                0.3.21
langchain-community                      0.3.20
langchain-core                           0.3.47
langchain-huggingface                    0.1.2
langchain-openai                         0.3.8
langchain-text-splitters                 0.3.7
sentence-transformers                    3.4.1
transformers                             4.48.0

if you need any else packages for this bug re-occur, please let me know
thanks anyway

Metadata

Metadata

Assignees

No one assigned

    Labels

    investigateFlagged for investigation.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions