Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for issue #19933: langchain_huggingface #25136

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

Soumil32
Copy link

@Soumil32 Soumil32 commented Aug 7, 2024

This partially fixes issue #19933 in which Langchain was not taking the taking steps to apply the correct formatting for different chat models.

The change involves passing the prompt in the ChatML format of the pipeline. This way the pipeline can automatically format accordingly.

In addition, the argument return_full_text=False is being passed into the pipeline in huggingace_pipeline.py. This causes the pipeline to only return the newly generated text whereas the default behaviour is to return the entire chat history which we do not need. The output can then be parsed back into an AIMessage.

This still only works when ChatHuggingFace is used. If either model_id or tokenizer are not set when initialising ChatHuggingFace , it defaults to gpt2 tokenizer which is what caused the issue in the first place. The changes I have made correctly handle parsing the model response back into AIMessage. The initial issue still might need to be investigated but doing this should work as the tokenizer should be picked up from the pipeline passed into HuggingFacePipeline.

from langchain_huggingface import ChatHuggingFace, HuggingFacePipeline
from langchain_core.messages import HumanMessage, SystemMessage
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM

model_id = "hugging-quants/Meta-Llama-3.1-8B-Instruct-BNB-NF4" #replace with your model
tokenizer = AutoTokenizer.from_pretrained(model_id)
terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
model = AutoModelForCausalLM.from_pretrained(
  model_id,
  torch_dtype=torch.bfloat16,
  low_cpu_mem_usage=True,
  device_map="auto",
)

pipe = pipeline(
    "text-generation", model=model, tokenizer=tokenizer, max_new_tokens=40, eos_token_id=terminators, pad_token_id=tokenizer.eos_token_id
)

messages = [
    SystemMessage(content="You are a translator which converts the users input from English to French. Never refer to yourself. Just provide the translated text."),
    HumanMessage(content="Hi, how are you?")
]


hf = HuggingFacePipeline(pipeline=pipe)
chat_model = ChatHuggingFace(llm=hf, model_id=model_id)

response = chat_model.invoke(messages)

print(response)

Sorry if my language isn't clear. It is my first time contributing to any OSS project. If you have any questions or clarification is need, please say so! 😊

Edit: Did some more testing and my changes also fix the iisue the issue all together instead of partially! model_id or tokenizer do not need to be set anymore.

@efriis efriis added the partner label Aug 7, 2024
@efriis efriis self-assigned this Aug 7, 2024
@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Aug 7, 2024
Copy link

vercel bot commented Aug 7, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Aug 12, 2024 7:02pm

@Soumil32 Soumil32 changed the title Partial fix for issue [#19933](https://github.com/langchain-ai/langchain/issues/19933): langchain_huggingface Partial fix for issue #19933: langchain_huggingface Aug 7, 2024
@dosubot dosubot bot added langchain Related to the langchain package 🔌: huggingface Primarily related to HuggingFace integrations 🤖:improvement Medium size change to existing code to handle new use-cases labels Aug 7, 2024
@ccurme ccurme removed the langchain Related to the langchain package label Aug 7, 2024
@Soumil32 Soumil32 changed the title Partial fix for issue #19933: langchain_huggingface Fix for issue #19933: langchain_huggingface Aug 7, 2024
@Soumil32
Copy link
Author

Soumil32 commented Aug 8, 2024

@efriis whats the status on this?

@Soumil32
Copy link
Author

Soumil32 commented Aug 8, 2024

This also seems to fix issue #24437, #22487

@Soumil32
Copy link
Author

@ccurme Is there anything i still need to do?

Copy link
Collaborator

@ccurme ccurme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add some tests to demonstrate the updated behavior?

@@ -269,6 +270,7 @@ def _generate(
# Process batch of prompts
responses = self.pipeline(
batch_prompts,
return_full_text=False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this in a separate issue / PR?

@ccurme ccurme added the needs test PR needs to be updated with tests label Aug 23, 2024
@efriis efriis assigned ccurme and unassigned efriis Aug 24, 2024
@efriis efriis self-assigned this Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔌: huggingface Primarily related to HuggingFace integrations 🤖:improvement Medium size change to existing code to handle new use-cases needs test PR needs to be updated with tests partner size:S This PR changes 10-29 lines, ignoring generated files.
Projects
Status: In review
Development

Successfully merging this pull request may close these issues.

3 participants