Skip to content

Enable streaming with get_chat_response() #79

Open
@Rumeysakeskin

Description

@Rumeysakeskin

I want to use streaming with chat history

# Import the Llama class of llama-cpp-python and the LlamaCppPythonProvider of llama-cpp-agent
from llama_cpp import Llama
from llama_cpp_agent.providers import LlamaCppPythonProvider
from llama_cpp_agent.chat_history import BasicChatHistory, BasicChatMessageStore, BasicChatHistoryStrategy


# Create an instance of the Llama class and load the model
llama_model = Llama("gemma-2-2b-it-IQ3_M.gguf", n_batch=1024, n_threads=10, n_gpu_layers=0)
# llama_model = Llama("gemma-2-9b-it-IQ2_M.gguf", n_batch=1024, n_threads=10, n_gpu_layers=40)


# Create the provider by passing the Llama class instance to the LlamaCppPythonProvider class
provider = LlamaCppPythonProvider(llama_model)

from llama_cpp_agent import LlamaCppAgent
from llama_cpp_agent import MessagesFormatterType
# Pass the provider to the LlamaCppAgentClass and define the system prompt and predefined message formatter
agent = LlamaCppAgent(provider,
                      system_prompt="You are a helpful assistant.",
                      predefined_messages_formatter_type=MessagesFormatterType.CHATML)


settings = provider.get_provider_default_settings()
settings.stream = True
settings.temperature = 0.1

# Create a message store for the chat history
chat_history_store = BasicChatMessageStore()

# Create the actual chat history, by passing the wished chat history strategy, it can be last_k_message or last_k_tokens. The default strategy will be to use the 20 last messages for the chat history.
# We will use the last_k_tokens strategy which will include the last k tokens into the chat history. When we use this strategy, we will have to pass the provider to the class.
chat_history = BasicChatHistory(message_store=chat_history_store, chat_history_strategy=BasicChatHistoryStrategy.last_k_tokens, k=7000, llm_provider=provider)

agent_output = agent.get_chat_response("neler yapabiliyorsun", llm_sampling_settings=settings)

agent_output.strip()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions