Skip to content

Releases: deepset-ai/haystack

v2.15.1

30 Jun 14:26
Compare
Choose a tag to compare

Bug Fixes

  • Fix _convert_streaming_chunks_to_chat_message which is used to convert Haystack StreamingChunks into a Haystack ChatMessage. This fixes the scenario where one StreamingChunk contains two ToolCallDetlas in StreamingChunk.tool_calls. With this fix this correctly saves both ToolCallDeltas whereas before they were overwriting each other. This only occurs with some LLM providers like Mistral (and not OpenAI) due to how the provider returns tool calls.

v2.15.1-rc1

30 Jun 13:54
Compare
Choose a tag to compare
v2.15.1-rc1 Pre-release
Pre-release
v2.15.1-rc1

v2.15.0

26 Jun 10:52
Compare
Choose a tag to compare

⭐️ Highlights

Parallel Tool Calling for Faster Agents

  • ToolInvoker now processes all tool calls passed to run or run_async in parallel using an internal ThreadPoolExecutor. This improves performance by reducing the time spent on sequential tool invocations.
  • This parallel execution capability enables ToolInvoker to batch and process multiple tool calls concurrently, allowing Agents to run complex pipelines efficiently with decreased latency.
  • You no longer need to pass an async_executor. ToolInvoker manages its own executor, configurable via the max_workers parameter in init.

Introducing LLMMessagesRouter

The new LLMMessagesRouter component that classifies and routes incoming ChatMessage objects to different connections using a generative LLM. This component can be used with general-purpose LLMs and with specialized LLMs for moderation like Llama Guard.

Usage example:

from haystack.components.generators.chat import HuggingFaceAPIChatGenerator 
from haystack.components.routers.llm_messages_router import LLMMessagesRouter
from haystack.dataclasses import ChatMessage 
 
chat_generator = HuggingFaceAPIChatGenerator(api_type="serverless_inference_api", api_params={"model": "meta-llama/Llama-Guard-4-12B", "provider": "groq"}, )  
router = LLMMessagesRouter(chat_generator=chat_generator, output_names=["unsafe", "safe"], output_patterns=["unsafe", "safe"])  
print(router.run([ChatMessage.from_user("How to rob a bank?")]))

New HuggingFaceTEIRanker Component

HuggingFaceTEIRanker enables end-to-end reranking via the Text Embeddings Inference (TEI) API. It supports both self-hosted TEI services and Hugging Face Inference Endpoints, giving you flexible, high-quality reranking out of the box.

🚀 New Features

  • Added a ComponentInfo dataclass to haystack to store information about the component. We pass it to StreamingChunk so we can tell from which component a stream is coming.

  • Pass the component_info to the StreamingChunk in the OpenAIChatGenerator, AzureOpenAIChatGenerator, HuggingFaceAPIChatGenerator, HuggingFaceGenerator, HugginFaceLocalGenerator and HuggingFaceLocalChatGenerator.

  • Added the enable_streaming_callback_passthrough to the init, run and run_async methods of ToolInvoker. If set to True the ToolInvoker will try and pass the streaming_callback function to a tool's invoke method only if the tool's invoke method has streaming_callback in its signature.

  • Added dedicated finish_reason field to StreamingChunk class to improve type safety and enable sophisticated streaming UI logic. The field uses a FinishReason type alias with standard values: "stop", "length", "tool_calls", "content_filter", plus Haystack-specific value "tool_call_results" (used by ToolInvoker to indicate tool execution completion).

  • Updated ToolInvoker component to use the new finish_reason field when streaming tool results. The component now sets finish_reason="tool_call_results" in the final streaming chunk to indicate that tool execution has completed, while maintaining backward compatibility by also setting the value in meta["finish_reason"].

  • Added a raise_on_failure boolean parameter to OpenAIDocumentEmbedder and AzureOpenAIDocumentEmbedder. If set to True then the component will raise an exception when there is an error with the API request. It is set to False by default so the previous behavior of logging an exception and continuing is still the default.

  • Add AsyncHFTokenStreamingHandler for async streaming support in HuggingFaceLocalChatGenerator

  • For HuggingFaceAPIGenerator and HuggingFaceAPIChatGenerator all additional key, value pairs passed in api_params are now passed to the initializations of the underlying Inference Clients. This allows passing of additional parameters to the clients like timeout, headers, provider, etc. This means we now can easily specify a different inference provider by passing the provider key in api_params.

  • Updated StreamingChunk to add the fields tool_calls, tool_call_result, index, and start to make it easier to format the stream in a streaming callback.

    • Added new dataclass ToolCallDelta for the StreamingChunk.tool_calls field to reflect that the arguments can be a string delta.
    • Updated print_streaming_chunk and _convert_streaming_chunks_to_chat_message utility methods to use these new fields. This especially improves the formatting when using print_streaming_chunk with Agent.
    • Updated OpenAIGenerator, OpenAIChatGenerator, HuggingFaceAPIGenerator, HuggingFaceAPIChatGenerator, HuggingFaceLocalGenerator and HuggingFaceLocalChatGenerator to follow the new dataclasses.
    • Updated ToolInvoker to follow the StreamingChunk dataclass.

⚡️ Enhancement Notes

  • Added a new deserialize_component_inplace function to handle generic component deserialization that works with any component type.

  • Made doc-parser a core dependency since ComponentTool that uses it is one of the core Tool components.

  • Make the PipelineBase().validate_input method public so users can use it with the confidence that it won't receive breaking changes without warning. This method is useful for checking that all required connections in a pipeline have a connection and is automatically called in the run method of Pipeline. It is being exposed as public for users who would like to call this method before runtime to validate the pipeline.

  • For component run Datadog tracing, set the span resource name to the component name instead of the operation name.

  • Added a trust_remote_code parameter to the SentenceTransformersSimilarityRanker component. When set to True, this enables execution of custom models and scripts hosted on the Hugging Face Hub.

  • Add a new parameter require_tool_call_ids to ChatMessage.to_openai_dict_format. The default is True, for compatibility with OpenAI's Chat API: if the id field is missing in a Tool Call, an error is raised. Using False is useful for shallow OpenAI-compatible APIs, where the id field is not required.

  • Haystack's core modules are now "type complete", meaning that all function parameters and return types are explicitly annotated. This increases the usefulness of the newly added py.typed marker and sidesteps differences in type inference between the various type checker implementations.

  • HuggingFaceAPIChatGenerator now uses the util method _convert_streaming_chunks_to_chat_message. This is to help with being consistent for how we convert StreamingChunks into a final ChatMessage.

    • If only system messages are provided as input a warning will be logged to the user indicating that this likely not intended and that they should probably also provide user messages.

⚠️ Deprecation Notes

  • async_executor parameter in ToolInvoker is deprecated in favor of max_workers parameter and will be removed in Haystack 2.16.0. You can use max_workers parameter to control the number of threads used for parallel tool calling.

🐛 Bug Fixes

  • Fixed the to_dict and from_dict of ToolInvoker to properly serialize the streaming_callback init parameter.
  • Fix bug where if raise_on_failure=False and an error occurs mid-batch that the following embeddings would be paired with the wrong documents.
  • Fix component_invoker used by ComponentTool to work when a dataclass like ChatMessage is directly passed to component_tool.invoke(...). Previously this would either cause an error or silently skip your input.
  • Fixed a bug in the LLMMetadataExtractor that occurred when processing Document objects with None or empty string content. The component now gracefully handles these cases by marking such documents as failed and providing an appropriate error message in their metadata, without attempting an LLM call.
  • RecursiveDocumentSplitter now generates a unique Document.id for every chunk. The meta fields (split_id, parent_id, etc.) are populated before Document creation, so the hash used for id generation is always unique.
  • In ConditionalRouter fixed the to_dict and from_dict methods to properly handle the case when output_type is a List of types or a List of strings. This occurs when a user specifies a route in ConditionalRouter to have multiple outputs.
  • Fix serialization of GeneratedAnswer when ChatMessage objects are nested in meta.
  • Fix the serialization of ComponentTool and Tool when specifying outputs_to_string. Previously an error occurred on deserialization right after serializing if outputs_to_string is not None.
  • When calling set_output_types we now also check that the decorator @component.output_types is not present on the run_async method of a Component. Previously we only checked that the Component.run method did not possess the decorator.
  • Fix type comparison in schema validation by replacing is not with != when checking the type List[ChatMessage]. This prevents false mismatches due to Python's is operator comparing object identity instead of equality.
  • Re-export symbols in __init__.py files. This ensures that short imports like from haystack.components.builders import ChatPromptBuilder work equivalently to from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder, without causing errors or warnings in mypy/Pylance.
  • The SuperComponent class can now correctly serialize and deserialize a SuperComponent based on an async pipeline. Previously, the SuperComponent class always assumed the und...
Read more

v2.15.0-rc1

25 Jun 13:16
Compare
Choose a tag to compare
v2.15.0-rc1 Pre-release
Pre-release
v2.15.0-rc1

v2.14.3

19 Jun 10:29
e3ce992
Compare
Choose a tag to compare

Bug Fixes

  • In ConditionalRouter fixed the to_dict and from_dict methods to properly handle the case when output_type is a List of types or a List of strings. This occurs when a user specifies a route in ConditionalRouter to have multiple outputs.
  • Fix the serialization of ComponentTool and Tool when specifying outputs_to_string. Previously an error occurred on deserialization right after serializing if outputs_to_string is not None.

v2.14.3-rc1

18 Jun 15:03
Compare
Choose a tag to compare
v2.14.3-rc1 Pre-release
Pre-release
v2.14.3-rc1

v2.14.2

04 Jun 09:10
0e3d436
Compare
Choose a tag to compare

Bug Fixes

  • Fixed a bug in OpenAIDocumentEmbedder and AzureOpenAIDocumentEmbedder where if an OpenAI API error occurred mid-batch then the following embeddings would be paired with the wrong documents.

New Features

  • Added a raise_on_failure boolean parameter to OpenAIDocumentEmbedder and AzureOpenAIDocumentEmbedder. If set to True then the component will raise an exception when there is an error with the API request. It is set to False by default so the previous behavior of logging an exception and continuing is still the default.

v2.14.2-rc1

03 Jun 11:47
Compare
Choose a tag to compare
v2.14.2-rc1 Pre-release
Pre-release
v2.14.2-rc1

v2.14.1

30 May 12:55
045603d
Compare
Choose a tag to compare

Bug Fixes

  • Fixed a mypy issue in the OpenAIChatGenerator and its handling of stream responses. This issue only occurs with mypy >=1.16.0.
  • Fix type comparison in schema validation by replacing is not with != when checking the type List[ChatMessage]. This prevents false mismatches due to Python's is operator comparing object identity instead of equality.

v2.14.1-rc1

30 May 10:00
5c2e244
Compare
Choose a tag to compare
v2.14.1-rc1 Pre-release
Pre-release
v2.14.1-rc1