Add release notes for Haystack v2.22.0 (#492)

HaystackBot · anakin87 · web-flow · commit 0a538001bc7b · 2026-01-08T15:50:48.000+01:00
Co-authored-by: anakin87 &lt;44616784+anakin87@users.noreply.github.com&gt;
diff --git a/content/release-notes/v2.22.0.md b/content/release-notes/v2.22.0.md
@@ -0,0 +1,150 @@
+---
+title: Haystack 2.22.0
+description: Release notes for Haystack 2.22.0
+toc: True
+date: 2026-01-08
+last_updated: 2026-01-08
+tags: ["Release Notes"]
+link: https://github.com/deepset-ai/haystack/releases/tag/v2.22.0
+---
+
+## ⭐️ Highlights
+
+### ✂️ Smarter Document Chunking with Embedding-Based Splitting
+Introducing the new [EmbeddingBasedDocumentSplitter](https://docs.haystack.deepset.ai/docs/embeddingbaseddocumentsplitter), a component that takes an embedder and splits documents based on semantic similarity rather than fixed sizes or rules.
+
+```python
+from haystack.components.embedders import SentenceTransformersDocumentEmbedder
+from haystack.components.preprocessors import EmbeddingBasedDocumentSplitter
+
+# Initialize an embedder to calculate semantic similarities
+embedder = SentenceTransformersDocumentEmbedder()
+
+# Configure the splitter with parameters that control splitting behavior
+splitter = EmbeddingBasedDocumentSplitter(
+    document_embedder=embedder,
+    sentences_per_group=2,      # Group 2 sentences before calculating embeddings
+    percentile=0.95,            # Split when cosine distance exceeds 95th percentile
+    min_length=50,              # Merge splits shorter than 50 characters
+    max_length=1000             # Further split chunks longer than 1000 characters
+)
+result = splitter.run(documents=[doc])
+```
+
+### 🔥 `warm_up` Runs Automatically on First Use
+
+Components that define a`warm_up` method now run it automatically on first execution, removing the need for manual calls and preventing errors in standalone usage.
+```python
+from haystack.components.embedders import SentenceTransformersTextEmbedder
+
+text_embedder = SentenceTransformersTextEmbedder()
+# text_embedder.warm_up() # ❌ Don't need this step anymore
+print(text_embedder.run("I love pizza!"))
+
+## {'embedding': [-0.07804739475250244, 0.1498992145061493,, ...]}
+```
+
+### 🛠️ Multiple Tool String Outputs with `outputs_to_string`
+Tools can now expose multiple string outputs via the new `outputs_to_string` configuration, giving you fine-grained control over how tool results are surfaced to the LLM, without changing the underlying tool logic.
+
+  ``` python
+  def format_documents(documents):
+      return "\n".join(f"{i+1}. Document: {doc.content}" for i, doc in enumerate(documents))
+
+  def format_summary(metadata):
+      return f"Found {metadata['count']} results"
+
+  tool = Tool(
+      name="search",
+      description="Search for documents",
+      parameters={...},
+      function=search_func,  # Returns {"documents": [Document(...)], "metadata": {"count": 5}, "debug_info": {...}}
+      outputs_to_string={
+          "formatted_docs": {"source": "documents", "handler": format_documents},
+          "summary": {"source": "metadata", "handler": format_summary}
+          # Note: "debug_info" is not included, so it won't be converted to a string
+      }
+  )
+
+  # After the tool invocation, the tool result includes:
+  # {
+  #     "formatted_docs": "1. Document Title\n   Content...\n2. ...",
+  #     "summary": "Found 5 results"
+  # }
+  ```
+### 🐍 Python 3.10+ Only
+Haystack now requires Python 3.10 or later, as Python 3.9 reached End of Life (EOL) in October 2025.
+
+## ⬆️ Upgrade Notes
+- `HuggingFaceLocalChatGenerator` now uses `Qwen/Qwen3-0.6B` as the default model, replacing the previous default.
+
+## ⚡️Enhancement Notes
+
+- The parameters `query_suffix` and `document_suffix` have been added to `SentenceTransformersSimilarityRanker` to support the Qwen3 reranker model family.
+
+  Here is an example of how to use these new parameters to use the Qwen3-Reranker-0.6B:
+
+  ``` python
+  from haystack import Document
+  from haystack.components.rankers.sentence_transformers_similarity import SentenceTransformersSimilarityRanker
+
+  ranker = SentenceTransformersSimilarityRanker(
+      model="tomaarsen/Qwen3-Reranker-0.6B-seq-cls",
+      query_prefix='<|im_start|>system\nJudge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be "yes" or "no".<|im_end|>\n<|im_start|>user\n<Instruct>: Given a web search query, retrieve relevant passages that answer the query\n<Query>: ',
+      query_suffix="\n",
+      document_prefix="<Document>: ",
+      document_suffix="<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n",
+  )
+
+  result = ranker.run(
+      query="Which planet is known as the Red Planet?",
+      documents=[
+          Document(content="Venus is often called Earth's twin because of its similar size and proximity."),
+          Document(content="Mars, known for its reddish appearance, is often referred to as the Red Planet."),
+          Document(content="Jupiter, the largest planet in our solar system, has a prominent red spot."),
+          Document(content="Saturn, famous for its rings, is sometimes mistaken for the Red Planet."),
+      ],
+  )
+
+  print(result)
+  ```
+
+  NOTE: This only works with the Qwen3 reranker models that use the sequence classification architecture. For example, you can find some on `tomaarsen`'s Hugging Face profile.
+
+- Added reasoning content support to `HuggingFaceAPIChatGenerator`. The component now extracts reasoning content from models that support chain-of-thought reasoning (e.g., DeepSeek R1). Both streaming and non-streaming modes are supported. Access via `reply.reasoning.reasoning_text`.
+
+- When an Agent runs as part of a Pipeline, the agent's tracing span now uses the component span as its parent. This enables proper nested trace visualization in tracing tools like Datadog, Braintrust, or OpenTelemetry backends.
+
+- The `_handle_async_stream_response()` method in `OpenAIChatGenerator` now handles `asyncio.CancelledError` exceptions. When a streaming task is cancelled mid-stream, the async for loop gracefully closes the stream using `asyncio.shield()` to ensure the cleanup operation completes even during cancellation.
+
+- A new `enable_thinking` parameter has been added to enable thinking mode in chat templates for thinking-capable models, allowing them to generate intermediate reasoning steps before producing final responses.
+
+- Add support for PEP 604 type syntax. This means that when defining types in components, you can use `X | Y` instead of `Union[X, Y]` and `X | None` instead of `Optional[X]`. The codebase has been migrated to the new syntax, but both syntaxes are fully supported.
+
+- Support Multiple Tool String Outputs
+
+  Added support for tools to define multiple string outputs using the `outputs_to_string` configuration. This allows users to specify how different parts of a tool's output should be converted to strings, enhancing flexibility in handling tool results.
+
+  - Updated `ToolInvoker` to handle multiple output configurations.
+  - Updated `Tool` to validate and store multiple output configurations.
+  - Added tests to verify the functionality of multiple string outputs.
+
+  This enables tools to provide rich, varied context to language models or downstream components without requiring multiple tool calls, while keeping full control over which outputs are stringified.
+
+- Added validation for `inputs_from_state` and `outputs_to_state` parameters in the `Tool` class. Tools now validate at construction time that state mappings reference valid tool parameters and outputs, catching configuration errors early instead of at runtime. The validation uses function introspection and JSON schema to ensure parameter names exist, and subclasses like `ComponentTool` validate against component input/output sockets.
+
+## 🐛 Bug Fixes
+
+- Improved error messages in ConditionalRouter when non-string values are provided as route outputs. Users now receive clear guidance (e.g., "use '2' instead of 2") instead of the cryptic "Can't compile non template nodes" error.
+- Fixes jinja2 variable detection in `ConditionalRouter`, `ChatPromptBuilder`, `PromptBuilder` and `OutputAdapter` by properly skipping variables that are assigned within the template. Previously under specific scenarios variables assigned within a template would falsely be picked up as input variables to the component. For more information you can check out the parent issue in the Jinja2 library here: <https://github.com/pallets/jinja/issues/2069>
+- Fixes deserializing an instance of `NamedEntityExtractor` when `pipeline_kwargs` is stored in the deserialization dict with the value of `None`.
+- When creating an HTTP client object from a dictionary, we now convert the `limits` parameter to an `httpx.Limits` object to avoid AttributeError.
+- Raise a `ValueError` when an async function is passed to the `Tool` class. Async functions are not supported as tools. This change provides a clear error message instead of silent failures where coroutines are never awaited.
+
+## ⚠️ Deprecation Notes
+
+- The `return_empty_on_no_match` parameter has been removed from the `RegexTextExtractor` component. This component now always returns a dictionary with the key "captured_text"; the value can be an empty string if no match is found or the captured text. Currently, the `return_empty_on_no_match` parameter is ignored. Starting from Haystack 2.23.0, initializing the component with this parameter will raise an error.
+
+## 💙 Big thank you to everyone who contributed to this release!
+
+@anakin87, @ArzelaAscoIi, @bilgeyucel, @Bobholamovic, @davidsbatista, @dfokina, @GunaPalanivel, @majiayu000, @OliverZhangA, @sjrl, @TaMaN2031A, @tommasocerruti, @tstadel, @vblagoje, @YassineGabsi