Python: Fix ChatHistoryTruncationReducer deleting system prompt by roli-lpci · Pull Request #13610 · microsoft/semantic-kernel

roli-lpci · 2026-03-01T09:04:20Z

Summary

ChatHistoryTruncationReducer.reduce() silently deletes system/developer messages when truncating chat history. This is because it calls extract_range(), which unconditionally filters out system/developer messages — a function designed for the summarization use case, not truncation.

Fixes #12612.

Approach

Port the .NET SDK fix (PR #10344) to Python:

Detect the first system/developer message before truncation
Pass has_system_message=True to locate_safe_reduction_index so target_count accounts for the preserved message (matches .NET's targetCount -= hasSystemMessage ? 1 : 0)
Use a simple history[truncation_index:] slice instead of extract_range (which strips system messages)
Prepend the system message if it was truncated away

Also adds a guard for target_count <= 0 after the system message adjustment to prevent IndexError when target_count=1.

Changes

File	Change
`chat_history_reducer_utils.py`	Add `has_system_message` parameter, adjust `target_count`, guard against `<= 0`
`chat_history_truncation_reducer.py`	Detect system message, bypass `extract_range`, prepend if truncated
`test_chat_history_truncation_reducer.py`	Update 2 existing tests + 4 new tests (system, developer, no-system, target_count=1)

Note

The summarization reducer (ChatHistorySummarizationReducer) has the same bug — it also uses extract_range and loses system messages during summarization. The .NET summarization reducer preserves system messages via AssemblySummarizedHistory. This should be addressed in a follow-up PR to keep this change focused.

Test plan

11/11 truncation reducer tests pass (was 7, now 11)
9/9 utils tests pass (no regressions from new parameter)
11/11 summarization reducer tests pass (unaffected)
Reproduces and fixes the exact scenario from Python: Bug: Python 1.34.0 ChatHistoryTruncationReducer deletes the System Prompt #12612
Edge case: target_count=1 with system message does not crash

…osoft#12612) The truncation reducer used `extract_range()` which unconditionally filters out system/developer messages — a function designed for the summarization use case, not truncation. This caused system prompts to be silently deleted when chat history was reduced. Fix: detect the first system/developer message before truncation, pass `has_system_message=True` to `locate_safe_reduction_index` so target_count accounts for the preserved message, use a simple slice instead of `extract_range`, and prepend the system message if it was truncated away. This matches the .NET SDK behavior (PR microsoft#10344, already merged). Also adds a guard for `target_count <= 0` after the system message adjustment to prevent IndexError when target_count=1. Note: The summarization reducer (`ChatHistorySummarizationReducer`) has the same bug — it also uses `extract_range` and loses system messages. That should be addressed in a follow-up PR. Closes microsoft#12612

moonbox3

Automated Code Review

Reviewers: 3 | Confidence: 85%

✗ Correctness

The diff correctly preserves system/developer messages during truncation by adjusting target_count and re-prepending the system message when it gets sliced away. The approach is sound for the common case. However, there is a correctness issue in the edge case where target_count=1 with a system message: locate_safe_reduction_index returns None (meaning 'no reduction needed'), so the history is never reduced even when it contains many messages. The test for this case is lenient and accepts None, masking the bug. Additionally, the ChatHistorySummarizationReducer is not updated with the same system-message preservation logic, which may lead to inconsistent behavior between the two reducers.

✓ Security Reliability

The diff preserves system/developer messages during chat history truncation, aligning with .NET SDK behavior. The changes are generally sound from a security perspective — no injection risks, unsafe deserialization, or secrets. The main reliability concerns are: (1) only the first system/developer message is preserved, silently dropping any subsequent ones; (2) when target_count=1 with a system message, reduction is silently disabled forever, allowing unbounded history growth; and (3) the identity-based membership check for re-prepending the system message could behave unexpectedly if message equality is value-based rather than identity-based.

✗ Test Coverage

The diff adds system/developer message preservation to the truncation reducer and includes several new tests. Overall coverage is good, but there is one test (test_truncation_target_count_1_with_system_message) whose key assertion is unreachable dead code — the if result is not None guard means the test always passes vacuously. Additionally, there are no direct unit tests for the modified locate_safe_reduction_index utility with the new has_system_message parameter, and no tests cover edge cases like multiple system/developer messages or the ChatHistorySummarizationReducer which also calls the utility.

Blocking Issues

When target_count=1 and has_system_message=True, locate_safe_reduction_index returns None (no reduction), leaving the entire history intact even when it far exceeds the target. The caller should still be able to reduce to just the system message in this scenario. The early return None when target_count <= 0 prevents a crash (the scan loop would IndexError at history[total_count]), but the behavior is wrong — the reducer should produce [system_message] with 1 total message, not silently skip reduction.
test_truncation_target_count_1_with_system_message (lines 161-170): The assertion inside if result is not None is dead code. With target_count=1 and a system message, locate_safe_reduction_index decrements target_count to 0 and returns None, so reduce() always returns None. The test gives a false sense of coverage. It should explicitly assert result is None and, ideally, add a separate case that actually exercises the preservation path.

Suggestions

The ChatHistorySummarizationReducer also calls locate_safe_reduction_index but is not updated to pass has_system_message or to preserve system messages. Consider applying the same fix there for consistency.
The check system_message not in truncated_list uses value equality (__eq__). If ChatMessageContent has deep equality semantics (e.g., Pydantic models), a different message with identical fields could cause a false match. Consider using identity comparison (all(msg is not system_message for msg in truncated_list)) for robustness.
The test test_truncation_target_count_1_with_system_message is overly lenient — it accepts None (no reduction) as valid, which masks the bug. The test should assert that reduction occurs and the result contains only the system message.
Only the first system/developer message is found via next(...). If a history legitimately contains multiple system or developer messages, all but the first will be silently dropped on truncation. Consider collecting all such messages or documenting this as a known limitation.
When target_count=1 and a system message exists, locate_safe_reduction_index returns None unconditionally, meaning truncation never fires and history grows unboundedly. Consider logging a warning so users can detect this misconfiguration.
The check system_message not in truncated_list relies on the equality semantics of message objects. If ChatMessageContent.__eq__ is value-based, a different message with identical content could prevent the system message from being re-prepended. Using identity comparison (all(msg is not system_message for msg in truncated_list)) would be more robust.
Add direct unit tests for locate_safe_reduction_index with has_system_message=True, covering the target_count adjustment and the target_count<=0 early return.
Add a test with multiple system/developer messages to verify only the first is preserved (current next() semantics).
Add a test where the system message falls within the retained tail (i.e., system message is NOT truncated away) to exercise the system_message not in truncated_list false branch.
Verify whether ChatHistorySummarizationReducer also needs updating to pass has_system_message to locate_safe_reduction_index, and add corresponding tests if so.

Automated review by moonbox3's agents

moonbox3 · 2026-03-04T00:39:28Z

python/semantic_kernel/contents/history_reducer/chat_history_reducer_utils.py

        The index that identifies the starting point for a reduced history that does not orphan
        sensitive content. Returns None if reduction is not needed.
    """
+    # Adjust target_count to account for the system message that will be preserved separately.


When has_system_message=True and target_count was 1, this decrements to 0 and returns None (no reduction). But the history may have many messages that should be reduced — the reducer should produce a result containing only the system message. Returning None here causes the truncation reducer to silently skip reduction entirely. Consider handling target_count == 0 as 'truncate everything' (return len(history)) and let the caller prepend the system message, or handle this edge case in the truncation reducer itself.

moonbox3 · 2026-03-04T00:39:28Z

python/semantic_kernel/contents/history_reducer/chat_history_truncation_reducer.py

+        truncated_list = history[truncation_index:]
+
+        # Prepend the system/developer message if it was truncated away
+        if system_message is not None and system_message not in truncated_list:


This uses value equality (__eq__) via not in. Since system_message is a reference into the original history list and truncated_list is a slice of the same list, identity comparison would be more precise and avoid a potential false match if another message happens to have identical fields.

Suggested change

if system_message is not None and system_message not in truncated_list:

if system_message is not None and all(msg is not system_message for msg in truncated_list):

moonbox3 · 2026-03-04T00:39:28Z

python/tests/unit/contents/test_chat_history_truncation_reducer.py

+
+async def test_truncation_target_count_1_with_system_message():
+    """Verify target_count=1 with system message does not crash (edge case from review)."""
+    reducer = ChatHistoryTruncationReducer(target_count=1, system_message="System prompt")
+    reducer.add_message(ChatMessageContent(role=AuthorRole.USER, content="Hello"))
+    reducer.add_message(ChatMessageContent(role=AuthorRole.ASSISTANT, content="Hi"))
+    reducer.add_message(ChatMessageContent(role=AuthorRole.USER, content="How are you?"))
+    reducer.add_message(ChatMessageContent(role=AuthorRole.ASSISTANT, content="Good"))
+
+    # Should not crash. Either returns None (no reduction possible)
+    # or returns just the system message.


This test accepts None (no reduction) as a valid outcome for target_count=1 with a system message, but that behavior is a bug — the user asked for at most 1 message and the history has 5. The test should assert that reduction happens and the result contains exactly the system message. As written, this test will pass even when the reducer fails to reduce.

moonbox3 · 2026-03-04T00:39:28Z

python/semantic_kernel/contents/history_reducer/chat_history_reducer_utils.py

        The index that identifies the starting point for a reduced history that does not orphan
        sensitive content. Returns None if reduction is not needed.
    """
+    # Adjust target_count to account for the system message that will be preserved separately.


When target_count is 1 and has_system_message is True, this permanently prevents any reduction. Consider emitting a warning log so callers can detect this silent no-op, since an ever-growing history could eventually cause OOM or token-limit failures.

Suggested change

# Adjust target_count to account for the system message that will be preserved separately.

target_count -= 1

if target_count <= 0:

import logging

logging.getLogger(__name__).warning(

"target_count after accounting for system message is %d; reduction disabled.", target_count

)

return None # Cannot reduce further; only system message would remain

moonbox3 · 2026-03-04T00:39:28Z

python/semantic_kernel/contents/history_reducer/chat_history_truncation_reducer.py

-        truncated_list = extract_range(history, start=truncation_index)
+        truncated_list = history[truncation_index:]
+
+        # Prepend the system/developer message if it was truncated away


Using not in performs value-based equality checks. If ChatMessageContent.__eq__ compares by content, a duplicate message elsewhere in the list could shadow the original system message and prevent it from being re-prepended. An identity check would be safer here.

Suggested change

# Prepend the system/developer message if it was truncated away

if system_message is not None and not any(m is system_message for m in truncated_list):

moonbox3 · 2026-03-04T00:39:28Z

python/tests/unit/contents/test_chat_history_truncation_reducer.py

+    reducer.add_message(ChatMessageContent(role=AuthorRole.USER, content="How are you?"))
+    reducer.add_message(ChatMessageContent(role=AuthorRole.ASSISTANT, content="Good"))
+
+    # Should not crash. Either returns None (no reduction possible)
+    # or returns just the system message.
+    result = await reducer.reduce()
+    if result is not None:
+        # System message must be preserved
+        assert any(m.role == AuthorRole.SYSTEM for m in result.messages)
+


Dead assertion: with target_count=1 and a system message, locate_safe_reduction_index reduces target to 0 and returns None, so reduce() returns None. The if result is not None branch never executes, meaning the system-message assertion is never checked. Assert the expected behavior explicitly.

Suggested change

reducer.add_message(ChatMessageContent(role=AuthorRole.USER, content="How are you?"))

reducer.add_message(ChatMessageContent(role=AuthorRole.ASSISTANT, content="Good"))

# Should not crash. Either returns None (no reduction possible)

# or returns just the system message.

result = await reducer.reduce()

if result is not None:

# System message must be preserved

assert any(m.role == AuthorRole.SYSTEM for m in result.messages)

# target_count=1 with system message means adjusted target becomes 0,

# so no reduction is possible.

result = await reducer.reduce()

assert result is None

1. target_count=1 with system message: return len(history) instead of None so the reducer produces [system_message] instead of silently skipping reduction (unbounded history growth). 2. Dead test assertion: replace vacuously-passing `if result is not None` guard with explicit assertions on result count, role, and content. 3. Identity comparison: use `all(msg is not system_message ...)` instead of `not in` to avoid false matches from Pydantic value-based equality. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

roli-lpci · 2026-03-04T22:45:06Z

Thanks for the thorough review — all three blocking issues were valid. Pushed fixes:

1. target_count=1 with system message (unbounded growth)

locate_safe_reduction_index now returns len(history) instead of None when target_count <= 0 after the system message adjustment. This produces an empty tail (history[len(history):] = []), and the caller prepends the system message, resulting in [system_message]. The reducer now correctly reduces to just the system message instead of silently skipping reduction.

Note: this actually improves on the .NET SDK's handling of the same edge case — the .NET implementation has a latent issue where target_count=1 with a system message can produce unexpected behavior.

2. Dead test assertion

Replaced the vacuous if result is not None guard with explicit assertions:

assert result is not None
assert len(result.messages) == 1
assert result.messages[0].role == AuthorRole.SYSTEM
assert result.messages[0].content == "System prompt"

3. Identity comparison

Switched from system_message not in truncated_list (value equality) to all(msg is not system_message for msg in truncated_list) (identity) to avoid false matches from Pydantic's value-based __eq__.

Regarding the non-blocking suggestions:

ChatHistorySummarizationReducer consistency: Agreed this should be updated too. Happy to add it in this PR or as a follow-up — let me know your preference.
Multiple system/developer messages: The current next() semantics preserve only the first. This matches the .NET SDK behavior. Worth documenting as a known limitation.
Direct unit tests for locate_safe_reduction_index: Can add these if desired.

roli-lpci requested a review from a team as a code owner March 1, 2026 09:04

moonbox3 added the python Pull requests for the Python Semantic Kernel label Mar 1, 2026

roli-lpci force-pushed the fix/python-history-reducer-system-prompt-12612 branch from df8d4a9 to cd25e3a Compare March 3, 2026 06:44

moonbox3 reviewed Mar 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Fix ChatHistoryTruncationReducer deleting system prompt#13610

Python: Fix ChatHistoryTruncationReducer deleting system prompt#13610
roli-lpci wants to merge 2 commits intomicrosoft:mainfrom
roli-lpci:fix/python-history-reducer-system-prompt-12612

roli-lpci commented Mar 1, 2026

Uh oh!

moonbox3 left a comment

Uh oh!

moonbox3 Mar 4, 2026

Uh oh!

moonbox3 Mar 4, 2026

Uh oh!

moonbox3 Mar 4, 2026

Uh oh!

moonbox3 Mar 4, 2026

Uh oh!

moonbox3 Mar 4, 2026

Uh oh!

moonbox3 Mar 4, 2026

Uh oh!

roli-lpci commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	if system_message is not None and system_message not in truncated_list:
	if system_message is not None and all(msg is not system_message for msg in truncated_list):

-    # Adjust target_count to account for the system message that will be preserved separately.
+        target_count -= 1
+        if target_count <= 0:
+            import logging
+            logging.getLogger(__name__).warning(
+                "target_count after accounting for system message is %d; reduction disabled.", target_count
+            )
+            return None  # Cannot reduce further; only system message would remain

	# Prepend the system/developer message if it was truncated away
	if system_message is not None and not any(m is system_message for m in truncated_list):

Conversation

roli-lpci commented Mar 1, 2026

Summary

Approach

Changes

Note

Test plan

Uh oh!

moonbox3 left a comment

Choose a reason for hiding this comment

Automated Code Review

✗ Correctness

✓ Security Reliability

✗ Test Coverage

Blocking Issues

Suggestions

Uh oh!

moonbox3 Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

moonbox3 Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

moonbox3 Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

moonbox3 Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

moonbox3 Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

moonbox3 Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

roli-lpci commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants