Skip to content

Optimize head‑truncation loop in summarize_real() #3764

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 18 additions & 17 deletions aider/history.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,37 +63,38 @@ def summarize_real(self, messages, depth=0):
if split_index <= min_split:
return self.summarize_all(messages)

# Split head and tail
head = messages[:split_index]
tail = messages[split_index:]

sized = sized[:split_index]
head.reverse()
sized.reverse()
keep = []
total = 0
# Only size the head once
sized_head = sized[:split_index]

# These sometimes come set with value = None
# Precompute token limit (fallback to 4096 if undefined)
model_max_input_tokens = self.models[0].info.get("max_input_tokens") or 4096
model_max_input_tokens -= 512
model_max_input_tokens -= 512 # reserve buffer for safety

for i in range(split_index):
total += sized[i][0]
keep = []
total = 0

# Iterate in original order, summing tokens until limit
for tokens, msg in sized_head:
total += tokens
if total > model_max_input_tokens:
break
keep.append(head[i])

keep.reverse()
keep.append(msg)
# No need to reverse lists back and forth

summary = self.summarize_all(keep)

tail_tokens = sum(tokens for tokens, msg in sized[split_index:])
# If the combined summary and tail still fits, return directly
summary_tokens = self.token_count(summary)

result = summary + tail
tail_tokens = sum(tokens for tokens, _ in sized[split_index:])
if summary_tokens + tail_tokens < self.max_tokens:
return result
return summary + tail

return self.summarize_real(result, depth + 1)
# Otherwise recurse with increased depth
return self.summarize_real(summary + tail, depth + 1)

def summarize_all(self, messages):
content = ""
Expand Down