publish a new blog: context rot #513

ChelseyZ · 2025-12-26T16:24:00Z

Core invariant: Larger context windows alone do not prevent context rot; effective systems must control token noise by selective retrieval and context management to maintain agent accuracy over long-running workflows.
Logic removed/simplified: No application logic is removed; this PR only adds documentation. It does not modify existing code paths—therefore no runtime simplifications or redundant logic eliminations are performed in the codebase.
Why no data loss or regression: The change is purely additive (a new markdown blog under blog/en/...), so it does not alter compiled code, APIs, storage schemas, or runtime behavior. No existing files or exported symbols are modified, and all examples are illustrative; therefore there is no path in the repository where production data or behavior could be changed.
New capability added: Provides an operational guide (JIT retrieval, pre-retrieval vector search, and hybrid approaches) with concrete Milvus examples and mitigation strategies (compression, external memory, sub-agents, prompt/tool best practices) to help engineers design systems that prevent context rot; this is documentation-only and affects user guidance rather than code.

sre-ci-robot · 2025-12-26T16:24:05Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ChelseyZ

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2025-12-26T16:26:19Z

Walkthrough

Adds a new blog post markdown file describing context engineering strategies to prevent context rot in AI agents using Milvus. The file includes front matter metadata and content covering context rot definitions and causes; retrieval approaches (Just-in-Time retrieval, pre-retrieval vector search, hybrid retrieval); Milvus-specific examples, code snippets, and images; decision guidance for selecting approaches; techniques for when context windows are insufficient (two-stage pipelines, compression, external memory, sub-agents); and prompt/tooling best practices. The post references Zilliz Cloud and provides concrete workflow examples.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'publish a new blog: context rot' clearly summarizes the main change—adding a new blog post about context rot and prevention strategies.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5529a59 and a0a7422.

📒 Files selected for processing (1)

blog/en/keeping-ai-agents-grounded-context-engineering-strategies-that-prevent-context-rot-using-milvus.md

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-12-25T12:37:16.088Z

Learnt from: septemberfd
Repo: milvus-io/community PR: 511
File: blog/en/embedding-first-chunking-second-smarter-rag-retrieval-with-max-min-semantic-chunking.md:7-7
Timestamp: 2025-12-25T12:37:16.088Z
Learning: In milvus-io/community blog posts, the front matter 'cover' field does not require the 'https://' protocol prefix. When editing or adding blog markdown files under the blog directory (e.g., blog/en/...), specify cover URLs without the protocol (the blogging system handles protocol-less URLs). This applies to all markdown files in the blog area.

Applied to files:

blog/en/keeping-ai-agents-grounded-context-engineering-strategies-that-prevent-context-rot-using-milvus.md

🪛 LanguageTool

blog/en/keeping-ai-agents-grounded-context-engineering-strategies-that-prevent-context-rot-using-milvus.md

[style] ~19-~19: ‘exact same’ might be wordy. Consider a shorter alternative.
Context: ...ysteriously vanish. But if you drop the exact same prompt into a new chat, suddenly the mo...

(EN_WORDINESS_PREMIUM_EXACT_SAME)

[style] ~21-~21: Consider a different verb to strengthen your wording.
Context: ... tokens to 128K, retrieval accuracy can drop by 15–30%. The model still has room, bu...

(DROP_DECLINE)

[style] ~40-~40: As an alternative to the over-used intensifier ‘extremely’, consider replacing this phrase.
Context: ...o when you ask an LLM to operate across extremely large contexts, you’re pushing it into a regi...

(EN_WEAK_ADJECTIVE)

[style] ~56-~56: For conciseness, consider replacing this expression with an adverb.
Context: ...ific item and inserts it into context at the moment it matters—not before. For example,...

(AT_THE_MOMENT)

[style] ~79-~79: This phrase is redundant. Consider writing “relevant”.
Context: ...me, the system retrieves a small set of highly relevant chunks through similarity searches. - ...

(HIGHLY_RELEVANT)

[style] ~98-~98: To elevate your writing, try using a synonym here.
Context: ...Accuracy:* Before a task begins, it’s hard to predict precisely what the model wil...

(HARD_TO)

[grammar] ~100-~100: Use a hyphen to join words.
Context: ...ep or exploratory workflows. So in real world workloads, a hybrid appaorch is th...

(QB_NEW_EN_HYPHEN)

[grammar] ~100-~100: Ensure spelling is correct
Context: .... So in real world workloads, a hybrid appaorch is the optimal solution. - Vector sea...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

[style] ~308-~308: The phrase ‘in many cases’ is used quite frequently. Consider using a less frequent alternative to set your writing apart.
Context: ...automatically produce better results**; in many cases, it does the opposite. When a model is ...

(IN_MANY_STYLE_CASES)

🪛 markdownlint-cli2 (0.18.1)

blog/en/keeping-ai-agents-grounded-context-engineering-strategies-that-prevent-context-rot-using-milvus.md

38-38: Images should have alternate text (alt text)

(MD045, no-alt-text)

120-120: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

142-142: Images should have alternate text (alt text)

(MD045, no-alt-text)

145-145: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

243-243: Images should have alternate text (alt text)

(MD045, no-alt-text)

254-254: Images should have alternate text (alt text)

(MD045, no-alt-text)

266-266: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

295-295: Images should have alternate text (alt text)

(MD045, no-alt-text)

318-318: Spaces inside link text

(MD039, no-space-in-links)

318-318: Spaces inside link text

(MD039, no-space-in-links)

318-318: Spaces inside link text

(MD039, no-space-in-links)

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

🧹 Nitpick comments (1)

blog/en/keeping-ai-agents-grounded-context-engineering-strategies-that-prevent-context-rot-using-milvus.md (1)

19-19: Consider style improvements for clarity and conciseness.

LanguageTool flagged several wordiness and style issues. While not blockers, addressing these will improve readability:

Line 19: "exact same" → consider "same"
Line 21: "drop by 15–30%" → consider "decline" for stronger wording
Line 40: "extremely large contexts" → replace "extremely" with a more precise intensifier
Line 56: "at the moment it matters" → consider "when needed" for conciseness
Line 79: "highly relevant chunks" → "relevant chunks" (redundant intensifier)
Line 98: "it's hard to predict" → consider "difficult to anticipate" to elevate phrasing
Line 307: "in many cases" → use less frequent alternative for variety

🔎 Proposed style improvements

-If you've worked with long-running LLM conversations, you've probably had this frustrating moment: halfway through a long thread, the model starts drifting. Answers get vague, reasoning weakens, and key details mysteriously vanish. But if you drop the exact same prompt into a new chat, suddenly the model behaves—focused, accurate, grounded.
+If you've worked with long-running LLM conversations, you've probably had this frustrating moment: halfway through a long thread, the model starts drifting. Answers get vague, reasoning weakens, and key details mysteriously vanish. But if you drop the same prompt into a new chat, suddenly the model behaves—focused, accurate, grounded.

-This isn't the model "getting tired" — it's **context rot**. As a conversation grows, the model has to juggle more information, and its ability to prioritize slowly declines. [Antropic studie](https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents)s show that as context windows stretch from around 8K tokens to 128K, retrieval accuracy can drop by 15–30%. The model still has room, but it loses track of what matters.
+This isn't the model "getting tired" — it's **context rot**. As a conversation grows, the model has to juggle more information, and its ability to prioritize slowly declines. [Anthropic studies](https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents) show that as context windows stretch from around 8K tokens to 128K, retrieval accuracy can decline by 15–30%. The model still has room, but it loses track of what matters.

-The root issue comes from the [Transformer architecture](https://zilliz.com/learn/decoding-transformer-models-a-study-of-their-architecture-and-underlying-principles) itself. Every token must compare itself against every other token, forming pairwise attention across the entire sequence. That means computation grows **O(n²)** with context length. Expanding your prompt from 1K tokens to 100K doesn't make the model "work harder"—it multiplies the number of token interactions by **10,000×**. Then there's the problem with the training data. Models see far more short sequences than long ones. So when you ask an LLM to operate across extremely large contexts, you're pushing it into a regime it wasn't heavily trained for.
+The root issue comes from the [Transformer architecture](https://zilliz.com/learn/decoding-transformer-models-a-study-of-their-architecture-and-underlying-principles) itself. Every token must compare itself against every other token, forming pairwise attention across the entire sequence. That means computation grows **O(n²)** with context length. Expanding your prompt from 1K tokens to 100K doesn't make the model "work harder"—it multiplies the number of token interactions by **10,000×**. Then there's the problem with the training data. Models see far more short sequences than long ones. So when you ask an LLM to operate across substantially larger contexts, you're pushing it into a regime it wasn't heavily trained for.

-Instead of stuffing entire codebases or datasets into its context (which greatly increases the chance of drift and forgetting), Claude Code maintains a tiny index: file paths, commands, and documentation links. When the model needs a piece of information, it retrieves that specific item and inserts it into context **at the moment it matters**—not before.
+Instead of stuffing entire codebases or datasets into its context (which greatly increases the chance of drift and forgetting), Claude Code maintains a tiny index: file paths, commands, and documentation links. When the model needs a piece of information, it retrieves that specific item and inserts it into context **when needed**—not before.

-In a typical RAG setup: - Documents are embedded and stored in a vector database, such as Milvus. - At query time, the system retrieves a small set of highly relevant chunks through similarity searches.
+In a typical RAG setup: - Documents are embedded and stored in a vector database, such as Milvus. - At query time, the system retrieves a small set of relevant chunks through similarity searches.

-**Accuracy:** Before a task begins, it's hard to predict precisely what the model will need—especially for multi-step or exploratory workflows.
+**Accuracy:** Before a task begins, it's difficult to predict precisely what the model will need—especially for multi-step or exploratory workflows.

-**However, a bigger context window doesn't automatically produce better results**; in many cases, it does the opposite. When a model is overloaded, fed stale information, or forced through massive prompts, accuracy quietly drifts.
+**However, a bigger context window doesn't automatically produce better results**; often, it does the opposite. When a model is overloaded, fed stale information, or forced through massive prompts, accuracy quietly drifts.

Note: Line 21 also contains a typo: "Antropic studies" link text should not include the 's' in the markdown syntax.

Also applies to: 21-21, 40-40, 56-56, 79-79, 98-98, 307-307

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 019d603 and 5529a59.

📒 Files selected for processing (1)

blog/en/keeping-ai-agents-grounded-context-engineering-strategies-that-prevent-context-rot-using-milvus.md

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-12-25T12:37:16.088Z

Learnt from: septemberfd
Repo: milvus-io/community PR: 511
File: blog/en/embedding-first-chunking-second-smarter-rag-retrieval-with-max-min-semantic-chunking.md:7-7
Timestamp: 2025-12-25T12:37:16.088Z
Learning: In milvus-io/community blog posts, the front matter 'cover' field does not require the 'https://' protocol prefix. When editing or adding blog markdown files under the blog directory (e.g., blog/en/...), specify cover URLs without the protocol (the blogging system handles protocol-less URLs). This applies to all markdown files in the blog area.

Applied to files:

blog/en/keeping-ai-agents-grounded-context-engineering-strategies-that-prevent-context-rot-using-milvus.md

🪛 LanguageTool

blog/en/keeping-ai-agents-grounded-context-engineering-strategies-that-prevent-context-rot-using-milvus.md

[style] ~19-~19: ‘exact same’ might be wordy. Consider a shorter alternative.
Context: ...ysteriously vanish. But if you drop the exact same prompt into a new chat, suddenly the mo...

(EN_WORDINESS_PREMIUM_EXACT_SAME)

[style] ~21-~21: Consider a different verb to strengthen your wording.
Context: ... tokens to 128K, retrieval accuracy can drop by 15–30%. The model still has room, bu...

(DROP_DECLINE)

[style] ~40-~40: As an alternative to the over-used intensifier ‘extremely’, consider replacing this phrase.
Context: ...o when you ask an LLM to operate across extremely large contexts, you’re pushing it into a regi...

(EN_WEAK_ADJECTIVE)

[style] ~56-~56: For conciseness, consider replacing this expression with an adverb.
Context: ...ific item and inserts it into context at the moment it matters—not before. For example,...

(AT_THE_MOMENT)

[style] ~79-~79: This phrase is redundant. Consider writing “relevant”.
Context: ...me, the system retrieves a small set of highly relevant chunks through similarity searches. - ...

(HIGHLY_RELEVANT)

[style] ~98-~98: To elevate your writing, try using a synonym here.
Context: ...Accuracy:* Before a task begins, it’s hard to predict precisely what the model wil...

(HARD_TO)

[grammar] ~100-~100: Use a hyphen to join words.
Context: ...ep or exploratory workflows. So in real world workloads, a hybrid appaorch is th...

(QB_NEW_EN_HYPHEN)

[grammar] ~100-~100: Ensure spelling is correct
Context: .... So in real world workloads, a hybrid appaorch is the optimal solution. - Vector sea...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

[style] ~307-~307: The phrase ‘in many cases’ is used quite frequently. Consider using a less frequent alternative to set your writing apart.
Context: ...automatically produce better results**; in many cases, it does the opposite. When a model is ...

(IN_MANY_STYLE_CASES)

🪛 markdownlint-cli2 (0.18.1)

blog/en/keeping-ai-agents-grounded-context-engineering-strategies-that-prevent-context-rot-using-milvus.md

38-38: Images should have alternate text (alt text)

(MD045, no-alt-text)

120-120: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

142-142: Images should have alternate text (alt text)

(MD045, no-alt-text)

145-145: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

228-228: Code block style
Expected: fenced; Actual: indented

(MD046, code-block-style)

243-243: Images should have alternate text (alt text)

(MD045, no-alt-text)

254-254: Images should have alternate text (alt text)

(MD045, no-alt-text)

261-261: Multiple headings with the same content

(MD024, no-duplicate-heading)

265-265: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

294-294: Images should have alternate text (alt text)

(MD045, no-alt-text)

317-317: Spaces inside link text

(MD039, no-space-in-links)

317-317: Spaces inside link text

(MD039, no-space-in-links)

317-317: Spaces inside link text

(MD039, no-space-in-links)

...g-ai-agents-grounded-context-engineering-strategies-that-prevent-context-rot-using-milvus.md

publish a new blog: context rot

5529a59

sre-ci-robot added the size/L label Dec 26, 2025

coderabbitai bot reviewed Dec 26, 2025

View reviewed changes

update

a0a7422

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

publish a new blog: context rot #513

publish a new blog: context rot #513

Uh oh!

ChelseyZ commented Dec 26, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

sre-ci-robot commented Dec 26, 2025

Uh oh!

coderabbitai bot commented Dec 26, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

publish a new blog: context rot #513

Are you sure you want to change the base?

publish a new blog: context rot #513

Uh oh!

Conversation

ChelseyZ commented Dec 26, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sre-ci-robot commented Dec 26, 2025

Uh oh!

coderabbitai bot commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ChelseyZ commented Dec 26, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 26, 2025 •

edited

Loading