Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds a new bundled MCP server skill guide for diagnosing financial RAG/tool-using agent failures, and updates the existing tests to include it in the expected skill set.
Changes:
- Added new
financial_rag_robustnessskill documentation (SKILL.md). - Updated skill rendering/content tests to recognize the new skill.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| openbb_platform/extensions/mcp_server/tests/app/test_tool_execution.py | Adds the new skill to the expected bundled-skill set and SKILL.md content checks. |
| openbb_platform/extensions/mcp_server/openbb_mcp_server/skills/financial_rag_robustness/SKILL.md | Introduces a robustness checklist/workflow document for financial RAG and tool-calling agents. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+53
to
+62
| | Class | Symptom | First check | | ||
| | --- | --- | --- | | ||
| | Retrieval miss | Relevant filing, news item, transcript section, or time series was not used | Query terms, filters, date range, provider coverage | | ||
| | Retrieval contamination | Irrelevant company, stale period, or wrong asset leaks into the context | Symbol normalization, CIK/ticker mapping, document metadata | | ||
| | Tool-call mismatch | Agent called the wrong OpenBB command or omitted a required provider parameter | Tool schema, active category, generated arguments | | ||
| | Numeric hallucination | Answer cites a metric not present in tool output or retrieved text | Result payload, units, transformation, as-of date | | ||
| | Temporal leak | Answer uses future data or mixes fiscal/calendar periods | Observation date, filing date, period end, release timestamp | | ||
| | Unit or scale error | Basis points, percent, dollars, shares, or split-adjusted prices are mixed | Field metadata, provider docs, chart labels | | ||
| | Reasoning gap | Data is correct, but conclusion does not follow | Intermediate calculations, assumptions, thresholds | | ||
| | Evaluation blind spot | No check would have caught the wrong answer | Add a deterministic assertion or review rubric | |
8486a29 to
109b001
Compare
Contributor
|
This is not the purpose of the MCP server, and this PR is simply AI slop. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Withdrawn by author.