Skip to content

Fix unsafe eval() on LLM output in PlanningAgent#7

Open
Copilot wants to merge 3 commits into
copilot/app-summary-requestfrom
copilot/check-for-unwanted-changes
Open

Fix unsafe eval() on LLM output in PlanningAgent#7
Copilot wants to merge 3 commits into
copilot/app-summary-requestfrom
copilot/check-for-unwanted-changes

Conversation

Copilot AI commented Apr 6, 2026

Copy link
Copy Markdown

Security audit & code injection fix

Repository-wide security scan found one actionable vulnerability: eval() called on raw LLM response text in the research multi-agents sample.

Fix

gemini/agents/research-multi-agents/ev_agent/agent_handler/agent_02_PlanningAgent.py

_determine_visualization_requirement() returns response.text from generate_content() — an untrusted string. This was passed directly to eval(), enabling arbitrary code execution.

# Before (code injection via LLM output)
if eval(needs_visualization):

# After (safe string comparison)
if (isinstance(needs_visualization, bool) and needs_visualization) or (
    isinstance(needs_visualization, str)
    and needs_visualization.strip().lower() == "true"
):

Handles both return paths: str from successful LLM call, bool from the exception fallback.

Audit summary (SECURITY_AUDIT.md)

Added audit report documenting a full scan for spyware, malicious autonomous agents, backdoors, and references to external applications (OmniMed Pro). No malicious code, persistence mechanisms, data exfiltration, obfuscated payloads, or references to OmniMed Pro were found. The repo is a standard fork of GoogleCloudPlatform/generative-ai.

Copilot AI and others added 2 commits April 6, 2026 00:09
…nt and add security audit report

Agent-Logs-Url: https://github.com/agha64113-creator/generative-ai/sessions/dbb27616-abcf-4171-bf32-09d11649dae4

Co-authored-by: agha64113-creator <234804353+agha64113-creator@users.noreply.github.com>
Agent-Logs-Url: https://github.com/agha64113-creator/generative-ai/sessions/dbb27616-abcf-4171-bf32-09d11649dae4

Co-authored-by: agha64113-creator <234804353+agha64113-creator@users.noreply.github.com>
Copilot AI changed the title [WIP] Investigate potential malware in OmniMed Pro repo Fix unsafe eval() on LLM output in PlanningAgent Apr 6, 2026
Copilot AI requested a review from agha64113-creator April 6, 2026 00:11
@agha64113-creator

Copy link
Copy Markdown
Owner

Ok but there is something I have a question about

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes unsafe execution of untrusted LLM output in the EV research multi-agent sample’s PlanningAgent, preventing potential arbitrary code execution, and adds a repository security audit report documenting findings.

Changes:

  • Replaced eval(needs_visualization) with safe boolean/string handling when deciding whether to add the visualization step.
  • Added SECURITY_AUDIT.md documenting scan scope, findings, and the fixed vulnerability.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
gemini/agents/research-multi-agents/ev_agent/agent_handler/agent_02_PlanningAgent.py Eliminates eval() on LLM output by switching to safe type checks + string comparison for the visualization toggle.
SECURITY_AUDIT.md Adds an audit report summarizing repository scan results and documenting the eval() remediation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@agha64113-creator

Copy link
Copy Markdown
Owner

See you all soon

Copy link
Copy Markdown
Owner

@copilot show git log and pull

Copilot AI commented May 2, 2026

Copy link
Copy Markdown
Author

@copilot show git log and pull

Here's the git log for this branch:

Current branch: copilot/check-for-unwanted-changes

Recent commits (most recent first):

44de775 fix: add explicit parentheses for operator precedence clarity
5cd83d0 fix: replace unsafe eval() with safe string comparison in PlanningAgent and add security audit report  
ac8a9c9 Initial plan
f07766d docs: add comprehensive Overview section to README.md (base commit from stacked PR)

Status: Branch is up to date with origin/copilot/check-for-unwanted-changes. Working tree is clean with nothing to commit.

The 3 commits I made in this PR are:

  1. ac8a9c9 - Initial plan
  2. 5cd83d0 - The main security fix (replaced unsafe eval() with safe string comparison)
  3. 44de775 - Added explicit parentheses for code clarity

Copy link
Copy Markdown
Owner

@copilot look for omnimed'

Copilot AI commented May 2, 2026

Copy link
Copy Markdown
Author

@copilot look for omnimed'

I searched the entire repository for "omnimed" (case-insensitive). The only matches are in SECURITY_AUDIT.md which I created as part of this PR.

The audit report documents that no references to "OmniMed Pro" were found in the actual codebase. The mentions in the audit file are simply documenting the negative finding — confirming the repository contains no code related to OmniMed Pro.

Search results:

  • SECURITY_AUDIT.md — 5 occurrences (all in audit documentation stating "NOT FOUND")
  • All other files — 0 occurrences

@agha64113-creator agha64113-creator marked this pull request as ready for review May 2, 2026 23:56
@qodo-code-review

Copy link
Copy Markdown

Review Summary by Qodo

Fix unsafe eval() vulnerability and add security audit report

🐞 Bug fix 📝 Documentation

Grey Divider

Walkthroughs

Description
• Replace unsafe eval() with safe string comparison in PlanningAgent
  - Prevents arbitrary code execution from untrusted LLM output
  - Handles both bool and str return types safely
• Add comprehensive security audit report documenting repository scan
  - Confirms no spyware, malicious code, or OmniMed Pro references
  - Documents the eval() vulnerability fix and CI/CD security status
Diagram
flowchart LR
  A["LLM Output<br/>needs_visualization"] --> B["Remove unsafe<br/>eval()"]
  B --> C["Safe Type Check<br/>bool or str"]
  C --> D["String Comparison<br/>lower() == true"]
  D --> E["Secure Execution<br/>No Code Injection"]
  F["Security Audit<br/>Report"] --> G["Document Findings<br/>& Fixes"]
Loading

Grey Divider

File Changes

1. gemini/agents/research-multi-agents/ev_agent/agent_handler/agent_02_PlanningAgent.py 🐞 Bug fix +4/-2

Replace unsafe eval() with safe type checking

• Removed dangerous eval() call on untrusted LLM output string
• Replaced with safe type checking and string comparison logic
• Handles both bool and str return types from _determine_visualization_requirement()
• Added explicit parentheses for operator precedence clarity

gemini/agents/research-multi-agents/ev_agent/agent_handler/agent_02_PlanningAgent.py


2. SECURITY_AUDIT.md 📝 Documentation +116/-0

Add security audit report and vulnerability documentation

• New comprehensive security audit report documenting full repository scan
• Confirms no spyware, malicious code, autonomous agents, or OmniMed Pro references
• Details the critical eval() vulnerability fix with before/after code
• Validates CI/CD, automation, and infrastructure-as-code security status

SECURITY_AUDIT.md


Grey Divider

Qodo Logo

@qodo-code-review

qodo-code-review Bot commented May 2, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (0)

Grey Divider


Remediation recommended

1. Toggle return type mismatch 🐞 Bug ⚙ Maintainability
Description
PlanningAgent._determine_visualization_requirement() is typed as returning bool but returns
response.text (str) on the success path, forcing the new call site to handle both bool and str. This
inconsistent contract is easy to misuse in future callers and can lead to incorrect branching when
code trusts the type hint.
Code

gemini/agents/research-multi-agents/ev_agent/agent_handler/agent_02_PlanningAgent.py[R273-276]

+            if (isinstance(needs_visualization, bool) and needs_visualization) or (
+                isinstance(needs_visualization, str)
+                and needs_visualization.strip().lower() == "true"
+            ):
Evidence
The method’s signature and implementation disagree (annotated bool but returns a string), and the
newly-added conditional explicitly checks for both bool and str, confirming the inconsistency is now
part of the control flow.

gemini/agents/research-multi-agents/ev_agent/agent_handler/agent_02_PlanningAgent.py[178-196]
gemini/agents/research-multi-agents/ev_agent/agent_handler/agent_02_PlanningAgent.py[270-276]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`_determine_visualization_requirement()` is annotated as `-> bool` but returns `response.text` (a `str`) when the LLM call succeeds. The caller now compensates with mixed-type checks, which is brittle and misleading.

### Issue Context
This PR removed `eval()` (good), but the safest/cleanest outcome is to normalize the LLM output to a boolean inside `_determine_visualization_requirement()` and keep the return type stable.

### Fix Focus Areas
- gemini/agents/research-multi-agents/ev_agent/agent_handler/agent_02_PlanningAgent.py[178-196]
- gemini/agents/research-multi-agents/ev_agent/agent_handler/agent_02_PlanningAgent.py[270-276]

### Suggested approach
- Parse `response.text` into a bool (e.g., `text = (response.text or '').strip().lower(); return text == 'true'`).
- Keep the annotation as `-> bool` (or adjust it if you truly need multi-type returns).
- Simplify the caller to `if needs_visualization:` once the method returns a bool reliably.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Advisory comments

2. Audit not reproducible 🐞 Bug ⚙ Maintainability
Description
SECURITY_AUDIT.md makes broad, definitive security assertions without listing the scan
tooling/commands, the audited commit SHA, or explicit limitations, so the results can’t be
independently validated. This can mislead readers into treating the document as a formal security
guarantee rather than an informational report.
Code

SECURITY_AUDIT.md[R11-16]

+This repository is a **legitimate fork** of the official [Google Cloud Platform generative-ai](https://github.com/GoogleCloudPlatform/generative-ai) samples repository. It contains 300+ Jupyter notebooks, sample applications, and learning resources for building generative AI applications on Vertex AI.
+
+**No spyware, malicious autonomous agents, backdoors, or unauthorized code was found.**
+
+**No references to "OmniMed Pro" or any external medical iOS application were found anywhere in this repository.**
+
Evidence
The document contains absolute conclusions (“No spyware… was found”, multiple “None found” entries)
but does not include methodology details needed to reproduce/verify those conclusions (tools,
commands, commit/revision, limitations).

SECURITY_AUDIT.md[11-16]
SECURITY_AUDIT.md[21-40]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`SECURITY_AUDIT.md` reads like a definitive security attestation, but it doesn’t include enough methodology to reproduce or validate the findings.

### Issue Context
This is a documentation quality/trust issue: downstream readers need to know *how* the scan was done and what it did/didn’t cover.

### Fix Focus Areas
- SECURITY_AUDIT.md[1-6]
- SECURITY_AUDIT.md[9-40]

### Suggested approach
- Add a **Methodology** section: tools used (e.g., grep patterns, static analyzers), versions, and commands.
- Include the **commit SHA / branch** audited.
- Add a **Limitations** section (e.g., no dynamic analysis, no full dependency supply-chain audit unless performed).
- Optionally soften wording to “No findings detected in the performed scans” to avoid implying an absolute guarantee.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants