Skip to content

Validator and Feedback records#9

Merged
tpilli merged 14 commits into
mainfrom
dev
Feb 26, 2026
Merged

Validator and Feedback records#9
tpilli merged 14 commits into
mainfrom
dev

Conversation

@tpilli
Copy link
Copy Markdown
Collaborator

@tpilli tpilli commented Feb 10, 2026

store feedback records in cache and send to vertex ai

@github-actions
Copy link
Copy Markdown

🤖 AI-SAST Security Scan

2 potential issue(s) found.

💡 Help us improve! Use the checkboxes below to mark each finding as a true positive (✅) or false positive (❌).
Your feedback helps train the AI to be more accurate over time.

Severity Count
🔥 High 2

📄 **High Issues (2)**
  • ✅ True Positive

  • ❌ False Positive

ID: 486cff7e
Severity: High
Issue: Prompt Injection
Location: src/integrations/bedrock.py
CVSS Vector: N/A

📋 Click to see details, risk, and remediation

Risk:
The generate_with_bedrock and generate_with_claude methods pass the prompt parameter directly to the Amazon Bedrock API without separating it from the system instructions. An attacker who can control the content of the prompt (e.g., by submitting malicious source code for scanning) can inject instructions to manipulate the Large Language Model (LLM). This could be exploited to bypass the security scan, causing vulnerabilities to be missed (Integrity loss), exfiltrate data from the LLM's context window, or cause the scanner to produce misleading results. The attack prerequisite is the ability to provide input to the scanning function, which is expected for any user of a SAST tool.

Remediation:

Separate the untrusted user input from the trusted system instructions. Modify the `_invoke_messages` function to use the dedicated `system` parameter in the Anthropic Messages API. This ensures the model can distinguish between your instructions and the content it is supposed to analyze.

💬 Optional Comment: (Reply to this PR to explain your feedback)


  • ✅ True Positive

  • ❌ False Positive

ID: 7a30f70c
Severity: High
Issue: Prompt Injection
Location: src/integrations/ollama.py
CVSS Vector: N/A

📋 Click to see details, risk, and remediation

Risk:
The function generate_with_ollama accepts a prompt string which, according to the docstring, is used to "Generate text using Ollama model". This user-controllable input appears to be passed directly to a Large Language Model (LLM) without any sanitization or protective measures. An attacker can craft a malicious prompt to manipulate the LLM's behavior, a vulnerability known as Prompt Injection. Successful exploitation could lead to the model ignoring its original instructions, revealing sensitive information from its context, generating malicious or inappropriate content, or performing unintended actions on behalf of the application. This attack requires low-privileged access to the application feature that calls this function.

Remediation:

To mitigate prompt injection, implement a multi-layered defense. Do not pass user input directly to the LLM. Instead, use a hardened system prompt to constrain the model's behavior and treat the user input as data. Additionally, consider input/output validation as a secondary defense.

💬 Optional Comment: (Reply to this PR to explain your feedback)

@github-actions
Copy link
Copy Markdown

🤖 AI-SAST Security Scan

1 potential issue(s) found.

💡 Help us improve! Use the checkboxes below to mark each finding as a true positive (✅) or false positive (❌).
Your feedback helps train the AI to be more accurate over time.

Severity Count
🔥 High 1

📄 **High Issues (1)**
  • ✅ True Positive

  • ❌ False Positive

ID: af8fccdb
Severity: High
Issue: Lack of Defense Against Prompt Injection
Location: src/integrations/bedrock.py
CVSS Vector: N/A

📋 Click to see details, risk, and remediation

Risk:
The prompt parameter is passed directly to the Large Language Model (LLM) without any protective framing, such as a system prompt or other hardening techniques. An attacker can submit a crafted prompt to override the application's intended instructions, leading to a wide range of potential impacts. For example, an attacker could instruct the model to ignore its primary function and instead reveal sensitive data from its context window, generate malicious content (e.g., phishing emails, malware), or deceive downstream systems that trust the LLM's output. The impact is application-dependent but can be severe, potentially leading to data leakage, code execution, or full system compromise. The exploit requires the ability to provide input that is passed to the generate_with_bedrock or generate_with_claude functions.

Remediation:

Harden the client against prompt injection by implementing a system prompt that defines the model's role, constraints, and boundaries. This creates a layer of defense by making it more difficult for user input to override the intended behavior.

💬 Optional Comment: (Reply to this PR to explain your feedback)

@tpilli tpilli changed the title feedback records Validator and Feedback records Feb 26, 2026
@github-actions
Copy link
Copy Markdown

🤖 AI-SAST Security Scan

1 potential issue(s) found.

💡 Help us improve! Use the checkboxes below to mark each finding as a true positive (✅) or false positive (❌).
Your feedback helps train the AI to be more accurate over time.

Severity Count
🔥 High 1

📄 **High Issues (1)**
  • ✅ True Positive

  • ❌ False Positive

ID: 6ab346d8
Severity: High
Issue: Prompt Injection in LLM-based Security Analysis
Location: src/integrations/bedrock.py
CVSS Vector: N/A

📋 Click to see details, risk, and remediation

Risk:
The prompt parameter, which is expected to contain source code for analysis, is directly embedded into the content sent to the Large Language Model (LLM). An attacker who can submit source code for scanning can include malicious instructions within it (e.g., in comments or docstrings). These instructions can cause the LLM to ignore its primary security analysis task and produce a misleading result, such as reporting "no vulnerabilities found" regardless of the code's actual content. This undermines the core function of the AI-SAST tool, allowing vulnerable code to be approved. Exploitation requires the ability to submit code for scanning, which is a standard privilege for a developer using such a tool.

Remediation:

Separate the trusted system instructions from the untrusted user input (the code to be scanned) by using the dedicated `system` parameter in the Anthropic/Bedrock API. This creates a stronger boundary that prevents the user input from easily overriding the model's core instructions.

💬 Optional Comment: (Reply to this PR to explain your feedback)

@tpilli tpilli merged commit da19a23 into main Feb 26, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant