|
| 1 | +# AI Red Teaming Agent |
| 2 | + |
| 3 | +This repository contains a template for creating an AI Red Teaming Agent using Azure AI Evaluation and Semantic Kernel. The template allows you to test and evaluate AI systems for potential vulnerabilities and harmful outputs when used by an adversarial attacker. The AI Red Teaming Agent leverages state of the art attack strategies from Microsoft AI Red Teaming team's open-source framework for [Python Risk Identification Tool (PyRIT)](https://github.com/Azure/PyRIT). |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +The `template.py` script implements an AI Red Teaming Agent that can: |
| 8 | + |
| 9 | +1. Fetch harmful prompts across different [risk categories](https://learn.microsoft.com/en-us/azure/ai-foundry/concepts/ai-red-teaming-agent#supported-risk-categories) |
| 10 | +2. Send prompts to target generative AI models or applications |
| 11 | +3. Apply transformations to prompts (like base64 encoding) with different [attack strategies](https://learn.microsoft.com/en-us/azure/ai-foundry/concepts/ai-red-teaming-agent#supported-attack-strategies) |
| 12 | +4. Interact with your target model to evaluate its response to potentially harmful inputs |
| 13 | + |
| 14 | +## Prerequisites |
| 15 | + |
| 16 | +- Python 3.10+ |
| 17 | +- An Azure subscription with access to Azure OpenAI |
| 18 | +- An Azure AI Project |
| 19 | +- Ollama (or another model service) running locally or accessible via API as your target for AI red teaming |
| 20 | + |
| 21 | +## Setup Instructions |
| 22 | + |
| 23 | +### 1. Install Required Dependencies |
| 24 | + |
| 25 | +```bash |
| 26 | +pip install semantic-kernel azure-ai-evaluation[redteam] python-dotenv requests |
| 27 | +``` |
| 28 | + |
| 29 | +### 2. Environment Variables |
| 30 | + |
| 31 | +Create a `.env` file in the same directory as `template.py` with the following variables: |
| 32 | + |
| 33 | +``` |
| 34 | +# Azure OpenAI Configuration |
| 35 | +AZURE_OPENAI_ENDPOINT=https://your-resource-name.openai.azure.com/ |
| 36 | +AZURE_OPENAI_DEPLOYMENT_NAME=your-deployment-name |
| 37 | +AZURE_OPENAI_API_KEY=your-api-key |
| 38 | +
|
| 39 | +# Azure AI Project Configuration |
| 40 | +AZURE_SUBSCRIPTION_ID=your-subscription-id |
| 41 | +AZURE_RESOURCE_GROUP=your-resource-group |
| 42 | +AZURE_PROJECT_NAME=your-project-name |
| 43 | +``` |
| 44 | + |
| 45 | +### 3. Configure Target Model |
| 46 | + |
| 47 | +By default, the template uses Ollama as the target model. To use Ollama: |
| 48 | + |
| 49 | +1. Make sure Ollama is installed and running on your machine |
| 50 | +2. Update the model name in the `call_ollama` function: |
| 51 | + |
| 52 | +```python |
| 53 | +payload = {"model": "llama2", "prompt": query, "stream": False} # Replace "llama2" with your model |
| 54 | +``` |
| 55 | + |
| 56 | +### 4. Changing the Target Model |
| 57 | + |
| 58 | +To use a different target model instead of Ollama, modify the `call_ollama` function. For example, to use another API: |
| 59 | + |
| 60 | +```python |
| 61 | +def call_custom_api(query: str) -> str: |
| 62 | + """ |
| 63 | + Call a custom API with a prompt and return the response. |
| 64 | + """ |
| 65 | + url = "https://your-api-endpoint.com/generate" |
| 66 | + headers = { |
| 67 | + "Authorization": f"Bearer {os.environ.get('API_KEY')}", |
| 68 | + "Content-Type": "application/json" |
| 69 | + } |
| 70 | + payload = {"prompt": query, "max_tokens": 300} |
| 71 | + |
| 72 | + response = requests.post(url, headers=headers, json=payload, timeout=60) |
| 73 | + try: |
| 74 | + return response.json()["response"] |
| 75 | + except Exception as e: |
| 76 | + print(f"Error occurred: {e}") |
| 77 | + return "error" |
| 78 | +``` |
| 79 | + |
| 80 | +Then, update the `main` function to use your new target function: |
| 81 | + |
| 82 | +```python |
| 83 | +# Initialize the RedTeamPlugin with the target function |
| 84 | +red_team_plugin = RedTeamPlugin( |
| 85 | + subscription_id=subscription_id, |
| 86 | + resource_group=resource_group, |
| 87 | + project_name=project_name, |
| 88 | + target_func=call_custom_api # Use your new function here |
| 89 | +) |
| 90 | +``` |
| 91 | + |
| 92 | +## Running the Agent |
| 93 | + |
| 94 | +Execute the script: |
| 95 | + |
| 96 | +```bash |
| 97 | +python template.py |
| 98 | +``` |
| 99 | + |
| 100 | +The script will: |
| 101 | +1. Run through predefined demonstration messages to show the agent's capabilities |
| 102 | +2. Enter interactive mode where you can interact with the agent directly |
| 103 | + |
| 104 | +## Usage Examples |
| 105 | + |
| 106 | +Here are some example commands you can use in interactive mode: |
| 107 | + |
| 108 | +- `Fetch a harmful prompt in the violence category` |
| 109 | +- `Fetch a harmful prompt in the harassment category` |
| 110 | +- `Convert [prompt] using base64_converter` |
| 111 | +- `Send [prompt] to my target` |
| 112 | + |
| 113 | +## Customizing Agent Behavior |
| 114 | + |
| 115 | +To change the agent's instructions or behavior, modify the `instructions` parameter when creating the agent: |
| 116 | + |
| 117 | +```python |
| 118 | +agent = ChatCompletionAgent( |
| 119 | + service=service, |
| 120 | + name="RedTeamAgent", |
| 121 | + instructions="Your custom instructions here...", |
| 122 | + plugins=[red_team_plugin], |
| 123 | +) |
| 124 | +``` |
| 125 | + |
| 126 | +## Security and Responsible Use |
| 127 | + |
| 128 | +This tool is intended for authorized red team exercises and security evaluations only. Always ensure you have proper permission to test any AI system and that you're following all applicable policies and guidelines. |
0 commit comments