🛡️ AI/ML Pentesting Roadmap
A comprehensive, structured guide to learning AI/ML security and penetration testing — from zero to practitioner.
Prerequisites
Phase 1 — Foundations
Phase 2 — AI/ML Security Concepts
Phase 3 — Prompt Injection & LLM Attacks
Phase 4 — Hands-On Practice
Phase 5 — Advanced Exploitation Techniques
Phase 6 — Real-World Research & Bug Bounty
Standards, Frameworks & References
Tools & Repositories
Books, PDFs & E-Books
Video Resources
CTF & Competitions
Bug Bounty Programs
Community & News
Suggested Learning Path by Experience Level
Before diving into AI/ML pentesting, ensure you have the following foundation:
Programming (Python is essential)
Understand REST APIs, HTTP methods, headers, and authentication flows
Postman Learning Center
Practice with tools: curl, Burp Suite, Postman
1.1 Machine Learning Fundamentals
1.2 Large Language Models (LLMs)
Understanding how LLMs work is critical before attacking them.
Phase 2 — AI/ML Security Concepts
2.1 Core Security Concepts
2.2 Attack Surface Overview
Key attack vectors in AI/ML systems:
Prompt Injection — Manipulating LLM behavior through crafted inputs
Jailbreaking — Bypassing safety filters and guardrails
Model Inversion — Extracting training data from a model
Membership Inference — Determining if data was in training set
Data Poisoning — Corrupting training data to influence behavior
Adversarial Examples — Perturbed inputs that fool classifiers
Model Extraction/Stealing — Cloning a model via API queries
Supply Chain Attacks — Malicious models/weights on platforms like Hugging Face
Insecure Plugin/Tool Integration — Exploiting LLM agents with external tools
Training Data Exfiltration — Extracting memorized private data
Denial of Service — Overloading models via crafted prompts
2.3 MLOps & Infrastructure Security
Phase 3 — Prompt Injection & LLM Attacks
3.1 Understanding Prompt Injection
3.2 Jailbreaking Techniques
3.3 Indirect Prompt Injection
A more sophisticated attack where malicious instructions are injected via external data sources (emails, documents, websites) that an LLM agent processes.
3.4 Advanced Prompt Attack Techniques
Phase 4 — Hands-On Practice
4.1 Interactive Platforms & Games
4.2 Vulnerable-by-Design Projects
4.3 CTF Writeups to Study
Phase 5 — Advanced Exploitation Techniques
5.1 Agent & Tool Integration Attacks
When LLMs are integrated with tools (code execution, web browsing, file systems), the attack surface expands dramatically.
5.2 Data Exfiltration via LLMs
5.3 Account Takeover & Authentication Attacks
5.4 XSS & Web Vulnerabilities in AI Products
5.5 Model & Infrastructure Attacks
5.6 Persistent Attacks & Memory Exploitation
5.7 Adversarial Machine Learning
Phase 6 — Real-World Research & Bug Bounty
6.1 Notable Research & Disclosures
6.2 How to Find LLM Vulnerabilities
Key areas to test when assessing an LLM-powered application:
System prompt extraction — Can you leak the hidden system prompt?
Instruction override — Can you ignore system-level instructions?
Plugin/tool abuse — Can agent tools be misused (SSRF, RCE, SQLi)?
Data exfiltration via markdown — Does the UI render  ?
Persistent injection via memory — Can you inject instructions that persist in memory/RAG?
PII leakage — Does the model reveal training data or other users' data?
Cross-user data leakage — In multi-tenant apps, can you access other users' contexts?
Authentication bypass — Can you trick the LLM into performing privileged actions?
Standards, Frameworks & References
Defensive / Scanning Tools
Resource
Link
LLM Hacker's Handbook
GitHub
OWASP Top 10 for LLM (Snyk)
PDF
Bugcrowd Ultimate Guide to AI Security
PDF
Lakera Real World LLM Exploits
PDF
HackerOne Ultimate Guide to Managing AI Risks
E-Book
Adversarial Machine Learning — Goodfellow et al.
arXiv
Resource
Link
Penetration Testing Against and With AI/LLM/ML (Playlist)
YouTube
Andrej Karpathy — Intro to Large Language Models
YouTube
DEF CON AI Village Talks
YouTube
LiveOverflow — AI/ML Security
YouTube
3Blue1Brown — Neural Networks Series
YouTube
John Hammond — AI Security Challenges
YouTube
Cybrary — Machine Learning Security
Cybrary
AI/ML security bug bounties are growing rapidly. Target these platforms:
Program
Scope
Link
OpenAI Bug Bounty
ChatGPT, API, plugins
bugcrowd.com/openai
Google AI Bug Bounty
Gemini, Bard, Vertex AI
bughunters.google.com
Meta AI Bug Bounty
Llama models, Meta AI
facebook.com/whitehat
HuggingFace via ProtectAI
Hub, models, spaces
huntr.com
Anthropic Bug Bounty
Claude, API
anthropic.com/security
Microsoft (Copilot, Azure AI)
Copilot, Azure OpenAI
msrc.microsoft.com
Huntr (AI/ML focused)
Open source ML libraries
huntr.com
Tips for AI bug bounty:
Focus on data exfiltration via markdown rendering (common finding)
Test plugin/tool integrations thoroughly
Look for prompt injection in RAG pipelines
Explore memory and persistent context manipulation
Check for cross-tenant data leakage in multi-user deployments
Community & News
Suggested Learning Path by Experience Level
Complete PortSwigger Web Security Academy fundamentals
Learn Python basics
Take Google ML Crash Course
Read OWASP LLM Top 10
Play Gandalf — all levels
Read Simon Willison's prompt injection article
Watch Andrej Karpathy — Intro to LLMs
🟡 Intermediate (3–9 months)
Study MITRE ATLAS Matrix
Complete PortSwigger LLM Attack labs
Set up and exploit Damn Vulnerable LLM Agent
Complete Prompt Airlines and Crucible challenges
Read the LLM Hacker's Handbook
Study the Embrace the Red blog in full
Experiment with Garak and PyRIT
Try Offensive ML Playbook
Participate in AI Village CTF at DEF CON
Submit findings to Huntr or OpenAI Bug Bounty
Study adversarial ML with ART and CleverHans
Read academic papers on model inversion, membership inference, and data extraction
Contribute to open source tools like Garak or AI Exploits
Build your own vulnerable LLM demo environment
Write and publish research — blog posts, CVEs, conference talks
Last updated: 2025 | Contributions welcome — submit a PR with new resources.