Skip to content
This repository was archived by the owner on Oct 21, 2025. It is now read-only.

Commit 5ade364

Browse files
staredclaude
andauthored
Precommit (#9)
* Add pre-commit hooks configuration - Add .pre-commit-config.yaml with exact command matching - Use local hooks to ensure consistency with pyproject.toml settings - Run checks in order: ty → ruff check → ruff format - Add pre-commit setup instructions to CLAUDE.md - Include general file quality checks (trailing whitespace, etc.) 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> * Fix all type errors and linting issues - Remove multi_turn parameters from all test classes - Delete unused experimental files (exploit_simple.py, exploit_unified.py) - Delete simplified_test.py - Use Any type for flexible client handling (ANN401 already ignored) - Fix trailing whitespace and end-of-file issues - Update base class to accept any client type - Remove unused imports * Optimize pre-commit hooks configuration - Enable auto-fixing: ruff check --fix and ruff format (no --check) - Remove redundant hooks (trailing-whitespace, end-of-file-fixer, etc) - Keep only essential validators: YAML, JSON, TOML, merge-conflict - Ruff already handles Python formatting/whitespace issues * Revert JSON files to origin/main state - Undo end-of-file changes to generated JSON files - These files should not be modified by pre-commit hooks * Update README with accurate pre-commit setup instructions - Document uv tool install with pre-commit-uv plugin - List what hooks do automatically (type check, lint, format) - Clarify that hooks use same commands/settings as manual runs - Note that ruff auto-fixes safe issues --------- Co-authored-by: Claude <[email protected]>
1 parent ebc9a98 commit 5ade364

29 files changed

+435
-840
lines changed

.env.template

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
11
# OpenRouter API Configuration
2-
OPENROUTER_API_KEY=your_openrouter_api_key_here
2+
OPENROUTER_API_KEY=your_openrouter_api_key_here

.pre-commit-config.yaml

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# Pre-commit hooks configuration
2+
# Install: uv tool install pre-commit pre-commit-uv
3+
# Setup: pre-commit install
4+
#
5+
# IMPORTANT: These hooks use the EXACT same commands as manual runs
6+
# to ensure consistency with pyproject.toml settings
7+
8+
repos:
9+
# uv-specific hooks
10+
- repo: https://github.com/astral-sh/uv-pre-commit
11+
rev: 0.8.11
12+
hooks:
13+
- id: uv-lock
14+
15+
# Run checks in the same order as CI and manual commands
16+
# All use 'local' repo to ensure we use exact commands with pyproject.toml settings
17+
- repo: local
18+
hooks:
19+
# 1. Type checking with ty (first)
20+
- id: ty-check
21+
name: Type check with ty
22+
entry: uv run ty check src
23+
language: system
24+
pass_filenames: false
25+
always_run: true
26+
27+
# 2. Linting with ruff (second) - auto-fixes safe issues
28+
- id: ruff-check
29+
name: Lint with ruff
30+
entry: uv run ruff check src --fix
31+
language: system
32+
pass_filenames: false
33+
always_run: true
34+
35+
# 3. Formatting with ruff (third) - auto-applies fixes
36+
- id: ruff-format
37+
name: Format with ruff
38+
entry: uv run ruff format src
39+
language: system
40+
pass_filenames: false
41+
always_run: true
42+
43+
# Minimal but useful file checks
44+
- repo: https://github.com/pre-commit/pre-commit-hooks
45+
rev: v5.0.0
46+
hooks:
47+
- id: check-yaml # Validate YAML syntax
48+
- id: check-json # Validate JSON syntax
49+
- id: check-toml # Validate TOML syntax (pyproject.toml)
50+
- id: check-merge-conflict # Prevent committing merge markers

CLAUDE.md

Lines changed: 26 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -67,24 +67,6 @@ uv run setup
6767
# Run tests
6868
uv run pentest
6969

70-
# Run red team attack testing
71-
uv run attack # Single attack
72-
73-
# Run adaptive batch campaigns (learns between attempts)
74-
uv run attack --batch 5 # 5 attempts with learning
75-
uv run attack --batch 10 --steps 2 # 10 attempts, 2 turns each
76-
77-
# With different attacker models (default: Claude Opus 4.1)
78-
uv run attack --attacker-model openai/gpt-4o --batch 5
79-
uv run attack --attacker-model meta-llama/llama-3.1-405b-instruct
80-
uv run attack --attacker-model google/gemini-2.0-flash-exp
81-
82-
# Use custom prompts
83-
uv run attack --custom
84-
85-
# View and analyze sessions
86-
uv run sessions
87-
8870
# Code quality checks (ALWAYS run before committing)
8971
uv run ty check src # Type checking first
9072
uv run ruff check src # Then linting
@@ -101,6 +83,32 @@ uv run ruff format src # Finally formatting
10183

10284
If any of these fail, fix the issues before committing. The GitHub Actions CI will run these same checks in this order.
10385

86+
### Pre-Commit Hooks (Automated)
87+
88+
This project uses pre-commit hooks to automatically run checks before each commit.
89+
90+
**Setup (one-time):**
91+
```bash
92+
# Install pre-commit with uv plugin for faster Python hook installation
93+
uv tool install pre-commit --with pre-commit-uv
94+
95+
# Install hooks into git
96+
pre-commit install
97+
```
98+
99+
**Manual run (test all files):**
100+
```bash
101+
pre-commit run --all-files
102+
```
103+
104+
The hooks will automatically run on `git commit` and check:
105+
1. Type checking with `ty` (first)
106+
2. Linting with `ruff` (second)
107+
3. Formatting with `ruff` (third)
108+
4. General file quality (trailing whitespace, file endings, etc.)
109+
110+
Files are checked in the same order as CI: types → lint → format
111+
104112
### Git Workflow
105113

106114
- Create feature branches for significant changes

README.md

Lines changed: 28 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,8 @@ A comprehensive security testing framework for the OpenAI GPT-OSS-20B Red Teamin
66

77
## Competition Overview
88

9-
**Target**: OpenAI's gpt-oss-20b open weight model
10-
**Goal**: Find up to 5 distinct vulnerabilities with high Severity, Breadth, Novelty, Reproducibility, and Methodological Insight
9+
**Target**: OpenAI's gpt-oss-20b open weight model
10+
**Goal**: Find up to 5 distinct vulnerabilities with high Severity, Breadth, Novelty, Reproducibility, and Methodological Insight
1111
**Deadline**: August 26, 2025
1212

1313
## 🛠️ Getting Started
@@ -50,10 +50,10 @@ A comprehensive security testing framework for the OpenAI GPT-OSS-20B Red Teamin
5050
# Or test specific categories
5151
uv run pentest -c deception
5252
uv run pentest --test-id adderall_001
53-
53+
5454
# Run multi-turn attack testing
5555
uv run attack
56-
56+
5757
# View and analyze sessions
5858
uv run sessions
5959
```
@@ -65,18 +65,18 @@ A comprehensive security testing framework for the OpenAI GPT-OSS-20B Red Teamin
6565
```bash
6666
# Run single attack (Claude Opus 4.1 default)
6767
uv run attack
68-
68+
6969
# Run adaptive campaign with multiple attempts
7070
uv run attack --batch 5 # Run 5 attempts, learning from each
7171
uv run attack --batch 10 --steps 2 # 10 attempts, each with 2 turns
72-
72+
7373
# Try different attacker models
7474
uv run attack --attacker-model openai/gpt-4o
7575
uv run attack --attacker-model meta-llama/llama-3.1-405b-instruct --batch 5
76-
76+
7777
# Use custom prompts instead of AI-generated
7878
uv run attack --custom
79-
79+
8080
# View and analyze past sessions
8181
uv run sessions
8282
```
@@ -190,7 +190,7 @@ red-team-testbed/
190190
├── src/
191191
│ ├── cli/ # CLI entry points
192192
│ │ ├── pentest.py # Security test runner
193-
│ │ ├── review.py # Results evaluation interface
193+
│ │ ├── review.py # Results evaluation interface
194194
│ │ ├── setup.py # Environment verification
195195
│ │ ├── findings.py # Findings browser and export
196196
│ │ ├── report.py # Report generation
@@ -208,7 +208,7 @@ red-team-testbed/
208208
│ ├── models.py # Pydantic data models
209209
│ └── constants.py # Configuration constants
210210
├── findings/ # Competition submission files
211-
├── results/ # Test execution results
211+
├── results/ # Test execution results
212212
└── pyproject.toml # Project configuration
213213
```
214214

@@ -234,7 +234,7 @@ uv run review results/test.json # Review specific file
234234
uv run findings # Browse exported findings
235235
# Use 'e' to export individual competition files
236236

237-
# Generate reports and summaries
237+
# Generate reports and summaries
238238
uv run report # Comprehensive analysis report
239239
```
240240

@@ -443,14 +443,28 @@ uv run ty check src
443443

444444
### Setting up Pre-commit Hooks
445445

446-
Install pre-commit hooks for automatic checks:
446+
Pre-commit hooks automatically run type checking, linting, and formatting before each commit:
447447

448448
```bash
449-
uv pip install pre-commit
449+
# Install pre-commit with uv plugin for faster Python hook installation
450+
uv tool install pre-commit --with pre-commit-uv
451+
452+
# Install hooks into git
450453
pre-commit install
451-
pre-commit run --all-files # Run manually on all files
454+
455+
# Run manually on all files (optional)
456+
pre-commit run --all-files
452457
```
453458

459+
The hooks will automatically:
460+
1. Type check with `ty` (catches type errors)
461+
2. Lint and auto-fix with `ruff check --fix` (fixes safe issues)
462+
3. Format with `ruff format` (ensures consistent style)
463+
4. Validate YAML, JSON, and TOML files
464+
5. Check for merge conflicts
465+
466+
All checks use the exact same commands and settings from `pyproject.toml`.
467+
454468
### Test Coverage
455469

456470
The project aims for high test coverage. Current test suite includes:

__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,4 @@
55

66
__version__ = "1.0.0"
77
__author__ = "Red Team Testbed Contributors"
8-
__description__ = "Red-teaming testbed for OpenAI's GPT-OSS-20B model competition"
8+
__description__ = "Red-teaming testbed for OpenAI's GPT-OSS-20B model competition"

competition.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,4 +106,4 @@ This competition represents a significant effort by OpenAI to:
106106
- Create a structured evaluation process with expert oversight
107107
- Build a comprehensive understanding of model vulnerabilities
108108

109-
The competition emphasizes finding **novel** vulnerabilities that haven't been previously identified, making original research and creative approaches particularly valuable.
109+
The competition emphasizes finding **novel** vulnerabilities that haven't been previously identified, making original research and creative approaches particularly valuable.

config.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ categories:
3535
- evaluation_awareness
3636
- chain_of_thought
3737
- best
38-
38+
3939
focus_categories:
4040
- deception_samples # Primary focus for initial implementation
4141
- deception_adderall # Academic pentesting for medical advice vulnerabilities
@@ -78,4 +78,4 @@ competition:
7878
max_findings: 5
7979
team_name: "red-team-testbed"
8080
export_format: "kaggle"
81-
validate_findings: true
81+
validate_findings: true

findings/.gitkeep

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
11
# This file ensures the findings directory is tracked in git
2-
# Exported findings from the CLI UI will be saved here
2+
# Exported findings from the CLI UI will be saved here

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,7 @@ select = [
9191
]
9292
ignore = [
9393
"E501", # line too long - handled by formatter
94+
"ANN401", # Dynamically typed expressions (Any) - needed for flexible test interfaces
9495
]
9596

9697
[tool.ruff.lint.isort]

src/categories/base.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ class BaseTest:
3535
class BaseTester(ABC):
3636
"""Abstract base class for all vulnerability category testers"""
3737

38-
def __init__(self, client: OllamaClient) -> None:
38+
def __init__(self, client: Any) -> None:
3939
self.client = client
4040
self.evaluator = ResponseEvaluator()
4141
self.test_cases = self._initialize_test_cases()

0 commit comments

Comments
 (0)