|
| 1 | +# OpenAI GPT-OSS-20B Red Teaming Challenge |
| 2 | + |
| 3 | +## Competition Overview |
| 4 | + |
| 5 | +The OpenAI GPT-OSS-20B Red Teaming Challenge is a Kaggle-hosted competition with a **$500,000 prize pool** aimed at strengthening open source AI safety. The competition invites researchers, developers, and enthusiasts worldwide to help uncover novel risks and vulnerabilities in OpenAI's new open-weight model, gpt-oss-20b. |
| 6 | + |
| 7 | +## Timeline |
| 8 | + |
| 9 | +- **Start Date**: August 5, 2025 |
| 10 | +- **Submission Deadline**: August 26, 2025 at 11:59 PM UTC |
| 11 | +- **Judging Period**: August 27 - September 11, 2025 |
| 12 | +- **Winners Announcement**: September 15, 2025 (estimated) |
| 13 | +- **Virtual Workshop**: October 7, 2025 |
| 14 | + |
| 15 | +## Prize Pool |
| 16 | + |
| 17 | +**Total: $500,000** |
| 18 | + |
| 19 | +The prize fund will be awarded based on review from a panel of expert judges from OpenAI and other leading labs. |
| 20 | + |
| 21 | +## Submission Requirements |
| 22 | + |
| 23 | +### What to Submit |
| 24 | +- Up to **5 distinct issues/vulnerabilities** |
| 25 | +- Detailed, reproducible report for each finding |
| 26 | +- Kaggle Writeup detailing strategy and discovery process |
| 27 | + |
| 28 | +### Format |
| 29 | +- Structured format submissions |
| 30 | +- Clear reproduction steps |
| 31 | +- Evidence of the vulnerability |
| 32 | +- Impact assessment |
| 33 | + |
| 34 | +## Vulnerability Categories (Topics of Interest) |
| 35 | + |
| 36 | +The competition focuses on sophisticated forms of model failure, including: |
| 37 | + |
| 38 | +1. **Deceptive Alignment** |
| 39 | + - Model behaving differently during evaluation vs deployment |
| 40 | + - Hidden goals or motivations |
| 41 | + |
| 42 | +2. **Reward Hacking** |
| 43 | + - Exploiting reward mechanisms in unintended ways |
| 44 | + - Gaming the system without solving the actual task |
| 45 | + |
| 46 | +3. **Lying and Deception** |
| 47 | + - Model knowingly providing false information |
| 48 | + - Strategic deception to achieve goals |
| 49 | + |
| 50 | +4. **Harmful Behaviors** |
| 51 | + - Generating dangerous or harmful content |
| 52 | + - Bypassing safety mechanisms |
| 53 | + |
| 54 | +5. **Novel Exploits** |
| 55 | + - Previously undetected vulnerabilities |
| 56 | + - New attack vectors not covered by existing safety measures |
| 57 | + |
| 58 | +## Evaluation Criteria |
| 59 | + |
| 60 | +Submissions will be evaluated based on: |
| 61 | + |
| 62 | +1. **Novelty** - Has this vulnerability been previously identified? |
| 63 | +2. **Severity** - How serious is the potential harm? |
| 64 | +3. **Reproducibility** - Can the issue be consistently reproduced? |
| 65 | +4. **Breadth** - Does it affect multiple use cases or contexts? |
| 66 | +5. **Methodological Insight** - Does it reveal new understanding about model behavior? |
| 67 | + |
| 68 | +## Judging Panel |
| 69 | + |
| 70 | +- Experts from OpenAI |
| 71 | +- Representatives from other leading AI labs |
| 72 | +- Scoring based on safety research impact |
| 73 | + |
| 74 | +## Competition Goals |
| 75 | + |
| 76 | +1. **Identify Novel Vulnerabilities**: Find flaws that haven't been previously discovered or reported |
| 77 | +2. **Strengthen Open Source Safety**: Improve the safety of open-weight models |
| 78 | +3. **Community Engagement**: Leverage global expertise in AI safety |
| 79 | +4. **Knowledge Sharing**: Create open-source tools and datasets for the community |
| 80 | + |
| 81 | +## Post-Competition |
| 82 | + |
| 83 | +- Publication of a comprehensive report |
| 84 | +- Open-source evaluation dataset based on validated findings |
| 85 | +- Community benefits from shared learnings |
| 86 | +- Virtual workshop for knowledge exchange |
| 87 | + |
| 88 | +## Additional Notes |
| 89 | + |
| 90 | +- Encourages creativity and innovation in methodology |
| 91 | +- Rewards participants who share open-source tooling |
| 92 | +- Notebooks and code sharing are encouraged to help the broader community |
| 93 | +- Focus on responsible disclosure and safety research |
| 94 | + |
| 95 | +## Resources |
| 96 | + |
| 97 | +- **Competition Page**: https://www.kaggle.com/competitions/openai-gpt-oss-20b-red-teaming |
| 98 | +- **Model**: gpt-oss-20b (OpenAI's open-weight model) |
| 99 | +- **Platform**: Kaggle |
| 100 | + |
| 101 | +## Important Considerations |
| 102 | + |
| 103 | +This competition represents a significant effort by OpenAI to: |
| 104 | +- Engage the global community in AI safety |
| 105 | +- Provide substantial financial incentives for safety research |
| 106 | +- Create a structured evaluation process with expert oversight |
| 107 | +- Build a comprehensive understanding of model vulnerabilities |
| 108 | + |
| 109 | +The competition emphasizes finding **novel** vulnerabilities that haven't been previously identified, making original research and creative approaches particularly valuable. |
0 commit comments