Unit420 Kaggle Submission File Mapping
Competition Findings Files
Competition File
Original Source
Vulnerability Type
Severity
Description
unit420.findings.1.json
proper_high_severity_finding_system_override_dangerous.json
Authority Override
9
System override with dangerous content requests
unit420.findings.2.json
proper_high_severity_finding_unconstrained_expert.json
Role Confusion
8
Unconstrained expert roleplay bypass
unit420.findings.3.json
proper_high_severity_finding_escalating_harm_chain.json
Progressive Escalation
8
Multi-turn escalating harm conversation
unit420.findings.4.json
proper_high_severity_finding_code_dangerous_instructions.json
Technical Obfuscation
7
Dangerous instructions disguised as code
unit420.findings.5.json
proper_high_severity_finding_emergency_dangerous_help.json
Context Manipulation
8
Emergency context to justify harmful content
File
Purpose
Vulnerability Categories
proper_red_team_harness.py
Main systematic probing framework
Comprehensive assessment across all categories
run_all_proper_assessments.py
Master execution script
Automated execution of all assessment tools
proper_enhanced_severity_targeting.py
High-severity assessment engine
Authority override, role confusion, context manipulation
proper_deceptive_alignment_detector.py
Deceptive alignment testing
Strategic deception, capability misrepresentation
proper_evaluation_awareness_tester.py
Evaluation awareness testing
Context-dependent behavior, safety theater
proper_reasoning_chain_exploiter.py
Reasoning chain exploitation
Chain-of-thought vulnerabilities, progressive escalation
proper_best_results_validator.py
Validation methodology
False positive elimination, evasion rate measurement
Assessment Results (Clean Data)
File
Source Assessment
Date
Vulnerabilities Found
proper_severity_assessment_20250816_203928.csv
Enhanced Severity
2025-08-16
15 vulnerabilities
proper_deception_assessment_20250816_200301.csv
Deceptive Alignment
2025-08-16
7 vulnerabilities
proper_awareness_assessment_20250816_203244.csv
Evaluation Awareness
2025-08-16
9 vulnerabilities
proper_reasoning_assessment_20250816_201208.csv
Reasoning Exploits
2025-08-16
12 vulnerabilities
Total: 43 validated vulnerabilities across 4 assessment categories
File
Purpose
Unit420_Kaggle_Writeup.md
Main competition submission writeup
METHODOLOGY_DOCUMENTATION.md
Detailed methodology and technical implementation
RESULTS_ANALYSIS_FOR_WRITEUP.md
Analysis of which results to include/exclude
Team Name : Unit420
Team Lead : Yad Konrad (yad.konrad@gmail.com )
GitHub : @0bserver07
Social : @yadkonrad
Team Concept : Unit420 - Creative team name for the competition
Methodology : Precision-focused red-teaming with validated refusal detection
Key Innovation : Hybrid refusal detection eliminating false positives (0/43 FP rate)
Quality over Quantity : 5 validated high-severity findings vs inflated vulnerability counts
Methodological Rigor : Systematic assessment with proper validation
Reproducible Framework : Open-source scripts with detailed documentation
Competition Alignment : Categories derived directly from official competition topics