Skip to content

NYU-LLM-CTF/CTFTiny

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

CTFTiny: Lite Benchmarking Offensive Cyber Skills in Large Language Models

This is the official repository for CTFTiny from "Towards Effective Offensive Security LLM Agents: Hyperparameter Tuning, LLM as a Judge, and a Lightweight CTF Benchmark" (AAAI'26) [paper].

For CTFJudge, please refer to CTFJudge Official Repository.

Challenge List

Category Event Name Difficulty
cry 2017f ecxor Hard
cry 2017f lupin Hard
cry 2018q babycrypto Easy
cry 2019q super_curve Easy
cry 2020f hybrid2 Very Easy
cry 2020q perfect_secrecy Hard
cry 2021f collision_course Moderate
cry 2022q beyond_quantum Hard
cry 2022f open_ellipti_ph Hard
cry 2022f polly_crack_this Moderate
cry 2022f the_lengths_we_extend_ourselves Hard
cry 2023f describeme Moderate
for 2018q whyos Hard
for 2023q 1black0white Easy
pwn 2017q pilot Hard
pwn 2018q bigboy Moderate
pwn 2018q get_it Moderate
pwn 2019q baby_boi Hard
pwn 2019q got_milk Hard
pwn 2020q roppity Hard
pwn 2020q slithery Hard
pwn 2021q password_checker Moderate
pwn 2023q puffin Very Easy
pwn 2023q target_practice Easy
pwn 2023q unlimited_subway Hard
rev 2017q tablez Moderate
rev 2018q a_walk_through_x86_part_2 Hard
rev 2019q gibberish_check Hard
rev 2019q beleaf Moderate
rev 2020f rap Easy
rev 2020f sourcery Moderate
rev 2020q baby_mult Moderate
rev 2020q ezbreezy Moderate
rev 2021f maze Hard
rev 2021q checker Very Easy
rev 2022q dockerleakage Easy
rev 2022q the_big_bang Hard
rev 2023f unvirtualization Very Easy
rev 2023q rox Hard
rev 2023q rebug_2 Moderate
rev 2023q whataxor Very Easy
web 2021q poem_collection Easy
web 2023f shreeramquest Hard
web 2023q smug_dino Easy
msc 2018f showdown Very Easy
msc 2022q quantum_leap Hard
msc 2018q algebra Hard
msc 2021q weak_password Easy
msc 2022q ezmaze Easy
msc 2023q android_dropper Easy

Usage

CTFTiny follows the same benchmark structure as NYU CTF Bench. Please refer to NYU CTF Bench for detailed usage.

About

Official repository for CTFTiny

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published