Security Datasets

Based on Awesome-Cybersecurity-Datasets, we aim to prepare a database of security datasets of ALL kinds with PREPROCESSING available :)

Let's collectively make our lives easier when we search for data to showcase our cool methods!!

Format of documents

We will worry about the directory structure later, but the following must be included with each note:

The name of the dataset
Relevant tags for the dataset (see below)
A brief description of it
The location or instructions on how to gain access to the dataset
Bibtex citation for said dataset
Pre-processing instructions for those that do not know how to use the dataset
(Optional but preferred) PyTorch Dataset class for how to load the dataset - with Train/Val/Test split options

Curated list of tags!!!

IMPORTANT STANDARDIZATION: This repo will only be useful if we can accurately tag the datasets for easy lookup. Below are the features of interest and if any are updated, then the entire repository must be updated for consistency...

The tags are formated to work in Obsidian, an organisational tool that can link MD files based on these tags in a cool UI.

Tags	Purpose
#network_traffic, #host
#urls, #domain_names
#malware, #binaries
#webapps, #software
#email, #fraud, #phishing, #passwords
#simulated/environment, #simulated/users, #real/attackers, #real/users

Still a work in progress...

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
BETH_Dataset		BETH_Dataset
.gitignore		.gitignore
README.md		README.md
stratosphere_lab.md		stratosphere_lab.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Security Datasets

Format of documents

Curated list of tags!!!

About

Uh oh!

Releases

Packages

Uh oh!

Languages

ICL-ml4csec/security_datasets

Folders and files

Latest commit

History

Repository files navigation

Security Datasets

Format of documents

Curated list of tags!!!

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages