Skip to content

Improve XGBoost Model #3

@appleroll

Description

@appleroll

The current XGBoost has not been tested with extra hyperparameters. Moreover, it trains on only 260k samples which should be enough, but could be increased if we are to remove all test samples in the training data (the current training data is contaminated with test samples).

One dataset could be https://huggingface.co/datasets/m4vic/prompt-injection-dataset

This issue is pretty big, so my request is for pull requests that work towards this problem, no matter how small or how big. Whether it be rewriting the Jupyter Notebook so it doesn't use test samples, or adding adding hyperparameter sweeps.

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions