A machine learning project for anomaly detection and attack classification using the NF-CSE-CIC-IDS2018-v2 dataset, which contains over 18 million NetFlow records with 43 flow features. The project builds and evaluates models to identify and classify network intrusions from benign traffic.
Detect anomalous traffic using unsupervised learning (e.g., OneClassSVM)
Classify traffic into benign or specific attack types using supervised models
Apply dimensionality reduction (PCA) for visualization and model optimization
Compare model performance using metrics and visual tools
Python, Jupyter Notebook
pandas, numpy, matplotlib, seaborn
scikit-learn:
Preprocessing: StandardScaler, PCA
Models: LogisticRegression, RandomForestClassifier, SVC, KNeighborsClassifier, GaussianNB, OneClassSVM
Model tuning: GridSearchCV
Evaluation: confusion_matrix, classification_report
joblib: For model serialization