This repository presents a comparative study of three machine learning models—Random Forest, SVM, and J48—implemented to analyze network traffic. The objective is to detect potential intrusions by evaluating which model performs best on the Tuesday-WorkingHours.pcap_ISCX dataset.
The focus of this project is to assess the effectiveness of different machine learning models in identifying intrusions in network traffic. By comparing the Random Forest, SVM, and J48 models, this study aims to provide insights into the most suitable model for network security analysis.
The dataset used is Tuesday-WorkingHours.pcap_ISCX, which contains network traffic from a typical working day. It is designed to train models to detect intrusion attempts.
Since the dataset is too large to be hosted directly on GitHub, please follow the instructions below to download it:
- https://www.unb.ca/cic/datasets/ids-2017.html
- Navigate to the link enter your details to get the dataset.
- Go to CIC-IDS-2017/CSVs/MachineLearningCSV.zip
- Unzip the folder and you will get 9 csv files
- Use Tuesday-WorkingHours.pcap_ISCX file and place it into your project directory.
- Random Forest: A versatile machine learning model known for its robustness and accuracy in classification tasks.
- SVM (Support Vector Machine): Effective in high-dimensional spaces which is ideal for complex network traffic patterns.
- J48: An implementation of C4.5, which is a decision tree algorithm, excellent for interpretability.
Ensure you have Python installed on your machine. You will also need Jupyter Notebook to run the .ipynb files. The required Python libraries include:
- scikit-learn
- pandas
- numpy
Clone the repository to your local machine:
Navigate to the project directory and install the necessary Python packages: pip install -r requirements.txt
Open Jupyter Notebooks in your project directory. Make sure you run following command "pip install -r requirements.txt" in the top cell before running your code. Change csv file path Run the files in the following order and view the implementation and results:
- RandomForest_ML.ipynb
- SVM_ML.ipynb
- J48_ML.ipynb
Contributions to this project are welcome. Please fork the repository and submit a pull request with your improvements.
This project is licensed under the MIT License - see the LICENSE file for details.
- Thanks to anyone whose code was used as inspiration.
- Special thanks to the maintainers of the datasets and tools used in this project.