Cybersecurity Incident Prediction for Data Science Masters program at Drexel University
ATTENTION: This dataset was run in Google Colab originally which may interfere with the display of tables using the .show() command.
Also please note that this project was done for DSCI632 Applied Cloud Computing and the requirement for the class was that all work be done in the PySpark library to prove mastery of cloud computing techniques we developed throughout the class. Typically, I would have tended towards the use of Scikit-Learn for the well developed model library, XGBoost for efficient and high performing gradient boosted tree model, and possibly a Neural Network model in PyTorch or Keras for deep learning methods.