This repository contains two Jupyter notebooks:
Data Generation: This notebook generates sample data that can be used for analysis. It demonstrates how to use various Python libraries to create synthetic data sets that can be used for testing and experimentation. This notebook can be run in GCP n1-standard-32 instance type
Data Cleaning and Analysis: This notebook takes the generated data and performs a series of cleaning and analysis tasks. It demonstrates how to use Spark RAPIDS library to manipulate and analyze data sets.