CLOUDY automates the execution of experiments on Google Cloud. It creates VM instances and buckets, installs dependencies, runs Python scripts, and handles resource cleanup.
The workflow of CLOUDY comprises the following steps:
-
The script
launch.shprepares a VM instance, according to the options specified in theconfig.jsonfile. -
The script
setup.shis executed in the VM to install dependencies and run the Python script indicated. -
The output is saved to an existing bucket, or a new one is created as required.
-
The instance is automatically deleted once its execution has finished.
This project consists of the following scripts:
launch.sh: creates a VM instance on Google Cloud according to the configuration defined inconfig.json. It also downloads and copies your repository to the VM instance.setup.sh: runs on the VM instance. Installs dependencies, runs your Python script, and saves the results to a Google Cloud bucket, creating it if necessary.clean.sh: cleans up all VM instances and buckets on Google Cloud.Makefile: enables the execution of the scripts through simple commands.
-
Prerequisites
First, create a service account on GCP with the required permissions for Compute Engine and Cloud Storage (e.g., storage administrator, compute instances administrator).
Then, install the following dependencies:
-
Google Cloud SDK: required to interact with Google Cloud from the command line. -
jq: used to read the JSON configuration file.
-
-
Edit
config.jsonDefine your custom configuration in the
config.jsonfile, located in the root directory of the project. For example:{ "INSTANCE_NAME": "vm", "BUCKET_NAME": "bucket", "REPO_URL": "https://github.com/manjavacas/cloudy.git", "SCRIPT_PATH": "foo/foo.py", "SCRIPT_ARGS": "cloudy", "DEPENDENCIES": "numpy pandas", "SERVICE_ACCOUNT": "[email protected]", "SETUP_SCRIPT": "setup.sh", "MACHINE_TYPE": "n2-standard-2", "ZONE": "europe-southwest1-b", "IMAGE_FAMILY": "ubuntu-2004-lts", "IMAGE_PROJECT": "ubuntu-os-cloud", "BUCKET_ZONE": "eu" }The main options to edit are:
INSTANCE_NAMEandBUCKET_NAME: identifiers for the created instance and bucket.REPO_URL: the repository to clone. This is where the code you want to execute is located.SCRIPT_PATHandSCRIPT_ARGS: path to the Python script you want to execute in the repository, along with its input arguments.DEPENDENCIES: dependencies required to run the Python script.SERVICE_ACCOUNT: GCP service account to be used. It must have the necessary permissions.
-
Run CLOUDY
a. Using
Makefile- To launch a VM instance, run:
$ make launch
- To clean up all VM instances and buckets, run:
$ make clean
- To delete VM instances and buckets and then relaunch, run:
$ make reset
b. Using
cloudy.pyAlternatively, you can use the Python script
cloud.pyfor the same operations:$ python cloudy.py launch $ python cloudy.py clean $ python cloudy.py reset

