In this module we’re going to walk through installing Kubeflow on IBM Bluemix. While this is very similar to installing Kubeflow on other cloud providers, there are some tricks and gotchyas specific to Bluemix we want the reader to be aware of.
-
Create a Kubernetes Cluster on IBM Bluemix
-
Create a Bucket in IBM Object Store and enable the S3 interface
-
Install Kubeflow
-
Train a simple Tensorflow model (MNIST)
-
Create a serving layer for the model to predict new hand written digits
Important
|
We presume the user has already signed up for an IBMCloud account, has installed the CLI tools, and has logged in at the command line. Please go to https://console.bluemix.net/docs/cli/index.html#overview for more info on installing the CLI tools. |
For this entire tutorial, there is an "easy" way which will get Kubeflow up and running on IBMCloud with minimal effort and understanding, and a "long way" that explains what the scripts are doing, and hopefully will be instructive in helping the user understand what, how, and why the script is doing the things that it is doing. The user is encouraged to use the easy way for assistance but to read through and understand the instructive way as needed.
The easy way is to clone the tutorial repo, login to IBMCloud, and run the create-k8s-cluster.sh
script. This can be
acheived in the following three lines of code (though login is interactive and may be slightly different depending on
the security on your account).
ibmcloud login
git clone https://github.com/intro-to-ml-with-kubeflow/ibm-install-guide
cd ;ibm-install-guide/create-k8s-cluster.sh
That will fire off a series of commands that will setup a small Kubernetes cluster on IBMCloud. It takes about ten minutes to set up a cluster, so now let’s go through all of the lines in the script and see what it is doing.
The first thing to do when setting up your Kubernetes cluster will be to name it. This can be anything. The default name
for this tutorial is kubeflow-tutorial
. Next you will select a zone in which to create your cluster. You can see a
full list of available zones with the command ibmcloud ks zones
. The script simply chooses the last zone in the list.
Next we will set the version of Kubernetes we want to run. We have chosen 1.10.11
as it is the latest stable version
available on IBMCloud at the time of writing. Next you will need to choose your machine types. To see a full list of
available machines you can run ibmcloud ks machine-types $ZONE
(replace $ZONE
with your zone). For the tutorial however,
we are going to use the smallest cheapest machines u2c.2x4
.
Next you may need to create new VLANs. If you have already set up a Kubernetes cluster in this zone however, you won’t
need to create them. To find out if VLANs exist in the zone run ibmcloud ks vlans $ZONE
. If that command returns an
empty list, then you can create the cluster (and the VLANs automatically), with the following command:
Warning
|
If you already have VLANs the previous command will throw an error- please use following command instead. |
If you already have VLANs you will need to explicitly set the public and private VLAN by their ID. The shell script
looks at the output of ibmcloud ks vlans $ZONE
and parses out the first public and private VLAN’s ID. You could
easily do this by just writing down the ID of a VLAN you wish to explicitly use.
To create an S3 bucket using the Bluemix Web GUI, first log into IBM Cloud. You will initially be brought to the Dashboard. In this section, we are going to provision a Cloud Object storage instance and setup a bucket with S3 interface. If you already have a Cloud Object Storage instance you wish to use, please skip ahead to step step number.
When you login you should be at the Dashboard. The first step is to click on the Catalog tab at the top of the screen.
In the next screen, search for 'Object Storage', you should see the following icon come up- click that.
In the next screen, feel free to name your service something fun or to use the name generated by the system. Click 'Create' when you’re done (in the bottom left corner).
Warning
|
You can only have one instance of a Lite plan per service. If you already have a Cloud Object Storage Lite, you’ll have to use that one. |
After you have created the Object Cloud Storage instance, you can create buckets within it. You should come to a page that looks like this. Click on the 'Create Bucket' button off to the right.
On the next page, let’s create our bucket. For the purposes of this tutorial, we’re going to call the bucket mnist-bucket-tutorial
.
Important
|
Set the resiliancy to "Cross Region" and the location to "us-geo". These are not the default settings. |
Finally scroll down and click "Create Bucket".
Note the tabs on the left, and click on "Service Credentials". You will see a blue box on the right that says "New credential". Click on that box.
A dialog will pop up. You can leave everything as defaults, however you must add {"HMAC":true}
into the Add Inline
Configuration Parameters box. Adding this parameter causes the service credential to create AWS_ACCESS_KEY_ID
and
AWS_SECRET_KEY_ID
.
After you’ve created the credentials, you should see them populate in the list in the Service credentials page. Click
on View credentials to expose a JSON of the credentials. In a key called cos_hmac_keys
you will see access_key_id
and secret_access_key
. In the directory where you have cloned the tutorial repo, create a file called set-aws-creds.sh
and fill it out as follows:
#!/usr/bin/env bash
export AWS_ACCESS_KEY_ID=<access_key_id>
export AWS_SECRET_ACCESS_KEY=<secret_access_key>
Where obviously, you replace <access_key_id>
and <secret_access_key>
with the values from the json. Finally at the
command prompt make the bash script executable with
chmod +x ./set-aws-creds.sh
That’s it! You’ve created an S3 bucket on IBMCloud!
Once the cluster is setup (you can check progress in the IBMCloud Web GUI), you have created a bucket, and entered your
credentials into set-aws-creds.sh
, you are ready to download the MNIST example, install Kubeflow, and submit the MNIST
job for training.
The "easy way" to do everything mentioned above is to run the script once-cluster-is-up.sh
. This will take care of
everything and even train and serve the model!