Setting up the environment

Creating the cluster

We will create a cluster called determined-seldon-cluster. It will have 2 node pools:

A node pool with a single non accelerated node, to host the Determined's master, Pachyderm and Seldon
A GPU accelerated node pool with autoscaling capabilities where each node will have 4 vCPUs, 15GB of memory and 4 NVIDIA K80 GPUs.

The cluster can be created just running the provided create-cluster.sh script: you only need to change the project's name (which is determined-ai) at the beginning and maybe some other default. This script will:

Create the cluster with the default node pool
Create the second, GPU accelerated, node pool for the cluster
Enable GPU acceleration on the cluster (it simply runs the following command: kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/master/nvidia-driver-installer/cos/daemonset-preloaded.yaml)
Create the bucket to store Determined's checkpoints

After we have our cluster up & running, we have to open an extra port in the GCP firewall, in order to allow Seldon to connect to Istio (as it is documented here ). We first need to figure out the name of the firewall rule that has been created for us and this is achieved through the gcloud command. Example:

gcloud compute firewall-rules list --filter="name~gke-determined-seldon-cluster-[0-9a-z]*-master"
NAME                                           NETWORK  DIRECTION  PRIORITY  ALLOW              DENY  DISABLED
gke-determined-seldon-cluster-e3e95d04-master  primary  INGRESS    1000      tcp:10250,tcp:443        False

After we get the firewall rule, we simply update it this way:

gcloud compute firewall-rules update gke-determined-seldon-cluster-e3e95d04-master --allow tcp:10250,tcp:443,tcp:15017
Updated [https://www.googleapis.com/compute/v1/projects/determined-ai/global/firewalls/gke-determined-seldon-cluster-e3e95d04-master].

Creating the buckets

The checkpoint bucket has been created with the script above, so we just need to create the buckets to store Pachyderm's repositories and data generated by Seldon's detectors (drift & outlier). It can be created with the following command:

gsutil mb -l us-central1 gs://determined-pachyderm-data
gsutil mb -l us-central1 gs://determined-seldon-detector

Installing Pachyderm

The first step is to include Pachyderm's repository to Helm:

helm repo add pach https://helm.pachyderm.com
helm repo update

Next, a CloudSQL instance for PostgreSQL must be created. If you want to use the provided pachyderm-values.yaml you have to create a pachyderm database with a pachyderm user having postgres.123 as password. In addition, the Cloud DNS zone named determined must be created with the pachyderm-db entry pointing to the CloudSQL's IP address. You may look at the provided pachyderm-values.yaml for the details.

The next step is to install Pachyderm, using the provided pachyderm-values.yaml file:

helm install pachyderm -f pachyderm-values.yaml pach/pachyderm --version 2.1.3

At this point Pachyderm is installed and we only need to link it to our pachctl command. First, we need to get the Pachyderm's public address. It can be done this way:

kubectl get services | grep pachd-lb | awk '{print $4}'
34.132.165.26

Then, the printed IP address (34.132.165.26 in this case) must be used in the commands below:

echo '{"pachd_address": "grpc://34.132.165.26:30650"}' | pachctl config set context "determined-seldon-context" --overwrite
pachctl config set active-context "determined-seldon-context"
pachctl version

More installation details for Pachyderm can be found here.

Installing Determined AI

The first step would be to download the latest Helm chart and unzip it to a folder. This has been already done for you and you can find the chart inside the determined-chart subfolder (version is 0.18.1 and it has been also customized a bit for this example). Determined can be installed issuing this command:

helm install determined determined-chart

You may also want to issue the following command to get the master's public IP:

kubectl get service determined-master-service-determined

Save this IP address to the DET_MASTER variable (this variable will be used by the det command). For example, if the IP address is 35.223.115.12 you have to issue the following command:

export DET_MASTER="35.223.115.12"

Just for completeness, the latest Helm chart can be downloaded from here:

https://docs.determined.ai/latest/_downloads/389266101877e29ab82805a88a6fc4a6/determined-latest.tgz

More installation details on Determined AI on Kubernetes can be found here.

Installing Seldon Deploy

Seldon Deploy can be installed using the instructions at the following link: https://deploy.seldon.io/en/v1.5/contents/getting-started/trial-installation/index.html#non-local . We are using a non-local installation and hence some packages are required before starting the process. These packages can be installed with the following commands:

sudo apt install python3-venv
sudo apt install rustc
pip install wheel
pip install bcrypt

If Seldon Deploy is correctly installed, running the kubectl get pods -n seldon-system command should provide this output:

$ kubectl get pods -n seldon-system
NAME                                                             READY   STATUS    RESTARTS   AGE
keycloak-0                                                       1/1     Running   0          31h
seldon-controller-manager-7f78464f7f-jk86c                       1/1     Running   0          37h
seldon-core-analytics-kube-state-metrics-94bb6cb9-m68qj          1/1     Running   0          37h
seldon-core-analytics-prometheus-alertmanager-6d9f85b55d-bhc86   2/2     Running   0          37h
seldon-core-analytics-prometheus-node-exporter-m6rm5             1/1     Running   0          37h
seldon-core-analytics-prometheus-pushgateway-8476474cff-jspkq    1/1     Running   0          37h
seldon-core-analytics-prometheus-seldon-55c65f8f48-fbksk         2/2     Running   0          37h
seldon-deploy-9797bc555-s7vtj                                    1/1     Running   0          31h

We also need to get the Istio service address, as it will be used to access Seldon's API. We can use the usual kubectl command for this:

$ kubectl get svc -n istio-system
NAME                    TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)                                      AGE
istio-ingressgateway    LoadBalancer   10.112.10.167   35.188.211.134   15021:32641/TCP,80:32490/TCP,443:32719/TCP   76d
istiod                  ClusterIP      10.112.4.57     <none>           15010/TCP,15012/TCP,443/TCP,15014/TCP        76d
knative-local-gateway   ClusterIP      10.112.9.246    <none>           80/TCP                                       50d

Here, we have to consider the external IP, which is 35.188.211.134.

Creating secrets

There are a couple of secrets that need to be created to store sensitive information. The first one will be used by Pachyderm and hence will be created into the default namespace (Pachyderm is installed there). The second one will be used by Seldon (actually by the serving image) and hence will be created into the seldon namespace.

The secret for Pachyderm

The secret can be easily created using the provided pachyderm-seldon/pipeline-secret.yaml file. Here is its content:

apiVersion: v1
kind: Secret
metadata:
  name: pipeline-secret
stringData:
  det_master: determined-master-service-determined.default:8080
  det_user: determined
  det_password: dai
  pac_token: PACHYDERM_TOKEN
  sel_url: https://SELDON_HOST
  sel_secret: sd-api-secret
  sel_namespace: seldon

As you can see, you just need to replace a few placeholders with the proper values:

PACHYDERM_TOKEN : we have to generate a token with the pachctl command and put it here (the process is described below)
SELDON_HOST : this should be replaced with the IP address of the Istio service address, described above

Now, with Pachyderm Enterprise, we have to generate a token and provide read access to our repositories. The token is generate with the following command (we will call the associated user seldon):

pachctl auth get-robot-token seldon

We will get an output like the following one:

Token: 3cb22a223d0d4b9c90cb88b4fc2a48bb

Finally, the secret can be created with the usual kubectl command:

$ kubectl apply -f pachyderm-secrets.yaml 
secret/pipeline-secret created

The secret for Seldon

Generally speaking, a Kubernetes cluster should have minimal permissions to operate on its surrounding environment (like buckets) and the applications should use service accounts in order to access resources. The serving image, run by Seldon, needs to access the Determined AI's bucket in order to download a checkpoint: if we create a service account for that, we can even run predictions locally, from our development environment. The account may have a generic name but it must have the "Storage Object Viewer" role. During the creation, generate and download the JSON key file, which must have the 'service-account.json' filename. We can then generate the needed secret this way:

kubectl create secret generic deployment-secret -n seldon --from-file=service-account.json

If you want to run predictions locally, you have to put this file into container/serve/config. Then, you can see the predictions running the container/serve/predict_local.py script.

Up | Next

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setting up the environment

Creating the cluster

Creating the buckets

Installing Pachyderm

Installing Determined AI

Installing Seldon Deploy

Creating secrets

The secret for Pachyderm

The secret for Seldon

FilesExpand file tree

environment.md

Latest commit

History

environment.md

File metadata and controls

Setting up the environment

Creating the cluster

Creating the buckets

Installing Pachyderm

Installing Determined AI

Installing Seldon Deploy

Creating secrets

The secret for Pachyderm

The secret for Seldon