Pipeline example with OpenVINO inference execution engine

This notebook illustrates how you can serve ensemble of models using OpenVINO prediction model. The demo includes optimized ResNet50 and DenseNet169 models by OpenVINO model optimizer. They have reduced precision of graph operations from FP32 to INT8. It significantly improves the execution peformance with minimal impact on the accuracy. The gain is particulary visible with the latest Casade Lake CPU with VNNI extension.

pipeline

Install Seldon Core on Minikube or on any Kubernetes cluster

The Minikube example below assumes version 0.30.0 installed

It also assumes; * You have 4G of memory available * You have 4 CPU Cores available * You have 20G of free disk

If you already have Kubernetes cluster present, you can skip minikube setup steps

[25]:
!minikube start --memory 4096 --cpus 4 --disk-size 20g
Starting local Kubernetes v1.13.2 cluster...
Starting VM...
Getting VM IP address...
Moving files into cluster...
Setting up certs...
Connecting to cluster...
Setting up kubeconfig...
Stopping extra container runtimes...
Starting cluster components...
Verifying kubelet health ...
Verifying apiserver health ...
Kubectl is now configured to use the cluster.
Loading cached images from config file.


Everything looks great. Please enjoy minikube!
[2]:
!kubectl create namespace seldon
namespace/seldon created
[1]:
!kubectl config set-context $(kubectl config current-context) --namespace=seldon
Context "minikube" modified.
[4]:
!kubectl create clusterrolebinding kube-system-cluster-admin --clusterrole=cluster-admin --serviceaccount=kube-system:default
clusterrolebinding.rbac.authorization.k8s.io/kube-system-cluster-admin created
[5]:
!helm init
$HELM_HOME has been configured at /home/clive/.helm.

Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.

Please note: by default, Tiller is deployed with an insecure 'allow unauthenticated users' policy.
To prevent this, run `helm init` with the --tiller-tls-verify flag.
For more information on securing your installation see: https://docs.helm.sh/using_helm/#securing-your-helm-installation
Happy Helming!
[6]:
!kubectl rollout status deploy/tiller-deploy -n kube-system
Waiting for deployment "tiller-deploy" rollout to finish: 0 of 1 updated replicas are available...
deployment "tiller-deploy" successfully rolled out
[1]:
!helm install ../../../helm-charts/seldon-core-operator --name seldon-core --set usageMetrics.enabled=true --namespace seldon-system
NAME:   seldon-core
LAST DEPLOYED: Wed Apr 24 15:19:35 2019
NAMESPACE: seldon-system
STATUS: DEPLOYED

RESOURCES:
==> v1beta1/CustomResourceDefinition
NAME                                         AGE
seldondeployments.machinelearning.seldon.io  1s

==> v1/ClusterRole
seldon-operator-manager-role  1s

==> v1/ClusterRoleBinding
NAME                                 AGE
seldon-operator-manager-rolebinding  1s

==> v1/Service
NAME                                        TYPE       CLUSTER-IP      EXTERNAL-IP  PORT(S)  AGE
seldon-operator-controller-manager-service  ClusterIP  10.107.189.217  <none>       443/TCP  1s

==> v1/StatefulSet
NAME                                DESIRED  CURRENT  AGE
seldon-operator-controller-manager  1        1        1s

==> v1/Pod(related)
NAME                                  READY  STATUS             RESTARTS  AGE
seldon-operator-controller-manager-0  0/1    ContainerCreating  0         1s

==> v1/Secret
NAME                                   TYPE    DATA  AGE
seldon-operator-webhook-server-secret  Opaque  0     1s


NOTES:
NOTES: TODO


[2]:
!kubectl rollout status statefulset.apps/seldon-operator-controller-manager -n seldon-system
partitioned roll out complete: 1 new pods have been updated...

Setup Ingress

There are gRPC issues with the latest Ambassador, so we rewcommend 0.40.2 until these are fixed.

[4]:
!helm install stable/ambassador --name ambassador --set crds.keep=false
NAME:   ambassador
LAST DEPLOYED: Wed Apr 24 15:20:31 2019
NAMESPACE: seldon
STATUS: DEPLOYED

RESOURCES:
==> v1/Service
NAME               TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)                     AGE
ambassador-admins  ClusterIP     10.110.65.117  <none>       8877/TCP                    0s
ambassador         LoadBalancer  10.96.204.128  <pending>    80:32059/TCP,443:31261/TCP  0s

==> v1/Deployment
NAME        DESIRED  CURRENT  UP-TO-DATE  AVAILABLE  AGE
ambassador  3        3        3           0          0s

==> v1/Pod(related)
NAME                         READY  STATUS             RESTARTS  AGE
ambassador-5b89d44544-46tmd  0/1    ContainerCreating  0         0s
ambassador-5b89d44544-dlvhg  0/1    ContainerCreating  0         0s
ambassador-5b89d44544-nhzj2  0/1    ContainerCreating  0         0s

==> v1/ServiceAccount
NAME        SECRETS  AGE
ambassador  1        0s

==> v1beta1/ClusterRole
NAME        AGE
ambassador  0s

==> v1beta1/ClusterRoleBinding
NAME        AGE
ambassador  0s


NOTES:
Congratuations! You've successfully installed Ambassador.

For help, visit our Slack at https://d6e.co/slack or view the documentation online at https://www.getambassador.io.

To get the IP address of Ambassador, run the following commands:
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
     You can watch the status of by running 'kubectl get svc -w  --namespace seldon ambassador'

  On GKE/Azure:
  export SERVICE_IP=$(kubectl get svc --namespace seldon ambassador -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

  On AWS:
  export SERVICE_IP=$(kubectl get svc --namespace seldon ambassador -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')

  echo http://$SERVICE_IP:

[5]:
!kubectl rollout status deployment.apps/ambassador
Waiting for deployment "ambassador" rollout to finish: 0 of 3 updated replicas are available...
Waiting for deployment "ambassador" rollout to finish: 1 of 3 updated replicas are available...
Waiting for deployment "ambassador" rollout to finish: 2 of 3 updated replicas are available...
deployment "ambassador" successfully rolled out

(Optional) Install Jaeger

We will use the Jaeger All-in-1 resource found at the Jaeger Kubernetes repo.

[6]:
!kubectl create -f https://raw.githubusercontent.com/jaegertracing/jaeger-kubernetes/master/all-in-one/jaeger-all-in-one-template.yml -n seldon
deployment.extensions/jaeger created
service/jaeger-query created
service/jaeger-collector created
service/jaeger-agent created
service/zipkin created

Start Jaeger UI

minikube service jaeger-query -n seldon

(Optional) Build Model, Combiner and Transformer Images

This is optional step. You can skip building the docker images for the pipeline components and rely on the prebuilt versions in the public docker registry.

The commands below build the components on the docker registry inside the minikube.

Alternatively you can change the REGISTRY variable to your private one and drop the eval $(minikube docker-env) && phrase. In that case, after the images are built, you need to push them to your docker registry and update the images names in the pipeline json file.

[4]:
%env REGISTRY=docker.io/seldonio
env: REGISTRY=docker.io/seldonio
[18]:
!eval $(minikube docker-env) && cd resources/model && s2i build -E environment_grpc . ${REGISTRY}/seldon-core-s2i-openvino:0.1 ${REGISTRY}/seldon-openvino-prediction:0.1
---> Installing application source...
---> Installing dependencies ...
Looking in links: /whl
Collecting google-cloud-storage==1.13.0 (from -r requirements.txt (line 1))
  Url '/whl' is ignored. It is either a non-existing path or lacks a specific scheme.
Downloading https://files.pythonhosted.org/packages/d7/62/a2e3111bf4d1eb54fe86dec694418644e024eb059bf1e66ebdcf9f98ad70/google_cloud_storage-1.13.0-py2.py3-none-any.whl (59kB)
Collecting boto3==1.9.34 (from -r requirements.txt (line 2))
  Url '/whl' is ignored. It is either a non-existing path or lacks a specific scheme.
Downloading https://files.pythonhosted.org/packages/94/04/c48c102e11b0cb2e3d4a7bdda49647b40e2ae03279ce9ba935e4ae66ab89/boto3-1.9.34-py2.py3-none-any.whl (128kB)
  Url '/whl' is ignored. It is either a non-existing path or lacks a specific scheme.
Collecting google-cloud-core<0.29dev,>=0.28.0 (from google-cloud-storage==1.13.0->-r requirements.txt (line 1))
Downloading https://files.pythonhosted.org/packages/0f/41/ae2418b4003a14cf21c1c46d61d1b044bf02cf0f8f91598af572b9216515/google_cloud_core-0.28.1-py2.py3-none-any.whl
Collecting google-api-core<2.0.0dev,>=0.1.1 (from google-cloud-storage==1.13.0->-r requirements.txt (line 1))
  Url '/whl' is ignored. It is either a non-existing path or lacks a specific scheme.
Downloading https://files.pythonhosted.org/packages/8b/01/13758ff9b970008ccf9e0dcc3b86d0e01937d7485b9a2c6142c9c2bdb4da/google_api_core-1.7.0-py2.py3-none-any.whl (64kB)
Collecting google-resumable-media>=0.3.1 (from google-cloud-storage==1.13.0->-r requirements.txt (line 1))
  Url '/whl' is ignored. It is either a non-existing path or lacks a specific scheme.
Downloading https://files.pythonhosted.org/packages/e2/5d/4bc5c28c252a62efe69ed1a1561da92bd5af8eca0cdcdf8e60354fae9b29/google_resumable_media-0.3.2-py2.py3-none-any.whl
Collecting botocore<1.13.0,>=1.12.34 (from boto3==1.9.34->-r requirements.txt (line 2))
  Url '/whl' is ignored. It is either a non-existing path or lacks a specific scheme.
Downloading https://files.pythonhosted.org/packages/c8/6c/2058039815eb4eac4f2f7462ecae3e352e994d6618ba1f27114d9b985618/botocore-1.12.89-py2.py3-none-any.whl (5.2MB)
Collecting jmespath<1.0.0,>=0.7.1 (from boto3==1.9.34->-r requirements.txt (line 2))
  Url '/whl' is ignored. It is either a non-existing path or lacks a specific scheme.
Downloading https://files.pythonhosted.org/packages/b7/31/05c8d001f7f87f0f07289a5fc0fc3832e9a57f2dbd4d3b0fee70e0d51365/jmespath-0.9.3-py2.py3-none-any.whl
Collecting s3transfer<0.2.0,>=0.1.10 (from boto3==1.9.34->-r requirements.txt (line 2))
  Url '/whl' is ignored. It is either a non-existing path or lacks a specific scheme.
Downloading https://files.pythonhosted.org/packages/d7/14/2a0004d487464d120c9fb85313a75cd3d71a7506955be458eebfe19a6b1d/s3transfer-0.1.13-py2.py3-none-any.whl (59kB)
Requirement already satisfied: setuptools>=34.0.0 in /opt/conda/lib/python3.6/site-packages (from google-api-core<2.0.0dev,>=0.1.1->google-cloud-storage==1.13.0->-r requirements.txt (line 1)) (40.8.0)
Requirement already satisfied: requests<3.0.0dev,>=2.18.0 in /opt/conda/lib/python3.6/site-packages (from google-api-core<2.0.0dev,>=0.1.1->google-cloud-storage==1.13.0->-r requirements.txt (line 1)) (2.18.4)
Requirement already satisfied: protobuf>=3.4.0 in /opt/conda/lib/python3.6/site-packages (from google-api-core<2.0.0dev,>=0.1.1->google-cloud-storage==1.13.0->-r requirements.txt (line 1)) (3.6.1)
Collecting googleapis-common-protos!=1.5.4,<2.0dev,>=1.5.3 (from google-api-core<2.0.0dev,>=0.1.1->google-cloud-storage==1.13.0->-r requirements.txt (line 1))
  Url '/whl' is ignored. It is either a non-existing path or lacks a specific scheme.
Downloading https://files.pythonhosted.org/packages/61/29/1549f61917eadd11650e42b78b4afcfe9cb467157af4510ab8cb59535f14/googleapis-common-protos-1.5.6.tar.gz
Requirement already satisfied: six>=1.10.0 in /opt/conda/lib/python3.6/site-packages (from google-api-core<2.0.0dev,>=0.1.1->google-cloud-storage==1.13.0->-r requirements.txt (line 1)) (1.11.0)
Collecting pytz (from google-api-core<2.0.0dev,>=0.1.1->google-cloud-storage==1.13.0->-r requirements.txt (line 1))
  Url '/whl' is ignored. It is either a non-existing path or lacks a specific scheme.
Downloading https://files.pythonhosted.org/packages/61/28/1d3920e4d1d50b19bc5d24398a7cd85cc7b9a75a490570d5a30c57622d34/pytz-2018.9-py2.py3-none-any.whl (510kB)
Collecting google-auth<2.0dev,>=0.4.0 (from google-api-core<2.0.0dev,>=0.1.1->google-cloud-storage==1.13.0->-r requirements.txt (line 1))
  Url '/whl' is ignored. It is either a non-existing path or lacks a specific scheme.
Downloading https://files.pythonhosted.org/packages/4e/85/71b2dfbf5b4241cd031cc333ed71f90a271074a97cb2c517bb65f07a1a90/google_auth-1.6.2-py2.py3-none-any.whl (73kB)
Collecting docutils>=0.10 (from botocore<1.13.0,>=1.12.34->boto3==1.9.34->-r requirements.txt (line 2))
  Url '/whl' is ignored. It is either a non-existing path or lacks a specific scheme.
Downloading https://files.pythonhosted.org/packages/36/fa/08e9e6e0e3cbd1d362c3bbee8d01d0aedb2155c4ac112b19ef3cae8eed8d/docutils-0.14-py3-none-any.whl (543kB)
Collecting python-dateutil<3.0.0,>=2.1; python_version >= "2.7" (from botocore<1.13.0,>=1.12.34->boto3==1.9.34->-r requirements.txt (line 2))
  Url '/whl' is ignored. It is either a non-existing path or lacks a specific scheme.
Downloading https://files.pythonhosted.org/packages/41/17/c62faccbfbd163c7f57f3844689e3a78bae1f403648a6afb1d0866d87fbb/python_dateutil-2.8.0-py2.py3-none-any.whl (226kB)
Requirement already satisfied: urllib3<1.25,>=1.20; python_version >= "3.4" in /opt/conda/lib/python3.6/site-packages (from botocore<1.13.0,>=1.12.34->boto3==1.9.34->-r requirements.txt (line 2)) (1.22)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/conda/lib/python3.6/site-packages (from requests<3.0.0dev,>=2.18.0->google-api-core<2.0.0dev,>=0.1.1->google-cloud-storage==1.13.0->-r requirements.txt (line 1)) (3.0.4)
Requirement already satisfied: idna<2.7,>=2.5 in /opt/conda/lib/python3.6/site-packages (from requests<3.0.0dev,>=2.18.0->google-api-core<2.0.0dev,>=0.1.1->google-cloud-storage==1.13.0->-r requirements.txt (line 1)) (2.6)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.6/site-packages (from requests<3.0.0dev,>=2.18.0->google-api-core<2.0.0dev,>=0.1.1->google-cloud-storage==1.13.0->-r requirements.txt (line 1)) (2018.1.18)
Collecting cachetools>=2.0.0 (from google-auth<2.0dev,>=0.4.0->google-api-core<2.0.0dev,>=0.1.1->google-cloud-storage==1.13.0->-r requirements.txt (line 1))
  Url '/whl' is ignored. It is either a non-existing path or lacks a specific scheme.
Downloading https://files.pythonhosted.org/packages/39/2b/d87fc2369242bd743883232c463f28205902b8579cb68dcf5b11eee1652f/cachetools-3.1.0-py2.py3-none-any.whl
Collecting pyasn1-modules>=0.2.1 (from google-auth<2.0dev,>=0.4.0->google-api-core<2.0.0dev,>=0.1.1->google-cloud-storage==1.13.0->-r requirements.txt (line 1))
  Url '/whl' is ignored. It is either a non-existing path or lacks a specific scheme.
Downloading https://files.pythonhosted.org/packages/da/98/8ddd9fa4d84065926832bcf2255a2b69f1d03330aa4d1c49cc7317ac888e/pyasn1_modules-0.2.4-py2.py3-none-any.whl (66kB)
Collecting rsa>=3.1.4 (from google-auth<2.0dev,>=0.4.0->google-api-core<2.0.0dev,>=0.1.1->google-cloud-storage==1.13.0->-r requirements.txt (line 1))
  Url '/whl' is ignored. It is either a non-existing path or lacks a specific scheme.
Downloading https://files.pythonhosted.org/packages/02/e5/38518af393f7c214357079ce67a317307936896e961e35450b70fad2a9cf/rsa-4.0-py2.py3-none-any.whl
  Url '/whl' is ignored. It is either a non-existing path or lacks a specific scheme.
Collecting pyasn1<0.5.0,>=0.4.1 (from pyasn1-modules>=0.2.1->google-auth<2.0dev,>=0.4.0->google-api-core<2.0.0dev,>=0.1.1->google-cloud-storage==1.13.0->-r requirements.txt (line 1))
Downloading https://files.pythonhosted.org/packages/7b/7c/c9386b82a25115cccf1903441bba3cbadcfae7b678a20167347fa8ded34c/pyasn1-0.4.5-py2.py3-none-any.whl (73kB)
Building wheels for collected packages: googleapis-common-protos
Running setup.py bdist_wheel for googleapis-common-protos: started
Running setup.py bdist_wheel for googleapis-common-protos: finished with status 'done'
Stored in directory: /root/.cache/pip/wheels/da/6b/81/8573adcbe2aa2ecba92c341dfe19c5b5a733f4514297ba52b4
Successfully built googleapis-common-protos
Installing collected packages: googleapis-common-protos, pytz, cachetools, pyasn1, pyasn1-modules, rsa, google-auth, google-api-core, google-cloud-core, google-resumable-media, google-cloud-storage, jmespath, docutils, python-dateutil, botocore, s3transfer, boto3
Successfully installed boto3-1.9.34 botocore-1.12.89 cachetools-3.1.0 docutils-0.14 google-api-core-1.7.0 google-auth-1.6.2 google-cloud-core-0.28.1 google-cloud-storage-1.13.0 google-resumable-media-0.3.2 googleapis-common-protos-1.5.6 jmespath-0.9.3 pyasn1-0.4.5 pyasn1-modules-0.2.4 python-dateutil-2.8.0 pytz-2018.9 rsa-4.0 s3transfer-0.1.13
You are using pip version 10.0.1, however version 19.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Build completed successfully
[11]:
!eval $(minikube docker-env) && cd resources/combiner && s2i build -E environment_grpc . ${REGISTRY}/seldon-core-s2i-openvino:0.1 ${REGISTRY}/imagenet_combiner:0.1
---> Installing application source...
Build completed successfully
[12]:
!eval $(minikube docker-env) && cd resources/transformer && s2i build -E environment_grpc . ${REGISTRY}/seldon-core-s2i-openvino:0.1 ${REGISTRY}/imagenet_transformer:0.1
---> Installing application source...
---> Installing dependencies ...
Looking in links: /whl
You are using pip version 10.0.1, however version 19.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Build completed successfully

Deploy Seldon pipeline with Intel OpenVINO models ensemble

  • Ingest compressed JPEG binary and transform to TensorFlow Proto payload
  • Ensemble two OpenVINO optimized models for ImageNet classification: ResNet50, DenseNet169
  • Return result in human readable text
[7]:
!pip install graphviz
import sys
sys.path.append("../../../notebooks")
from visualizer import *
Requirement already satisfied: graphviz in /home/clive/anaconda3/lib/python3.6/site-packages (0.8.2)
You are using pip version 19.0.3, however version 19.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
[8]:
get_graph("seldon_ov_predict_ensemble.json")
[8]:
../_images/examples_openvino_ensemble_23_0.svg
[9]:
!pygmentize seldon_ov_predict_ensemble.json
{
  "apiVersion": "machinelearning.seldon.io/v1alpha2",
  "kind": "SeldonDeployment",
  "metadata": {
    "labels": {
      "app": "seldon"
    },
    "name": "openvino-model",
    "namespace": "seldon"
  },
    "spec": {
        "annotations" : {
            "seldon.io/grpc-read-timeout":"100000"
        },
    "name": "openvino",
    "predictors": [
      {
        "componentSpecs": [{
          "spec": {
            "containers": [
              {
                "name": "imagenet-itransformer",
                "image": "seldonio/openvino-demo-transformer:0.1",
                "env": [
                  {
                    "name": "TRACING",
                    "value": "1"
                  },
                  {
                    "name": "JAEGER_AGENT_HOST",
                    "value": "jaeger-agent"
                  },
                  {
                    "name": "DTYPE",
                    "value": "float32"
                  }
                ]
              },
              {
                "name": "imagenet-otransformer",
                "image": "seldonio/openvino-demo-transformer:0.1",
                "env": [
                  {
                    "name": "TRACING",
                    "value": "1"
                  },
                  {
                    "name": "JAEGER_AGENT_HOST",
                    "value": "jaeger-agent"
                  }
                ]
              },
              {
                "name": "imagenet-combiner",
                "image": "seldonio/openvino-demo-combiner:0.1",
                "env": [
                  {
                    "name": "TRACING",
                    "value": "1"
                  },
                  {
                    "name": "JAEGER_AGENT_HOST",
                    "value": "jaeger-agent"
                  }
                ]
              },
              {
                "name": "prediction1",
                "image": "seldonio/openvino-demo-prediction:0.1",
                "env": [
                  {
                    "name": "XML_PATH",
                    "value": "gs://intelai_public_models/densenet_169/1/densenet_169_i8.xml"
                  },
                  {
                    "name": "BIN_PATH",
                    "value": "gs://intelai_public_models/densenet_169/1/densenet_169_i8.bin"
                  },
                  {
                    "name": "http_proxy",
                    "value": ""
                  },
                  {
                    "name": "https_proxy",
                    "value": ""
                  },
                  {
                    "name": "TRACING",
                    "value": "1"
                  },
                  {
                    "name": "JAEGER_AGENT_HOST",
                    "value": "jaeger-agent"
                  }
                ]
              },
              {
                "name": "prediction2",
                "image": "seldonio/openvino-demo-prediction:0.1",
                "env": [
                   {
                     "name": "XML_PATH",
                     "value": "gs://intelai_public_models/resnet_50_i8/1/resnet_50_i8.xml"
                   },
                   {
                    "name": "BIN_PATH",
                    "value": "gs://intelai_public_models/resnet_50_i8/1/resnet_50_i8.bin"
                   },
                   {
                     "name": "http_proxy",
                    "value": ""
                   },
                   {
                     "name": "https_proxy",
                     "value": ""
                   },
                   {
                    "name": "TRACING",
                     "value": "1"
                   },
                   {
                     "name": "JAEGER_AGENT_HOST",
                     "value": "jaeger-agent"
                   }
                  ]
              }
            ],
            "terminationGracePeriodSeconds": 1
          }
        }],
        "graph": {
          "name": "imagenet-otransformer",
          "endpoint": { "type" : "GRPC" },
          "type": "OUTPUT_TRANSFORMER",
          "children": [
            {

              "name": "imagenet-itransformer",
              "endpoint": { "type" : "GRPC" },
              "type": "TRANSFORMER",
              "children": [
                {
                  "name": "imagenet-combiner",
                  "endpoint": { "type" : "GRPC" },
                  "type": "COMBINER",
                  "children": [
                    {
                      "name": "prediction1",
                      "endpoint": { "type" : "GRPC" },
                      "type": "MODEL",
                      "children": []
                    },
                    {
                      "name": "prediction2",
                      "endpoint": { "type" : "GRPC" },
                      "type": "MODEL",
                      "children": []
                    }
                  ]
                }
              ]
            }
          ]
        },
        "name": "openvino",
        "replicas": 1,
        "svcOrchSpec" : {
          "env": [
            {
              "name": "TRACING",
              "value": "1"
            },
            {
              "name": "JAEGER_AGENT_HOST",
              "value": "jaeger-agent"
            },
            {
              "name": "JAEGER_AGENT_PORT",
              "value": "5775"
            },
            {
              "name": "JAEGER_SAMPLER_TYPE",
              "value": "const"
            },
            {
              "name": "JAEGER_SAMPLER_PARAM",
              "value": "1"
            }
          ]
        }
      }
    ]
  }
}
[10]:
!kubectl apply -f seldon_ov_predict_ensemble.json
seldondeployment.machinelearning.seldon.io/openvino-model created

Executing the pipeline

Connectivity with the seldon pipeline

You may connect to the seldon ambassador endpoint using on of the following options: - Establish a tunnel over http via kubectl port-forward command.

Expose ambassador API endpoint outside of the Kubernetes cluster or connect to it via kubectl port-forward.

kubectl port-forward $(kubectl get pods -n seldon -l app.kubernetes.io/name=ambassador -o jsonpath='{.items[0].metadata.name}') -n seldon 8080:8080
  • Expose the service seldon-core-ambassador using a LoadBalancer or NodePort type.

kubectl edit service seldon-core-ambassador

Check the assigned External IP address with:

kubectl get service seldon-core-ambassador

Using the exemplary grpc client

Install client dependencies: seldon-core and grpcio packages

[11]:
!pip install -q seldon-core grpcio
You are using pip version 19.0.3, however version 19.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
[12]:
!python seldon_grpc_client.py --debug
meta {
  puid: "meabbkemsugfo0vc84j7vc19us"
  routing {
    key: "imagenet-combiner"
    value: -1
  }
  routing {
    key: "imagenet-itransformer"
    value: -1
  }
  routing {
    key: "imagenet-otransformer"
    value: -1
  }
  requestPath {
    key: "imagenet-combiner"
    value: "seldonio/openvino-demo-combiner:0.1"
  }
  requestPath {
    key: "imagenet-itransformer"
    value: "seldonio/openvino-demo-transformer:0.1"
  }
  requestPath {
    key: "imagenet-otransformer"
    value: "seldonio/openvino-demo-transformer:0.1"
  }
  requestPath {
    key: "prediction1"
    value: "seldonio/openvino-demo-prediction:0.1"
  }
  requestPath {
    key: "prediction2"
    value: "seldonio/openvino-demo-prediction:0.1"
  }
}
strData: "Eskimo dog, husky"

Duration 8459.775 ms
meta {
  puid: "3l9jq1tm0c5lu00gp476hca9bd"
  routing {
    key: "imagenet-combiner"
    value: -1
  }
  routing {
    key: "imagenet-itransformer"
    value: -1
  }
  routing {
    key: "imagenet-otransformer"
    value: -1
  }
  requestPath {
    key: "imagenet-combiner"
    value: "seldonio/openvino-demo-combiner:0.1"
  }
  requestPath {
    key: "imagenet-itransformer"
    value: "seldonio/openvino-demo-transformer:0.1"
  }
  requestPath {
    key: "imagenet-otransformer"
    value: "seldonio/openvino-demo-transformer:0.1"
  }
  requestPath {
    key: "prediction1"
    value: "seldonio/openvino-demo-prediction:0.1"
  }
  requestPath {
    key: "prediction2"
    value: "seldonio/openvino-demo-prediction:0.1"
  }
}
strData: "zebra"

Duration 7514.488 ms
meta {
  puid: "467pdas9vi2mb0t0hrvjajud4s"
  routing {
    key: "imagenet-combiner"
    value: -1
  }
  routing {
    key: "imagenet-itransformer"
    value: -1
  }
  routing {
    key: "imagenet-otransformer"
    value: -1
  }
  requestPath {
    key: "imagenet-combiner"
    value: "seldonio/openvino-demo-combiner:0.1"
  }
  requestPath {
    key: "imagenet-itransformer"
    value: "seldonio/openvino-demo-transformer:0.1"
  }
  requestPath {
    key: "imagenet-otransformer"
    value: "seldonio/openvino-demo-transformer:0.1"
  }
  requestPath {
    key: "prediction1"
    value: "seldonio/openvino-demo-prediction:0.1"
  }
  requestPath {
    key: "prediction2"
    value: "seldonio/openvino-demo-prediction:0.1"
  }
}
strData: "pelican"

Duration 8559.143 ms
average duration: 8177.802 ms
average accuracy: 100.0

For more extensive test see the client help.

You can change the default test-input file including labeled list of images to calculate accuracy based on complete imagenet dataset. Follow the format from file input_images.txt - path to the image and imagenet class in every line.

[28]:
!python seldon_grpc_client.py --help
usage: seldon_grpc_client.py [-h] [--repeats REPEATS] [--debug]
                             [--test-input TEST_INPUT]

optional arguments:
  -h, --help            show this help message and exit
  --repeats REPEATS
  --debug
  --test-input TEST_INPUT

Examining the logs

You can use Seldon containers logs to get additional details about the execution:

[2]:
!kubectl logs $(kubectl get pods -l seldon-app=openvino -o jsonpath='{.items[0].metadata.name}') prediction1 --tail=10
2019-02-12 10:32:38,646 - Prediction - DEBUG - Processing time: 27.29 ms
2019-02-12 10:32:38,646 - Prediction:predict:103 - DEBUG:  Processing time: 27.29 ms
2019-02-12 10:32:38,699 - Prediction - DEBUG - Processing time: 26.76 ms
2019-02-12 10:32:38,699 - Prediction:predict:103 - DEBUG:  Processing time: 26.76 ms
2019-02-12 10:37:59,123 - Prediction - DEBUG - Processing time: 27.28 ms
2019-02-12 10:37:59,123 - Prediction:predict:103 - DEBUG:  Processing time: 27.28 ms
2019-02-12 10:37:59,174 - Prediction - DEBUG - Processing time: 26.20 ms
2019-02-12 10:37:59,174 - Prediction:predict:103 - DEBUG:  Processing time: 26.20 ms
2019-02-12 10:37:59,228 - Prediction - DEBUG - Processing time: 27.33 ms
2019-02-12 10:37:59,228 - Prediction:predict:103 - DEBUG:  Processing time: 27.33 ms
[3]:
!kubectl logs $(kubectl get pods -l seldon-app=openvino -o jsonpath='{.items[0].metadata.name}') prediction2 --tail=10
2019-02-12 10:32:38,630 - Prediction - DEBUG - Processing time: 9.86 ms
2019-02-12 10:32:38,630 - Prediction:predict:103 - DEBUG:  Processing time: 9.86 ms
2019-02-12 10:32:38,681 - Prediction - DEBUG - Processing time: 9.32 ms
2019-02-12 10:32:38,681 - Prediction:predict:103 - DEBUG:  Processing time: 9.32 ms
2019-02-12 10:37:59,111 - Prediction - DEBUG - Processing time: 15.16 ms
2019-02-12 10:37:59,111 - Prediction:predict:103 - DEBUG:  Processing time: 15.16 ms
2019-02-12 10:37:59,158 - Prediction - DEBUG - Processing time: 9.16 ms
2019-02-12 10:37:59,158 - Prediction:predict:103 - DEBUG:  Processing time: 9.16 ms
2019-02-12 10:37:59,211 - Prediction - DEBUG - Processing time: 9.62 ms
2019-02-12 10:37:59,211 - Prediction:predict:103 - DEBUG:  Processing time: 9.62 ms
[4]:
!kubectl logs $(kubectl get pods -l seldon-app=openvino -o jsonpath='{.items[0].metadata.name}') imagenet-itransformer --tail=10
2019-02-12 10:37:59,086 - ImageNetTransformer:transform_input_grpc:43 - INFO:  jpeg preprocessing: 1.464 ms
2019-02-12 10:37:59,089 - ImageNetTransformer:transform_input_grpc:50 - INFO:  Total transformation: 4.042 ms
2019-02-12 10:37:59,137 - ImageNetTransformer:transform_input_grpc:33 - INFO:  Transform called
2019-02-12 10:37:59,139 - ImageNetTransformer:transform_input_grpc:40 - INFO:  Shape: (1, 3, 224, 224); Dtype: float32; Min: 0.0; Max: 255.0
2019-02-12 10:37:59,140 - ImageNetTransformer:transform_input_grpc:43 - INFO:  jpeg preprocessing: 2.222 ms
2019-02-12 10:37:59,142 - ImageNetTransformer:transform_input_grpc:50 - INFO:  Total transformation: 4.92 ms
2019-02-12 10:37:59,188 - ImageNetTransformer:transform_input_grpc:33 - INFO:  Transform called
2019-02-12 10:37:59,191 - ImageNetTransformer:transform_input_grpc:40 - INFO:  Shape: (1, 3, 224, 224); Dtype: float32; Min: 0.0; Max: 255.0
2019-02-12 10:37:59,191 - ImageNetTransformer:transform_input_grpc:43 - INFO:  jpeg preprocessing: 2.3249999999999997 ms
2019-02-12 10:37:59,194 - ImageNetTransformer:transform_input_grpc:50 - INFO:  Total transformation: 5.760999999999999 ms

Performance consideration

In production environment with a shared workloads, you might consider contraining the CPU resources for individual pipeline components. You might restrict the assigned capacity using Kubernetes capabilities. This configuration can be added to seldon pipeline definition.

Another option for tuning the resource allocation is adding environment variable OMP_NUM_THREADS. It can indicate how many threads will be used by OpenVINO execution engine and how many CPU cores can be consumed. The recommeded value is equal to the number of allocated CPU physical cores.

In the tests using GKE service in Google Cloud on nodes with 32 SkyLake vCPU assigned, the following configuration was set on prediction components. It achieved the optimal latency and throughput:

"resources": {
  "requests": {
     "cpu": "1"
  },
  "limits": {
     "cpu": "32"
  }
}

"env": [
  {
    "name": "KMP_AFFINITY",
    "value": "granularity=fine,verbose,compact,1,0"
  },
  {
    "name": "KMP_BLOCKTIME",
    "value": "1"
  },
  {
    "name": "OMP_NUM_THREADS",
    "value": "8"
  }
]
[ ]: