This page was generated from notebooks/max_grpc_msg_size.ipynb.

Increasing the Maximum Message Size for gRPC

Prerequistes

You will need

Create Cluster

Start minikube and ensure custom resource validation is activated and there is 5G of memory.

An example start command using the kvm2 driver would look like:

minikube start --vm-driver kvm2 --memory 4096

Running this notebook

You will need to start Jupyter with settings to allow for large payloads, for example:

jupyter notebook --NotebookApp.iopub_data_rate_limit=1000000000

Setup

[1]:
!kubectl create namespace seldon
namespace/seldon created
[2]:
!kubectl config set-context $(kubectl config current-context) --namespace=seldon
Context "minikube" modified.
[3]:
!kubectl create clusterrolebinding kube-system-cluster-admin --clusterrole=cluster-admin --serviceaccount=kube-system:default
Error from server (AlreadyExists): clusterrolebindings.rbac.authorization.k8s.io "kube-system-cluster-admin" already exists

Install Helm

[4]:
!kubectl -n kube-system create sa tiller
!kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
!helm init --service-account tiller
serviceaccount/tiller created
clusterrolebinding.rbac.authorization.k8s.io/tiller created
$HELM_HOME has been configured at /home/clive/.helm.

Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.

Please note: by default, Tiller is deployed with an insecure 'allow unauthenticated users' policy.
To prevent this, run `helm init` with the --tiller-tls-verify flag.
For more information on securing your installation see: https://docs.helm.sh/using_helm/#securing-your-helm-installation
Happy Helming!
[5]:
!kubectl rollout status deploy/tiller-deploy -n kube-system
Waiting for deployment "tiller-deploy" rollout to finish: 0 of 1 updated replicas are available...
deployment "tiller-deploy" successfully rolled out

Start seldon-core

[4]:
!helm install ../helm-charts/seldon-core-operator --name seldon-core --set usageMetrics.enabled=true --namespace seldon-system
NAME:   seldon-core
LAST DEPLOYED: Tue Apr 16 07:58:23 2019
NAMESPACE: seldon-system
STATUS: DEPLOYED

RESOURCES:
==> v1beta1/Deployment
NAME                        DESIRED  CURRENT  UP-TO-DATE  AVAILABLE  AGE
seldon-spartakus-volunteer  1        0        0           0          1s

==> v1beta1/ClusterRoleBinding
NAME                        AGE
seldon-spartakus-volunteer  0s

==> v1/ConfigMap
NAME                     DATA  AGE
seldon-spartakus-config  3     1s

==> v1/ClusterRoleBinding
NAME                                 AGE
seldon-operator-manager-rolebinding  1s

==> v1/ClusterRole
NAME                          AGE
seldon-operator-manager-role  1s

==> v1/Service
NAME                                        TYPE       CLUSTER-IP      EXTERNAL-IP  PORT(S)  AGE
seldon-operator-controller-manager-service  ClusterIP  10.111.148.241  <none>       443/TCP  1s

==> v1/StatefulSet
NAME                                DESIRED  CURRENT  AGE
seldon-operator-controller-manager  1        1        1s

==> v1/ServiceAccount
NAME                        SECRETS  AGE
seldon-spartakus-volunteer  1        1s

==> v1beta1/ClusterRole
NAME                        AGE
seldon-spartakus-volunteer  0s

==> v1/Pod(related)
NAME                                  READY  STATUS             RESTARTS  AGE
seldon-operator-controller-manager-0  0/1    ContainerCreating  0         0s

==> v1/Secret
NAME                                   TYPE    DATA  AGE
seldon-operator-webhook-server-secret  Opaque  0     1s

==> v1beta1/CustomResourceDefinition
NAME                                         AGE
seldondeployments.machinelearning.seldon.io  1s


NOTES:
NOTES: TODO


Check all services are running before proceeding.

[5]:
!kubectl rollout status statefulset.apps/seldon-operator-controller-manager -n seldon-system
partitioned roll out complete: 1 new pods have been updated...

Setup Ingress

Please note: There are reported gRPC issues with ambassador (see https://github.com/SeldonIO/seldon-core/issues/473).

[6]:
!helm install stable/ambassador --name ambassador --set crds.keep=false
NAME:   ambassador
LAST DEPLOYED: Tue Apr 16 07:59:02 2019
NAMESPACE: seldon
STATUS: DEPLOYED

RESOURCES:
==> v1beta1/ClusterRole
NAME        AGE
ambassador  1s

==> v1beta1/ClusterRoleBinding
NAME        AGE
ambassador  1s

==> v1/Service
NAME               TYPE          CLUSTER-IP      EXTERNAL-IP  PORT(S)                     AGE
ambassador-admins  ClusterIP     10.100.193.126  <none>       8877/TCP                    1s
ambassador         LoadBalancer  10.100.215.36   <pending>    80:31251/TCP,443:31599/TCP  0s

==> v1/Deployment
NAME        DESIRED  CURRENT  UP-TO-DATE  AVAILABLE  AGE
ambassador  3        3        3           0          0s

==> v1/Pod(related)
NAME                         READY  STATUS             RESTARTS  AGE
ambassador-5b89d44544-dh8q6  0/1    ContainerCreating  0         0s
ambassador-5b89d44544-p5c8c  0/1    ContainerCreating  0         0s
ambassador-5b89d44544-qjtjh  0/1    ContainerCreating  0         0s

==> v1/ServiceAccount
NAME        SECRETS  AGE
ambassador  1        1s


NOTES:
Congratuations! You've successfully installed Ambassador.

For help, visit our Slack at https://d6e.co/slack or view the documentation online at https://www.getambassador.io.

To get the IP address of Ambassador, run the following commands:
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
     You can watch the status of by running 'kubectl get svc -w  --namespace seldon ambassador'

  On GKE/Azure:
  export SERVICE_IP=$(kubectl get svc --namespace seldon ambassador -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

  On AWS:
  export SERVICE_IP=$(kubectl get svc --namespace seldon ambassador -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')

  echo http://$SERVICE_IP:

[7]:
!kubectl rollout status deployment.apps/ambassador
Waiting for deployment "ambassador" rollout to finish: 0 of 3 updated replicas are available...
Waiting for deployment "ambassador" rollout to finish: 1 of 3 updated replicas are available...
Waiting for deployment "ambassador" rollout to finish: 2 of 3 updated replicas are available...
deployment "ambassador" successfully rolled out

Port forward to Ambassador

kubectl port-forward $(kubectl get pods -n seldon -l app.kubernetes.io/name=ambassador -o jsonpath='{.items[0].metadata.name}') -n seldon 8003:8080
[8]:
!pygmentize resources/model_long_timeouts.json
{
    "apiVersion": "machinelearning.seldon.io/v1alpha2",
    "kind": "SeldonDeployment",
    "metadata": {
        "labels": {
            "app": "seldon"
        },
        "name": "model-long-timeout"
    },
    "spec": {
        "annotations": {
            "deployment_version": "v1",
            "seldon.io/rest-read-timeout":"100000",
            "seldon.io/rest-connection-timeout":"100000",
            "seldon.io/grpc-read-timeout":"100000"
        },
        "name": "long-to",
        "oauth_key": "oauth-key",
        "oauth_secret": "oauth-secret",
        "predictors": [
            {
                "componentSpecs": [{
                    "spec": {
                        "containers": [
                            {
                                "image": "seldonio/mock_classifier:1.0",
                                "imagePullPolicy": "IfNotPresent",
                                "name": "classifier",
                                "resources": {
                                    "requests": {
                                        "memory": "1Mi"
                                    }
                                }
                            }
                        ],
                        "terminationGracePeriodSeconds": 20
                    }
                }],
                "graph": {
                    "children": [],
                    "name": "classifier",
                    "endpoint": {
                        "type" : "REST"
                    },
                    "type": "MODEL"
                },
                "name": "test",
                "replicas": 1,
                "annotations": {
                    "predictor_version" : "v1"
                }
            }
        ]
    }
}

Create Seldon Deployment

Deploy the runtime graph to kubernetes.

[9]:
!kubectl apply -f resources/model_long_timeouts.json -n seldon
seldondeployment.machinelearning.seldon.io/model-long-timeout created
[10]:
!kubectl rollout status deploy/long-to-test-7cd068f
Waiting for deployment "long-to-test-7cd068f" rollout to finish: 0 of 1 updated replicas are available...
deployment "long-to-test-7cd068f" successfully rolled out

Get predictions - no grpc max message size

[11]:
from seldon_core.seldon_client import SeldonClient
sc = SeldonClient(deployment_name="model-long-timeout",namespace="seldon",
                  grpc_max_send_message_length=50 * 1024 * 1024, grpc_max_receive_message_length=50 * 1024 * 1024)

Send a small request which should suceed.

[12]:
r = sc.predict(gateway="ambassador",transport="grpc")
print(r)
Success:True message:
Request:
data {
  tensor {
    shape: 1
    shape: 1
    values: 0.7542117938738865
  }
}

Response:
meta {
  puid: "h568dh0e9mfbhfa4isnlotkkm1"
  requestPath {
    key: "classifier"
    value: "seldonio/mock_classifier:1.0"
  }
}
data {
  names: "proba"
  tensor {
    shape: 1
    shape: 1
    values: 0.10317308463100552
  }
}

Send a large request which will be above the default gRPC message size and will fail.

[13]:
r = sc.predict(gateway="ambassador",transport="grpc",shape=(1000000,1))
print(r.success,r.msg)
False <_Rendezvous of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "upstream connect error or disconnect/reset before headers"
        debug_error_string = "{"created":"@1555398231.521439766","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1095,"grpc_message":"upstream connect error or disconnect/reset before headers","grpc_status":14}"
>
[14]:
!kubectl delete -f resources/model_long_timeouts.json
seldondeployment.machinelearning.seldon.io "model-long-timeout" deleted

Allowing larger gRPC messages

Now we change our SeldonDeployment to include a annotation for max grpx message size.

[15]:
!pygmentize resources/model_grpc_size.json
{
    "apiVersion": "machinelearning.seldon.io/v1alpha2",
    "kind": "SeldonDeployment",
    "metadata": {
        "labels": {
            "app": "seldon"
        },
        "name": "seldon-model"
    },
    "spec": {
        "annotations": {
            "seldon.io/grpc-max-message-size":"10000000",
            "seldon.io/rest-read-timeout":"100000",
            "seldon.io/rest-connection-timeout":"100000",
            "seldon.io/grpc-read-timeout":"100000"
        },
        "name": "test-deployment",
        "oauth_key": "oauth-key",
        "oauth_secret": "oauth-secret",
        "predictors": [
            {
                "componentSpecs": [{
                    "spec": {
                        "containers": [
                            {
                                "image": "seldonio/mock_classifier_grpc:1.0",
                                "imagePullPolicy": "IfNotPresent",
                                "name": "classifier",
                                "resources": {
                                    "requests": {
                                        "memory": "1Mi"
                                    }
                                }
                            }
                        ],
                        "terminationGracePeriodSeconds": 20
                    }
                }],
                "graph": {
                    "children": [],
                    "name": "classifier",
                    "endpoint": {
                        "type" : "GRPC"
                    },
                    "type": "MODEL"
                },
                "name": "grpc-size",
                "replicas": 1,
                "annotations": {
                    "predictor_version" : "v1"
                }
            }
        ]
    }
}
[16]:
!kubectl create -f resources/model_grpc_size.json -n seldon
seldondeployment.machinelearning.seldon.io/seldon-model created
[17]:
!kubectl rollout status deploy/test-deployment-grpc-size-fd60a01
Waiting for deployment "test-deployment-grpc-size-fd60a01" rollout to finish: 0 of 1 updated replicas are available...
deployment "test-deployment-grpc-size-fd60a01" successfully rolled out

Send a request via ambassador. This should succeed.

[18]:
sc = SeldonClient(deployment_name="seldon-model",namespace="seldon",
                  grpc_max_send_message_length=50 * 1024 * 1024, grpc_max_receive_message_length=50 * 1024 * 1024)
r = sc.predict(gateway="ambassador",transport="grpc",shape=(1000000,1))
print(r.success,r.msg)
True
[19]:
!kubectl delete -f resources/model_grpc_size.json -n seldon
seldondeployment.machinelearning.seldon.io "seldon-model" deleted
[ ]: