This page was generated from notebooks/max_grpc_msg_size.ipynb.

Increasing the Maximum Message Size for gRPC

Running this notebook

You will need to start Jupyter with settings to allow for large payloads, for example:

jupyter notebook --NotebookApp.iopub_data_rate_limit=1000000000

Setup Seldon Core

Use the setup notebook to Setup Cluster with Ambassador Ingress and Install Seldon Core. Instructions also online.

[1]:
!pygmentize resources/model_long_timeouts.json
{
    "apiVersion": "machinelearning.seldon.io/v1alpha2",
    "kind": "SeldonDeployment",
    "metadata": {
        "labels": {
            "app": "seldon"
        },
        "name": "model-long-timeout"
    },
    "spec": {
        "annotations": {
            "deployment_version": "v1",
            "seldon.io/rest-read-timeout":"100000",
            "seldon.io/rest-connection-timeout":"100000",
            "seldon.io/grpc-read-timeout":"100000"
        },
        "name": "long-to",
        "oauth_key": "oauth-key",
        "oauth_secret": "oauth-secret",
        "predictors": [
            {
                "componentSpecs": [{
                    "spec": {
                        "containers": [
                            {
                                "image": "seldonio/mock_classifier:1.0",
                                "imagePullPolicy": "IfNotPresent",
                                "name": "classifier",
                                "resources": {
                                    "requests": {
                                        "memory": "1Mi"
                                    }
                                }
                            }
                        ],
                        "terminationGracePeriodSeconds": 20
                    }
                }],
                "graph": {
                    "children": [],
                    "name": "classifier",
                    "endpoint": {
                        "type" : "REST"
                    },
                    "type": "MODEL"
                },
                "name": "test",
                "replicas": 1,
                "annotations": {
                    "predictor_version" : "v1"
                }
            }
        ]
    }
}

Create Seldon Deployment

Deploy the runtime graph to kubernetes.

[2]:
!kubectl apply -f resources/model_long_timeouts.json -n seldon
seldondeployment.machinelearning.seldon.io/model-long-timeout created
[3]:
!kubectl rollout status deploy/long-to-test-7cd068f
Waiting for deployment "long-to-test-7cd068f" rollout to finish: 0 of 1 updated replicas are available...
deployment "long-to-test-7cd068f" successfully rolled out

Get predictions - no grpc max message size

[4]:
from seldon_core.seldon_client import SeldonClient
sc = SeldonClient(deployment_name="model-long-timeout",namespace="seldon",
                  grpc_max_send_message_length=50 * 1024 * 1024, grpc_max_receive_message_length=50 * 1024 * 1024)

Send a small request which should suceed.

[5]:
r = sc.predict(gateway="ambassador",transport="grpc")
print(r)
Success:True message:
Request:
data {
  tensor {
    shape: 1
    shape: 1
    values: 0.9674523986034762
  }
}

Response:
meta {
  puid: "skh2656gs2fa8r8na7h0khj78q"
  requestPath {
    key: "classifier"
    value: "seldonio/mock_classifier:1.0"
  }
}
data {
  names: "proba"
  tensor {
    shape: 1
    shape: 1
    values: 0.12463905953323141
  }
}

Send a large request which will be above the default gRPC message size and will fail.

[6]:
r = sc.predict(gateway="ambassador",transport="grpc",shape=(1000000,1))
print(r.success,r.msg)
--------------------------------------------------------------------------
_Rendezvous                              Traceback (most recent call last)
<ipython-input-6-e4a696bf468f> in <module>
----> 1 r = sc.predict(gateway="ambassador",transport="grpc",shape=(1000000,1))
      2 print(r.success,r.msg)

~/anaconda3/envs/seldoncore/lib/python3.7/site-packages/seldon_core/seldon_client.py in predict(self, gateway, transport, deployment_name, payload_type, oauth_key, oauth_secret, seldon_rest_endpoint, seldon_grpc_endpoint, gateway_endpoint, microservice_endpoint, method, shape, namespace, data, bin_data, str_data, json_data, names, gateway_prefix, headers, http_path)
    366                 return rest_predict_gateway(**k)
    367             elif k["transport"] == "grpc":
--> 368                 return grpc_predict_gateway(**k)
    369             else:
    370                 raise SeldonClientException("Unknown transport " + k["transport"])

~/anaconda3/envs/seldoncore/lib/python3.7/site-packages/seldon_core/seldon_client.py in grpc_predict_gateway(deployment_name, namespace, gateway_endpoint, shape, data, headers, payload_type, bin_data, str_data, json_data, grpc_max_send_message_length, grpc_max_receive_message_length, names, call_credentials, channel_credentials, **kwargs)
   1852         for k in headers:
   1853             metadata.append((k, headers[k]))
-> 1854     response = stub.Predict(request=request, metadata=metadata)
   1855     return SeldonClientPrediction(request, response, True, "")
   1856

~/anaconda3/envs/seldoncore/lib/python3.7/site-packages/grpc/_channel.py in __call__(self, request, timeout, metadata, credentials, wait_for_ready, compression)
    688         state, call, = self._blocking(request, timeout, metadata, credentials,
    689                                       wait_for_ready, compression)
--> 690         return _end_unary_response_blocking(state, call, False, None)
    691
    692     def with_call(self,

~/anaconda3/envs/seldoncore/lib/python3.7/site-packages/grpc/_channel.py in _end_unary_response_blocking(state, call, with_call, deadline)
    590             return state.response
    591     else:
--> 592         raise _Rendezvous(state, None, None, deadline)
    593
    594

_Rendezvous: <_Rendezvous of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "upstream connect error or disconnect/reset before headers. reset reason: remote reset"
        debug_error_string = "{"created":"@1575289220.338184210","description":"Error received from peer ipv6:[::1]:8003","file":"src/core/lib/surface/call.cc","file_line":1055,"grpc_message":"upstream connect error or disconnect/reset before headers. reset reason: remote reset","grpc_status":14}"
>
[7]:
!kubectl delete -f resources/model_long_timeouts.json
seldondeployment.machinelearning.seldon.io "model-long-timeout" deleted

Allowing larger gRPC messages

Now we change our SeldonDeployment to include a annotation for max grpx message size.

[8]:
!pygmentize resources/model_grpc_size.json
{
    "apiVersion": "machinelearning.seldon.io/v1alpha2",
    "kind": "SeldonDeployment",
    "metadata": {
        "labels": {
            "app": "seldon"
        },
        "name": "seldon-model"
    },
    "spec": {
        "annotations": {
            "seldon.io/grpc-max-message-size":"10000000",
            "seldon.io/rest-read-timeout":"100000",
            "seldon.io/rest-connection-timeout":"100000",
            "seldon.io/grpc-read-timeout":"100000"
        },
        "name": "test-deployment",
        "oauth_key": "oauth-key",
        "oauth_secret": "oauth-secret",
        "predictors": [
            {
                "componentSpecs": [{
                    "spec": {
                        "containers": [
                            {
                                "image": "seldonio/mock_classifier_grpc:1.0",
                                "imagePullPolicy": "IfNotPresent",
                                "name": "classifier",
                                "resources": {
                                    "requests": {
                                        "memory": "1Mi"
                                    }
                                }
                            }
                        ],
                        "terminationGracePeriodSeconds": 20
                    }
                }],
                "graph": {
                    "children": [],
                    "name": "classifier",
                    "endpoint": {
                        "type" : "GRPC"
                    },
                    "type": "MODEL"
                },
                "name": "grpc-size",
                "replicas": 1,
                "annotations": {
                    "predictor_version" : "v1"
                }
            }
        ]
    }
}
[9]:
!kubectl create -f resources/model_grpc_size.json -n seldon
seldondeployment.machinelearning.seldon.io/seldon-model created
[10]:
!kubectl rollout status deploy/test-deployment-grpc-size-fd60a01
Waiting for deployment "test-deployment-grpc-size-fd60a01" rollout to finish: 0 of 1 updated replicas are available...
deployment "test-deployment-grpc-size-fd60a01" successfully rolled out

Send a request via ambassador. This should succeed.

[12]:
sc = SeldonClient(deployment_name="seldon-model",namespace="seldon",
                  grpc_max_send_message_length=50 * 1024 * 1024, grpc_max_receive_message_length=50 * 1024 * 1024)
r = sc.predict(gateway="ambassador",transport="grpc",shape=(1000000,1))
print(r.success,r.msg)
True
[13]:
!kubectl delete -f resources/model_grpc_size.json -n seldon
seldondeployment.machinelearning.seldon.io "seldon-model" deleted