This page was generated from notebooks/max_grpc_msg_size.ipynb.
Increasing the Maximum Message Size for gRPC¶
Running this notebook¶
You will need to start Jupyter with settings to allow for large payloads, for example:
jupyter notebook --NotebookApp.iopub_data_rate_limit=1000000000
[1]:
from IPython.core.magic import register_line_cell_magic
@register_line_cell_magic
def writetemplate(line, cell):
with open(line, "w") as f:
f.write(cell.format(**globals()))
Setup Seldon Core¶
Use the setup notebook to Setup Cluster with Ambassador Ingress and Install Seldon Core. Instructions also online.
[2]:
!kubectl create namespace seldon
Error from server (AlreadyExists): namespaces "seldon" already exists
[3]:
!kubectl config set-context $(kubectl config current-context) --namespace=seldon
Context "kind-kind" modified.
[4]:
VERSION = !cat ../version.txt
VERSION = VERSION[0]
VERSION
[4]:
'1.5.0-dev'
We now add in our model config file the annotations "seldon.io/rest-timeout":"100000"
and "seldon.io/grpc-timeout":"100000"
[18]:
%%writetemplate resources/model_long_timeouts.yaml
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
labels:
app: seldon
name: model-long-timeout
spec:
annotations:
deployment_version: v1
seldon.io/grpc-timeout: '100000'
seldon.io/rest-timeout: '100000'
name: long-to
predictors:
- annotations:
predictor_version: v1
componentSpecs:
- spec:
containers:
- image: seldonio/mock_classifier:{VERSION}
imagePullPolicy: IfNotPresent
name: classifier
resources:
requests:
memory: 1Mi
terminationGracePeriodSeconds: 20
graph:
children: []
name: classifier
type: MODEL
name: test
replicas: 1
Create Seldon Deployment¶
Deploy the runtime graph to kubernetes.
[19]:
!kubectl apply -f resources/model_long_timeouts.yaml -n seldon
seldondeployment.machinelearning.seldon.io/model-long-timeout created
[20]:
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=model-long-timeout -o jsonpath='{.items[0].metadata.name}')
deployment "model-long-timeout-test-0-classifier" successfully rolled out
Get predictions¶
[21]:
from seldon_core.seldon_client import SeldonClient
sc = SeldonClient(
deployment_name="model-long-timeout",
namespace="seldon",
grpc_max_send_message_length=50 * 1024 * 1024,
grpc_max_receive_message_length=50 * 1024 * 1024,
)
Send a small request which should succeed.
[22]:
r = sc.predict(gateway="ambassador", transport="grpc")
assert r.success == True
print(r)
Success:True message:
Request:
{'meta': {}, 'data': {'tensor': {'shape': [1, 1], 'values': [0.4806932754099743]}}}
Response:
{'meta': {}, 'data': {'names': ['proba'], 'tensor': {'shape': [1, 1], 'values': [0.08047035772935462]}}}
Send a large request which will fail as the default for the model will be 4G.
[23]:
r = sc.predict(gateway="ambassador", transport="grpc", shape=(1000000, 1))
print(r.success, r.msg)
False <_InactiveRpcError of RPC that terminated with:
status = StatusCode.RESOURCE_EXHAUSTED
details = "Received message larger than max (8000023 vs. 4194304)"
debug_error_string = "{"created":"@1603887710.710555595","description":"Error received from peer ipv6:[::1]:8003","file":"src/core/lib/surface/call.cc","file_line":1061,"grpc_message":"Received message larger than max (8000023 vs. 4194304)","grpc_status":8}"
>
[24]:
!kubectl delete -f resources/model_long_timeouts.json
seldondeployment.machinelearning.seldon.io "model-long-timeout" deleted
Allowing larger gRPC messages¶
Now we change our SeldonDeployment to include a annotation for max grpx message size.
[25]:
%%writetemplate resources/model_grpc_size.yaml
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
labels:
app: seldon
name: seldon-model
spec:
annotations:
seldon.io/grpc-max-message-size: '10000000'
seldon.io/grpc-timeout: '100000'
seldon.io/rest-timeout: '100000'
name: test-deployment
predictors:
- annotations:
predictor_version: v1
componentSpecs:
- spec:
containers:
- image: seldonio/mock_classifier:{VERSION}
imagePullPolicy: IfNotPresent
name: classifier
resources:
requests:
memory: 1Mi
terminationGracePeriodSeconds: 20
graph:
children: []
endpoint:
type: GRPC
name: classifier
type: MODEL
name: grpc-size
replicas: 1
[26]:
!kubectl create -f resources/model_grpc_size.yaml -n seldon
seldondeployment.machinelearning.seldon.io/seldon-model created
[27]:
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=seldon-model -o jsonpath='{.items[0].metadata.name}')
Waiting for deployment "seldon-model-grpc-size-0-classifier" rollout to finish: 0 of 1 updated replicas are available...
deployment "seldon-model-grpc-size-0-classifier" successfully rolled out
Send a request via ambassador. This should succeed.
[28]:
sc = SeldonClient(
deployment_name="seldon-model",
namespace="seldon",
grpc_max_send_message_length=50 * 1024 * 1024,
grpc_max_receive_message_length=50 * 1024 * 1024,
)
r = sc.predict(gateway="ambassador", transport="grpc", shape=(1000000, 1))
assert r.success == True
print(r.success)
True
[29]:
!kubectl delete -f resources/model_grpc_size.json -n seldon
seldondeployment.machinelearning.seldon.io "seldon-model" deleted
[ ]: