This page was generated from notebooks/protocol_examples.ipynb.

Basic Examples with Different Protocols

Prerequisites

  • A kubernetes cluster with kubectl configured

  • curl

  • grpcurl

  • pygmentize

Setup Seldon Core

Use the setup notebook to Setup Cluster to setup Seldon Core with an ingress - either Ambassador or Istio.

Then port-forward to that ingress on localhost:8003 in a separate terminal either with:

  • Ambassador: kubectl port-forward $(kubectl get pods -n seldon -l app.kubernetes.io/name=ambassador -o jsonpath='{.items[0].metadata.name}') -n seldon 8003:8080

  • Istio: kubectl port-forward $(kubectl get pods -l istio=ingressgateway -n istio-system -o jsonpath='{.items[0].metadata.name}') -n istio-system 8003:80

[ ]:
!kubectl create namespace seldon
[ ]:
!kubectl config set-context $(kubectl config current-context) --namespace=seldon
[ ]:
import json

Seldon Protocol REST Model

We will deploy a model with a GRPC endpoint that uses the SELDON Protocol namely by specifying the attribute protocol: seldon

[ ]:
%%writefile resources/model_seldon_rest.yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: rest-seldon
spec:
  name: restseldon
  protocol: seldon
  transport: rest
  predictors:
  - componentSpecs:
    - spec:
        containers:
        - image: seldonio/mock_classifier_rest:1.3
          name: classifier
    graph:
      name: classifier
      type: MODEL
    name: model
    replicas: 1
[ ]:
!kubectl apply -f resources/model_seldon_rest.yaml
[ ]:
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=rest-seldon -o jsonpath='{.items[0].metadata.name}')
[ ]:
for i in range(60):
    state=!kubectl get sdep rest-seldon -o jsonpath='{.status.state}'
    state=state[0]
    print(state)
    if state=="Available":
        break
    time.sleep(1)
assert(state=="Available")

We can now send requests using the Seldon protocol format.

[ ]:
X=!curl -s -d '{"data": {"ndarray":[[1.0, 2.0, 5.0]]}}' \
   -X POST http://localhost:8003/seldon/seldon/rest-seldon/api/v1.0/predictions \
   -H "Content-Type: application/json"
d=json.loads(X[0])
print(d)
assert(d["data"]["ndarray"][0][0] > 0.4)
[ ]:
!kubectl delete -f resources/model_seldon_rest.yaml

Seldon Protocol GRPC Model

We will deploy a model with a GRPC endpoint that uses the SELDON Protocol namely by specifying the attribute protocol: seldon

[ ]:
%%writefile resources/model_seldon_grpc.yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: grpc-seldon
spec:
  name: grpcseldon
  protocol: seldon
  transport: grpc
  predictors:
  - componentSpecs:
    - spec:
        containers:
        - image: seldonio/mock_classifier_grpc:1.3
          name: classifier
    graph:
      name: classifier
      type: MODEL
      endpoint:
        type: GRPC
    name: model
    replicas: 1
[ ]:
!kubectl apply -f resources/model_seldon_grpc.yaml
[ ]:
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=grpc-seldon \
                                 -o jsonpath='{.items[0].metadata.name}')
[ ]:
for i in range(60):
    state=!kubectl get sdep grpc-seldon -o jsonpath='{.status.state}'
    state=state[0]
    print(state)
    if state=="Available":
        break
    time.sleep(1)
assert(state=="Available")

We can now send a set of GRPC requests using grpcurl and leveraging the Seldon protocol

[ ]:
X=!cd ../executor/proto && grpcurl -d '{"data":{"ndarray":[[1.0,2.0,5.0]]}}' \
         -rpc-header seldon:grpc-seldon -rpc-header namespace:seldon \
         -plaintext \
         -proto ./prediction.proto  0.0.0.0:8003 seldon.protos.Seldon/Predict
d=json.loads("".join(X))
print(d)
assert(d["data"]["ndarray"][0][0] > 0.4)
[ ]:
!kubectl delete -f resources/model_seldon_grpc.yaml

Tensorflow Protocol REST Model

We will deploy a model with a REST endpoint that uses the TENSORLFOW Protocol namely by specifying the attribute protocol: tensorflow

[ ]:
%%writefile resources/model_tfserving_rest.yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: rest-tfserving
spec:
  name: resttfserving
  protocol: tensorflow
  transport: rest
  predictors:
  - componentSpecs:
    - spec:
        containers:
        - args:
          - --port=8500
          - --rest_api_port=8501
          - --model_name=halfplustwo
          - --model_base_path=gs://seldon-models/tfserving/half_plus_two
          image: tensorflow/serving
          name: halfplustwo
          ports:
          - containerPort: 8501
            name: http
    graph:
      name: halfplustwo
      type: MODEL
      endpoint:
        service_port: 8501
    name: model
    replicas: 1
[ ]:
!kubectl apply -f resources/model_tfserving_rest.yaml
[ ]:
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=rest-tfserving \
                                 -o jsonpath='{.items[0].metadata.name}')
[ ]:
for i in range(60):
    state=!kubectl get sdep rest-tfserving -o jsonpath='{.status.state}'
    state=state[0]
    print(state)
    if state=="Available":
        break
    time.sleep(1)
assert(state=="Available")

We can now send requests using the REST tensorflow protocol format.

[ ]:
X=!curl -s -d '{"instances": [1.0, 2.0, 5.0]}' \
   -X POST http://localhost:8003/seldon/seldon/rest-tfserving/v1/models/halfplustwo/:predict \
   -H "Content-Type: application/json"
d=json.loads("".join(X))
print(d)
assert(d["predictions"][0] == 2.5)
[ ]:
!kubectl delete -f resources/model_tfserving_rest.yaml

Tensorflow Protocol GRPC Model

We will deploy a model with a GRPC endpoint that uses the TENSOFRLOW Protocol namely by specifying the attribute protocol: tensorflow

[ ]:
%%writefile resources/model_tfserving_grpc.yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: grpc-tfserving
spec:
  name: grpctfserving
  protocol: tensorflow
  transport: grpc
  predictors:
  - componentSpecs:
    - spec:
        containers:
        - args:
          - --port=8500
          - --rest_api_port=8501
          - --model_name=halfplustwo
          - --model_base_path=gs://seldon-models/tfserving/half_plus_two
          image: tensorflow/serving
          name: halfplustwo
          ports:
          - containerPort: 8500
            name: grpc
    graph:
      name: halfplustwo
      type: MODEL
      endpoint:
        service_port: 8500
        type: GRPC
    name: model
    replicas: 1
[ ]:
!kubectl apply -f resources/model_tfserving_grpc.yaml
[ ]:
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=grpc-tfserving \
                                 -o jsonpath='{.items[0].metadata.name}')
[ ]:
for i in range(60):
    state=!kubectl get sdep grpc-tfserving -o jsonpath='{.status.state}'
    state=state[0]
    print(state)
    if state=="Available":
        break
    time.sleep(1)
assert(state=="Available")

We can now send requests using the GRPC tensorflow protocol format.

[ ]:
X=!cd ../executor/proto && grpcurl \
   -d '{"model_spec":{"name":"halfplustwo"},"inputs":{"x":{"dtype": 1, "tensor_shape": {"dim":[{"size": 3}]}, "floatVal" : [1.0, 2.0, 3.0]}}}' \
   -rpc-header seldon:grpc-tfserving -rpc-header namespace:seldon \
   -plaintext -proto ./prediction_service.proto \
   0.0.0.0:8003 tensorflow.serving.PredictionService/Predict
d=json.loads("".join(X))
print(d)
assert(d["outputs"]["x"]["floatVal"][0] == 2.5)
[ ]:
!kubectl delete -f resources/model_tfserving_grpc.yaml
[ ]: