Shadow Rollout with Seldon and Ambassador

This notebook shows how you can deploy “shadow” deployments to direct traffic not only to the main Seldon Deployment but also to a shadow deployment whose reponse will be dicarded. This allows you to test new models in a production setting and with production traffic and anlalyse how they perform before putting them live.

These are useful when you want to test a new model or higher latency inference piepline (e.g., with explanation components) with production traffic but without affecting the live deployment.

Setup Seldon Core

Use the setup notebook to Setup Cluster with Ambassador Ingress and Install Seldon Core. Instructions also online.

Set up Port Forward

Ensure you port forward to Grafana

kubectl port-forward $(kubectl get pods -n seldon -l app=grafana-prom-server -o jsonpath='{.items[0].metadata.name}') -n seldon 3000:3000

Launch main model

We will create a very simple Seldon Deployment with a dummy model image seldonio/mock_classifier:1.0. This deployment is named example.

[17]:
!pygmentize model.json
{
    "apiVersion": "machinelearning.seldon.io/v1alpha2",
    "kind": "SeldonDeployment",
    "metadata": {
        "labels": {
            "app": "seldon"
        },
        "name": "example"
    },
    "spec": {
        "name": "production-model",
        "predictors": [
            {
                "componentSpecs": [{
                    "spec": {
                        "containers": [
                            {
                                "image": "seldonio/mock_classifier:1.0",
                                "imagePullPolicy": "IfNotPresent",
                                "name": "classifier"
                            }
                        ],
                        "terminationGracePeriodSeconds": 1
                    }}
                                  ],
                "graph":
                {
                    "children": [],
                    "name": "classifier",
                    "type": "MODEL",
                    "endpoint": {
                        "type": "REST"
                    }},
                "name": "single",
                "replicas": 1
            }
        ]
    }
}
[8]:
!kubectl create -f model.json
seldondeployment.machinelearning.seldon.io/example created
[9]:
!kubectl rollout status deploy/production-model-single-7cd068f
Waiting for deployment "production-model-single-7cd068f" rollout to finish: 0 of 1 updated replicas are available...
deployment "production-model-single-7cd068f" successfully rolled out

Get predictions

[10]:
from seldon_core.seldon_client import SeldonClient
sc = SeldonClient(deployment_name="example",namespace="seldon")

REST Request

[11]:
r = sc.predict(gateway="ambassador",transport="rest")
print(r)
Success:True message:
Request:
data {
  tensor {
    shape: 1
    shape: 1
    values: 0.25103583044502875
  }
}

Response:
meta {
  puid: "j7mhk9clk2fbu8tdq3ka8m2274"
  requestPath {
    key: "classifier"
    value: "seldonio/mock_classifier:1.0"
  }
}
data {
  names: "proba"
  tensor {
    shape: 1
    shape: 1
    values: 0.06503212230125832
  }
}

gRPC Request

[12]:
r = sc.predict(gateway="ambassador",transport="grpc")
print(r)
Success:True message:
Request:
data {
  tensor {
    shape: 1
    shape: 1
    values: 0.6342191670710421
  }
}

Response:
meta {
  puid: "cb93sjuopd985fa8jde1bt023m"
  requestPath {
    key: "classifier"
    value: "seldonio/mock_classifier:1.0"
  }
}
data {
  names: "proba"
  tensor {
    shape: 1
    shape: 1
    values: 0.09258712191804268
  }
}

Launch Shadow

We will now create a new Seldon Deployment for our Shadow deployment with a new model seldonio/mock_classifier_rest:1.1. To make it a shadow of the original example deployment we add two annotations

"annotations": {
        "seldon.io/ambassador-service-name":"example",
        "seldon.io/ambassador-shadow":"true"
    },

The first says to use example as our service endpoint rather than the default which would be our deployment name - in this case example-shadow. This will ensure that this Ambassador setting will apply to the same prefix as the previous one. The second states we want to use Ambassador’s shadow functionality.

[13]:
!pygmentize shadow.json
{
    "apiVersion": "machinelearning.seldon.io/v1alpha2",
    "kind": "SeldonDeployment",
    "metadata": {
        "labels": {
            "app": "seldon"
        },
        "name": "example-shadow"
    },
    "spec": {
        "name": "shadow-model",
        "annotations": {
            "seldon.io/ambassador-service-name":"example",
            "seldon.io/ambassador-shadow":"true"
        },
        "predictors": [
            {
                "componentSpecs": [{
                    "spec": {
                        "containers": [
                            {
                                "image": "seldonio/mock_classifier_rest:1.1",
                                "imagePullPolicy": "IfNotPresent",
                                "name": "classifier"
                            }
                        ],
                        "terminationGracePeriodSeconds": 1
                    }}
                                  ],
                "graph":
                {
                    "children": [],
                    "name": "classifier",
                    "type": "MODEL",
                    "endpoint": {
                        "type": "REST"
                    }},
                "name": "single",
                "replicas": 1
            }
        ]
    }
}
[14]:
!kubectl create -f shadow.json
seldondeployment.machinelearning.seldon.io/example-shadow created
[15]:
!kubectl rollout status deploy/shadow-model-single-4c8805f
Waiting for deployment "shadow-model-single-4c8805f" rollout to finish: 0 of 1 updated replicas are available...
deployment "shadow-model-single-4c8805f" successfully rolled out

Let’s send a bunch of requests to the endpoint.

[16]:
for i in range(1000):
    r = sc.predict(gateway="ambassador",transport="rest")

Now view the analytics dashboard at http://localhost:3000 You should see a dashboard view like below showing the two models production and shadow both receiving requests.

shadow

[ ]: