Autoscaling Seldon Deployments

Prerequisites

  • The cluster should have heapster and metric-server running in the kube-system namespace

  • For Kind install ../../testing/scripts/metrics.yaml See https://github.com/kubernetes-sigs/kind/issues/398

  • For Minikube run:

    minikube addons enable metrics-server
    minikube addons enable heapster
    

Setup Seldon Core

Use the setup notebook to Setup Cluster with Ambassador Ingress and Install Seldon Core. Instructions also online.

[1]:
!kubectl create namespace seldon
Error from server (AlreadyExists): namespaces "seldon" already exists
[2]:
!kubectl config set-context $(kubectl config current-context) --namespace=seldon
Context "kind-kind" modified.

Create model with autoscaler

To create a model with an HorizontalPodAutoscaler there are three steps:

  1. Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory, e.g.:

resources:
  requests:
    cpu: '0.5'
  1. Add an HPA Spec refering to this Deployment, e.g.:

- hpaSpec:
    maxReplicas: 3
    metrics:
    - resource:
        name: cpu
        targetAverageUtilization: 10
      type: Resource
    minReplicas: 1

The full SeldonDeployment spec is shown below.

[3]:
!pygmentize model_with_hpa.yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: seldon-model
spec:
  name: test-deployment
  predictors:
  - componentSpecs:
    - hpaSpec:
        maxReplicas: 3
        metrics:
        - resource:
            name: cpu
            targetAverageUtilization: 10
          type: Resource
        minReplicas: 1
      spec:
        containers:
        - image: seldonio/mock_classifier:1.5.0-dev
          imagePullPolicy: IfNotPresent
          name: classifier
          resources:
            requests:
              cpu: '0.5'
        terminationGracePeriodSeconds: 1
    graph:
      children: []
      name: classifier
      type: MODEL
    name: example
[4]:
!kubectl create -f model_with_hpa.yaml
seldondeployment.machinelearning.seldon.io/seldon-model created
[5]:
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=seldon-model -o jsonpath='{.items[0].metadata.name}')
Waiting for deployment "seldon-model-example-0-classifier" rollout to finish: 0 of 1 updated replicas are available...
deployment "seldon-model-example-0-classifier" successfully rolled out

Create Load

We label some nodes for the loadtester. We attempt the first two as for Kind the first node shown will be the master.

[6]:
!kubectl label nodes $(kubectl get nodes -o jsonpath='{.items[0].metadata.name}') role=locust
!kubectl label nodes $(kubectl get nodes -o jsonpath='{.items[1].metadata.name}') role=locust
error: 'role' already has a value (locust), and --overwrite is false
error: 'role' already has a value (locust), and --overwrite is false
[7]:
!helm install loadtester ../../../helm-charts/seldon-core-loadtesting  \
    --set locust.host=http://seldon-model-example:8000 \
    --set oauth.enabled=false \
    --set locust.hatchRate=1 \
    --set locust.clients=1 \
    --set loadtest.sendFeedback=0 \
    --set locust.minWait=0 \
    --set locust.maxWait=0 \
    --set replicaCount=1
NAME: loadtester
LAST DEPLOYED: Sun Nov  1 13:13:47 2020
NAMESPACE: seldon
STATUS: deployed
REVISION: 1
TEST SUITE: None

After a few mins you should see the deployment my-dep scaled to 3 deployments

[8]:
import json
import time

def getNumberPods():
    dp=!kubectl get deployment seldon-model-example-0-classifier -o json
    dp=json.loads("".join(dp))
    return dp["status"]["replicas"]

scaled = False
for i in range(60):
    pods = getNumberPods()
    print(pods)
    if pods > 1:
        scaled = True
        break
    time.sleep(5)
assert(scaled)
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
[ ]:
!kubectl get pods,deployments,hpa

Remove Load

After 5-10 mins you should see the deployments replicas decrease to 1

[9]:
!helm delete loadtester -n seldon
release "loadtester" uninstalled
[10]:
!kubectl get pods,deployments,hpa
NAME                                                     READY   STATUS    RESTARTS   AGE
pod/ambassador-6747c68887-2rddl                          1/1     Running   0          22h
pod/jaeger-5cb557b89d-khfb8                              1/1     Running   0          22h
pod/jaeger-operator-67777ffc99-m25fp                     1/1     Running   0          22h
pod/locust-master-1-6sbss                                1/1     Running   0          125m
pod/locust-slave-1-nlwgv                                 1/1     Running   0          125m
pod/seldon-model-example-0-classifier-7cf4bd7485-fvn7f   2/2     Running   0          126m
pod/seldon-model-example-0-classifier-7cf4bd7485-jlsjg   2/2     Running   0          124m
pod/seldon-model-example-0-classifier-7cf4bd7485-p9j4w   0/2     Pending   0          124m

NAME                                                READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/ambassador                          1/1     1            1           22h
deployment.apps/jaeger                              1/1     1            1           22h
deployment.apps/jaeger-operator                     1/1     1            1           22h
deployment.apps/seldon-model-example-0-classifier   2/3     3            2           126m

NAME                                                                    REFERENCE                                      TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/seldon-model-example-0-classifier   Deployment/seldon-model-example-0-classifier   29%/10%   1         3         3          126m
[11]:
!kubectl delete -f model_with_hpa.yaml
seldondeployment.machinelearning.seldon.io "seldon-model" deleted
[ ]: