This page was generated from examples/models/autoscaling/autoscaling_example.ipynb.

Autoscaling Seldon Deployments¶

Prerequisites¶

The cluster should have metric-server running in the kube-system namespace
For Kind install ../../testing/scripts/metrics.yaml See https://github.com/kubernetes-sigs/kind/issues/398
For Minikube run:
```
minikube addons enable metrics-server
```

Setup Seldon Core¶

Use the setup notebook to Setup Cluster with Ambassador Ingress and Install Seldon Core. Instructions also online.

[1]:

!kubectl create namespace seldon

Error from server (AlreadyExists): namespaces "seldon" already exists

[2]:

!kubectl config set-context $(kubectl config current-context) --namespace=seldon

Context "kind-ansible" modified.

Create model with v2beta1 autoscaler¶

To create a model with an HorizontalPodAutoscaler there are three steps:

Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory, e.g.:

resources:
  requests:
    cpu: '0.5'

Add an v2beta1 HPA Spec referring to this Deployment, e.g.:

- hpaSpec:
    maxReplicas: 3
    minReplicas: 1
    metrics:
    - resource:
        name: cpu
        targetAverageUtilization: 10
      type: Resource

The full SeldonDeployment spec is shown below.

[5]:

!pygmentize model_with_hpa_v2beta1.yaml

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: seldon-model
spec:
  name: test-deployment
  predictors:
  - componentSpecs:
    - hpaSpec:
        maxReplicas: 3
        metrics:
        - resource:
            name: cpu
            targetAverageUtilization: 10
          type: Resource
        minReplicas: 1
      spec:
        containers:
        - image: seldonio/mock_classifier:1.5.0-dev
          imagePullPolicy: IfNotPresent
          name: classifier
          resources:
            requests:
              cpu: '0.5'
        terminationGracePeriodSeconds: 1
    graph:
      children: []
      name: classifier
      type: MODEL
    name: example

[6]:

!kubectl create -f model_with_hpa_v2beta1.yaml

seldondeployment.machinelearning.seldon.io/seldon-model created

[7]:

!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=seldon-model -o jsonpath='{.items[0].metadata.name}')

Waiting for deployment "seldon-model-example-0-classifier" rollout to finish: 0 of 1 updated replicas are available...
deployment "seldon-model-example-0-classifier" successfully rolled out

Create Load¶

We label some nodes for the loadtester. We attempt the first two as for Kind the first node shown will be the master.

[9]:

!kubectl label nodes $(kubectl get nodes -o jsonpath='{.items[0].metadata.name}') role=locust

node/ansible-control-plane not labeled

[10]:

!helm install loadtester ../../../helm-charts/seldon-core-loadtesting  \
    --set locust.host=http://seldon-model-example:8000 \
    --set oauth.enabled=false \
    --set locust.hatchRate=1 \
    --set locust.clients=1 \
    --set loadtest.sendFeedback=0 \
    --set locust.minWait=0 \
    --set locust.maxWait=0 \
    --set replicaCount=1

NAME: loadtester
LAST DEPLOYED: Sat Mar  4 09:13:46 2023
NAMESPACE: seldon
STATUS: deployed
REVISION: 1
TEST SUITE: None

After a few mins you should see the deployment my-dep scaled to 3 deployments

[11]:

import json
import time


def getNumberPods():
    dp = !kubectl get deployment seldon-model-example-0-classifier -o json
    dp = json.loads("".join(dp))
    return dp["status"]["replicas"]


scaled = False
for i in range(60):
    pods = getNumberPods()
    print(pods)
    if pods > 1:
        scaled = True
        break
    time.sleep(5)
assert scaled

[12]:

!kubectl get pods,deployments,hpa

NAME                                                     READY   STATUS    RESTARTS   AGE
pod/locust-master-1-xjplw                                1/1     Running   0          85s
pod/locust-slave-1-gljjf                                 1/1     Running   0          85s
pod/seldon-model-example-0-classifier-795b9cc8b6-7jfgp   0/2     Running   0          15s
pod/seldon-model-example-0-classifier-795b9cc8b6-bqwg9   2/2     Running   0          80m
pod/seldon-model-example-0-classifier-795b9cc8b6-fms5f   0/2     Running   0          15s

NAME                                                READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/seldon-model-example-0-classifier   1/3     3            1           80m

NAME                                                                    REFERENCE                                      TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/seldon-model-example-0-classifier   Deployment/seldon-model-example-0-classifier   60%/10%   1         3         1          80m

[13]:

!helm delete loadtester -n seldon

release "loadtester" uninstalled

[14]:

!kubectl delete -f model_with_hpa_v2beta1.yaml

seldondeployment.machinelearning.seldon.io "seldon-model" deleted

Create model with v2 autoscaler¶

To create a model with an HorizontalPodAutoscaler there are three steps:

Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory, e.g.:

resources:
  requests:
    cpu: '0.5'

Add an v2beta1 HPA Spec referring to this Deployment, e.g.:

- hpaSpec:
    maxReplicas: 3
    minReplicas: 1
    metricsv2:
    - resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 10
      type: Resource

The full SeldonDeployment spec is shown below.

[16]:

!pygmentize model_with_hpa_v2.yaml

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: seldon-model
spec:
  name: test-deployment
  predictors:
  - componentSpecs:
    - hpaSpec:
        maxReplicas: 3
        metricsv2:
        - resource:
            name: cpu
            target:
              type: Utilization
              averageUtilization: 10
          type: Resource
        minReplicas: 1
      spec:
        containers:
        - image: seldonio/mock_classifier:1.5.0-dev
          imagePullPolicy: IfNotPresent
          name: classifier
          resources:
            requests:
              cpu: '0.5'
        terminationGracePeriodSeconds: 1
    graph:
      children: []
      name: classifier
      type: MODEL
    name: example

[17]:

!kubectl create -f model_with_hpa_v2.yaml

seldondeployment.machinelearning.seldon.io/seldon-model created

[18]:

!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=seldon-model -o jsonpath='{.items[0].metadata.name}')

Waiting for deployment "seldon-model-example-0-classifier" rollout to finish: 0 of 1 updated replicas are available...
deployment "seldon-model-example-0-classifier" successfully rolled out

Create Load¶

We label some nodes for the loadtester. We attempt the first two as for Kind the first node shown will be the master.

[19]:

!kubectl label nodes $(kubectl get nodes -o jsonpath='{.items[0].metadata.name}') role=locust

node/ansible-control-plane not labeled

[20]:

!helm install loadtester ../../../helm-charts/seldon-core-loadtesting  \
    --set locust.host=http://seldon-model-example:8000 \
    --set oauth.enabled=false \
    --set locust.hatchRate=1 \
    --set locust.clients=1 \
    --set loadtest.sendFeedback=0 \
    --set locust.minWait=0 \
    --set locust.maxWait=0 \
    --set replicaCount=1

NAME: loadtester
LAST DEPLOYED: Sat Mar  4 09:20:04 2023
NAMESPACE: seldon
STATUS: deployed
REVISION: 1
TEST SUITE: None

After a few mins you should see the deployment my-dep scaled to 3 deployments

[21]:

import json
import time


def getNumberPods():
    dp = !kubectl get deployment seldon-model-example-0-classifier -o json
    dp = json.loads("".join(dp))
    return dp["status"]["replicas"]


scaled = False
for i in range(60):
    pods = getNumberPods()
    print(pods)
    if pods > 1:
        scaled = True
        break
    time.sleep(5)
assert scaled

[22]:

!kubectl get pods,deployments,hpa

NAME                                                     READY   STATUS    RESTARTS   AGE
pod/locust-master-1-qhvt6                                1/1     Running   0          11m
pod/locust-slave-1-gnz8h                                 1/1     Running   0          11m
pod/seldon-model-example-0-classifier-5f6445c99c-6t42q   2/2     Running   0          10m
pod/seldon-model-example-0-classifier-5f6445c99c-fqfd9   2/2     Running   0          10m
pod/seldon-model-example-0-classifier-5f6445c99c-s4wrv   2/2     Running   0          11m

NAME                                                READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/seldon-model-example-0-classifier   3/3     3            3           11m

NAME                                                                    REFERENCE                                      TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/seldon-model-example-0-classifier   Deployment/seldon-model-example-0-classifier   21%/10%   1         3         3          11m

[23]:

!helm delete loadtester -n seldon

release "loadtester" uninstalled

[24]:

!kubectl delete -f model_with_hpa_v2.yaml

seldondeployment.machinelearning.seldon.io "seldon-model" deleted