Autoscaling Seldon Deployments¶

Prerequisites¶

Setup Seldon Core¶

Use the setup notebook to Setup Cluster with Ambassador Ingress and Install Seldon Core. Instructions also online.

[1]:
!kubectl create namespace seldon
Error from server (AlreadyExists): namespaces "seldon" already exists
[2]:
!kubectl config set-context $(kubectl config current-context) --namespace=seldon
Context "kind-ansible" modified.

Create model with v2beta1 autoscaler¶

To create a model with an HorizontalPodAutoscaler there are three steps:

  1. Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory, e.g.:

resources:
  requests:
    cpu: '0.5'
  1. Add an v2beta1 HPA Spec referring to this Deployment, e.g.:

- hpaSpec:
    maxReplicas: 3
    minReplicas: 1
    metrics:
    - resource:
        name: cpu
        targetAverageUtilization: 10
      type: Resource

The full SeldonDeployment spec is shown below.

[5]:
!pygmentize model_with_hpa_v2beta1.yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: seldon-model
spec:
  name: test-deployment
  predictors:
  - componentSpecs:
    - hpaSpec:
        maxReplicas: 3
        metrics:
        - resource:
            name: cpu
            targetAverageUtilization: 10
          type: Resource
        minReplicas: 1
      spec:
        containers:
        - image: seldonio/mock_classifier:1.5.0-dev
          imagePullPolicy: IfNotPresent
          name: classifier
          resources:
            requests:
              cpu: '0.5'
        terminationGracePeriodSeconds: 1
    graph:
      children: []
      name: classifier
      type: MODEL
    name: example
[6]:
!kubectl create -f model_with_hpa_v2beta1.yaml
seldondeployment.machinelearning.seldon.io/seldon-model created
[7]:
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=seldon-model -o jsonpath='{.items[0].metadata.name}')
Waiting for deployment "seldon-model-example-0-classifier" rollout to finish: 0 of 1 updated replicas are available...
deployment "seldon-model-example-0-classifier" successfully rolled out

Create Load¶

We label some nodes for the loadtester. We attempt the first two as for Kind the first node shown will be the master.

[9]:
!kubectl label nodes $(kubectl get nodes -o jsonpath='{.items[0].metadata.name}') role=locust
node/ansible-control-plane not labeled
[10]:
!helm install loadtester ../../../helm-charts/seldon-core-loadtesting  \
    --set locust.host=http://seldon-model-example:8000 \
    --set oauth.enabled=false \
    --set locust.hatchRate=1 \
    --set locust.clients=1 \
    --set loadtest.sendFeedback=0 \
    --set locust.minWait=0 \
    --set locust.maxWait=0 \
    --set replicaCount=1
NAME: loadtester
LAST DEPLOYED: Sat Mar  4 09:13:46 2023
NAMESPACE: seldon
STATUS: deployed
REVISION: 1
TEST SUITE: None

After a few mins you should see the deployment my-dep scaled to 3 deployments

[11]:
import json
import time


def getNumberPods():
    dp = !kubectl get deployment seldon-model-example-0-classifier -o json
    dp = json.loads("".join(dp))
    return dp["status"]["replicas"]


scaled = False
for i in range(60):
    pods = getNumberPods()
    print(pods)
    if pods > 1:
        scaled = True
        break
    time.sleep(5)
assert scaled
3
[12]:
!kubectl get pods,deployments,hpa
NAME                                                     READY   STATUS    RESTARTS   AGE
pod/locust-master-1-xjplw                                1/1     Running   0          85s
pod/locust-slave-1-gljjf                                 1/1     Running   0          85s
pod/seldon-model-example-0-classifier-795b9cc8b6-7jfgp   0/2     Running   0          15s
pod/seldon-model-example-0-classifier-795b9cc8b6-bqwg9   2/2     Running   0          80m
pod/seldon-model-example-0-classifier-795b9cc8b6-fms5f   0/2     Running   0          15s

NAME                                                READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/seldon-model-example-0-classifier   1/3     3            1           80m

NAME                                                                    REFERENCE                                      TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/seldon-model-example-0-classifier   Deployment/seldon-model-example-0-classifier   60%/10%   1         3         1          80m
[13]:
!helm delete loadtester -n seldon
release "loadtester" uninstalled
[14]:
!kubectl delete -f model_with_hpa_v2beta1.yaml
seldondeployment.machinelearning.seldon.io "seldon-model" deleted

Create model with v2 autoscaler¶

To create a model with an HorizontalPodAutoscaler there are three steps:

  1. Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory, e.g.:

resources:
  requests:
    cpu: '0.5'
  1. Add an v2beta1 HPA Spec referring to this Deployment, e.g.:

- hpaSpec:
    maxReplicas: 3
    minReplicas: 1
    metricsv2:
    - resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 10
      type: Resource

The full SeldonDeployment spec is shown below.

[16]:
!pygmentize model_with_hpa_v2.yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: seldon-model
spec:
  name: test-deployment
  predictors:
  - componentSpecs:
    - hpaSpec:
        maxReplicas: 3
        metricsv2:
        - resource:
            name: cpu
            target:
              type: Utilization
              averageUtilization: 10
          type: Resource
        minReplicas: 1
      spec:
        containers:
        - image: seldonio/mock_classifier:1.5.0-dev
          imagePullPolicy: IfNotPresent
          name: classifier
          resources:
            requests:
              cpu: '0.5'
        terminationGracePeriodSeconds: 1
    graph:
      children: []
      name: classifier
      type: MODEL
    name: example
[17]:
!kubectl create -f model_with_hpa_v2.yaml
seldondeployment.machinelearning.seldon.io/seldon-model created
[18]:
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=seldon-model -o jsonpath='{.items[0].metadata.name}')
Waiting for deployment "seldon-model-example-0-classifier" rollout to finish: 0 of 1 updated replicas are available...
deployment "seldon-model-example-0-classifier" successfully rolled out

Create Load¶

We label some nodes for the loadtester. We attempt the first two as for Kind the first node shown will be the master.

[19]:
!kubectl label nodes $(kubectl get nodes -o jsonpath='{.items[0].metadata.name}') role=locust
node/ansible-control-plane not labeled
[20]:
!helm install loadtester ../../../helm-charts/seldon-core-loadtesting  \
    --set locust.host=http://seldon-model-example:8000 \
    --set oauth.enabled=false \
    --set locust.hatchRate=1 \
    --set locust.clients=1 \
    --set loadtest.sendFeedback=0 \
    --set locust.minWait=0 \
    --set locust.maxWait=0 \
    --set replicaCount=1
NAME: loadtester
LAST DEPLOYED: Sat Mar  4 09:20:04 2023
NAMESPACE: seldon
STATUS: deployed
REVISION: 1
TEST SUITE: None

After a few mins you should see the deployment my-dep scaled to 3 deployments

[21]:
import json
import time


def getNumberPods():
    dp = !kubectl get deployment seldon-model-example-0-classifier -o json
    dp = json.loads("".join(dp))
    return dp["status"]["replicas"]


scaled = False
for i in range(60):
    pods = getNumberPods()
    print(pods)
    if pods > 1:
        scaled = True
        break
    time.sleep(5)
assert scaled
1
1
1
1
1
1
3
[22]:
!kubectl get pods,deployments,hpa
NAME                                                     READY   STATUS    RESTARTS   AGE
pod/locust-master-1-qhvt6                                1/1     Running   0          11m
pod/locust-slave-1-gnz8h                                 1/1     Running   0          11m
pod/seldon-model-example-0-classifier-5f6445c99c-6t42q   2/2     Running   0          10m
pod/seldon-model-example-0-classifier-5f6445c99c-fqfd9   2/2     Running   0          10m
pod/seldon-model-example-0-classifier-5f6445c99c-s4wrv   2/2     Running   0          11m

NAME                                                READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/seldon-model-example-0-classifier   3/3     3            3           11m

NAME                                                                    REFERENCE                                      TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/seldon-model-example-0-classifier   Deployment/seldon-model-example-0-classifier   21%/10%   1         3         3          11m
[23]:
!helm delete loadtester -n seldon
release "loadtester" uninstalled
[24]:
!kubectl delete -f model_with_hpa_v2.yaml
seldondeployment.machinelearning.seldon.io "seldon-model" deleted