This page was generated from examples/models/autoscaling/autoscaling_example.ipynb.
Autoscaling Seldon Deployments¶
Prerequisites¶
The cluster should have
metric-server
running in thekube-system
namespaceFor Kind install
../../testing/scripts/metrics.yaml
See https://github.com/kubernetes-sigs/kind/issues/398For Minikube run:
minikube addons enable metrics-server
Setup Seldon Core¶
Use the setup notebook to Setup Cluster with Ambassador Ingress and Install Seldon Core. Instructions also online.
[1]:
!kubectl create namespace seldon
Error from server (AlreadyExists): namespaces "seldon" already exists
[2]:
!kubectl config set-context $(kubectl config current-context) --namespace=seldon
Context "kind-ansible" modified.
Create model with v2beta1 autoscaler¶
To create a model with an HorizontalPodAutoscaler there are three steps:
Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory, e.g.:
resources:
requests:
cpu: '0.5'
Add an v2beta1 HPA Spec referring to this Deployment, e.g.:
- hpaSpec:
maxReplicas: 3
minReplicas: 1
metrics:
- resource:
name: cpu
targetAverageUtilization: 10
type: Resource
The full SeldonDeployment spec is shown below.
[5]:
!pygmentize model_with_hpa_v2beta1.yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
name: seldon-model
spec:
name: test-deployment
predictors:
- componentSpecs:
- hpaSpec:
maxReplicas: 3
metrics:
- resource:
name: cpu
targetAverageUtilization: 10
type: Resource
minReplicas: 1
spec:
containers:
- image: seldonio/mock_classifier:1.5.0-dev
imagePullPolicy: IfNotPresent
name: classifier
resources:
requests:
cpu: '0.5'
terminationGracePeriodSeconds: 1
graph:
children: []
name: classifier
type: MODEL
name: example
[6]:
!kubectl create -f model_with_hpa_v2beta1.yaml
seldondeployment.machinelearning.seldon.io/seldon-model created
[7]:
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=seldon-model -o jsonpath='{.items[0].metadata.name}')
Waiting for deployment "seldon-model-example-0-classifier" rollout to finish: 0 of 1 updated replicas are available...
deployment "seldon-model-example-0-classifier" successfully rolled out
Create Load¶
We label some nodes for the loadtester. We attempt the first two as for Kind the first node shown will be the master.
[9]:
!kubectl label nodes $(kubectl get nodes -o jsonpath='{.items[0].metadata.name}') role=locust
node/ansible-control-plane not labeled
[10]:
!helm install loadtester ../../../helm-charts/seldon-core-loadtesting \
--set locust.host=http://seldon-model-example:8000 \
--set oauth.enabled=false \
--set locust.hatchRate=1 \
--set locust.clients=1 \
--set loadtest.sendFeedback=0 \
--set locust.minWait=0 \
--set locust.maxWait=0 \
--set replicaCount=1
NAME: loadtester
LAST DEPLOYED: Sat Mar 4 09:13:46 2023
NAMESPACE: seldon
STATUS: deployed
REVISION: 1
TEST SUITE: None
After a few mins you should see the deployment my-dep
scaled to 3 deployments
[11]:
import json
import time
def getNumberPods():
dp = !kubectl get deployment seldon-model-example-0-classifier -o json
dp = json.loads("".join(dp))
return dp["status"]["replicas"]
scaled = False
for i in range(60):
pods = getNumberPods()
print(pods)
if pods > 1:
scaled = True
break
time.sleep(5)
assert scaled
3
[12]:
!kubectl get pods,deployments,hpa
NAME READY STATUS RESTARTS AGE
pod/locust-master-1-xjplw 1/1 Running 0 85s
pod/locust-slave-1-gljjf 1/1 Running 0 85s
pod/seldon-model-example-0-classifier-795b9cc8b6-7jfgp 0/2 Running 0 15s
pod/seldon-model-example-0-classifier-795b9cc8b6-bqwg9 2/2 Running 0 80m
pod/seldon-model-example-0-classifier-795b9cc8b6-fms5f 0/2 Running 0 15s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/seldon-model-example-0-classifier 1/3 3 1 80m
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
horizontalpodautoscaler.autoscaling/seldon-model-example-0-classifier Deployment/seldon-model-example-0-classifier 60%/10% 1 3 1 80m
[13]:
!helm delete loadtester -n seldon
release "loadtester" uninstalled
[14]:
!kubectl delete -f model_with_hpa_v2beta1.yaml
seldondeployment.machinelearning.seldon.io "seldon-model" deleted
Create model with v2 autoscaler¶
To create a model with an HorizontalPodAutoscaler there are three steps:
Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory, e.g.:
resources:
requests:
cpu: '0.5'
Add an v2beta1 HPA Spec referring to this Deployment, e.g.:
- hpaSpec:
maxReplicas: 3
minReplicas: 1
metricsv2:
- resource:
name: cpu
target:
type: Utilization
averageUtilization: 10
type: Resource
The full SeldonDeployment spec is shown below.
[16]:
!pygmentize model_with_hpa_v2.yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
name: seldon-model
spec:
name: test-deployment
predictors:
- componentSpecs:
- hpaSpec:
maxReplicas: 3
metricsv2:
- resource:
name: cpu
target:
type: Utilization
averageUtilization: 10
type: Resource
minReplicas: 1
spec:
containers:
- image: seldonio/mock_classifier:1.5.0-dev
imagePullPolicy: IfNotPresent
name: classifier
resources:
requests:
cpu: '0.5'
terminationGracePeriodSeconds: 1
graph:
children: []
name: classifier
type: MODEL
name: example
[17]:
!kubectl create -f model_with_hpa_v2.yaml
seldondeployment.machinelearning.seldon.io/seldon-model created
[18]:
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=seldon-model -o jsonpath='{.items[0].metadata.name}')
Waiting for deployment "seldon-model-example-0-classifier" rollout to finish: 0 of 1 updated replicas are available...
deployment "seldon-model-example-0-classifier" successfully rolled out
Create Load¶
We label some nodes for the loadtester. We attempt the first two as for Kind the first node shown will be the master.
[19]:
!kubectl label nodes $(kubectl get nodes -o jsonpath='{.items[0].metadata.name}') role=locust
node/ansible-control-plane not labeled
[20]:
!helm install loadtester ../../../helm-charts/seldon-core-loadtesting \
--set locust.host=http://seldon-model-example:8000 \
--set oauth.enabled=false \
--set locust.hatchRate=1 \
--set locust.clients=1 \
--set loadtest.sendFeedback=0 \
--set locust.minWait=0 \
--set locust.maxWait=0 \
--set replicaCount=1
NAME: loadtester
LAST DEPLOYED: Sat Mar 4 09:20:04 2023
NAMESPACE: seldon
STATUS: deployed
REVISION: 1
TEST SUITE: None
After a few mins you should see the deployment my-dep
scaled to 3 deployments
[21]:
import json
import time
def getNumberPods():
dp = !kubectl get deployment seldon-model-example-0-classifier -o json
dp = json.loads("".join(dp))
return dp["status"]["replicas"]
scaled = False
for i in range(60):
pods = getNumberPods()
print(pods)
if pods > 1:
scaled = True
break
time.sleep(5)
assert scaled
1
1
1
1
1
1
3
[22]:
!kubectl get pods,deployments,hpa
NAME READY STATUS RESTARTS AGE
pod/locust-master-1-qhvt6 1/1 Running 0 11m
pod/locust-slave-1-gnz8h 1/1 Running 0 11m
pod/seldon-model-example-0-classifier-5f6445c99c-6t42q 2/2 Running 0 10m
pod/seldon-model-example-0-classifier-5f6445c99c-fqfd9 2/2 Running 0 10m
pod/seldon-model-example-0-classifier-5f6445c99c-s4wrv 2/2 Running 0 11m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/seldon-model-example-0-classifier 3/3 3 3 11m
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
horizontalpodautoscaler.autoscaling/seldon-model-example-0-classifier Deployment/seldon-model-example-0-classifier 21%/10% 1 3 3 11m
[23]:
!helm delete loadtester -n seldon
release "loadtester" uninstalled
[24]:
!kubectl delete -f model_with_hpa_v2.yaml
seldondeployment.machinelearning.seldon.io "seldon-model" deleted