This page was generated from testing/benchmarking/logger/kafka.ipynb.
Kafka Request Logging Tests¶
This notebook illustrates testing your model with Kafka payload logging.
Prequisites¶
An authenticated K8S cluster with istio and Seldon Core installed
You can use the ansible seldon-core and kafka playbooks in the root ansible folder.
vegeta and ghz benchmarking tools
Port forward to istio
kubectl port-forward $(kubectl get pods -l istio=ingressgateway -n istio-system -o jsonpath='{.items[0].metadata.name}') -n istio-system 8003:8080
Tested on GKE with 6 nodes of 32vCPU e2-standard-32
[ ]:
from IPython.core.magic import register_line_cell_magic
@register_line_cell_magic
def writetemplate(line, cell):
with open(line, "w") as f:
f.write(cell.format(**globals()))
[ ]:
VERSION = !cat ../../../version.txt
VERSION = VERSION[0]
VERSION
[ ]:
!kubectl create namespace seldon
CIFAR10 Model running on Triton Inference Server¶
We run CIFAR10 image model on Triton inference server with settings to allow 5 CPUs to be used for model on Triton.
[ ]:
%%writetemplate model.yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
name: cifar10
namespace: seldon
spec:
name: resnet32
predictors:
- componentSpecs:
- spec:
containers:
- name: cifar10
resources:
requests:
cpu: 5
limits:
cpu: 5
graph:
implementation: TRITON_SERVER
logger:
mode: all
modelUri: gs://seldon-models/triton/tf_cifar10_5cpu
name: cifar10
name: default
svcOrchSpec:
env:
- name: LOGGER_KAFKA_BROKER
value: seldon-kafka-plain-0.kafka:9092
- name: LOGGER_KAFKA_TOPIC
value: seldon
- name: GOMAXPROCS
value: "2"
resources:
requests:
memory: "3G"
cpu: 2
limits:
memory: "3G"
cpu: 2
replicas: 15
protocol: kfserving
[ ]:
!kubectl apply -f model.yaml -n seldon
[ ]:
!kubectl wait --for condition=ready --timeout=600s pods --all -n seldon
[ ]:
!curl -X POST -H 'Content-Type: application/json' \
-d '@./truck-v2.json' \
http://localhost:8003/seldon/seldon/cifar10/v2/models/cifar10/infer
Direct Tests to Validate Setup¶
[ ]:
%%bash
vegeta attack -format=json -duration=10s -rate=0 -max-workers=1 -targets=vegeta_cifar10.json |
vegeta report -type=text
Run Vegeta Benchmark¶
[ ]:
!kubectl create -f configmap_cifar10.yaml -n seldon
[ ]:
workers = 10
duration = "300s"
[ ]:
%%writetemplate job-vegeta-cifar10.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: cifar10-loadtest
spec:
backoffLimit: 6
parallelism: 16
template:
metadata:
annotations:
sidecar.istio.io/inject: "false"
spec:
containers:
- args:
- vegeta -cpus=1 attack -format=json -keepalive=false -duration={duration} -rate=0 -max-workers={workers} -targets=/var/vegeta/cifar10.json
| vegeta report -type=text
command:
- sh
- -c
image: peterevans/vegeta:latest
imagePullPolicy: Always
name: vegeta
volumeMounts:
- mountPath: /var/vegeta
name: vegeta-cfg
restartPolicy: Never
volumes:
- configMap:
defaultMode: 420
name: vegeta-cfg
name: vegeta-cfg
[ ]:
!kubectl create -f job-vegeta-cifar10.yaml -n seldon
[ ]:
!kubectl wait --for=condition=complete job/cifar10-loadtest -n seldon
[ ]:
!kubectl delete -f job-vegeta-cifar10.yaml -n seldon
[ ]:
!kubectl delete -f model.yaml
Summary¶
By looking at the Kafka Grafana monitoring on e can inspect the achieved message rate.
You can port-forward to it with:
kubectl port-forward svc/kafka-grafana -n kafka 3000:80
The default login and password is set to admin
.
On the above deployment and test we see around 3K predictions per second resulting in 6K Kafka messages per second.
[ ]: