Alibaba Cloud Container Service for Kubernetes (ACK) Deep MNIST Example

In this example we will deploy a tensorflow MNIST model in the Alibaba Cloud Container Service for Kubernetes.

This tutorial will break down in the following sections:

  1. Train a tensorflow model to predict mnist locally

  2. Containerise the tensorflow model with our docker utility

  3. Test model locally with docker

  4. Set-up and configure Alibaba Cloud environment

  5. Deploy your model and visualise requests

Let’s get started! 🚀🔥


  • Helm v3.0.0+

  • kubectl v1.14+

  • Python 3.6+

  • Python DEV requirements

1) Train a tensorflow model to predict mnist locally

We will load the mnist images, together with their labels, and then train a tensorflow model to predict the right labels

from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
import tensorflow as tf

if __name__ == "__main__":

    x = tf.placeholder(tf.float32, [None, 784], name="x")

    W = tf.Variable(tf.zeros([784, 10]))
    b = tf.Variable(tf.zeros([10]))

    y = tf.nn.softmax(tf.matmul(x, W) + b, name="y")

    y_ = tf.placeholder(tf.float32, [None, 10])

    cross_entropy = tf.reduce_mean(
        -tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1])

    train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

    init = tf.initialize_all_variables()

    sess = tf.Session()

    for i in range(1000):
        batch_xs, batch_ys = mnist.train.next_batch(100), feed_dict={x: batch_xs, y_: batch_ys})

    correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    print(, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

    saver = tf.train.Saver(), "model/deep_mnist_model")
2) Containerise the tensorflow model with our docker utility

Create a wrapper file that exposes the functionality through a predict function:

import tensorflow as tf
import numpy as np

class DeepMnist(object):
    def __init__(self):
        self.class_names = ["class:{}".format(str(i)) for i in range(10)]
        self.sess = tf.Session()
        saver = tf.train.import_meta_graph("model/deep_mnist_model.meta")

        graph = tf.get_default_graph()
        self.x = graph.get_tensor_by_name("x:0")
        self.y = graph.get_tensor_by_name("y:0")

    def predict(self,X,feature_names):
        predictions =,feed_dict={self.x:X})
        return predictions.astype(np.float64)

Define the dependencies for the wrapper in the requirements.txt:

%%writefile requirements.txt
Overwriting requirements.txt

You need to make sure that you have added the .s2i/environment configuration file in this folder with the following content:

!mkdir .s2i
mkdir: cannot create directory ‘.s2i’: File exists
%%writefile .s2i/environment
Overwriting .s2i/environment

Now we can build a docker image named “deep-mnist” with the tag 0.1

!s2i build . seldonio/seldon-core-s2i-python37:1.19.0-dev deep-mnist:0.1
3) Test model locally with docker

We first run the docker image we just created as a container called “mnist_predictor”

!docker run --name "mnist_predictor" -d --rm -p 5000:5000 deep-mnist:0.1

Send some random features that conform to the contract

import matplotlib.pyplot as plt
import numpy as np

# This is the variable that was initialised at the beginning of the file
i = [0]
x = mnist.test.images[i]
y = mnist.test.labels[i]
plt.imshow(x.reshape((28, 28)), cmap="gray")
print("Expected label: ", np.sum(range(0, 10) * y), ". One hot encoding: ", y)
Expected label:  7.0 . One hot encoding:  [[0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]]
import math

import numpy as np

from seldon_core.seldon_client import SeldonClient

# We now test the REST endpoint expecting the same result
endpoint = ""
batch = x
payload_type = "ndarray"

sc = SeldonClient(microservice_endpoint=endpoint)

# We use the microservice, instead of the "predict" function
client_prediction = sc.microservice(
    data=batch, method="predict", payload_type=payload_type, names=["tfidf"]

for proba, label in zip([0].list_value.ListFields()[0][1],
    range(0, 10),
    print(f"LABEL {label}:\t {proba.number_value*100:6.4f} %")
LABEL 0:         0.0064 %
LABEL 1:         0.0000 %
LABEL 2:         0.0155 %
LABEL 3:         0.2862 %
LABEL 4:         0.0003 %
LABEL 5:         0.0027 %
LABEL 6:         0.0000 %
LABEL 7:         99.6643 %
LABEL 8:         0.0020 %
LABEL 9:         0.0227 %
!docker rm mnist_predictor --force

4) Set-up and configure Alibaba Cloud environment

4.1 create a managed kubernetes cluster

We need to first create a cluster in Alibaba Cloud - you should follow the following instructions (make sure you expose the cluster with an elastic IP by checking the tickbox):

You should follow up the instructions but the finished cluster should look as follows:


4.2 Copy the kubectl configuration to access the cluster

Once you have the cluster created, you will be able to use your local kubectl by copying the configuration details on the overview page, and copy it to your ~/.kube/config

4.3 Create an Alibaba Container Registry to push the image

Finally we need to create a container registry repository by following this guide:

Setup Seldon Core

Use the setup notebook to Setup Cluster with Ambassador Ingress and Install Seldon Core. Instructions also online.

Finally we install the Seldon Analytics Package

!helm install seldon-core-analytics seldon-core-analytics --repo
!kubectl rollout status deployment.apps/grafana-prom-deployment
!kubectl patch svc grafana-prom --type='json' -p '[{"op":"replace","path":"/spec/type","value":"LoadBalancer"}]'
deployment "grafana-prom-deployment" successfully rolled out
service/grafana-prom patched

4.5 Push docker image

We’ll now make sure the image is accessible within the Kubernetes cluster by pushing it to the repo that we created in step 4.3. This should look as follows in your dashboard:


To push the image we first tag it

!docker tag deep-mnist:0.1

And then we push it

!docker push
5 - Deploy your model and visualise requests

IMPORTANT: Make sure you replace the URL for your repo in the format of: * registry-intl.[REPO][REGISTRY]/[REPO]:0.1

%%writefile deep_mnist_deployment.yaml
kind: SeldonDeployment
    app: seldon
  name: deep-mnist
    project_name: Tensorflow MNIST
    deployment_version: v1
  name: deep-mnist
  - componentSpecs:
    - spec:
        - image:
          imagePullPolicy: IfNotPresent
          name: classifier
              memory: 1Mi
        terminationGracePeriodSeconds: 20
      children: []
      name: classifier
        type: REST
      type: MODEL
    name: single-model
    replicas: 1
      predictor_version: v1

Overwriting deep_mnist_deployment.yaml

Run the deployment in your cluster

kubectl apply -f deep_mnist_deployment.yaml

And let’s check that it’s been created.

!kubectl get pods

Test the model

We’ll use a random example from our dataset

import matplotlib.pyplot as plt
import numpy as np

# This is the variable that was initialised at the beginning of the file
i = [0]
x = mnist.test.images[i]
y = mnist.test.labels[i]
plt.imshow(x.reshape((28, 28)), cmap="gray")
print("Expected label: ", np.sum(range(0, 10) * y), ". One hot encoding: ", y)
Expected label:  7.0 . One hot encoding:  [[0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]]

First we need to find the URL that we’ll use

You need to add it to the script in the next block

!kubectl get svc ambassador -o jsonpath='{.status.loadBalancer.ingress[0].ip}'

We can now add the URL above to send our request:

import math
import subprocess

import numpy as np

from seldon_core.seldon_client import SeldonClient

# Add the URL you found above, here:
HOST = ""

port = "80"  # Make sure you use the port above
batch = x
payload_type = "ndarray"

sc = SeldonClient(
    gateway="ambassador", gateway_endpoint=HOST + ":" + port, namespace="default"

client_prediction = sc.predict(
    data=batch, deployment_name="deep-mnist", names=["text"], payload_type=payload_type

meta {
  puid: "krop281g4e1mrsgf60m7ilh1fc"
  requestPath {
    key: "classifier"
    value: ""
data {
  names: "class:0"
  names: "class:1"
  names: "class:2"
  names: "class:3"
  names: "class:4"
  names: "class:5"
  names: "class:6"
  names: "class:7"
  names: "class:8"
  names: "class:9"
  ndarray {
    values {
      list_value {
        values {
          number_value: 4.4009637349518016e-05
        values {
          number_value: 6.321029477618367e-09
        values {
          number_value: 0.0001286377664655447
        values {
          number_value: 0.0030034701339900494
        values {
          number_value: 3.061768893530825e-06
        values {
          number_value: 1.882280412246473e-05
        values {
          number_value: 1.8899024567531342e-08
        values {
          number_value: 0.9959828853607178
        values {
          number_value: 2.070009031740483e-05
        values {
          number_value: 0.0007983553223311901

Let’s see the predictions for each label

It seems that it correctly predicted the number 7

for proba, label in zip([0].list_value.ListFields()[0][1],
    range(0, 10),
    print(f"LABEL {label}:\t {proba.number_value*100:6.4f} %")
LABEL 0:         0.0044 %
LABEL 1:         0.0000 %
LABEL 2:         0.0129 %
LABEL 3:         0.3003 %
LABEL 4:         0.0003 %
LABEL 5:         0.0019 %
LABEL 6:         0.0000 %
LABEL 7:         99.5983 %
LABEL 8:         0.0021 %
LABEL 9:         0.0798 %

Finally let’s visualise the metrics that seldon provides out of the box

For this we can access the URL with the command below, it will request an admin and password which by default are set to the following: * Username: admin * Password: admin

You will be able to access it at http://[URL]/d/ejAHFXIWz/prediction-analytics?orgId=1

!kubectl get svc grafana-prom -o jsonpath='{.status.loadBalancer.ingress[0].ip}'

The metrics include requests per second, as well as latency. You are able to add your own custom metrics, and try out other more complex deployments by following furher guides at
