MLFlow Pre-packaged Model Server AB Test Deployment

In this example we will build two models with MLFlow and we will deploy them as an A/B test deployment.

The reason this is powerful is because it allows you to deploy a new model next to the old one, distributing a percentage of traffic.

These deployment strategies are quite simple using Seldon, and can be extended to shadow deployments, multi-armed-bandits, etc.

Tutorial Overview

This tutorial will break down in the following sections:

  1. Train the MLFlow elastic net wine example
  2. Deploy your trained model leveraging our pre-packaged MLFlow model server
  3. Test the deployed MLFlow model by sending requests
  4. Deploy your second model as an A/B test
  5. Visualise and monitor the performance of your models using Seldon Analytics

Dependencies:

For this example to work you must be running Seldon 0.3.2 or above - you can follow our getting started guide for this.

In regards to other dependencies, make sure you have installed:

  • Helm v2.13.1+
  • kubectl v1.14+
  • Python 3.6+
  • MLFlow 1.1.0

Let’s get started! 🚀🔥

1) Train the first MLFlow Elastic Net Wine example

We will use the elastic net wine example from MLFlow v1.1.0 for this example. First we’ll import all the dependencies.

[1]:
import os, warnings, sys

import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet

import mlflow
import mlflow.sklearn

Let’s load the wine dataset which is also in this folder

[3]:
data = pd.read_csv("wine-quality.csv")
data.head()
[3]:
fixed acidity volatile acidity citric acid residual sugar chlorides free sulfur dioxide total sulfur dioxide density pH sulphates alcohol quality
0 7.0 0.27 0.36 20.7 0.045 45.0 170.0 1.0010 3.00 0.45 8.8 6
1 6.3 0.30 0.34 1.6 0.049 14.0 132.0 0.9940 3.30 0.49 9.5 6
2 8.1 0.28 0.40 6.9 0.050 30.0 97.0 0.9951 3.26 0.44 10.1 6
3 7.2 0.23 0.32 8.5 0.058 47.0 186.0 0.9956 3.19 0.40 9.9 6
4 7.2 0.23 0.32 8.5 0.058 47.0 186.0 0.9956 3.19 0.40 9.9 6

Then we will define all the functions we will use to train the model

[6]:
def eval_metrics(actual, pred):
    rmse = np.sqrt(mean_squared_error(actual, pred))
    mae = mean_absolute_error(actual, pred)
    r2 = r2_score(actual, pred)
    return rmse, mae, r2

def run_train_model_iteration(data, seed=40):
    """
    This function takes a pandas dataframe and returns
    """
    np.random.seed(seed)
    train, test = train_test_split(data)
    train_x = train.drop(["quality"], axis=1)
    test_x = test.drop(["quality"], axis=1)
    train_y = train[["quality"]]
    test_y = test[["quality"]]

    alpha = 0.5
    l1_ratio = 0.5

    with mlflow.start_run():
        lr = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)
        lr.fit(train_x, train_y)

        predicted_qualities = lr.predict(test_x)

        (rmse, mae, r2) = eval_metrics(test_y, predicted_qualities)

        print("Elasticnet model (alpha=%f, l1_ratio=%f):" % (alpha, l1_ratio))
        print("  RMSE: %s" % rmse)
        print("  MAE: %s" % mae)
        print("  R2: %s" % r2)

        # We create a store of the model
        mlflow.sklearn.log_model(lr, "model")

Now we can create a first trained model, which the function above creates an MLFlow “log”

[7]:
run_train_model_iteration(data)
Elasticnet model (alpha=0.500000, l1_ratio=0.500000):
  RMSE: 0.8222428497595403
  MAE: 0.6278761410160693
  R2: 0.12678721972772622

Each of these iterations will create a new run which can be visualised through the MLFlow dashboard as per the screenshot below.

image0

Each of these models can actually be able to found on the mlruns folder

[12]:
!ls mlruns/0
012c5eaa115a4f43b5e4b74cb63d5c56  meta.yaml

Inside of the folders with the hash names is where we can find the artefacts of our model, which we’ll be using to deploy with Seldon

[20]:
print("mlruns/0/"+next(os.walk("mlruns/0"))[1][0]+"/artifacts/")
mlruns/0/012c5eaa115a4f43b5e4b74cb63d5c56/artifacts/

Now we should upload newly trained model into a public Google Bucket or S3 bucket.

We have already done this to make it simpler, which you will be able to find at gs://seldon-models/mlflow/elasticnet_wine

2) Deploy your model using the Pre-packaged Moldel Server for MLFlow

Once you have a Kubernetes Cluster running with Seldon and Ambassador running we can deploy our trained MLFlow model.

For this we have to create a Seldon definition of the model server definition, which we will break down further below.

We will be using the model we updated to our google bucket (gs://seldon-models/mlflow/elasticnet_wine), but you can use your model if you uploaded it to a public bucket.

[22]:
%%writefile mlflow-model-server-seldon-config.yaml
---
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
  name: mlflow-deployment
spec:
  name: mlflow-deployment
  predictors:
  - graph:
      children: []
      implementation: MLFLOW_SERVER
      modelUri: gs://seldon-models/mlflow/elasticnet_wine
      name: wines-classifier
    name: mlflow-deployment-dag
    replicas: 1
Writing mlflow-model-server-seldon-config.yaml

Once we write our configuration file, we are able to deploy it to our cluster by running it with our command

[24]:
!kubectl apply -f mlflow-model-server-seldon-config.yaml
seldondeployment.machinelearning.seldon.io/mlflow-deployment created

Once it’s created we just wait until it’s deployed.

It will basically download the image for the pre-packaged MLFlow model server, and initialise it with the model we specified above.

You can check the status of the deployment with the following command:

[26]:
!kubectl rollout status deployment.apps/mlflow-deployment-mlflow-deployment-dag
deployment "mlflow-deployment-mlflow-deployment-dag" successfully rolled out

Once it’s deployed, we should see a “succcessfully rolled out” message above. We can now test it!

3) Test the deployed MLFlow model by sending requests

Now that our model is deployed in Kubernetes, we are able to send any requests.

We will first need the URL that is currently available through Ambassador.

If you are running this locally, you should be able to reach it through localhost, in this case we can use port 80.

[35]:
!kubectl get svc | grep ambassador
ambassador                                                  LoadBalancer   10.100.227.53   localhost     80:31215/TCP,443:31622/TCP   16d
ambassador-admins                                           ClusterIP      10.101.19.26    <none>        8877/TCP                     16d

Now we will select the first datapoint in our dataset to send to the model.

[59]:
x_0 = data.drop(["quality"], axis=1).values[:1]
print(list(x_0[0]))
[7.0, 0.27, 0.36, 20.7, 0.045, 45.0, 170.0, 1.001, 3.0, 0.45, 8.8]

We can try sending a request first using curl:

[54]:
%%bash
curl -X POST -H 'Content-Type: application/json' \
    -d "{'data': {'names': [], 'ndarray': [[7.0, 0.27, 0.36, 20.7, 0.045, 45.0, 170.0, 1.001, 3.0, 0.45, 8.8]]}}" \
    http://localhost:80/seldon/default/mlflow-deployment/api/v0.1/predictions
{
  "meta": {
    "puid": "4gapbfom6aa1nb6bm71jcsdo5q",
    "tags": {
    },
    "routing": {
    },
    "requestPath": {
      "wines-classifier": ""
    },
    "metrics": []
  },
  "data": {
    "names": [],
    "ndarray": [5.655099099229193]
  }
}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   354  100   250  100   104  12500   5200 --:--:-- --:--:-- --:--:-- 17700

We can also send the request by using our python client

[60]:
from seldon_core.seldon_client import SeldonClient
import math
import numpy as np
import subprocess

HOST = "localhost" # Add the URL you found above
port = "80" # Make sure you use the port above
batch = x_0
payload_type = "ndarray"

sc = SeldonClient(
    gateway="ambassador",
    gateway_endpoint=HOST + ":" + port,
    namespace="default")

client_prediction = sc.predict(
    data=batch,
    deployment_name="mlflow-deployment",
    names=[],
    payload_type=payload_type)

print(client_prediction.response)
meta {
  puid: "kt99thn77rajhquoq50jb49hmh"
  requestPath {
    key: "wines-classifier"
  }
}
data {
  ndarray {
    values {
      number_value: 5.655099099229193
    }
  }
}

4) Deploy your second model as an A/B test

Now that we have a model in production, it’s possible to deploy a second model as an A/B test.

By leveraging this, we will be redirecting 20% of the traffic to the new model.

This can be done by simply adding a traffic attribute as shown below.

[66]:
%%writefile ab-test-mlflow-model-server-seldon-config.yaml
---
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
  name: mlflow-deployment
spec:
  name: mlflow-deployment
  predictors:
  - graph:
      children: []
      implementation: MLFLOW_SERVER
      modelUri: gs://seldon-models/mlflow/elasticnet_wine
      name: wines-classifier
    name: a-mlflow-deployment-dag
    replicas: 1
    traffic: 20
  - graph:
      children: []
      implementation: MLFLOW_SERVER
      modelUri: gs://seldon-models/mlflow/elasticnet_wine
      name: wines-classifier
    name: b-mlflow-deployment-dag
    replicas: 1
    traffic: 80
Overwriting ab-test-mlflow-model-server-seldon-config.yaml

And similar to the model above, we only need to run the command to run it

[67]:
!kubectl apply -f ab-test-mlflow-model-server-seldon-config.yaml
seldondeployment.machinelearning.seldon.io/mlflow-deployment configured

We can check that the models have been deployed and are running with the following command.

We should now see the “a-” model and the “b-” models.

[70]:
!kubectl get pods
NAME                                                         READY   STATUS    RESTARTS   AGE
ambassador-6657ccd4f6-4z6xh                                  1/1     Running   11         16d
ambassador-6657ccd4f6-5mc46                                  1/1     Running   11         16d
ambassador-6657ccd4f6-qqfgn                                  1/1     Running   13         16d
mlflow-deployment-a-mlflow-deployment-dag-68d4c9fcf5-crd9q   2/2     Running   0          2m25s
mlflow-deployment-b-mlflow-deployment-dag-8cdcccbfc-rhc4p    2/2     Running   0          2m25s
seldon-operator-controller-manager-0                         1/1     Running   0          110m

5) Visualise and monitor the performance of your models using Seldon Analytics

This section is optional, but by following the instructions you will be able to visualise the performance of both models as per the chart below.

In order for this example to work you need to install and run the Grafana Analytics package for Seldon Core.

For this we can access the URL with the command below, it will request an admin and password which by default are set to the following: * Username: admin * Password: admin

You can access the grafana dashboard through the port provided below:

[85]:
!kubectl get svc grafana-prom -o jsonpath='{.spec.ports[0].nodePort}'
31212
[ ]:
Now that we can access grafana, you have to go to the prediction analytics dashboard, where you'll be able to see metrics.

Now we can run the following `while True` loop to start sending some data:
[ ]:
while True:
    client_prediction = sc.predict(
        data=batch,
        deployment_name="mlflow-deployment",
        names=[],
        payload_type=payload_type)

print(client_prediction.response)

You should now be able to see the metrics reflected as per the chart below.

In the chart you can visualise on the bottom lef the requests per second, which shows the different traffic breakdown we specified.

You are able to add your own custom metrics, and try out other more complex deployments by following further guides at https://docs.seldon.io/projects/seldon-core/en/latest/workflow/README.html

image0