# MLflow Server If you have a trained MLflow model you are able to deploy one (or several) of the versions saved using Seldon's prepackaged MLflow server. During initialisation, the built-in reusable server will create the [Conda environment](https://www.mlflow.org/docs/latest/projects.html#project-environments) specified on your `conda.yaml` file. ## Pre-requisites To use the built-in MLflow server the following pre-requisites need to be met: - Your [MLmodel artifact folder](https://www.mlflow.org/docs/latest/models.html) needs to be accessible remotely (e.g. as `gs://seldon-models/mlflow/elasticnet_wine_1.8.0`). - Your model needs to be compatible with the [python_function flavour](https://www.mlflow.org/docs/latest/models.html#python-function-python-function). - Your `MLproject` environment needs to be specified using Conda. ## Conda environment creation The MLflow built-in server will create the Conda environment specified on your `MLmodel`'s `conda.yaml` file during initialisation. Note that this approach may slow down your Kubernetes `SeldonDeployment` startup time considerably. In some cases, it may be worth to consider [creating your own custom reusable server](./custom.md). For example, when the Conda environment can be considered stable, you can create your own image with a fixed set of dependencies. This image can then be re-used across different model versions using the same pre-loaded environment. Note that installation of `conda` packages may take longer than the `livenessProbe` limits. This can be worked around by setting longer limits, see our [elasticnet wine manifest](https://github.com/SeldonIO/seldon-core/blob/master/servers/mlflowserver/samples/elasticnet_wine.yaml) for an example. ## Examples An example for a saved Iris prediction model can be found below: ```yaml apiVersion: machinelearning.seldon.io/v1alpha2 kind: SeldonDeployment metadata: name: mlflow spec: name: wines predictors: - graph: children: [] implementation: MLFLOW_SERVER modelUri: gs://seldon-models/mlflow/elasticnet_wine_1.8.0 name: classifier name: default replicas: 1 ``` ## MLFlow xtype By default the server will call your loaded model's predict function with a `numpy.ndarray`. If you wish for it to call it with `pandas.DataFrame` instead, you can pass a parameter `xtype` and set it to `DataFrame`. For example: ```yaml apiVersion: machinelearning.seldon.io/v1alpha2 kind: SeldonDeployment metadata: name: mlflow spec: name: wines predictors: - graph: children: [] implementation: MLFLOW_SERVER modelUri: gs://seldon-models/mlflow/elasticnet_wine_1.8.0 name: classifier parameters: - name: xtype type: STRING value: DataFrame name: default replicas: 1 ``` You can also try out a [worked notebook](../examples/server_examples.html#Serve-MLflow-Elasticnet-Wines-Model) or check our [talk at the Spark + AI Summit 2019](https://www.youtube.com/watch?v=D6eSfd9w9eA). ## Open Inference Protocol (or V2 protocol) The MLFlow server can also be used to expose an API compatible with the [Open Inference Protocol](../graph/protocols.md#v2-protocol). Note that, under the hood, it will use the [Seldon MLServer](https://github.com/SeldonIO/MLServer) runtime. ### Create a model using `mlflow` and deploy to `seldon-core` As an example we are going to use the elasticnet wine model. - Create a `conda` environment ```bash $ conda -y create -n python3.8-mlflow-example python=3.8 $ conda activate python3.8-mlflow-example ``` - Install `mlflow` ```bash $ pip install mlflow ``` - Train the elasticnet wine example ```bash $ git clone https://github.com/mlflow/mlflow $ cd mlflow/examples $ python sklearn_elasticnet_wine/train.py ``` After the script ends, there will be a models persisted at `mlruns/0//artifacts/model`. This can be fetched from the ui (`mlflow ui`) - Install additional packaged required to deploy and pack the conda environment using [conda-pack](https://conda.github.io/conda-pack/) ```bash $ pip install conda-pack $ pip install mlserver $ pip install mlserver-mlflow $ cd mlflow/examples/mlruns/0//artifacts/model $ conda pack -o environment.tar.gz -f ``` This will pack the current conda environment to `environment.tar.gz`, this will be required by `mlserver` to create the same environment used during train for serving the model. - copy the model directory to a Google Storage bucket that is accessible by seldon-core ```bash $ gsutil cp -r ../model gs://seldon-models/test/elasticnet_wine_ ``` - deploy the model to seldon-core In order to enable support for the Open Inference Protocol, it's enough to specify the `protocol` of the `SeldonDeployment` to use `v2`. For example, ```yaml apiVersion: machinelearning.seldon.io/v1alpha2 kind: SeldonDeployment metadata: name: mlflow spec: protocol: v2 # Activate the Open Inference Protocol name: wines predictors: - graph: children: [] implementation: MLFLOW_SERVER modelUri: gs://seldon-models/test/elasticnet_wine_ name: classifier name: default replicas: 1 ``` - get predictions from the deployed model using REST ```python import json import requests inference_request = { "parameters": { "content_type": "pd" }, "inputs": [ { "name": "fixed acidity", "shape": [1], "datatype": "FP32", "data": [7.4], "parameters": { "content_type": "np" } }, { "name": "volatile acidity", "shape": [1], "datatype": "FP32", "data": [0.7000], "parameters": { "content_type": "np" } }, { "name": "citric acidity", "shape": [1], "datatype": "FP32", "data": [0], "parameters": { "content_type": "np" } }, { "name": "residual sugar", "shape": [1], "datatype": "FP32", "data": [1.9], "parameters": { "content_type": "np" } }, { "name": "chlorides", "shape": [1], "datatype": "FP32", "data": [0.076], "parameters": { "content_type": "np" } }, { "name": "free sulfur dioxide", "shape": [1], "datatype": "FP32", "data": [11], "parameters": { "content_type": "np" } }, { "name": "total sulfur dioxide", "shape": [1], "datatype": "FP32", "data": [34], "parameters": { "content_type": "np" } }, { "name": "density", "shape": [1], "datatype": "FP32", "data": [0.9978], "parameters": { "content_type": "np" } }, { "name": "pH", "shape": [1], "datatype": "FP32", "data": [3.51], "parameters": { "content_type": "np" } }, { "name": "sulphates", "shape": [1], "datatype": "FP32", "data": [0.56], "parameters": { "content_type": "np" } }, { "name": "alcohol", "shape": [1], "datatype": "FP32", "data": [9.4], "parameters": { "content_type": "np" } }, ] } endpoint = "http://localhost:8003/seldon/seldon/mlflow/v2/models/infer" response = requests.post(endpoint, json=inference_request) print(json.dumps(response.json(), indent=2)) ``` ### Caveats - The version of `mlserver` installed in the conda environment will need to match the supported version in `seldon-core`. We are working on tooling to make this more seamless. - Check the caveats of using [`conda-pack`](https://conda.github.io/conda-pack/#caveats)