General Predictive Services
This gudie takes you through the steps to serve general predictive models through Seldon.
Seldon provides the ability to build supervised learning based predictive models and put them into production at scale. The strutcture of a general predictive pipeline is show in the diagram below:
There are offline and realtime components. Raw data to be used for creating a predictive pipeline is sent to Seldon via its REST API in real time. This data can be sent as arbitrary JSON which allows complete freedom for a client to provide whatever data is available. It is normally the case that this raw JSON data sent to Seldon is not in the best format to directly build machine learning models. Therefore, in the offline modelling stage the data is first sent through an (optional) set of feature transformations to extract and create appropriate features that are useful for creating predictive models. After these transformations a model can be built to predict some target feature in the data based on the extracted/transformed features.
At runtime as prediction calls come in the same set of transformations performed offline will need to be repeated to create the same set of final features to test against the model. Once the transformations have been done the features can be scored and a predictive result returned to the client in real time.
Setup a client in Seldon with
seldon-cli client. This will create consumer keys that can be used in the next section.
Events can be sent to Seldon via the prediction API. They will be transfered from the Seldon Server(s) to a central store using Fluentd and stored at
/seldon-data/logs/events.<year>/<month>/<day/<hour>/file.gz as JSON if the default Fluentd configuration is kept. These files contain events for all the clients setup as many clients can use the same API via different JS and Oauth consumer keys. To separate out the events for each client into individual files we provide a spark job that processes these files and separates them into separate folders for processing by modelling jobs, usinf
seldon-cli client --action processevents.
You have other custom options for integration if needed:
- You can also place any historical data you may have into
- You can change the Fluentd configuration to push the data to a custom location
- You can bypass this section an use your own events datastore to build predictive models and serve them through Seldon
You can create your machine learning model using the toolkit of your choice. However, to make building predictive pipelines easier and to allow them to be used at runtime as well as during modeling we presently provide a python library that allows you to create pandas and scikit-learn compatable predictive pipelines. For details on using these see here.
To serve predictions for your model you need to provide a runtime microservice that conforms to the microserice prediction API packaged as a Docker container. If your model is built using our python predictive pipelines you can easily package it as a microservice.
Once packaged as a Docker container the microservice can be started using the command line script start-microservice.
The script creates a Kubernetes deployment for the microservice in
kubernetes/conf/microservices. If the microserice is already running Kubernetes will roll-down the previous version and roll-up the new version.
For example to start the XGBoost Iris microservice on the client “test”:
The script will use the seldon-cli to update the “test” client to add the microservice as a runtime algorithm. You can now get predictions via the Seldon API.
Run A/B Tests
To test differenet prediction algorithms in a live setting you will want to run A/B tests.
If you have two microservices you want to test in an A/B test you can use the script start-microservice.
An example to start two variantions of the example Iris predictors using Xgboost and Scikit-learn variants and sending 50% of the traffic to each is shown below:
The script will start both microservices and provide the correct configuration via the seldon CLI. The configuration used for the above example is shown below:
See here on how to use the Seldon CLI to set a custom prediction algorithm configuration.