Custom Init Containers with rclone and MinIO

In this tutorial we will deep dive into some of the interals of how Storage Initilizers are used by Prepackaged Model Servers.

We will also write a custom Init Container that will use rclone to download model artifacts from the in-cluster MinIO storage.

Prerequisites

  • A kubernetes cluster with kubectl configured

  • curl

Setup Seldon Core

Use the setup notebook to Setup Cluster with Ambassador Ingress and Install Seldon Core. Instructions also online.

Setup MinIO

Use the provided notebook to install Minio in your cluster and configure mc CLI tool. Instructions also online.

Copy iris model into local MinIO

[1]:
%%bash
mc config host add gcs https://storage.googleapis.com "" ""

mc mb minio-seldon/iris -p
mc cp gcs/seldon-models/v1.15.0-dev/sklearn/iris/model.joblib minio/minio-seldon/iris/
mc cp gcs/seldon-models/v1.15.0-dev/sklearn/iris/metadata.yaml minio/minio-seldon/iris/
Added `gcs` successfully.
Bucket created successfully `minio-seldon/iris`.
`gcs/seldon-models/sklearn/iris/model.joblib` -> `minio-seldon/iris/model.joblib`
Total: 0 B, Transferred: 1.06 KiB, Speed: 10.35 KiB/s
`gcs/seldon-models/sklearn/iris/metadata.yaml` -> `minio-seldon/iris/metadata.yaml`
Total: 0 B, Transferred: 162 B, Speed: 1.35 KiB/s
[2]:
%%bash
mc ls minio-seldon/iris/
[2021-02-09 18:11:17 GMT]    162B metadata.yaml
[2021-02-09 18:11:16 GMT]  1.1KiB model.joblib

Init Containers Deep Dive

Usually, when using in example SKLearn Prepackaged Model server one defines Seldon Deployment as follows

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: sklearn-default-init-container
spec:
  name: iris
  predictors:
  - graph:
      implementation: SKLEARN_SERVER
      modelUri: gs://seldon-models/v1.17.0-dev/sklearn/iris
      envSecretRefName: seldon-init-container-secret
      name: classifier
    name: default
    replicas: 1

This uses the default storage initilizer defined in the helm values, e.g.:

storageInitializer:
  image: kfserving/storage-initializer:v0.6.1

There are few things that effectively happens here: - emptyDir: {} volume is created and mounted into the model classifier and init containers classifier-storage-initializer - the seldon-init-container-secret secrets are exposed inside the init container via environmental variables - init container is called with two arguments: source seldon-init-container-secret and destination /mnt/models of artifacts to download

This is well illustrated by the following effective resource definition:

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: custom-init-container
spec:
  name: iris
  predictors:
  - componentSpecs:
    - spec:
        containers:
        - name: classifier
          volumeMounts:
          - mountPath: /mnt/models
            name: classifier-provision-location
            readOnly: true

        initContainers:
        - name: classifier-model-initializer
          image: kfserving/storage-initializer:v0.6.1
          imagePullPolicy: IfNotPresent
          args:
          - gs://seldon-models/v1.17.0-dev/sklearn/iris
          - /mnt/models

          envFrom:
          - secretRef:
              name: seldon-init-container-secret

          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File

          volumeMounts:
          - mountPath: /mnt/models
            name: classifier-provision-location

        volumes:
        - emptyDir: {}
          name: classifier-provision-location

    graph:
      children: []
      implementation: SKLEARN_SERVER
      modelUri: gs://seldon-models/v1.17.0-dev/sklearn/iris
      name: classifier
    name: default
    replicas: 1

Note:: - init container name is constructed from the ${predictiveUnitContainerName}-model-initializer pattern. - If the init container is provided explicitly with name matching the pattern SC won’t create one automatically.

Custom Init Container (full inline definition)

We will now define an init container that will use rclone/rclone:latest image with a very explicit definition.

Note: currently if init container of a matching name is provided manually it will be used as it is.

[3]:
%%writefile explicit-init-definition.yaml
apiVersion: v1
kind: Secret
metadata:
  name: mysecret
type: Opaque
stringData:
  rclone.conf: |
    [cluster-minio]
    type = s3
    provider = minio
    env_auth = false
    access_key_id = minioadmin
    secret_access_key = minioadmin
    endpoint = http://minio.minio-system.svc.cluster.local:9000


---

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: explicit-init-definition
spec:
  name: iris
  predictors:
  - componentSpecs:
    - spec:
        containers:
        - name: classifier
          volumeMounts:
          - mountPath: /mnt/models
            name: classifier-provision-location
            readOnly: true

        initContainers:
        - name: classifier-model-initializer
          image: rclone/rclone:latest
          imagePullPolicy: IfNotPresent

          args:
            - "copy"
            - "cluster-minio:sklearn/iris"
            - "/mnt/models"

          volumeMounts:
          - mountPath: /mnt/models
            name: classifier-provision-location

          - name: config
            mountPath: "/config/rclone"
            readOnly: true

        volumes:
        - name: classifier-provision-location
          emptyDir: {}

        - name: config
          secret:
            secretName: mysecret

    graph:
      implementation: SKLEARN_SERVER
      modelUri: "dummy value"
      name: classifier
    name: default
    replicas: 1
Overwriting explicit-init-definition.yaml
[4]:
!kubectl apply -f explicit-init-definition.yaml
secret/mysecret created
seldondeployment.machinelearning.seldon.io/explicit-init-definition created
[5]:
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=explicit-init-definition -o jsonpath='{.items[0].metadata.name}')
Waiting for deployment "explicit-init-definition-default-0-classifier" rollout to finish: 0 of 1 updated replicas are available...
deployment "explicit-init-definition-default-0-classifier" successfully rolled out
[6]:
%%bash
curl -s -X POST -H 'Content-Type: application/json' \
    -d '{"data":{"ndarray":[[5.964, 4.006, 2.081, 1.031]]}}' \
    http://localhost:8003/seldon/seldon/explicit-init-definition/api/v1.0/predictions  | jq .
{
  "data": {
    "names": [
      "t:0",
      "t:1",
      "t:2"
    ],
    "ndarray": [
      [
        0.9548873249364169,
        0.04505474761561406,
        5.7927447968952436e-05
      ]
    ]
  },
  "meta": {
    "requestPath": {
      "classifier": "seldonio/sklearnserver:1.6.0-dev"
    }
  }
}

Custom Init Container (the right way)

It is also possible to prepare a custom init container image to: - use it on an individual deployment - set as a new default

For this purpose we need to build a docker image which entrypoint will accept two arguments: - source - destination

Because copying artifacts with rclone between two location is done with

rclone copy source destination

we prepare following Dockerfile

[7]:
%%writefile Dockerfile
FROM rclone/rclone:latest
ENTRYPOINT ["rclone", "copy"]
Overwriting Dockerfile

This image example is build and published as seldonio/rclone-init-container-example.

rclone tool can be configured using both rclone.conf config file (as above) and environmental variables. Note the remote name mys3 in the name of environmental variables defined in the following secret:

[8]:
%%writefile seldon-reclone-secret.yaml

apiVersion: v1
kind: Secret
metadata:
  name: seldon-rclone-secret
type: Opaque
stringData:
  RCLONE_CONFIG_MYS3_TYPE: s3
  RCLONE_CONFIG_MYS3_PROVIDER: minio
  RCLONE_CONFIG_MYS3_ENV_AUTH: "false"
  RCLONE_CONFIG_MYS3_ACCESS_KEY_ID: minioadmin
  RCLONE_CONFIG_MYS3_SECRET_ACCESS_KEY: minioadmin
  RCLONE_CONFIG_MYS3_ENDPOINT: http://minio.minio-system.svc.cluster.local:9000
Overwriting seldon-reclone-secret.yaml
[9]:
!kubectl apply -f seldon-reclone-secret.yaml
secret/seldon-rclone-secret configured

With above defined we can easily define our sklearn server using modelUri: mys3:sklearn/iris:

[10]:
%%writefile rclone-default-init.yaml

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: rclone-as-default-init-container
spec:
  name: iris
  predictors:
  - name: default
    replicas: 1
    graph:
      name: classifier
      implementation: SKLEARN_SERVER
      modelUri: mys3:sklearn/iris
      storageInitializerImage: seldonio/rclone-init-container-example:0.1
      envSecretRefName: seldon-rclone-secret
Overwriting rclone-default-init.yaml
[11]:
!kubectl apply -f rclone-default-init.yaml
seldondeployment.machinelearning.seldon.io/rclone-as-default-init-container created
[12]:
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=rclone-as-default-init-container -o jsonpath='{.items[0].metadata.name}')
Waiting for deployment "rclone-as-default-init-container-default-0-classifier" rollout to finish: 0 of 1 updated replicas are available...
deployment "rclone-as-default-init-container-default-0-classifier" successfully rolled out
[13]:
%%bash
curl -s -X POST -H 'Content-Type: application/json' \
    -d '{"data":{"ndarray":[[5.964, 4.006, 2.081, 1.031]]}}' \
    http://localhost:8003/seldon/seldon/rclone-as-default-init-container/api/v1.0/predictions  | jq .
{
  "data": {
    "names": [
      "t:0",
      "t:1",
      "t:2"
    ],
    "ndarray": [
      [
        0.9548873249364169,
        0.04505474761561406,
        5.7927447968952436e-05
      ]
    ]
  },
  "meta": {
    "requestPath": {
      "classifier": "seldonio/sklearnserver:1.6.0-dev"
    }
  }
}

Set new default storage initializer

To user our newly created init container as default we need to configure Seldon Core installations by setting following helm values:

storageInitializer:
  image: seldonio/rclone-init-container-example:0.1

predictiveUnit:
  defaultEnvSecretRefName: seldon-rclone-secret

Cleanup

[14]:
%%bash
kubectl delete -f explicit-init-definition.yaml
kubectl delete -f rclone-default-init.yaml
secret "mysecret" deleted
seldondeployment.machinelearning.seldon.io "explicit-init-definition" deleted
seldondeployment.machinelearning.seldon.io "rclone-as-default-init-container" deleted