Distributed Tracing Template

Illustrate the configuration for allowing distributed tracing using Jaeger.

Dependencies

Test using Minikube

Due to a `minikube/s2i issue <https://github.com/SeldonIO/seldon-core/issues/253>`__ you will need `s2i >= 1.1.13 <https://github.com/openshift/source-to-image/releases/tag/v1.1.13>`__

[ ]:
!minikube start --memory 4096 --feature-gates=CustomResourceValidation=true --extra-config=apiserver.Authorization.Mode=RBAC
[ ]:
!kubectl create clusterrolebinding kube-system-cluster-admin --clusterrole=cluster-admin --serviceaccount=kube-system:default
[2]:
!kubectl create namespace seldon
namespace/seldon created
[3]:
!kubectl config set-context $(kubectl config current-context) --namespace=seldon
Context "minikube" modified.
[ ]:
!helm init
[ ]:
!kubectl rollout status deploy/tiller-deploy -n kube-system

Install Jaeger

We will use the Jaeger All-in-1 resource found at the Jaeger Kubernetes repo.

[4]:
!kubectl create -f https://raw.githubusercontent.com/jaegertracing/jaeger-kubernetes/master/all-in-one/jaeger-all-in-one-template.yml -n seldon
deployment.extensions/jaeger created
service/jaeger-query created
service/jaeger-collector created
service/jaeger-agent created
service/zipkin created

Start Jaeger UI

minikube service jaeger-query -n seldon

Install Seldon

[5]:
!helm install ../../../helm-charts/seldon-core-operator --name seldon-core --set usageMetrics.enabled=true --namespace seldon-system
NAME:   seldon-core
LAST DEPLOYED: Tue Apr 16 11:41:15 2019
NAMESPACE: seldon-system
STATUS: DEPLOYED

RESOURCES:
==> v1beta1/ClusterRole
NAME                        AGE
seldon-spartakus-volunteer  1s

==> v1beta1/ClusterRoleBinding
NAME                        AGE
seldon-spartakus-volunteer  1s

==> v1/Secret
NAME                                   TYPE    DATA  AGE
seldon-operator-webhook-server-secret  Opaque  0     1s

==> v1/ClusterRoleBinding
NAME                                 AGE
seldon-operator-manager-rolebinding  1s

==> v1/Service
NAME                                        TYPE       CLUSTER-IP   EXTERNAL-IP  PORT(S)  AGE
seldon-operator-controller-manager-service  ClusterIP  10.99.24.17  <none>       443/TCP  1s

==> v1/ServiceAccount
NAME                        SECRETS  AGE
seldon-spartakus-volunteer  1        1s

==> v1/StatefulSet
NAME                                DESIRED  CURRENT  AGE
seldon-operator-controller-manager  1        1        1s

==> v1/Pod(related)
NAME                                  READY  STATUS             RESTARTS  AGE
seldon-operator-controller-manager-0  0/1    ContainerCreating  0         1s

==> v1/ConfigMap
NAME                     DATA  AGE
seldon-spartakus-config  3     1s

==> v1beta1/CustomResourceDefinition
NAME                                         AGE
seldondeployments.machinelearning.seldon.io  1s

==> v1/ClusterRole
seldon-operator-manager-role  1s

==> v1beta1/Deployment
NAME                        DESIRED  CURRENT  UP-TO-DATE  AVAILABLE  AGE
seldon-spartakus-volunteer  1        0        0           0          1s


NOTES:
NOTES: TODO


[6]:
!kubectl rollout status statefulset.apps/seldon-operator-controller-manager -n seldon-system
partitioned roll out complete: 1 new pods have been updated...

Setup Ingress

There are gRPC issues with the latest Ambassador, so we rewcommend 0.40.2 until these are fixed.

[7]:
!helm install stable/ambassador --name ambassador --set image.tag=0.40.2
NAME:   ambassador
LAST DEPLOYED: Tue Apr 16 11:42:03 2019
NAMESPACE: seldon
STATUS: DEPLOYED

RESOURCES:
==> v1/Service
NAME               TYPE          CLUSTER-IP      EXTERNAL-IP  PORT(S)                     AGE
ambassador-admins  ClusterIP     10.101.227.241  <none>       8877/TCP                    0s
ambassador         LoadBalancer  10.110.126.111  <pending>    80:31342/TCP,443:31890/TCP  0s

==> v1/Deployment
NAME        DESIRED  CURRENT  UP-TO-DATE  AVAILABLE  AGE
ambassador  3        3        3           0          0s

==> v1/Pod(related)
NAME                         READY  STATUS             RESTARTS  AGE
ambassador-5b89d44544-jzwlp  0/1    ContainerCreating  0         0s
ambassador-5b89d44544-wf62d  0/1    ContainerCreating  0         0s
ambassador-5b89d44544-wjvhh  0/1    ContainerCreating  0         0s

==> v1/ServiceAccount
NAME        SECRETS  AGE
ambassador  1        0s

==> v1beta1/ClusterRole
NAME        AGE
ambassador  0s

==> v1beta1/ClusterRoleBinding
NAME        AGE
ambassador  0s


NOTES:
Congratuations! You've successfully installed Ambassador.

For help, visit our Slack at https://d6e.co/slack or view the documentation online at https://www.getambassador.io.

To get the IP address of Ambassador, run the following commands:
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
     You can watch the status of by running 'kubectl get svc -w  --namespace seldon ambassador'

  On GKE/Azure:
  export SERVICE_IP=$(kubectl get svc --namespace seldon ambassador -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

  On AWS:
  export SERVICE_IP=$(kubectl get svc --namespace seldon ambassador -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')

  echo http://$SERVICE_IP:

[8]:
!kubectl rollout status deployment.apps/ambassador
Waiting for deployment "ambassador" rollout to finish: 0 of 3 updated replicas are available...
Waiting for deployment "ambassador" rollout to finish: 1 of 3 updated replicas are available...
Waiting for deployment "ambassador" rollout to finish: 2 of 3 updated replicas are available...
deployment "ambassador" successfully rolled out

Create Jaeger ConfigMap

[9]:
!pygmentize tracing-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: tracing-config
data:
  tracing.yml: |
    sampler:
      type: const
      param: 1
    local_agent:
      reporting_host: jaeger-agent
      reporting_port: 5775
    logging: true
[5]:
!kubectl apply -f tracing-configmap.yaml -n seldon
configmap/tracing-config unchanged

Run Example REST Deployment

[11]:
!pygmentize deployment_rest.json
{
    "apiVersion": "machinelearning.seldon.io/v1alpha2",
    "kind": "SeldonDeployment",
    "metadata": {
        "labels": {
            "app": "seldon"
        },
        "name": "tracing-example",
        "namespace": "seldon"
    },
    "spec": {
        "name": "tracing-example",
        "oauth_key": "oauth-key",
        "oauth_secret": "oauth-secret",
        "predictors": [
            {
                "componentSpecs": [{
                    "spec": {
                        "containers": [
                            {
                                "name": "model1",
                                "image": "seldonio/mock_classifier_rest:1.1",
                                "env": [
                                    {
                                        "name": "TRACING",
                                        "value": "1"
                                    },
                                    {
                                        "name": "JAEGER_CONFIG_PATH",
                                        "value": "/etc/tracing/config/tracing.yml"
                                    }
                                ],
                                "volumeMounts": [
                                    {
                                        "mountPath": "/etc/tracing/config",
                                        "name": "tracing-config"
                                    }
                                ]
                            }
                        ],
                        "terminationGracePeriodSeconds": 1,
                        "volumes": [
                            {
                                "name": "tracing-config",
                                "volumeSource" : {
                                    "configMap": {
                                        "localObjectReference" :
                                        {
                                            "name": "tracing-config"
                                        },
                                        "items": [
                                            {
                                                "key": "tracing.yml",
                                                "path":  "tracing.yml"
                                            }
                                        ]
                                    }
                                }
                            }
                        ]
                    }
                }],
                "graph": {
                    "name": "model1",
                    "endpoint": { "type" : "REST" },
                    "type": "MODEL",
                    "children": [
                    ]
                },
                "name": "tracing",
                "replicas": 1,
                "svcOrchSpec" : {
                    "env": [
                        {
                            "name": "TRACING",
                            "value": "1"
                        },
                        {
                            "name": "JAEGER_AGENT_HOST",
                            "value": "jaeger-agent"
                        },
                        {
                            "name": "JAEGER_AGENT_PORT",
                            "value": "5775"
                        },
                        {
                            "name": "JAEGER_SAMPLER_TYPE",
                            "value": "const"
                        },
                        {
                            "name": "JAEGER_SAMPLER_PARAM",
                            "value": "1"
                        }
                    ]
                }
            }
        ]
    }
}
[6]:
!kubectl create -f deployment_rest.json
seldondeployment.machinelearning.seldon.io/tracing-example created
[7]:
!kubectl rollout status deployment/tracing-example-tracing-535f3a8
Waiting for deployment "tracing-example-tracing-535f3a8" rollout to finish: 0 of 1 updated replicas are available...
deployment "tracing-example-tracing-535f3a8" successfully rolled out
[8]:
!seldon-core-api-tester contract.json `minikube ip` `kubectl get svc ambassador -o jsonpath='{.spec.ports[0].nodePort}'` \
    tracing-example --namespace seldon -p
----------------------------------------
SENDING NEW REQUEST:

[[ 0.871  0.027 -0.599]]
RECEIVED RESPONSE:
meta {
  puid: "kl0gv9mc7pe39kst3n7loeu6va"
  requestPath {
    key: "model1"
    value: "seldonio/mock_classifier_rest:1.1"
  }
}
data {
  names: "proba"
  ndarray {
    values {
      list_value {
        values {
          number_value: 0.056412411380098754
        }
      }
    }
  }
}


Check the Jaeger UI. You should be able to find traces like below:

rest

[9]:
!kubectl delete -f deployment_rest.json
seldondeployment.machinelearning.seldon.io "tracing-example" deleted

Run Example GRPC Deployment

[10]:
!kubectl create -f deployment_grpc.json
seldondeployment.machinelearning.seldon.io/tracing-example created
[11]:
!kubectl rollout status deployment/tracing-example-tracing-d240ae0
Waiting for deployment "tracing-example-tracing-d240ae0" rollout to finish: 0 of 1 updated replicas are available...
deployment "tracing-example-tracing-d240ae0" successfully rolled out
[12]:
!seldon-core-api-tester contract.json `minikube ip` `kubectl get svc ambassador -o jsonpath='{.spec.ports[0].nodePort}'` \
    tracing-example --namespace seldon -p --grpc
----------------------------------------
SENDING NEW REQUEST:

[[-0.024 -0.489 -0.362]]
RECEIVED RESPONSE:
meta {
  puid: "7m4ikainug51qqsq626cos499o"
  requestPath {
    key: "model1"
    value: "seldonio/mock_classifier_grpc:1.1"
  }
}
data {
  names: "proba"
  ndarray {
    values {
      list_value {
        values {
          number_value: 0.03885332622207222
        }
      }
    }
  }
}


Check the Jaeger UI. You should be able to find traces like below:

grpc

[13]:
!kubectl delete -f deployment_grpc.json
seldondeployment.machinelearning.seldon.io "tracing-example" deleted
[ ]: