View Operator Logs

    +

    This section provides information about how to diagnose and troubleshoot problems with the Couchbase Operator or your deployment.

    When troubleshooting the Couchbase Operator, it is important to rule out Kubernetes itself as the root cause of the problem you are experiencing. See the Kubernetes Troubleshooting Guide for information about debugging applications within a Kubernetes cluster.

    The following topics are also helpful to understand when troubleshooting the Operator:

    Full Deployment Logs

    The Operator is distributed with a support tool which can automatically collect resources, logs and events from the Kubernetes cluster for use in support cases. It is also capable of collecting logs from Couchbase server instances via cbcollect_info. Please see the documentation.

    Operator Logs

    The Couchbase Operator generates logs that can help troubleshoot your deployment. Using kubectl or oc, you can choose to print the Operator logs to stdout.

    • Kubernetes

    • OpenShift

    Get the name of the operator pod:

    $ kubectl get po -lapp=couchbase-operator
    NAME                                  READY     STATUS    RESTARTS   AGE
    couchbase-operator-1917615544-h20bm   1/1       Running   0          20h

    Get the operator logs:

    $ kubectl logs couchbase-operator-1917615544-h20bm
    time="2018-01-23T22:56:34Z" level=info msg="couchbase-operator v1.1.0 (release)" module=main
    time="2018-01-23T22:56:34Z" level=info msg="Obtaining resource lock" module=main
    time="2018-01-23T22:56:34Z" level=info msg="Starting event recorder" module=main
    time="2018-01-23T22:56:34Z" level=info msg="Attempting to be elected the couchbase-operator leader" module=main
    time="2018-01-23T22:56:51Z" level=info msg="I'm the leader, attempt to start the operator" module=main
    time="2018-01-23T22:56:51Z" level=info msg="Creating the couchbase-operator controller" module=main

    You can even use the deployment to show the logs. Since there is only one instance of the Operator in the deployment, the underlying command will automatically select the correct pod:

    $ kubectl logs deployment/couchbase-operator

    Get the name of the operator pod:

    $ oc get po -lapp=couchbase-operator
    NAME                                  READY     STATUS    RESTARTS   AGE
    couchbase-operator-1917615544-h20bm   1/1       Running   0          20h

    Get the operator logs:

    $ oc logs couchbase-operator-1917615544-h20bm
    time="2018-01-23T22:56:34Z" level=info msg="couchbase-operator v1.1.0 (release)" module=main
    time="2018-01-23T22:56:34Z" level=info msg="Obtaining resource lock" module=main
    time="2018-01-23T22:56:34Z" level=info msg="Starting event recorder" module=main
    time="2018-01-23T22:56:34Z" level=info msg="Attempting to be elected the couchbase-operator leader" module=main
    time="2018-01-23T22:56:51Z" level=info msg="I'm the leader, attempt to start the operator" module=main
    time="2018-01-23T22:56:51Z" level=info msg="Creating the couchbase-operator controller" module=main

    You can even use the deployment to show the logs. Since there is only one instance of the Operator in the deployment, the underlying command will automatically select the correct pod:

    $ oc logs deployment/couchbase-operator

    Watch for the following messages which indicate that the Operator is unable to reconcile your cluster into a desired state:

    • Logs with level=error

    • Operator is unable to get cluster state after N retries

    Profiling the operator

    The Couchbase operator serves profiling data on it’s default listenAddress localhost:8080. You can access this endpoint by running a remote shell or forwarding the port to your local system.

    • Kubernetes

    • OpenShift

    Access Go routine stack backtraces via a shell:

    $ kubectl exec -it couchbase-operator-599bcf47f-8wswh sh
    $ wget -O- 'http://localhost:8080/debug/pprof/goroutine?debug=1' | less

    Access Go memory usage via a port forward:

    $ kubectl port-forward couchbase-operator-599bcf47f-8wswh 8080:8080
    $ go tool pprof localhost:8080/debug/pprof/heap
    (pprof) traces

    Access Go routine stack backtraces via a shell:

    $ oc exec -it couchbase-operator-599bcf47f-8wswh sh
    $ wget -O- 'http://localhost:8080/debug/pprof/goroutine?debug=1' | less

    Access Go memory usage via a port forward:

    $ oc port-forward couchbase-operator-599bcf47f-8wswh 8080:8080
    $ go tool pprof localhost:8080/debug/pprof/heap
    (pprof) traces

    For additional details on the Go language pprof feature please read the official documentation.

    Couchbase Server Logs

    In must situations the cbopinfo command will successfully allow logs to be collected and downloaded. There are some cases where collection will fail, for example if a stateful service crashes when the Operator recovers the pod continuously. In this situation as the pod is not alive for long enough to collect logs so we provide a method to manually collect logs.

    The general log collection process is as follows:

    1. Pause the Operator for the cluster by setting spec.paused to true

    2. Create a temporary pod resource with the persistent volumes mounted

    3. Run the cbcollect_info command

    4. Download the logs from the pod

    5. Delete the temporary pod

    6. Unpause the Operator by unsetting spec.paused

    Creating a Temporary Pod

    The basic template will look like the following:

    ---
    apiVersion: v1
    kind: Pod
    metadata:
      name: cb-example-0005
      namespace: default
    spec:
      restartPolicy: never
      containers:
      - name: couchbase-server
        image: couchbase/server:6.5.0
        command: '/bin/sleep'
        args:
        - '86400'
        volumeMounts:
        - mountPath: /opt/couchbase/var/lib/couchbase
          name: pvc-couchbase-cb-example-0005-00-default
          subPath: default
        - mountPath: /opt/couchbase/etc
          name: pvc-couchbase-cb-example-0005-00-default
          subPath: etc
        - mountPath: /mnt/data
          name: pvc-couchbase-cb-example-0005-00-data
      volumes:
        - name: pvc-couchbase-cb-example-0005-00-default
          persistentVolumeClaim:
            claimName: pvc-couchbase-cb-example-0005-00-default
        - name: pvc-couchbase-cb-example-0005-00-data
            persistentVolumeClaim:
              claimName: pvc-couchbase-cb-example-0005-00-data

    The pod contains a single container running a Couchbase Server image as this contains all the necessary command line tools. We modify the container entry point to run /bin/sleep for 86400 seconds (a day) while logs are collected and downloaded.

    The associated volumes that need to be defined for the pod can be determined by running the following command, assuming the pod you wish to collect from is cb-example-0005:

    $ kubectl get pvc -lcouchbase_node=cb-example-0005

    Any returned volumes will need to be defined in volumes and be correctly mounted in the pod via the volumeMounts. volumeMounts names refer to their corresponding entries in volumes. The following documents the volumeMounts required for each entry in volumes given the returned persistent volume claims:

    pvc-couchbase-cb-example-0005-00-default

    The default persistent volume claim requires two volumeMounts. The default subPath must be mounted at /opt/couchbase/var/lib/couchbase. The etc subPath must be mounted at /opt/couchbase/etc.

    pvc-couchbase-cb-example-0005-00-data

    If specified the data persistent volume claim requires a single mount in volumeMounts, and must be mounted as /mnt/data.

    pvc-couchbase-cb-example-0005-00-index

    If specified the index persistent volume claim requires a single mount in volumeMounts, and must be mounted as /mnt/index.

    pvc-couchbase-cb-example-0005-00-analytics-00

    If specified analytics persistent volume claims require a single mount in volumeMounts per volume, they must be mounted as /mnt/analytics-00. If multiple analytics mounts are specified they will have different numeric suffixes e.g. pvc-couchbase-cb-example-0005-00-analytics-01 would be mounted as /mnt/analytics-01.

    Collecting & Downloading Logs

    Please see the documentation for cbcollect_info, however a typical command to run would be:

    $ kubectl exec -ti pod/cb-example-0005 /opt/couchbase/bin/cbcollect_info /tmp/cbinfo-default-cb-example-0005-$(date +%y%m%dT%H%M%S%z)

    The pod name refers to the name given to the pod in the template. The convention for logs is cbinfo, namespace, pod name, timestamp.

    Once complete the logs can be downloaded to the local host.

    $ kubectl cp default/cb-example-0005:/tmp/cbcollectinfo-default-cb-example-0005-181005T154746+0100.zip .

    See Also

    Refer to the Couchbase Server Troubleshooting guide for additional information about reporting issues.