cbbackupmgr backup

    +

    Backs up data from a Couchbase cluster

    SYNOPSIS

    cbbackupmgr backup [--archive <archive_dir>] [--repo <repo_name>]
                       [--cluster <url>] [--username <username>]
                       [--password <password>] [--resume] [--purge]
                       [--threads <num>] [--cacert <file>] [--no-ssl-verify]
                       [--value-compression <type>] [--no-progress-bar]
                       [--skip-last-compaction] [--consistency-check <window>]
                       [--obj-access-key-id <access_key_id>] [--obj-cacert <cert_path>]
                       [--obj-endpoint <endpoint>] [--obj-no-ssl-verify]
                       [--obj-region <region>] [--obj-staging-dir <staging_dir>]
                       [--obj-secret-access-key <secret_access_key>]
                       [--s3-force-path-style] [--s3-log-level <level>]

    DESCRIPTION

    Backs up a Couchbase cluster into the backup repository specified. Before running the backup command, a backup repository must be created. See cbbackupmgr-config for more details on creating a backup repository. The backup command uses information from the previous backup taken in order to backup all new data on a Couchbase cluster. If no previous backup exists then all data on the cluster is backed up. The backup is taken based on the backup repository’s backup configuration. Each backup will create a new folder in the backup repository. This folder will contain all data from the backup and is named to reflect the time that the backup was started.

    As the backup runs, it tracks its progress which allows failed backups to be resumed from the point where they left off. If a backup fails before it is complete it is considered a partial backup. To attempt to complete the backup process, the backup may be resumed with the --resume flag. It may also be deleted and resumed from the previous successful backup with the --purge flag.

    The backup command is capable of backing up data when there is a cluster rebalance operation in progress. During a rebalance, the backup command will track data as it moves around the cluster and complete the backup. However users should use caution when running backups during a rebalance since both the rebalance and backup operations can be resource intensive and may cause temporary performance degradations in other parts of the cluster. See the --threads flag for information on how to lower the impact of the backup command on your Couchbase cluster.

    The backup command is also capable of backing up data when there are server failures in the target backup cluster. When a server failure occurs the backup command will wait for 180 seconds for the failed server to come back online or for the failed server to be failed over and removed from the cluster. If 180 seconds passes without the failed server coming back online or being failed over then the backup command will mark the data on that node as failed and attempt to back up the rest of the data from the cluster. The backup will be marked as a partial backup in the backup archive and will need to be either resumed or purged when the backup command is invoked again.

    OPTIONS

    Below are a list of required and optional parameters for the backup command.

    Required

    -a,--archive <archive_dir>

    The location of the backup archive directory. When backing up directly to S3 prefix the archive path with s3://${BUCKET_NAME}/.

    -r,--repo <repo_name>

    The name of the backup repository to backup data into.

    -c,--cluster <hostname>

    The hostname of one of the nodes in the cluster to back up. See the Host Formats section below for hostname specification details.

    -u,--username <username>

    The username for cluster authentication. The user must have the appropriate privileges to take a backup.

    -p,--password <password>

    The password for cluster authentication. The user must have the appropriate privileges to take a backup. If not password is supplied to this option then you will be prompted to enter your password.

    Optional

    --resume

    If the previous backup did not complete successfully it can be resumed from where it left off by specifying this flag. Note that the resume and purge flags may not be specified at the same time.

    --purge

    If the previous backup did not complete successfully the partial backup will be removed and restarted from the point of the previous successful backup by specifying this flag. Note that the purge and resume flags may not be specified at the same time.

    --no-ssl-verify

    Skips the SSL verification phase. Specifying this flag will allow a connection using SSL encryption, but will not verify the identity of the server you connect to. You are vulnerable to a man-in-the-middle attack if you use this flag. Either this flag or the --cacert flag must be specified when using an SSL encrypted connection.

    --cacert <cert_path>

    Specifies a CA certificate that will be used to verify the identity of the server being connecting to. Either this flag or the --no-ssl-verify flag must be specified when using an SSL encrypted connection.

    --value-compression <compression_policy>

    Specifies a compression policy for backed up values. When Couchbase sends data to the backup client the data stream may contain all compressed values, all uncompressed values, or a mix of compressed and uncompressed values. To backup all data in the same form that the backup client receives it you can specify "unchanged". If you wish for all values to be uncompressed then you can specify "uncompressed". This policy will however uncompress any compressed values received from Couchbase and may increase the backup file size. To compress all values you can specify "compressed". This will compress any uncompressed values before writing them to disk. The default value for this option is "compressed".

    -t,--threads <num>

    Specifies the number of concurrent clients to use when taking a backup. Fewer clients means backups will take longer, but there will be less cluster resources used to complete the backup. More clients means faster backups, but at the cost of more cluster resource usage. This parameter defaults to 1 if it is not specified and it is recommended that this parameter is not set to be higher than the number of CPUs on the machine where the backup is taking place.

    --no-progress-bar

    By default, a progress bar is printed to stdout so that the user can see how long the backup is expected to take, the amount of data that is being transferred per second, and the amount of data that has been backed up. Specifying this flag disables the progress bar and is useful when running automated jobs.

    --consistency-check <window>

    When a window larger than 1 is provided it will enable the consistency checker. This will show a warning if the backup consistency window is larger than the one provided in seconds. This feature is developer preview. See DISCUSSION for more information.

    Cloud integration

    Backing up directly to object store is only supported for 6.6.0 clusters. It’s likely that backing up older clusters will result in significantly higher memory consumption.

    Required

    --obj-staging-dir <staging_dir>

    When performing an operation on an archive which is located in the cloud such as AWS, the staging directory is used to store local meta data files. This directory can be temporary (it’s not treated as a persistent store) and is only used during the backup. NOTE: Do not use /tmp as the your obj-staging-dir. See Disk requirements in cbbackupmgr-cloud for more information.

    Optional

    --obj-access-key-id <access_key_id>

    The access key id which has access to your chosen object store. This option can be omitted when using the shared config functionality provided by your chosen object store. Can alternatively be provided using the CB_OBJSTORE_ACCESS_KEY_ID environment variable.

    --obj-cacert <cert_path>

    Specifies a CA certificate that will be used to verify the identity of the object store being connected to.

    --obj-endpoint <endpoint>

    The host/address of your object store.

    --obj-no-ssl-verify

    Skips the SSL verification phase when connecting to the object store. Specifiying this flag will allow a connection using SSL encryption, but you are vulnerable to a man-in-the-middle attack.

    --obj-region <region>

    The region in which your bucket/container resides. For AWS this option may be omitted when using the shared config functionality. See the AWS section of the cloud documentation for more information.

    --obj-secret-access-key <secret_access_key>

    The secret access key which has access to you chosen object store. This option can be omitted when using the shared config functionality provided by your chosen object store. Can alternatively be provided using the CB_OBJSTORE_SECRET_ACCESS_KEY environment variable.

    AWS S3 Options

    Optional
    --s3-force-path-style

    By default the updated virtual style paths will be used when interfacing with AWS S3. This option will force the AWS SDK to use the alternative path style URLs which are often required by S3 compatible object stores.

    --s3-log-level <level>

    Set the log level for the AWS SDK. By default logging will be disabled. Valid options are debug, debug-with-signing, debug-with-body, debug-with-request-retries, debug-with-request-errors, and debug-with-event-stream-body. :!include_version_warning:

    HOST FORMATS

    When specifying a host for the couchbase-cli command the following formats are expected:

    • couchbase://<addr>

    • <addr>:<port>

    • http://<addr>:<port>

    It is recommended to use the couchbase://<addr> format for standard installations. The other two formats allow an option to take a port number which is needed for non-default installations where the admin port has been set up on a port other that 8091.

    EXAMPLES

    The following command is used to take a backup of a Couchbase cluster.

    $ cbbackupmgr config --archive /data/backups --repo example
    $ cbbackupmgr backup -a /data/backups -r example \
     -c couchbase://172.23.10.5 -u Administrator -p password

    Once the backup has finished there will be a new directory in the specified backup repository containing the backed up data. You can see this new directory using the cbbackupmgr-info command.

    $ cbbackupmgr info -a /data/backups --all
    Name         | UUID                                 | Size     | # Repos  |
    example | 32c97d5f-821a-4284-840b-9ee7cf8733a3 | 55.56MB  | 1        |
    *  Name        | Size     | # Backups  |
    *  Manchester  | 55.56MB  | 1          |
    +    Backup                      | Size     | Type | Source               | Cluster UUID                     | Range | Events  | Aliases  | Complete  |
    +    2019-03-15T13_52_27.18301Z  | 55.56MB  | FULL | http://172.23.10.5   | c044f5eeb1dc16d0cd49dac29074b5f9 | N/A   | 0       | 1        | true      |
    -      Bucket          | Size     | Items  | Mutations | Tombstones | Views  | FTS  | Indexes  | CBAS  |
    -      beer-sample     | 6.85MB   | 7303   | 7303      | 0          | 1      | 0    | 1        | 0     |
    -      gamesim-sample  | 2.86MB   | 586    | 586       | 0          | 1      | 0    | 1        | 0     |
    -      travel-sample   | 42.72MB  | 31591  | 31591     | 0          | 0      | 0    | 10       | 0     |

    If a backup fails then it is considered a partial backup and the backup client will not be able to back up any new data until the user decides whether to resume or purge the partial backup. This decision is made by specifying either the --resume or the --purge flag on the next invocation of the backup command. Below is an example of how this process works if the user wants to resume a backup.

    $ cbbackupmgr config -a /data/backups -r example
    $ cbbackupmgr backup -a /data/backups -r example \
     -c 172.23.10.5 -u Administrator -p password
    Error backing up cluster: Not all data was backed up due to connectivity
    issues. Check to make sure there were no server side failures during
    backup. See backup logs for more details on what wasn't backed up.
    $ cbbackupmgr backup -a /data/backups -r example \
     -c 172.23.10.5 -u Administrator -p password
    Error backing up cluster: Partial backup error 2016-02-11T17:00:19.594970735-08:00
    $ cbbackupmgr backup -a /data/backups -r example -c 172.23.10.5 \
     -u Administrator -p password --resume
    Backup successfully completed

    To backup a cluster with a different amount of concurrent clients and decrease the backup time you can specify the --threads flag. Remember that specifying a higher number of concurrent clients increases the amount of resources the cluster uses to complete the backup. Below is an example of using 16 concurrent clients.

    $ cbbackupmgr config -a /data/backups -r example
    $ cbbackupmgr backup -a /data/backups -r example \
     -c 172.23.10.5 -u Administrator -p password -t 16

    DISCUSSION

    This command always backs up data incrementally. By using the vbucket sequence number that is associated with each item, the backup command is able to examine previous backups in order to determine where the last backup finished.

    When backing up a cluster, data for each bucket is backed up in the following order:

    • Bucket Settings

    • View Definitions

    • Global Secondary Index (GSI) Definitions

    • Full-Text Index Definitions

    • Key-Value Data

    The backup command will store everything that is persisted to disk on the Couchbase Server nodes at the time the backup is started. Couchbase server is consistent at a vBucket level and not across a whole bucket. The tool tries to provide a strong consistency window by opening all connection to every node at the same time. Being a distributed system there are times when this is not possible such as when the cluster is under-resourced or there are network issues. These may affect the consistency of the backup across the vBuckets. cbbackupmgr backup provides a developer preview feature that checks that the backup is inside a consistency window.

    ENVIRONMENT AND CONFIGURATION VARIABLES

    CB_CLUSTER

    Specifies the hostname of the Couchbase cluster to connect to. If the hostname is supplied as a command line argument then this value is overridden.

    CB_USERNAME

    Specifies the username for authentication to a Couchbase cluster. If the username is supplied as a command line argument then this value is overridden.

    CB_PASSWORD

    Specifies the password for authentication to a Couchbase cluster. If the password is supplied as a command line argument then this value is overridden.

    CB_ARCHIVE_PATH

    Specifies the path to the backup archive. If the archive path is supplied as a command line argument then this value is overridden.

    CB_OBJSTORE_STAGING_DIRECTORY

    Specifies the path to the staging directory. If the --obj-staging-dir argument is provided in the command line then this value is overridden.

    CB_OBJSTORE_REGION

    Specifies the object store region. If the --obj-region argument is provided in the command line then this value is overridden.

    CB_OBJSTORE_ACCESS_KEY_ID

    Specifies the object store access key id. If the --obj-access-key-id argument is provided in the command line this value is overridden.

    CB_OBJSTORE_SECRET_ACCESS_KEY

    Specifies the object store secret access key. If the --obj-secret-access-key argument is provided in the command line this value is overridden.

    CB_AWS_ENABLE_EC2_METADATA

    By default cbbackupmgr will disable fetching EC2 instance metadata. Setting this environment variable to true will allow the AWS SDK to fetch metadata from the EC2 instance endpoint.

    FILES

    restrictions.json

    Keeps a list of restrictions used to ensure data is not restored to the place.

    bucket-config.json

    Stores the bucket configuration settings for a bucket.

    views.json

    Stores the view definitions for a bucket.

    gsi.json

    Stores the global secondary index (GSI) definitions for a bucket.

    full-text.json

    Stores the full-text index definitions for a bucket.

    shard-*.fdb

    Stores the key-value data for a bucket.

    CBBACKUPMGR

    Part of the cbbackupmgr suite