cbbackupmgr
A utility for backing up and restoring a Couchbase cluster
DESCRIPTION
cbbackupmgr is a high performance backup and restore client for Couchbase Server.
See cbbackupmgr-tutorial to get started. For more information on how specific commands work you can run "cbbackupmgr <command> --help". For backup strategies see cbbackupmgr-strategies.
OPTIONS
- --version
-
Prints the cbbackupmgr suite version that the cbbackupmgr program came from.
- --help
-
Prints the synopsis and a list of commands. If a cbbackupmgr command is named, this option will bring up the manual page for that command.
CBBACKUPMGR COMMANDS
- cbbackupmgr-backup
-
Backs up data from a Couchbase cluster.
- cbbackupmgr-collect-logs
-
Collects diagnostic information.
- cbbackupmgr-compact
-
Compacts a backup.
- cbbackupmgr-config
-
Creates a new backup repository.
- cbbackupmgr-info
-
Command that displays information about the backups in the archive.
- cbbackupmgr-merge
-
Merges backups together.
- cbbackupmgr-remove
-
Deletes a backup repository.
- cbbackupmgr-restore
-
Restores a backup from the archive.
IDENTIFIER TERMINOLOGY
- <archive>
-
The root directory containing multiple backup repositories. This is the top-level backup directory and contains all backup data as well as backup logs.
- <repository>
-
Contains a backup configuration used for taking actual backups. A repository should be created for a specific Couchbase cluster and it will contain multiple incremental backups.
- <backup>
-
A backup of a Couchbase cluster at a given point in time. All backups are incremental backups.
- <bucket>
-
A backup may consist of one or more buckets. Each bucket is stored separately.
- <collection_string>
-
A dot separated string which indicates the location of documents on a Couchbase cluster. Collection strings are expected in the format 'bucket.scope.collection'. The collection string 'food.produce.fruit' refers to the 'fruit' collection inside the 'produce' scope which is contained in the 'food' bucket. In the case where the bucket name contains a '.' character you should escape the '.' character. e.g. 'fo\.od.produce.fruit' is a valid collection string. You are also free to omit the scope/collection portion of the collection string; in this case the absence of a scope/collection is treated as a wildcard indicating all collections inside a scope or all scopes inside a bucket. For example the collection string 'food.produce' indicates that we are that we are talking about all of the collections that fall inside of the 'produce' scope inside of the 'food' bucket.
ENVIRONMENT AND CONFIGURATION VARIABLES
- CB_CLUSTER
-
Specifies the hostname of the Couchbase cluster to connect to. If the hostname is supplied as a command line argument then this value is overridden.
- CB_USERNAME
-
Specifies the username for authentication to a Couchbase cluster. If the username is supplied as a command line argument then this value is overridden.
- CB_PASSWORD
-
Specifies the password for authentication to a Couchbase cluster. If the password is supplied as a command line argument then this value is overridden.
- CB_CLIENT_CERT
-
The path to a client certificate used to authenticate when connecting to a cluster. May be supplied with
CB_CLIENT_KEY
as an alternative to theCB_USERNAME
andCB_PASSWORD
variables. See the CERTIFICATE AUTHENTICATION section for more information. - CB_CLIENT_CERT_PASSWORD
-
The password for the certificate provided to the
CB_CLIENT_CERT
variable, when using this variable, the certificate/key pair is expected to be in the PKCS#12 format. See the CERTIFICATE AUTHENTICATION section for more information. - CB_CLIENT_KEY
-
The path to the client private key whose public key is contained in the certificate provided to the
CB_CLIENT_CERT
variable. May be supplied withCB_CLIENT_CERT
as an alternative to theCB_USERNAME
andCB_PASSWORD
variables. See the CERTIFICATE AUTHENTICATION section for more information. - CB_CLIENT_KEY_PASSWORD
-
The password for the key provided to the
CB_CLIENT_KEY
variable, when using this variable, the key is expected to be in the PKCS#8 format. See the CERTIFICATE AUTHENTICATION section for more information. - CB_ARCHIVE_PATH
-
Specifies the path to the backup archive. If the archive path is supplied as a command line argument then this value is overridden.
- CB_OBJSTORE_STAGING_DIRECTORY
-
Specifies the path to the staging directory. If the
--obj-staging-dir
argument is provided in the command line then this value is overridden. - CB_OBJSTORE_REGION
-
Specifies the object store region. If the
--obj-region
argument is provided in the command line then this value is overridden. - CB_OBJSTORE_ACCESS_KEY_ID
-
Specifies the object store access key id. If the
--obj-access-key-id
argument is provided in the command line this value is overridden. - CB_OBJSTORE_SECRET_ACCESS_KEY
-
Specifies the object store secret access key. If the
--obj-secret-access-key
argument is provided in the command line this value is overridden. - CB_OBJSTORE_REFRESH_TOKEN
-
Specifies the refresh token to use. If the
--obj-refresh-token
argument is provided in the command line, this value is overridden. - CB_AWS_ENABLE_EC2_METADATA
-
By default cbbackupmgr will disable fetching EC2 instance metadata. Setting this environment variable to true will allow the AWS SDK to fetch metadata from the EC2 instance endpoint.
DISCUSSION
The cbbackupmgr command is used for backing up Couchbase clusters, managing those backups, and restoring them.
The cbbackupmgr command was built around the concept of taking only incremental backups. This concept is important because as the data in a cluster grows it becomes increasingly difficult to take full backups in a reasonable amount of time. By taking incremental backups we are able to reduce the time it takes to backup a cluster by ensuring that we transfer the smallest amount of data possible each time we back the cluster up.
A consequence of taking incremental backups is that we must know about the previous backups that we have taken in order to know where we left off. This means that the cbbackupmgr command must manage the backups it has taken. The cbbackupmgr command does this by using the concept of a backup archive and a backup repository. A backup repository is a directory that contains a backup configuration for backing up a specific cluster. Normally there will be one backup repository per Couchbase cluster. Each time you want to back up this cluster you will specify this backup repository with the cbbackupmgr-backup command and the backup tool will automatically find where the last backup finished and incrementally backup new data in that cluster.
The backup archive is the top-level directory and contains one or more backup repositories and a logs folder. Logging for all backup repositories is contained in the logs folder in the backup.log file.
In an incremental approach the amount of data being stored in the backup archive is always increasing. To handle this issue the backup command allows backups to be merged together. This allows data to be deduplicated resulting in a single backup that takes up less disk space than the multiple previous backups. More information about how to take advantage of incremental backups and merges is contained in cbbackupmgr-strategies.
The minimum hardware requirement for running cbbackupmgr is four CPU cores, 8GiB RAM. The recommend hardware is sixteen CPU cores, 16GiB RAM and SSD disks.
OPERATIONS DURING MAJOR CLUSTER CONFIGURATION CHANGES
Operations (commands or sub-commands) which connect to a cluster are not supported during major cluster configuration changes.
For example, performing an import/export, making a backup or performing a restore whilst changing the TLS configuration/security settings is unsupported.
These types of changes (e.g. changing the TLS mode to strict) are not expected to be time consuming so it’s generally expected that operations should be started after completing the configuration change.
Please note that this does not include rebalances; operations may be performed during a rebalance. The reason for this distinction, is that major cluster configuration changes are generally quick, whilst rebalances for large data sets may be time consuming.
FURTHER DOCUMENTATION
A tutorial for getting started with the backup command is also available in cbbackupmgr-tutorial.
A guide for production backup strategies is available in cbbackupmgr-strategies.
REPORTING BUGS
Report urgent issues to the Couchbase Support Team at support@couchbase.com. Bugs can be reported to the Couchbase Jira Bug Tracker at http://www.couchbase.com/issues.
CBBACKUPMGR
Part of the cbbackupmgr suite