A newer version of this documentation is available.

View Latest

cbbackupmgr restore

Restores data from the backup archive to a Couchbase cluster

SYNOPSIS

cbbackupmgr restore [--archive <archive_dir>] [--repo <repo_name>]
                    [--cluster <host>] [--username <username>]
                    [--password <password>] [--start <backup>] [--end <backup>]
                    [--exclude-buckets <bucket_list>]
                    [--include-buckets <bucket_list>]
                    [--map-buckets <list>] [--disable-bucket-config]
                    [--disable-views] [--disable-gsi-indexes]
                    [--disable-ft-indexes] [--disable-data]
                    [--force-updates][--threads <integer>]
                    [--no-progress-bar]
== DESCRIPTION

Restores data from the backup archive to a target Couchbase cluster. By default all data, index definitions, view definitions, full-text index definitions and bucket configuration data are restored to the cluster unless specified otherwise in the repos backup config or though command line parameters when running the restore command. For example, if you changed bucket configuration settings since your last backup then restoring a previous backup will by default overwrite these settings unless you explicitly tell cbbackupmgr not to restore the bucket settings using the --disable-bucket-config flag.

The restore command is capable of restoring a single backup or a range of backups. When restoring a single backup, all data from that backup is restored. If a range of backups is restored, then cbbackupmgr will take into account any failovers that may have occurred in between the time that the backups were originally taken. If a failover did occur in between the backups, and the backup archive contains data that no longer exists in the cluster, then the data that no longer exists will be skipped during the restore. If no failovers occurred in between backups then restoring a range of backups will restore all data from each backup. If all data must be restored regardless of whether a failover occurred in between the original backups, then data should be restored one backup at a time.

The restore command is guaranteed to work during rebalances and failovers. If a rebalance is taking place, cbbackupmgr will track the movement of vbuckets around a Couchbase cluster and ensure that data is restored to the appropriate node. If a failover occurs during the restore then the client will wait 180 seconds for the failed node to be removed from the cluster. If the failed node is not removed in 180 seconds then the restore will fail, but if the failed node is removed before the timeout then data will continue to be restored.

Note that if you are restoring indexes then it is highly likely that you will need to take some manual steps in order to properly restore them. This is because by default indexes will only be built if they are restored to the exact same index node that they were backed up from. If the index node they were backed up from does not exist then the indexes will be restored in round-robin fashion among the current indexer nodes. These indexes will be created, but not built and will required the administrator to manually build them. We do this because we cannot know the optimal index topology ahead of time. By not building the indexes the administrator can move each index between nodes and build them when they deem that the index topology is optimal.

OPTIONS

Below is a list of required and optional parameters for the restore command.

Required

-a,--archive <archive_dir>

The directory containing the backup repository to restore data from.

-r,--repo <repo_name>

The name of the backup repository to restore data from.

-c,--cluster <hostname>

The hostname of one of the nodes in the cluster to restore data to. See the Host Formats section below for hostname specification details.

-u,--username <username>

The username for cluster authentication. The user must have the appropriate privileges to take a backup.

-p,--password <password>

The password for cluster authentication. The user must have the appropriate privileges to take a backup. If not password is supplied to this option then you will be prompted to enter your password.

Optional

--start <backup>

The name of the first backup in the backup repository to restore or an index value which references an incremental backup. Valid index values are any positive integer, "oldest", and "latest". If a positive integer is used then it should reference the index of the incremental backup starting from the oldest to the most recent backup. For example, "1" corresponds to the oldest backup, "2" corresponds to the second oldest backup, and so on. Specifying "oldest" means that the index of the oldest backup should be used and specifying "latest" means the index of the most recent backup should be used. If this flag is not specified then the restore will start with the oldest backup in the backup repository.

--end <backup>

The name of the last backup in the backup repository to restore or an index value which references an incremental backup. Valid index values are any positive integer, "oldest", and "latest". If a positive integer is used then it should reference the index of the incremental backup starting from the oldest to the most recent backup. For example, "1" corresponds to the oldest backup, "2" corresponds to the second oldest backup, and so on. Specifying "oldest" means that the index of the oldest backup should be used and specifying "latest" means the index of the most recent backup should be used. If this flag is not specified then the restore will end with the most recent backup in the backup repository.

--exclude-buckets <bucket_list>

Restores all buckets in a backup that are not specified in <bucket_list>. This flag cannot be specified at the same time as the --include-buckets flag. Takes a comma separated list of bucket names.

--include-buckets <bucket_list>

Restores only buckets in a backup that are specified in <bucket-list>. This flag cannot be specified at the same time as the --exclude-buckets flag. Takes a comma separated list of bucket names.

--filter-keys

Only restore data where the key matches a particular regular expression.

--filter-values

Only restore data where the value matches a particular regular expression.

--enable-bucket-config: Enables restoring the bucket configuration.

--disable-views

Skips restoring view definitions for all buckets.

--disable-gsi-indexes

Skips restoring gsi index definitions for all buckets.

--disable-ft-indexes

Skips restoring full-text index definitions for all buckets.

--disable-data

Skips restoring all key-value data for all buckets.

--force-updates

Forces data in the Couchbase cluster to be overwritten even if the data in the cluster is newer. By default updates are not forced and all updates use Couchbase’s conflict resolution mechanism to ensure that if newer data exists on the cluster that is not overwritten by older restore data.

--map-buckets <bucket_mapping>

Specified when you want to restore a backup to a destination bucket that has a different name than the bucket that was originally backed up. This parameter takes a list of mappings since multiple buckets may be restored at the same time. Each bucket mapping is separated by an "=" and if multiple bucket mappings are specified then they should be comma separated. If we have two buckets, bucket-1 and bucket-2, and we want to restore them to renamed-1 and renamed-2 then we would denote the mapping as "bucket-1=renamed-1,bucket-2=renamed-2".

--no-ssl-verify

Skips the SSL verification phase. Specifying this flag will allow a connection using SSL encryption, but will not verify the identity of the server you connect to. You are vulnerable to a man-in-the-middle attack if you use this flag. Either this flag or the --cacert flag must be specified when using an SSL encrypted connection.

--cacert <cert_path>

Specifies a CA certificate that will be used to verify the identity of the server being connecting to. Either this flag or the --no-ssl-verify flag must be specified when using an SSL encrypted connection.

-t,--threads <num>

Specifies the number of concurrent clients to use when restoring data. Fewer clients means restores will take longer, but there will be less cluster resources used to complete the restore. More clients means faster restores, but at the cost of more cluster resource usage. This parameter defaults to 1 if it is not specified and it is recommended that this parameter is not set to be higher than the number of CPUs on the machine where the restore is taking place.

--no-progress-bar

By default, a progress bar is printed to stdout so that the user can see how long the restore is expected to take, the amount of data that is being transferred per second, and the amount of data that has been restored. Specifying this flag disables the progress bar and is useful when running automated jobs.

HOST FORMATS

When specifying a host for the couchbase-cli command the following formats are expected:

  • couchbase://<addr>

  • <addr>:<port>

  • http://<addr>:<port>

It is recommended to use the couchbase://<addr> format for standard installations. The other two formats allow an option to take a port number which is needed for non-default installations where the admin port has been set up on a port other that 8091.

EXAMPLES

The restore command can be used to restore a single backup or range of backups in a backup repository. In the examples below, we will look a few different ways to restore data from a backup repository. All examples will assume that the backup archive is located at /data/backups and that all backups are located in the "example" backup repository.

The first thing to do when getting ready to restore data is to decide which backups to restore. The easiest way to do this is to use the list command to see which backups are available to restore.

$ cbbackupmgr list --archive /data/backups --repo example
Size      Items          Name
2.24GB    -              + example
1.11GB    -                  + 2016-03-08T14_41_10.757145596-08_00
1.11GB    -                      + default
295B      0                          bucket-config.json
1.11GB    983797                     + data
1.11GB    983797                         shard_0.fdb
2B        0                          full-text.json
128B      0                          gsi.json
2B        0                          views.json
430.52MB  -                  + 2016-03-09T14_42_24.024494032-08_00
430.52MB  -                      + default
295B      0                          bucket-config.json
430.52MB  334400                     + data
430.52MB  334400                         shard_0.fdb
2B        0                          full-text.json
128B      0                          gsi.json
2B        0                          views.json
728.72MB  -                  + 2016-03-10T14_42_58.743250296-08_00
728.72MB  -                      + default
295B      0                          bucket-config.json
728.72MB  607500                     + data
728.72MB  607500                         shard_0.fdb
2B        0                          full-text.json
128B      0                          gsi.json
2B        0                          views.json

From listing the backup repository we can see we have three backups that we can restore in the "examples" backup repository. If we just want to restore one of them we set the --start and --end flags in the restore command to the same backup name and specify the cluster that we want to restore the data to. In the example below we will restore only the oldest backup.

$ cbbackupmgr restore -a /data/backups -r example \
 -c couchbase://127.0.0.1 -u Administrator -p password \
 --start 2016-03-08T14_41_10.757145596-08_00 \
 --end 2016-03-08T14_41_10.757145596-08_00

If we want to restore only the two most recent backups then we specify the --start and --end flags with different backup names in order to specify the range we want to restore.

$ cbbackupmgr restore -a /data/backups -r example \
 -c couchbase://127.0.0.1 -u Administrator -p password \
 --start 2016-03-09T14_42_24.024494032-08_00 \
 --end 2016-03-10T14_42_58.743250296-08_00

If we want to restore all of the backups in the "examples" directory then we can omit the --start and --end flags since their default values are the oldest and most recent backup in the backup repository.

$ cbbackupmgr restore -a /data/backups -r example \
 -c couchbase://127.0.0.1 -u Administrator -p password

DISCUSSION

The restore command works by replaying the data recorded in backup files. During a restore each key-value pair backed up by cbbackupmgr will be sent to the cluster as either a "set" or "delete" operation. The restore command replays data from each file in order of backup time to guarantee that older backup data does not overwrite newer backup data. The restore command uses Couchbase’s conflict resolution mechanism by default to ensure this behavior. The conflict resolution mechanism can be disable by specifying the --force-updates flag when executing a restore.

Starting in Couchbase 4.6 each bucket can have different conflict resolution mechanisms. cbbackupmgr will backup all meta data used for conflict resolution, but since each conflict resolution mechanism is different cbbackupmgr will prevent restores to a bucket when the source and destination conflict resolution methods differ. This is done because by default cbbackupmgr will use the conflict resolution mechanism of the destination bucket to ensure an older value does not overwrite a newer value. If you want to restore a backup to a bucket with a different conflict resolution type you can do by using the --force-updates flag. This is allowed because forcing updates means that cbbackupmgr will skip doing conflict resolution on the destination bucket.

Also keep in mind that unlike backups, restores cannot be resumed if they fail.

ENVIRONMENT AND CONFIGURATION VARIABLES

CB_CLUSTER

Specifies the hostname of the Couchbase cluster to connect to. If the hostname is supplied as a command line argument then this value is overridden.

CB_USERNAME

Specifies the username for authentication to a Couchbase cluster. If the username is supplied as a command line argument then this value is overridden.

CB_PASSWORD

Specifies the password for authentication to a Couchbase cluster. If the password is supplied as a command line argument then this value is overridden.

CB_ARCHIVE_PATH

Specifies the path to the backup archive. If the archive path is supplied as a command line argument then this value is overridden.

CBBACKUPMGR

Part of the cbbackupmgr suite