cbmigrate

  • reference
Use the cbmigrate command-line tool to migrate your data from other platforms.

Description

The cbmigrate tool will migrate your existing data from the following platforms:

Installation

  1. Download the latest version of the cbmigrate package from its GitHub repository.

  2. Unpack the downloaded package to its own directory.

  3. Execute the tool by running the following from the command line:

    $ ./cbmigrate [command] [flags]

Syntax

$ cbmigrate [--version] [--help HELP]
$ cbmigrate [command] [flags]

Command options

cbmigrate takes one of three optional commands. Depending on the command used, the cbmigrate tool will also accept a range of flags for additional information required for its execution.

  • mongoDB

  • DynamoDB

  • Hugging Face

Table 1. Command options
Command Flags

mongo

Migrate the data from a MongoDB installation to Couchbase server.

--mongodb-uri string

The mongodb connection string.

--mongodb-database string

The name of the database that you wish to migrate.

--mongodb-collection

The name of the collection within the database you are migrating.

--cb-username string

The username granting access to the target cluster.

--cb-password string

The password (attached to --cb-username) for accessing the target cluster.

--cb-cluster string

The URL of the target cluster node for the import.

--cb-bucket string

The name of the target bucket.

--cb-scope string

The target scope for the migration.

--cb-collection string

The target collection name for the import.

--cb-generate-key string

Specifies a key expression used for generating a unique key for each imported document. It allows for the creation of document keys by combining static text, field values (denoted by %fieldname%), and custom generators (like #UUID#). For example, using a combination of static text, field names, and custom generators, you can generate a unique key of the form: "key::%name%::#UUID#"
(Default: "%_id%")

--cb-cacert string

Specifies a CA certificate that will be used to verify the identity of the server being connected to. Either this flag or the --cb-no-ssl-verify flag must be specified when using an SSL encrypted connection.

--cb-no-ssl-verify

Skips the SSL verification phase. Specifying this flag will allow a connection using SSL encryption but will not verify the identity of the server you connect to.

You are vulnerable to a man-in-the-middle attack if you use this flag.

Either this flag or the --cb-cacert flag must be specified when using an SSL encrypted connection

--cb-client-cert string

The path to a client certificate used to authenticate when connecting to a cluster. May be supplied with --cb-client-key as an alternative to the --cb-username and --cb-password flags.

--cb-client-cert-password

The password for the certificate provided to the --cb-client-cert flag, when using this flag, the certificate/key pair is expected to be in the PKCS#12 format

--cb-client-key string

The path to the client private key whose public key is contained in the certificate provided to the --cb-client-cert flag. May be supplied with --cb-client-cert as an alternative to the --username and --password flags.

--cb-client-key-password string

The password for the key provided to the --cb-client-key flag, when using this flag, the key is expected to be in the PKCS#8 format

--cb-buffer-size int

An integer value denoting the size of the memory buffer used during the import. (Default: 10000)

--cb-batch-size int

The number of documents processed as a batch during the import. (Default: 200)

--copy-indexes

Copy indexes for the collection (default: true)

--hash-document-key string

Hash the couchbase document key. Can be sha256 or sha512)

--keep-primary-key

Keep the non-composite primary key in the document. By default, if the key is a non-composite primary key, it is deleted.

--help

Help for the MongoDB migration parameters and flags

--debug

Enable debug output.

Table 2. Command options
Command Flags

dynamodb

Migrate the data from a DynamoDB installation to Couchbase server.

--aws-access-key-id string

Your AWS Access Key ID

--aws-ca-bundle string

The CA certificate bundle to use when verifying SSL certificates. Overrides config/env settings

--aws-endpoint-url string

Override the AWS default endpoint url with the given URL

--aws-no-verify-ssl

By default, cbmigrate uses SSL when communicating with AWS services. For each SSL connection, cbmigrate will verify SSL certificates. This option overrides the default behavior of verifying SSL certificates.

--aws-profile string

Use a specific aws profile from your credential file.

--aws-region string

The region to use. Overrides config/env settings.

--aws-secret-access-key string

The AWS secret access key.

--dynamodb-limit int

Specifies the maximum number of items to retrieve per page during a scan operation. Use this option to control the amount of data fetched in a single request, helping to manage memory usage and API call rates during scanning.

--dynamodb-segments int

Specifies the total number of segments to divide the DynamoDB table into for parallel scanning. Each segment is scanned independently, allowing multiple threads or processes to work concurrently for faster data retrieval. Use this option to optimize performance for large tables. By default, the entire table is scanned sequentially without segmentation (Default: 1)

--dynamodb-table-name string

The name of the table containing the requested item. You can also provide the Amazon Resource Name (ARN) of the table in this parameter.

--cb-username string

The username granting access to the target cluster.

--cb-password string

The password (attached to --cb-username) for accessing the target cluster.

--cb-cluster string

The URL of the target cluster node for the import.

--cb-bucket string

The name of the target bucket.

--cb-scope string

The target scope for the migration.

--cb-collection string

The target collection name for the import.

--cb-generate-key string

Specifies a key expression used for generating a unique key for each imported document. It allows for the creation of document keys by combining static text, field values (denoted by %fieldname%), and custom generators (like #UUID#). For example, using a combination of static text, field names, and custom generators, you can generate a unique key of the form: "key::%name%::#UUID#"
(Default: "%_id%")

--cb-cacert string

Specifies a CA certificate that will be used to verify the identity of the server being connected to. Either this flag or the --cb-no-ssl-verify flag must be specified when using an SSL encrypted connection.

--cb-no-ssl-verify

Skips the SSL verification phase. Specifying this flag will allow a connection using SSL encryption but will not verify the identity of the server you connect to.

You are vulnerable to a man-in-the-middle attack if you use this flag.

Either this flag or the --cb-cacert flag must be specified when using an SSL encrypted connection

--cb-client-cert string

The path to a client certificate used to authenticate when connecting to a cluster. May be supplied with --cb-client-key as an alternative to the --cb-username and --cb-password flags.

--cb-client-cert-password

The password for the certificate provided to the --cb-client-cert flag, when using this flag, the certificate/key pair is expected to be in the PKCS#12 format

--cb-client-key string

The path to the client private key whose public key is contained in the certificate provided to the --cb-client-cert flag. May be supplied with --cb-client-cert as an alternative to the --username and --password flags.

--cb-client-key-password string

The password for the key provided to the --cb-client-key flag, when using this flag, the key is expected to be in the PKCS#8 format

--cb-buffer-size int

An integer value denoting the size of the memory buffer used during the import. (Default: 10000)

--cb-batch-size int

The number of documents processed as a batch during the import. (Default: 200)

--copy-indexes

Copy indexes for the collection (default: true)

--hash-document-key string

Hash the couchbase document key. Can be sha256 or sha512)

--keep-primary-key

Keep the non-composite primary key in the document. By default, if the key is a non-composite primary key, it is deleted.

--help

Help for the MongoDB migration parameters and flags

--debug

Enable debug output.

Table 3. Command options
Command Flags

hugging-face

Migrate the data from a Hugging Face installation to Couchbase server.

--path string

The path or name of the dataset. (Required)

--name

Configuration name of the dataset. (Optional)

--data-files string

Path(s) to the source data file(s). (Optional)

--split string

The split of the data to load. (Optional)

--cache-dir string

The cache directory to store the datasets. (Optional)

--download-config string

Specific download configuration parameters. (Optional)

--download-mode reuse_dataset_if_exists | force_redownload

Specifies whether to reuse existing downloaded data or force a fresh download. (Optional)

--verification-mode no_checks | basic_checks | all_checks

Sets the level of verification during the migration. (Optional)

--keep-in-memory

Use this flag to keep the migrated dataset in memory.

--save-infos

Save the dataset information. (Default: false)

--revision string

The version of the dataset script to load. (Optional)

--token string

Authentication token for private datasets. (Optional)

--no-streaming

Disable streaming mode for dataset loading. (Default: false)

num-proc int

Number of processes to use for the migration. (Optional)

--storage-options string

Storage options for remote filesystems. (Optional)

--trust-remote-code

Allow loading arbitrary code from the dataset repository. (Optional)

--id-fields string

Comma-separated list of field names to use as the document ID.

--cb-url string

The URL for the target Couchbase cluster (e.g., couchbase://localhost)

--cb-username string

The username granting access to the target cluster.

--cb-password string

The password (attached to --cb-username) for accessing the target cluster.

--cb-bucket string

The name of the target bucket.

--cb-scope string

The target scope for the migration.

--cb-collection string

The target collection name for the import.

cb-batch-size int

The number of documents to insert per batch. (Default: 1000)

--help

Show the help screen for the hugging face migration.