A newer version of this documentation is available.

View Latest

cbexport json

      +

      Exports JSON data from Couchbase

      SYNOPSIS

      cbexport json [--cluster <url>] [--bucket <bucket_name>] [--format <data_format>]
                    [--username <username>] [--password <password>] [--cacert <path>]
                    [--no-ssl-verify] [--threads <num>] [--log-file <path>]
                    [--include-key <key>] [--include-data <collection_string_list>]
                    [--exclude-data <collection_string_list>] [--verbose]
                    [--scope-field <scope_field>] [--collection-field <collection_field>]

      DESCRIPTION

      Exports JSON data from Couchbase. The cbexport-json command supports exporting JSON documents to a file with a document on each line or a file that contain a JSON list where each element is a document. The file format to export to can be specified with the --format flag. See the DATASET FORMATS section below for more details on the supported file formats.

      OPTIONS

      Below are a list of required and optional parameters for the cbexport-json command.

      Required

      -c,--cluster <url>

      The hostname of a node in the cluster to export data from. See the HOST FORMATS section below for details about hostname specification formats.

      -u,--username <username>

      The username for cluster authentication. The user must have the appropriate privileges to read the bucket in which data will be exported from.

      -p,--password <password>

      The password for cluster authentication. The user must have the appropriate privileges to read from the bucket in which data will be exported from. Specifying this option without a value will allow the user to type a non-echoed password to stdin.

      -b,--bucket <bucket_name>

      The name of the bucket to export data from.

      -f,--format <format>

      The format of the dataset specified (lines or list). See the DATASET FORMATS section below for more details on the formats supported by cbexport.

      -o,--output <path>

      The path to the location of the file that JSON documents from Couchbase should be exported to. This may be an absolute or relative path, but must point to a file. The file does not have to exist when the command is invoked.

      Optional

      --include-key <key>

      Couchbase stores data as key value pairs where the value is a JSON document and the key is an identifier for retrieving that document. By default cbexport will only export the value portion of the document. If you wish to include the key in the exported document then this option should be specified. The value passed to this option should be the field name that the key is stored under. If the value passed already exists as a field in the document, it will be overridden with the key. If the JSON document is not an object it will be turned into one and the value added to a field named 'value'. If the key value passed is 'value' then the key will not be written. It will display a warning for any document it has converted into an object.

      --include-data <collection_string_list>

      A comma separated list of collection strings to include when exporting from the bucket. Only scopes/collections included in this will be included in the output file. The expected format is scope1.collection1,scope2.collection2. This argument is mutually exclusive with --exclude-data.

      --exclude-data <collection_string_list>

      A comma separated list of collection strings to exclude when exporting from the bucket. Any scopes/collections included in this list will not be included in the output file. The expected format is scope1.collection1,scope2.collection2. This argument is mutually exclusive with --include-data.

      --no-ssl-verify

      Skips the SSL verification phase. Specifying this flag will allow a connection using SSL encryption, but will not verify the identity of the server you connect to. You are vulnerable to a man-in-the-middle attack if you use this flag. Either this flag or the --cacert flag must be specified when using an SSL encrypted connection.

      --cacert <cert_path>

      Specifies a CA certificate that will be used to verify the identity of the server being connecting to. Either this flag or the --no-ssl-verify flag must be specified when using an SSL encrypted connection.

      --scope-field <scope_field>

      When exporting from a collection aware cluster this field will be created in each JSON document; it will be used to store the name of the scope the document came from. This flag is required when exporting from a bucket that has non-default scopes and collections.

      --collection-field <collection_field>

      When exporting from a collection aware cluster this field will be created in each JSON document; it will be used to store the name of the collection the document came from. This flag is required when exporting from a bucket that has non-default scopes and collections.

      -t,--threads <num>

      Specifies the number of concurrent clients to use when exporting data. Fewer clients means exports will take longer, but there will be less cluster resources used to complete the export. More clients means faster exports, but at the cost of more cluster resource usage. This parameter defaults to 1 if it is not specified and it is recommended that this parameter is not set to be higher than the number of CPUs on the machine where the export is taking place.

      -l,--log-file <path>

      Specifies a log file for writing debugging information about cbexport execution.

      -v,--verbose

      Specifies that logging should be sent to stdout. If this flag is specified along with the -l/--log-file option then the verbose option is ignored.

      HOST FORMATS

      When specifying a host/cluster for a command using the -c/--cluster flag, the following formats are accepted:

      • <addr>:<port>

      • http://<addr>:<port>

      • https://<addr>:<port>

      • couchbase://<addr>:<port>

      • couchbases://<addr>:<port>

      • couchbase://<srv>

      • couchbases://<srv>

      • <addr>:<port>,<addr>:<port>

      • <scheme>://<addr>:<port>,<addr>:<port>

      The <port> portion of the host format may be omitted, in which case the default port will be used for the scheme provided. For example, http:// and couchbase:// will both default to 8091 where https:// and couchbases:// will default to 18091. When connecting to a host/cluster using a non-default port, the <port> portion of the host format must be specified.

      Connection Strings (Multiple nodes)

      The -c/--cluster flag accepts multiple nodes in the format of a connection string; this is a comma separated list of <addr>:<port> strings where <scheme> only needs to be specified once. The main advantage of supplying multiple hosts is that in the event of a failure, the next host in the list will be used.

      For example, all of the following are valid connection strings:

      • localhost,[::1]

      • 10.0.0.1,10.0.0.2

      • http://10.0.0.1,10.0.0.2

      • https://10.0.0.1:12345,10.0.0.2

      • couchbase://10.0.0.1,10.0.0.2

      • couchbases://10.0.0.1:12345,10.0.0.2:12345

      SRV Records

      The -c/--cluster flag accepts DNS SRV records in place of a host/cluster address where the SRV record will be resolved into a valid connection string. There are a couple of rules which must be followed when supplying an SRV record which are as follows:

      • The <scheme> portion must be either couchbase:// or couchbases://

      • The <srv> portion should be a hostname with no port

      • The <srv> portion must not be a valid IP address

      For example, all of the following are valid connection string using an SRV record:

      • couchbase://hostname

      • couchbases://hostname

      DATASET FORMATS

      The cbexport command supports the formats listed below.

      LINES

      The lines format specifies a file that contains one JSON document on every line in the file. This format is specified by setting the --format option to "lines". Below is an example of a file in lines format.

      {"key": "mykey1", "value": "myvalue1"}
      {"key": "mykey2", "value": "myvalue2"}
      {"key": "mykey3", "value": "myvalue3"}
      {"key": "mykey4", "value": "myvalue4"}

      LIST

      The list format specifies a file which contains a JSON list where each element in the list is a JSON document. The file may only contain a single list, but the list may be specified over multiple lines. This format is specified by setting the --format option to "list". Below is an example of a file in list format.

      [
        {
          "key": "mykey1",
          "value": "myvalue1"
        },
        {"key": "mykey2", "value": "myvalue2"},
        {"key": "mykey3", "value": "myvalue3"},
        {"key": "mykey4", "value": "myvalue4"}
      ]

      EXAMPLES

      To export data to /data/lines.json using the lines format and running with 4 threads the following command can be run.

      $ cbexport json -c couchbase://127.0.0.1 -u Administrator -p password \
       -b default -o /data/lines.json -f lines -t 4

      To export data to /data/list.json using the list format the following command can be run.

      $ cbexport json -c couchbase://127.0.0.1 -u Administrator -p password \
       -b default -o /data/list.json -f list

      To export data from a collections aware cluster with the scope and collection being added to the scope/collection field the following command can be run.

      $ cbexport json -c couchbase://127.0.0.1 -u Administrator -p password \
       -b default -o /data/list.json -f list --scope-field scope --collection-field collection
      
      [
        {
          "scope": "myscope1",
          "collection": "mycollection1",
          "key": "mykey1",
          "value": "myvalue1",
        }
      ]

      ENVIRONMENT AND CONFIGURATION VARIABLES

      CB_CLUSTER

      Specifies the hostname of the Couchbase cluster to connect to. If the hostname is supplied as a command line argument then this value is overridden.

      CB_USERNAME

      Specifies the username for authentication to a Couchbase cluster. If the username is supplied as a command line argument then this value is overridden.

      CB_PASSWORD

      Specifies the password for authentication to a Couchbase cluster. If the password is supplied as a command line argument then this value is overridden.

      SEE ALSO

      CBEXPORT

      Part of the cbexport suite