A newer version of this documentation is available.

View Latest
February 16, 2025
+ 12

How to import documents into Couchbase.
This guide is for Couchbase Server.

Introduction

Importing data can be done from the Couchbase Server UI, via the cbimport command-line tool shipped with Couchbase Server, or using the SDK to script the process.

Data load essentially consists of the following steps:

  1. Prepare data in some well known format such as Comma Separated Values (.csv) or JSON documents.

  2. Parse this data, and iterate over each document.

  3. Connect to your Couchbase instance.

  4. Connect to the appropriate bucket, scope, and collection.

  5. Decide on the key for this document (could be an ID, a sequence number, or some combination of fields).

  6. Do any additional processing if required.

  7. Insert the document.

Couchbase Clients

Clients access data by connecting to a Couchbase cluster over the network. The most common type of client is a Couchbase SDK, which is a full programmatic API that enables applications to take the best advantage of Couchbase. This developer guide focuses on the most commonly-used SDKs, but full explanations and reference documentation for all SDKs is available.

The command line clients also provide a quick and streamlined interface for simple access and are suitable if you just want to access an item without writing any code. For this guide, we are especially interested in the cbimport tool.

With some editions, the command line clients are provided as part of the installation of Couchbase Server. Assuming a default installation, you can find them in the following location, depending on your operating system:

Linux

/opt/couchbase/bin

Windows

C:\Program Files\Couchbase\Server\bin

macOS

/Applications/Couchbase Server.app/Contents/Resources/couchbase-core/bin

The Couchbase Server UI also offers a graphical interface to cbimport.

Read the following for further information about the clients available for importing data:

Preparing the Data

To prepare the data, extract or generate your data in an appropriate data format.

The following are well supported for export as well as by cbimport and the module ecosystems of all Couchbase SDKs.

Comma Separated Values (.csv) are easily exported from many spreadsheet and database applications.

Ensure that the first row is a header row containing the names of the columns within the document.

id,type,name
20001,airline,CSV-air-1
20002,airline,CSV-air-2

Using cbimport

Using cbimport is straightforward. Ensure you have the path to the command line clients in Couchbase Server in your path.

You can import all of the data formats described above.

To import a CSV file using cbimport csv:

  1. Use the --dataset argument to specify the CSV file.

  2. Use the --cluster, --username, and --password arguments to specify your connection details.

  3. Use the --bucket and --scope-collection-exp arguments to specify the bucket, scope, and collection as required.

  4. Use the --generate-key argument to specify an ID for the imported documents.


The following example imports a local CSV file, generating IDs such as airline_1234.

cbimport csv \
  --dataset file://./import.csv \
  --cluster localhost --username Administrator --password password \
  --bucket travel-sample --scope-collection-exp inventory.airline \
  --generate-key %type%_%id%

Importing Using an SDK

While cbimport accomplishes all the necessary steps in a single command, as above, using an SDK gives you more flexibility and control. However all the same considerations apply, so let us look at those in turn.

Parsing the Import into an Array or Stream of Records

The details of how to parse the import data vary depending on the chosen input format, and the most appropriate library for your SDK.

Parsing CSV and TSV Data

To parse CSV and TSV data, use the CsvHelper library.

csharp
using CsvHelper; using CsvHelper.Configuration; using System.Globalization;
csharp
public async Task importCSV(string filename) { using (var reader = new StreamReader(filename)) using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture)) { var records = csv.GetRecords<dynamic>(); foreach (dynamic record in records) { await upsertDocument(record); } } }
csharp
public async Task importTSV(string filename) { using (var reader = new StreamReader("import.tsv")) using (var tsv = new CsvReader(reader, new CsvConfiguration(CultureInfo.InvariantCulture) { Delimiter = "\t" })) { var records = tsv.GetRecords<dynamic>(); foreach (dynamic record in records) { await upsertDocument(record); } } }

Click the View button to see any code sample in context.

Parsing JSON and JSONL Data

To parse JSON and JSONL data, use Newtonsoft.

csharp
using Newtonsoft.Json; using Newtonsoft.Json.Linq;
csharp
public async Task importJSON(string filename) { using (var reader = new StreamReader("import.json")) { var jsonReader = new JsonTextReader(reader); JArray arr = (JArray)JToken.ReadFrom(jsonReader); foreach (JObject record in arr) { await upsertDocument(record); } } }
csharp
public async Task importJSONL(string filename) { using (var reader = new StreamReader("import.jsonl")) { var jsonlReader = new JsonTextReader(reader) { SupportMultipleContent = true }; while (jsonlReader.Read()) { var record = (JObject)JToken.ReadFrom(jsonlReader); await upsertDocument(record); } } }

Click the View button to see any code sample in context.

Connecting to the Couchbase Server

First, you need the connection details for the Couchbase server.

Now decide which bucket and scope and collection you want to import to, and create them if they don’t already exist.

csharp
var cluster = await Cluster.ConnectAsync( "couchbase://your-ip", "Administrator", "password"); var bucket = await cluster.BucketAsync("travel-sample"); var scope = await bucket.ScopeAsync("inventory"); var _collection = await scope.CollectionAsync("airline");

Click the View button to see any code sample in context.

For more information, refer to Managing Connections.

Inserting the Documents

Having processed each imported document, you can insert it into the keyspace. Couchbase is a key-value store, and the document is the value, so before you can insert the document, you need to determine the key.

To insert an imported document into the keyspace:

  1. Specify the key. This could be as simple as extracting the id field from the document, or using an incrementing sequence number.

  2. Do any additional processing, for example calculating fields, or adding metadata about the importer.

  3. Finally, use an upsert operation to the store the document.

Use upsert rather than insert to upload the document even if the target key already has a value. This means that in the case of any error, it is easy to make any required tweaks to the import file and re-run the whole import.

To store the data, hook the prepared data into an upsert routine.

As CsvHelper and Newtonsoft generate different outputs, we’ve provided some overloaded options that work for either.
csharp
// CsvHelper emits `dynamic` records public async Task upsertDocument(dynamic record) { // define the key string key = record.type + "_" + record.id; // do any additional processing record.importer = ".NET SDK"; // upsert the document await _collection.UpsertAsync(key, record); // any required logging Console.WriteLine(key); } // Newtonsoft.Json.Linq emits `JObjects` public async Task upsertDocument(JObject record) { // define the key string key = record["type"] + "_" + record["id"]; // do any additional processing record["importer"] = ".NET SDK"; // upsert the document await _collection.UpsertAsync(key, record); // any required logging Console.WriteLine(key); }

Click the View button to see any code sample in context.

For more information, refer to Data Operations.

Reference and information:

  • The Couchbase Server UI offers a graphical view of documents, to check your imports interactively.

How-to guides:

Key-Value Operations with SDKs: