A newer version of this software is available
You are viewing the documentation for an older version of this software. To find the documentation for the current version, visit the Couchbase documentation home page.
This guide provides information for developers who want to use the Couchbase Python SDK to build applications that use Couchbase Server.
The following sections demonstrate how to get started using Couchbase with the Python SDK. We’ll first show how to install the SDK and then demonstrate how it can be used to perform some simple operations.
Install the following packages to get started using the Python SDK:
Download, install, and start Couchbase server. Come back here when you are done.
Download and install the C library. If you are using Microsoft Windows, you can skip this step because starting with version 1.0 Beta, the C library is bundled with the Python SDK.
If you already have the C SDK installed, ensure the installed version is
at least 2.1.0. To determine the version, either query your package version,
or run the cbc
command:
shell> cbc version
cbc built from: libcouchbase 2.1.3_14_g7684ab7 (rev. 7684ab7832628d520cd2803f3a3d0822e993c3f3)
using libcouchbase: 2.1.3 (libevent)
Note that depending on how your libcouchbase was installed, the cbc
command may not be available, and is not required for the Python SDK.Check your Python version. It must be at least version 2.6 (Python version 3.x is supported as well). To check your python version:
shell> python -V python 2.6.6
Install the Python development files.
For EL5-based RPM distributions (e.g. CentOS 5, RHEL 5), do
yum install python26-devel
For EL6-based RPM distributions (e.g. CentOS 6, RHEL 6), do
yum install python-devel
For Debian-based systems (including Ubuntu), do
apt-get install python-dev
For OS X and Windows systems, Python development files are already included in the Python distribution.
Install the Python SDK. The easiest way to do this is via the pip
tool:
shell> pip install couchbase --quiet
If all went well, you should not see any
errors printed to the screen.
Alternatively, you can manually download one of the packages at PyPi.
Verify that your Python SDK is available and working by entering the following command:
shell> python -c 'import couchbase'
If this does not print any errors or
exceptions, your Python SDK is properly installed!
To follow the tradition of programming tutorials, we’ll start with “Hello Couchbase.” This example works with the “beer-sample” bucket that is provided with the default install.
from couchbase import Couchbase
from couchbase.exceptions import CouchbaseError
c = Couchbase.connect(bucket='beer-sample', host='localhost')
try:
beer = c.get("aass_brewery-juleol")
except CouchbaseError as e:
print "Couldn't retrieve value for key", e
# Rethrow the exception, making the application exit
raise
doc = beer.value
# Because Python 2.x will complain if an ASCII format string is used
# with Unicode format values, we make the format string unicode as well.
print unicode("{name}, ABV: {abv}").format(name=doc['name'], abv=doc['abv'])
doc['comment'] = "Random beer from Norway"
try:
result = c.replace("aass_brewery-juleol", doc)
print result
except CouchbaseError as e:
print "Couldn't replace key"
raise
The following points explain each step in the example:
Connecting
The Couchbase.connect
class method constructs a new
couchbase.connection.Connection
object. This object represents a connection to
a single bucket within the cluster. Arguments passed to connect
are passed to
the constructor (see API documentation on the Connection
object for more
details and options).
A bucket represents a logical namespace for a key. All keys must be unique
within a single bucket, but multiple buckets can have keys with the same names
and they will not conflict. A new connection object must be created for each
bucket that you want to interact with in your application. This example creates
one connection to the beer-sample
bucket.
The constructor is passed the bucket name, which is beer-sample
, and a node on
the cluster to connect to. You can pass any node that is a member of the
cluster. This example uses the local cluster instance.
Retrieving Data
The get
method retrieves the value for the key requested. If the key exists, a
Result
object that contains the value of the key as well as additional
metadata is returned. To get the actual value of the object, you can access the
Result
object’s value
property.
If the key does not exist on the server, an exception of type CouchbaseError
is thrown. This exception object can be caught and examined or printed to see
more details about why the operation failed. See the API documentation for more
details.
We treat the value
as a dict
object. As a documented oriented database,
values stored to the server are considered to be JSON by default, and when
retrieved from the server are interpreted to be JSON (and unserialized into a
Python dict
). It is possible to use other formats than the default JSON,
however. The set
methods accept a format
keyword argument that indicates the
conversion type to be used. The default is couchbase.FMT_JSON
, but you can
also use couchbase.FMT_BYTES
, couchbase.FMT_UTF8
, or couchbase.FMT_PICKLE
instead. If none of these are sufficient, you can write your own custom
Transcoder
object to handle conversion on your own.
Starting from version 1.1.0, you may also use the couchbase.FMT_AUTO
flag
which will guess the suitable format based on the Python datatype being passed.
Storing Data
To store documents in the server, you can use one of the set
family of
methods. Here we use replace
, which enforces the constraint that a previous
value of the document must already exist. This can be thought of as an update
operation in terms of CRUD (create, read, update, delete).
The storage methods also return a Result
object that contains metadata about
the value stored.
Now we’re ready to run our first Couchbase Program:
The first line outputs the name field of the document, and the second line
outputs the Result
object of the replace operation.
A document in Couchbase server consists of a key, value, and metadata.
Key
A key is a unique identifier for your data. Each document must have a unique key. The key can be any valid Unicode string.
Value
The value is your own application data that exists under the key. The format of
the value can be anything. By default, only JSON-serializable objects are
supported — Python str
, unicode
, dict
, list
, tuple
, int
, long
,
float
, bool
, and None
types — in short, anything that the standard
json.dumps
accepts. The reason JSON is the default format is for the ability
to later query the database based on value contents, as will be explained later.
You can also store arbitrary Python objects using the FMT_PICKLE
value for the
format
option.
Metadata
The metadata contains information concerning the format of the value that is, whether it’s JSON, Pickle, or something else. It also contains revision information such as the CAS, which we’ll read about later.
You can store documents by providing the unique key under which the document will be stored, and the value which contains the actual document. You can retrieve documents either by directly specifying the unique key under which the document was stored or by querying views that retrieve documents based on specific criteria.
This section provides a bit more insight on how to store documents. This is a prerequisite to demonstrate how to retrieve documents because there must be something to retrieve.
There are additional storage methods beyond those described here, which are covered in the Advanced section, see Advanced Usage. These include manipulating numeric counters, setting expiration times for documents, and appending/prepending to existing values.
The Connection
object provides the following store operations, which conform
to the CRUD model:
set(key, value)
Stores the document value
under the key
. If the key did not previously
exist, it is created. If the key already exists, its existing value is
overwritten with the new contents of value
.
add(key, value)
Stores the document value
under the key
, but only if key
does not already
exist. If key
already exists, an exception is thrown.
replace(key, value)
Replace
is the inverse of add
. It sets the contents of key
to value
, but
only if the key already exists. If the key does not already exist, an
exception is thrown.
delete(key)
Deletes the key
from the bucket. Future attempts to access this key via get
raise an exception until something is stored again for this key using one of the
set
methods.
The following code demonstrates the store operations.
from couchbase import Couchbase
from couchbase.exceptions import CouchbaseError
key = "demo_key"
value = "demo_value"
# We use the 'default' bucket.
c = Couchbase.connect(bucket='default', host='localhost')
print "Setting key {0} with value {1}".format(key, value)
result = c.set(key, value)
print "...", result
print ""
print "Getting value for key {0}".format(key)
result = c.get(key)
print "...", result
print ""
print "Creating new key {0} with value 'new_value'".format(key)
print "This will fail as '{0}' already exists".format(key)
try:
c.add(key, "another value")
except CouchbaseError as e:
print e
print "Replacing existing key {0} with new value".format(key)
result = c.replace(key, "new value")
print "...", "result"
print ""
print "Getting new value for key {0}".format(key)
result = c.get(key)
print "...", result
print ""
print "Deleting key", key
result = c.delete(key)
print "...", result
print ""
print "Getting value for key {0}. This will fail as it has been deleted".format(key)
try:
c.get(key)
except CouchbaseError as e:
print e
print ""
print "Creating new key {0} with value 'added_value'".format(key)
result = c.add(key, "added_value")
print "...", result
print "Getting the new value"
result = c.get(key)
print "...", result
Output:
Setting key demo_key with value demo_value
... OperationResult<RC=0x0, Key=demo_key, CAS=0x2fbbf239b45e0400>
Getting value for key demo_key
... ValueResult<RC=0x0, Key=demo_key, Value=u'demo_value', CAS=0x2fbbf239b45e0400, Flags=0x0>
Creating new key demo_key with value 'new_value'
This will fail as 'demo_key' already exists
<Key=u'demo_key', RC=0xC[Key exists (with a different CAS value)], Operational Error, Results=1, C Source=(src/multiresult.c,148)>
Replacing existing key demo_key with new value
... result
Getting new value for key demo_key
... ValueResult<RC=0x0, Key=demo_key, Value=u'new value', CAS=0xa047003ab45e0400, Flags=0x0>
Deleting key demo_key
... OperationResult<RC=0x0, Key=demo_key, CAS=0xa147003ab45e0400>
Getting value for key demo_key. This will fail as it has been deleted
<Key=u'demo_key', RC=0xD[No such key], Operational Error, Results=1, C Source=(src/multiresult.c,148)>
Creating new key demo_key with value 'added_value'
... OperationResult<RC=0x0, Key=demo_key, CAS=0xe91f0c3ab45e0400>
Getting the new value
... ValueResult<RC=0x0, Key=demo_key, Value=u'added_value', CAS=0xe91f0c3ab45e0400, Flags=0x0>
Couchbase provides two ways to fetch your documents: you can retrieve a document by its key, or you can retrieve a set of documents that match some constraint by using Views. Because views are more complex, we’ll first demonstrate getting documents by their keys.
To get a single document, simply supply the key as the first argument to the
get
method. It returns a Result
object on success that can then be used to
extract the value.
client.store("my list", [])
result = client.get("my list")
doc = result.value
To get multiple documents, you can use the more efficient get_multi
method.
You pass it an iterable sequence of keys, and it returns a MultiResult
object
with the keys passed to get_multi
as keys, and the values being a Result
object for the result of each key. MultiResult
is a subclass of dict
.
client.set_multi({
'sheep_counting' : ['first sheep', 'second sheep'],
'famous_sheep' : {'sherry lewis' : 'Lamb Chop'}
})
keys = ('sheep_counting', 'famous_sheep')
results = client.get_multi(keys)
for key, result in results.items():
doc = result.value
If a document does not exist, a couchbase.exceptions.NotFoundError
(which is a
subclass of couchbase.exceptions.CouchbaseError
) is thrown.
You can change this behavior by using the quiet
keyword parameter and setting
it to true (to suppress exceptions for a specific get
call) or by setting the
Connection.quiet
property on the Connection
object (which suppresses
exceptions on get
for subsequent calls).
When using quiet
, you can still determine if a key was retrieved successfully
by examining the success
property of the value object.
result = client.get("non-exist-key", quiet=True)
if result.success:
print "Got document OK"
else:
print ("Couldn't retrieve document. "
"Result was received with code"), result.rc
client = Couchbase.connect(bucket='default', quiet=True)
result = client.get("non-exist-key")
if result.success:
print "Got document OK"
else:
print "Couldn't retrieve document"
The rc
property of the Result
object contains the error code received on
failure (on success, its value is 0
). You can also obtain the exception class
which would have been thrown by using
>>> CouchbaseError.rc_to_exctype(result.rc)
<class 'couchbase.exceptions.NotFoundError'>
This class method is passed an error code and produces the appropriate exception class.
When using get_multi
with the quiet option enabled, you can immediately
determine whether all the keys were fetched successfully by examining the
all_ok
property of the returned MultiResult
object.
results = client.get_multi(("i exist", "but i don't"), quiet=True)
if not results.all_ok:
print "Couldn't get all keys"
In addition to fetching documents by keys, you can also employ Views to retrieve information using secondary indexes. This guide gets you started on how to use them from the Python SDK. If you want to learn more about views, see the chapter in the Couchbase Server 2.0 documentation.
First, create your view definition using by the web UI (though you may also do this directly from the Python SDK, as will be shown later).
You can then query the view results by calling the query
method on the
Connection
object. Simply pass it the design and view name.
view_results = client.query("beer", "brewery_beers")
for result in view_results:
print "Mapped key: %r" % (result.key,)
print "Emitted value: %r" % (result.value,)
print "Document ID: %s" % (result.docid,)
The query
method returns a couchbase.views.iterator.View
object, which is an
iterator. You can simply iterate over it to retrieve the results for the query.
Each object yielded is a ViewRow
, which is a simple object containing the key,
value, document ID, and optionally the document itself for each of the results
returned by the view.
In addition to the design and view name, the query
method accepts additional
keyword arguments that control the behavior of the results returned. You can
thus use it as follows:
results = client.query("beer", "brewery_beers", opt1=value1, opt2=value2, ...)
for result in results:
# do something with result..
Here are some of the available parameters for the query
method. A full listing
can be found in the API documentation.
include_docs
This boolean parameter indicates whether the corresponding document should be
retrieved for each row fetched. If this is true, the doc
property of the
ViewRow
object yielded by the iterator returned by query
contains a Result
object that contains the document for the key.
reduce
This boolean parameter indicates whether the server should also pass the results
to the view’s reduce
function. An exception is raised if the view does not
have a reduce
method defined.
limit
This numeric parameter indicates the maximum amount of results to fetch from the query. This parameter is handy if your query can produce a lot of results.
descending
This boolean parameter indicates that the results should be returned in reverse order.
stale
This boolean parameter controls the tradeoff between performance and freshness of data.
debug
This boolean parameter fetches low-level debugging information from the view engine.
streaming
This boolean parameter indicates whether the view results should be decoded in a streaming manner. When enabled, the iterator internally fetches chunks of the response as required.
As this is less efficient than fetching all results at once, it is disabled by default, but can be very useful if you have a large dataset because it prevents the entire view from being buffered in memory.
results = client.query("beer", "brewery_beers",
include_docs=True, limit=5)
for result in results:
print "key is %r" % (result.key)
doc = result.doc.value
if doc['type'] == "beer":
print "Got a beer. It's got %0.2f ABV" % (doc['abv'],)
The default encoding format for the Python SDK is JSON. This means you can pass
any valid object that is accepted by the standard json.dumps
library function
and you will receive it back when you retrieve it.
# -*- coding: utf-8 -*-
import pprint
from couchbase import Couchbase
client = Couchbase.connect(bucket='default', host='localhost')
value = {
"this is a" : "dictionary",
"and this is a list" : ["with", "some", "elements"],
"and this is a tuple" : ("with", "more", "elements"),
"you can also use floats" : 3.14,
"integers" : 42,
"strings" : "hello",
"unicode" : "שלום!",
"blobs" : "\x00",
"or a None" : None
}
client.set("a_key", value)
result = client.get("a_key")
pprint.pprint(result.value)
print result.value['unicode']
Which then prints
{u'and this is a list': [u'with', u'some', u'elements'],
u'and this is a tuple': [u'with', u'more', u'elements'],
u'blobs': u'\x00',
u'integers': 42,
u'or a None': None,
u'strings': u'hello',
u'this is a': u'dictionary',
u'unicode': u'\u05e9\u05dc\u05d5\u05dd!',
u'you can also use floats': 3.14}
שלום!
To view the document you just created, go to localhost:8091
in
your browser, type in your administrative credentials, go over to the Data
Buckets pane, click on the Documents button for the default
bucket, and
type in the ID for the document (in this case, it’s a_key
)). The document can
now be indexed and queried against using views.
While JSON is the default format, you might want to use other formats. For
example, if you want to store complex custom Python objects and classes and
don’t require that they be indexed with views, you can use the pickle
serialization format. It allows you to store types that are not accepted by
JSON:
import pprint
from couchbase import Couchbase, FMT_PICKLE
c = Couchbase.connect(bucket='default')
c.set("a_python_object", object(), format=FMT_PICKLE)
c.set("a_python_set", set([1,2,3]), format=FMT_PICKLE)
pprint.pprint(c.get("a_python_object").value)
pprint.pprint(c.get("a_python_set").value)
Outputs:
<object object at 0x7fa7d0ad80e0> set([1, 2, 3])
You can also store arbitrary
strings of bytes by using FMT_BYTES
.
In Python 2.6 and above bytes
and str
are the same type; however in Python
3, a str
is a string with an encoding (i.e. Python 2’s unicode
) while
bytes
is a sequence of bytes which must be explicitly converted in order to be
used with text operations.
import pprint
from couchbase import Couchbase, FMT_BYTES
c = Couchbase.connect(bucket='default')
c.set("blob", b"\x01\x02\x03\x04", format=FMT_BYTES)
pprint.pprint(c.get("blob").value)
Outputs
b'\x01\x02\x03\x04'
Alternatively, you can use FMT_UTF8
to store a unicode
object represented as UTF-8.
While JSON is also capable of storing strings and Unicode, the JSON
specification mandates that all strings begin and end with a quote ( "
). This
uses additional space and costs extra processing power to decode and encode your
JSON string. Therefore,you can save on performance by using FMT_UTF8
for
simple strings.
It is possible to encode your data by using encodings other than UTF-8.
However, because the view engine operates using UTF-8, we selected this as the
default. If you need a different encoding, consider using the Transcoder
interface.
from couchbase import Couchbase, FMT_UTF8
c = Couchbase.connect(bucket='default')
c.set("EXCALIBUR", u"\u03EE", format=FMT_UTF8)
print c.get("EXCALIBUR")
Outputs
ValueResult<RC=0x0, Key=EXCALIBUR, Value=u'\u03ee', CAS=0x6552956a925e0400, Flags=0x4>
You can set the default format for the value type you use most by setting the
default_format
property on the connection object, either during construction
or afterwards:
c = Couchbase.connect(bucket='default', default_format=FMT_UTF8)
Or
c.default_format = FMT_PICKLE
In this chapter we build on the foundations of the Getting Started guide and
build a simple web application. Make sure you have the beer-sample
bucket
installed, because we’ll be using it. The sample application will allow you to
edit and manage various beers and breweries.
The sample application is not entirely complete, and there are some features that remain to be implemented. Implementing them is an exercise for the reader.
The full source code for the sample application is available through couchbaselabs on GitHub.
Note that the sample application provides more content than described in this tutorial, but it should be simple to navigate while reading this tutorial.
Ensure you have Flask installed. You can either install it via your distribution
or use pip install Flask
.
Download Couchbase Server and install it.
Make sure to install the beer-sample
dataset when you run the wizard because
this tutorial application works with it.
Clone the repository and cd
into the directory:
shell> git clone git://github.com/couchbaselabs/beersample-python
Cloning into ‘beersample-python’‘
#…
shell> cd beersample-python
Some views need to be set up. You can set up the views manually via the Web UI,
or invoke the design_setup.py
script located in the beersample-python
directory.
In the beer
design document, create a view called by_name
:
function (doc, meta) {
if (doc.type && doc.type == "beer") {
emit(doc.name, null);
}
}
Create a design document called brewery
and add a view called by_name
:
function (doc, meta) {
if (doc.type && doc.type == "brewery") {
emit(doc.name, null);
}
}
Invoke the beer.py
script:
shell> python beer.py
* Running on http://0.0.0.0:5000/
* Restarting with reloader
Navigate to localhost:5000
and enjoy the application!
In this section we’ll talk a bit about setting up your directory layout and adding some views in the server before we start dealing with the Python SDK and Flask itself.
Create a project directory named beer
:
shell> mkdir beer
shell> cd beer
shell> mkdir templates
shell> mkdir templates/beer
shell> mkdir templates/brewery
shell> mkdir static
shell> mkdir static/js
shell> mkdir static/css
Showing your directory contents displays something like this:
shell> find. -type d
./static
./static/js
./static/css
./templates
./templates/brewery
./templates/beer
To make the application look pretty, we’re incorporating jQuery and Twitter
Bootstrap. You can either download the libraries and put them in their
appropriate css
and js
directories (under static
), or clone the project
repository and use it from there. If you followed the Quickstart steps, you
already have the files in your beersample-python
directory. Either way, make
sure you have the following files in place:
static/css/beersample.css
static/css/bootstrap.min.css
(the minified twitter bootstrap library)
static/css/bootstrap-responsive.min.css
(the minified responsive layout
classes from bootstrap)
static/js/beersample.js
static/js/jquery.min.js
(the jQuery javascript library)
From here on, you should have a bare bones web application configured that has
all the dependencies included. We’ll now move on and configure the beer-sample
bucket the way we need it.
The beer-sample
bucket comes with a small set of predefined views, but to make
our application function correctly we need some more. This is also a good chance
to explore the view management possibilities inside the Web-UI.
Because we want to list beers and breweries by name, we need to define one view
for each. Head over to the Web-UI and click on the Views menu. Select
beer-sample
from the drop-down list to switch to the correct bucket. Now click
on Development Views and then Create Development View to define your first
view. You need to give it the name of both the design document and the actual
view. Insert the following names:
Design Document Name: _design/dev_beer
View Name: by_name
The next step is to define the map
and (optional) reduce
functions. In our
examples, we won’t use the reduce functions at all but you can play around and
see what happens. Insert the following map function (that’s JavaScript) and
click Save
.
function (doc, meta) {
if(doc.type && doc.type == "beer") {
emit(doc.name, null);
}
}
Every map function takes the full document ( doc
) and its associated metadata
( meta
) as the arguments. You are then free to inspect this data and emit a
result when you want to have it in your index. In our case, we emit the name of
the beer ( doc.name
) when the document both has a type
field and the type
is beer
. We don’t need to emit a value — that’s why we are using null
here.
It’s always advisable to keep the index as small as possible. Resist the urge to
include the full document through emit(meta.id, doc)
, because it will increase
the size of your view indexes. If you need to access the full document (or large
parts), then use include_docs
in the query
method, which returns ViewRow
objects together with their documents. You can also call cb.get(row.docid)
to
get the individual doc for a single row. The resulting retrieval of the document
might be slightly out of sync with your view, but it will be fast and efficient.
Now we need to define a view for our breweries. You already know how to do this — here is all the information you need to create a brewery view:
Design Document Name: _design/dev_brewery
View Name: by_name
Map Function:
function (doc, meta) {
if(doc.type && doc.type == "brewery") {
emit(doc.name, null);
}
}
The final step is to push the design documents in production. While the design documents are in development, the index is applied only on the local node. Because we want to have the index on the whole dataset, click the Publish button on both design documents (and accept any pop-up windows that warn you about overriding the old design documents).
For more information about using views for indexing and querying from Couchbase Server, see the following helpful resources in the Couchbase Server Manual :
General information: Views and Indexes.
Sample patterns: View and Query Pattern Samples.
Time stamp patterns: Many developers ask about extracting information based on date or time. To find out more, see Date and Time Selection.
We’ll be showing bits and pieces of the web app as it pertains to specific
sections. The entire file is less than 300 lines long, and you can inspect it by
looking into the beer.py
file in the repository.
First, our imports. We need some extra imports to be able to handle exceptions properly and let us build better view queries.
from collections import namedtuple
import json
from Flask import Flask, request, redirect, abort, render_template
from couchbase import Couchbase
from couchbase.exceptions import KeyExistsError, NotFoundError
from couchbase.views.iterator import RowProcessor
from couchbase.views.params import UNSPEC, Query
Then, we want to set some constants for our application:
DATABASE = 'beer-sample' HOST = 'localhost' ENTRIES_PER_PAGE = 30
The ENTRIES_PER_PAGE
variable is used later on to configure how many beers and
breweries to show in the search results.
Now, we’re ready to create our Flask application instance:
app = Flask(__name__, static_url_path='')
app.config.from_object(__name__)
The first line creates a new Flask application. The first argument is the module
in which the application is defined. Because we’re using only a single file as
our application, we can use __name__
, which expands to the name of the current
file being executed (minus the .py
suffix).
The second argument instructs Flask to treat unrouted URLs as being requests for
files located in the static
directory we created earlier. This allows our
templates to load the required .js
and .css
files.
The second line creates a configuration object for our app
. The argument is
the name of the module to scan for configuration directives. Flask scans this
module for variable names in UPPER_CASE
and places them in the app.config
dictionary.
Then, define a function to give us a database connection:
def connect_db():
return Couchbase.connect(
bucket=app.config['DATABASE'],
host=app.config['HOST'])
db = connect_db()
You already know how to connect to a Couchbase cluster, so we’ll skip the explanation here.
The module-level db
variable is set to be the Connection
object. In larger
applications this is not a good idea, but we can get away with it here because
this is a simple app.
The first route we will implement is that of the welcome
page, that is, the
page that is displayed when someone goes to the root of your site. Because there
is no Couchbase interaction involved, we just tell Flask to render the template.
@app.route('/')
def welcome():
return render_template('welcome.html')
app.add_url_rule('/welcome', view_func=welcome)
The welcome.html
template is actually a Jinja template inside the
templates
directory. It looks like this:
{% extends "layout.html" %}
{% block body %}
<div class="span6">
<div class="span12">
<h4>Browse all Beers</h4>
<a href="/beers" class="btn btn-warning">Show me all beers</a>
<hr />
</div>
<div class="span12">
<h4>Browse all Breweries</h4>
<a href="/breweries" class="btn btn-info">Take me to the breweries</a>
</div>
</div>
<div class="span6">
<div class="span6">
<div class="span12">
<h4>About this App</h4>
<p>Welcome to Couchbase!</p>
<p>This application helps you to get started on application
development with Couchbase. It shows how to create, update and
delete documents and how to work with JSON documents.</p>
</div>
</div>
{% endblock %}
The template simply provides some links to the brewery and beer pages (which are shown later).
An interesting thing about this template is that it “inherits” from the common
layout.html
template. All pages in the beer app have a common header and
footer to them — with only their body
differing. This is the layout.html
template.
<!DOCTYPE HTML>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Couchbase Python Beer Sample</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="description" content="The Couchbase Java Beer-Sample App">
<meta name="author" content="Couchbase, Inc. 2013">
<link href="/css/bootstrap.min.css" rel="stylesheet">
<link href="/css/beersample.css" rel="stylesheet">
<link href="/css/bootstrap-responsive.min.css" rel="stylesheet">
<!-- HTML5 shim, for IE6-8 support of HTML5 elements -->
<!--[if lt IE 9]>
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
</head>
<body>
<div class="container-narrow">
<div class="masthead">
<ul class="nav nav-pills pull-right">
<li><a href="/welcome">Home</a></li>
<li><a href="/beers">Beers</a></li>
<li><a href="/breweries">Breweries</a></li>
</ul>
<h2 class="muted">Couchbase Beer Sample</h2>
</div>
<hr>
<div class="row-fluid">
<div class="span12">
{% block body %}{% endblock %}
</div>
</div>
<hr>
<div class="footer">
<p>© Couchbase, Inc. 2013</p>
</div>
</div>
<script src="/js/jquery.min.js"></script>
<script src="/js/bootstrap.min.js"></script>
<script src="/js/beersample.js"></script>
</body>
</html>
If you start your app now, you should be able to navigate to localhost:5000
and see the welcome page. You’ll get a 404 error if you try to visit any links
though - this is because we haven’t implemented them yet. Let’s do that now!
In this section we’ll show the construction of the web app with respect to managing beers. We’ll be able to list, inspect, edit, create, search, and delete beers.
Now we’re finally getting into the cooler stuff of this tutorial. First, we’ll implement several classes for our pages to use.
class Beer(object):
def __init__(self, id, name, doc=None):
self.id = id
self.name = name
self.brewery = None
self.doc = doc
def __getattr__(self, name):
if not self.doc:
return ""
return self.doc.get(name, "")
class BeerListRowProcessor(object):
"""
This is the row processor for listing all beers (with their brewery IDs).
"""
def handle_rows(self, rows, connection, include_docs):
ret = []
by_docids = {}
for r in rows:
b = Beer(r['id'], r['key'])
ret.append(b)
by_docids[b.id] = b
keys_to_fetch = [ x.id for x in ret ]
docs = connection.get_multi(keys_to_fetch, quiet=True)
for beer_id, doc in docs.items():
if not doc.success:
ret.remove(beer)
continue
beer = by_docids[beer_id]
beer.brewery_id = doc.value['brewery_id']
return ret
First, we declare a simple Beer
object. This app isn’t too fancy and we
could’ve just used a simple dict
. However, it allows us to demonstrate the use
of the RowProcessor
interface.
In the beer listing page, we want to display each beer along with a link to the
brewery that produces it. However, we’ve defined the beer/by_name
view to
return only the name of the beer. To obtain the brewery, we need to fetch each
beer document and examine it. The document contains the Brewery ID that we need
later.
The BeerListRowProcessor
is an implementation of the RowProcessor
interface
that operates on the returned view rows.
For each raw JSON row, it creates a new Beer
object. The first argument is the
document ID, which is used to provide a link to display more information about
the beer. The second argument is the name of the beer itself, which we use in
the beer list on the webpage.
We also create a local variable called by_docids
that allows us to get a
Beer
object by its document ID.
After we’ve created all the beers, we create a list of document IDs to fetch by
using list comprehension. We pass this list to get_multi
(passing
quiet=True
, because there might be some inconsistencies between the view
indexes and the actual documents).
While we could have made this simpler by performing an individual get
on each
beer.id
, that would be less efficient in terms of network usage.
Now that we have the beer documents, it’s time to set each beer’s brewery_id
to its relevant value.
We first check to see that each document was successful in being retrieved; then
we look up the corresponding Beer
object by getting it from the by_docids
dictionary using the beer_id
as the key.
Then, we extract the brewery_id
field from the document and place it into the
Beer
object.
Finally, we return the list of populated beers. The View
object (returned by
the query
function) now yields results as we iterate over it.
Before we forget, let’s put this all together:
@app.route('/beers')
def beers():
rp = BeerListRowProcessor()
rows = db.query("beer", "by_name",
limit=ENTRIES_PER_PAGE,
row_processor=rp)
return render_template('beer/index.html', results=rows)
We tell Flask to route requests to /beers
to this function. We create an
instance of the BeerListRowProcessor
function we just defined.
We then execute a view query using the query
method, passing it the name of
the design and view ( beer
and by_name
, respsectively).
We set the limit
directive to the aforementioned ENTRIES_PER_PAGE
directive,
to avoid flooding a single webpage with many results.
We finally tell the query
method to use our own BeerListRowProcessor
for
processing the results.
We then direct the template engine to render the beer/index.html
template,
setting the template variable rows
to the iterable returned by the query
function.
Here is the beer/index.html
template:
{% extends "layout.html" %}
{% block body %}
<h3>Browse Beers</h3>
<form class="navbar-search pull-left">
<input id="beer-search" type="text" class="search-query" placeholder="Search for Beers">
</form>
<table id="beer-table" class="table table-striped">
<thead>
<tr>
<th>Name</th>
<th>Brewery</th>
<th></th>
</tr>
</thead>
<tbody>
{% for beer in results %}
<tr>
<td><a href="/beers/show/{{beer.id}}">{{beer.name}}</a></td>
<td><a href="/breweries/show/{{beer.brewery_id}}">To Brewery</a></td>
<td>
<a class="btn btn-small btn-warning" href="/beers/edit/{{beer.id}}">Edit</a>
<a class="btn btn-small btn-danger" href="/beers/delete/{{beer.id}}">Delete</a>
</td>
</tr>
{% endfor %}
</tbody>
</table>
<div>
<a class="btn btn-small btn-success" href="/beers/create">Add Beer</a>
</div>
{% endblock %}
We’re using Jinja {% for %}
blocks to iterate and emit a fragment of HTML
for each Beer
object returned by the query.
Navigate to localhost:5000/beers
, to see a listing of beers. Each beer has To
Brewery
, Edit
, and Delete
buttons.
On the bottom of the page, you can also see an Add Beer
button, which allows
you to define new beers.
Let’s implement the Delete
button next!
Due to the simplicity of Couchbase and Flask, we can implement a single method to delete both beers and breweries.
@app.route('<otype>/delete/<id>')
def delete_object(otype, id):
try:
db.delete(id)
return redirect('/welcome')
except NotFoundError:
return "No such {0} '{1}'".format(otype, id), 404
Here we tell Flask to route any URL that has as its second component the string
delete
to this method. The paths in <angle brackets>
are routing tokens that
Flask passes to the handler as arguments. This means that URLs such as
/beers/delete/foobar
and /foo/delete/whatever
are all routed here.
When we get an ID, we try to delete it by using the delete
method in a try
block. If successful, we redirect to the welcome page, but if the key does not
exist, we return with an error message and a 404
status code.
You can now access this page by going to
localhost:5000/beers/delete/nonexistent
and get a 404 error. Or you can delete
a beer by clicking on one of the Delete
buttons in the /beers
page!
If you find that a beer is still displayed after you click the delete button, you can refresh the browser page to verify that the beer has been deleted.
Another way to verify that a beer has been deleted is by clicking the delete button again and getting a 404 error.
Here we demonstrate how you can display the beers. In this case, we display a page showing all the fields and values of a given beer.
@app.route('/beers/show/<beer_id>')
def show_beer(beer_id):
doc = db.get(beer_id, quiet=True)
if not doc.success:
return "No such beer {0}".format(beer_id), 404
return render_template(
'beer/show.html',
beer=Beer(beer_id, doc.value['name'], doc.value))
Like for the delete
action, we first check to see that the beer exists. We are
passed the beer ID as the last part of the URL - this is passed to us as the
beer_id
.
In order to display the information for the given beer ID, we simply call the
connection’s get
method with the beer_id
argument. We also pass the quiet
parameter so that we don’t receive an exception if the beer does not exist.
We then check to see that the success
property of the returned Result
object
is true. If it isn’t we return an HTTP 404
error.
If the beer exists, we construct a new Beer
object; passing it the ID and the
name
field within the value dictionary.
We then pass this beer to the templates/beer/show.html
template which we’ll
show here:
{% extends "layout.html" %}
{% block body %}
{% set display = beer.doc %}
{% set brewery_id = display['brewery_id'] %}
<h3>Show Details for Beer "{{beer.name}}"</h3>
<table class="table table-striped">
<tbody>
<tr>
<td><strong>brewery_id</strong></td>
<td><a href="/breweries/show/{{brewery_id}}">{{brewery_id}}</a></td>
</tr>
{% for k, v in display.items() if not k == "brewery_id" %}
<tr>
<td><strong>{{k}}</strong></td>
<td>{{v}}</td>
</tr>
{% endfor %}
</tbody>
</table>
<a class="btn btn-medium btn-warning"
href="/beers/edit/{{beer.id}}">Edit</a>
<a class="btn btn-medium btn-danger"
href="/beers/delete/{{beer.id}}">Delete</a>
{% endblock %}
Here we make the display
variable in a special {% set %}
directive. This
makes dealing with the rest of the code simpler.
The next thing we do is extract the brewery_id
, and create a special entry
with a link pointing to the page to display the actual brewery.
Then we is iterate over the rest of the fields (omitting the brewery ID), printing out the key and value of each.
Finally, we provide links at the bottom to Edit
and Delete
the beer.
def normalize_beer_fields(form):
doc = {}
for k, v in form.items():
name_base, fieldname = k.split('_', 1)
if name_base != 'beer':
continue
doc[fieldname] = v
if not 'name' in doc or not doc['name']:
return (None, ("Must have name", 400))
if not 'brewery_id' in doc or not doc['brewery_id']:
return (None, ("Must have brewery ID", 400))
if not db.get(doc['brewery_id'], quiet=True).success:
return (None,
("Brewery ID {0} not found".format(doc['brewery_id']), 400))
return doc, None
@app.route('/beers/edit/<beer>', methods=['GET'])
def edit_beer_display(beer):
bdoc = db.get(beer, quiet=True)
if not bdoc.success:
return "No Such Beer", 404
return render_template('beer/edit.html',
beer=Beer(beer, bdoc.value['name'], bdoc.value),
is_create=False)
@app.route('/beers/edit/<beer>', methods=['POST'])
def edit_beer_submit(beer):
doc, err = normalize_beer_fields(request.form)
if not doc:
return err
db.set(beer, doc)
return redirect('/beers/show/' + beer)
We define two handlers for editing. The first handler is the GET
method for
/beers/edit/<beer>
, which displays a nice HTML form that we can use to edit
the beer. It passes the following parameters to the template: the Beer
object
and a Boolean that indicates this is not a new beer (because the same template
is also used for the Create Beer
form).
The second handler is the POST
method, which validates the input. The post
handler calls the normalize_beer_fields
function, which converts the form
fields into properly formed names for the beer document, checks to see that the
beer has a valid name
, and checks to see that a brewery_id
is specified and
that it indeed exists. If all the checks pass, the function returns a tuple of (
doc
, None
). The POST
handler checks whether the second element of the
returned tuple is false. If it is not false, then it’s an error code, and the
first element is the error message. Otherwise, the first element is the
document. It then sets the document in Couchbase by using the set
method.
The template is rather wordy because we enumerate all the possible fields with a nice description.
{% extends "layout.html" %}
{% block body %}
{% if is_create %}
<h3>Create Beer</h3>
{% else %}
<h3>Editing {{beer.name}}</h3>
{% endif %}
<form method="post" action="">
<fieldset>
<legend>General Info</legend>
<div class="span12">
<div class="span6">
<label>Name</label>
<input type="text" name="beer_name" placeholder="The name of the beer." value="{{beer.name}}">
<label>Description</label>
<input type="text" name="beer_description" placeholder="A short description." value="{{beer.description}}">
</div>
<div class="span6">
<label>Style</label>
<input type="text" name="beer_style" placeholder="Bitter? Sweet? Hoppy?" value="{{beer.style}}">
<label>Category</label>
<input type="text" name="beer_category" placeholder="Ale? Stout? Lager?" value="{{beer.category}}">
</div>
</div>
</fieldset>
<fieldset>
<legend>Details</legend>
<div class="span12">
<div class="span6">
<label>Alcohol (ABV)</label>
<input type="text" name="beer_abv" placeholder="The beer's ABV" value="{{beer.abv}}">
<label>Biterness (IBU)</label>
<input type="text" name="beer_ibu" placeholder="The beer's IBU" value="{{beer.ibu}}">
</div>
<div class="span6">
<label>Beer Color (SRM)</label>
<input type="text" name="beer_srm" placeholder="The beer's SRM" value="{{beer.srm}}">
<label>Universal Product Code (UPC)</label>
<input type="text" name="beer_upc" placeholder="The beer's UPC" value="{{beer.upc}}">
</div>
</div>
</fieldset>
<fieldset>
<legend>Brewery</legend>
<div class="span12">
<div class="span6">
<label>Brewery</label>
<input type="text" name="beer_brewery_id" placeholder="The brewery" value="{{beer.brewery_id}}">
</div>
</div>
</fieldset>
<div class="form-actions">
<button type="submit" class="btn btn-primary">Save changes</button>
</div>
</form>
{% endblock %}
The template first checks the is_create
variable. If it’s False
, then we’re
editing an existing beer, and the caption is filled with that name. Otherwise,
it’s titled as Create Beer
.
Creating beers is largely the same as editing beers:
@app.route('/beers/create')
def create_beer_display():
return render_template('beer/edit.html', beer=Beer('', ''), is_create=True)
@app.route('/beers/create', methods=['POST'])
def create_beer_submit():
doc, err = normalize_beer_fields(request.form)
if not doc:
return err
id = '{0}-{1}'.format(doc['brewery_id'],
doc['name'].replace(' ', '_').lower())
try:
db.add(id, doc)
return redirect('/beers/show/' + id)
except KeyExistsError:
return "Beer already exists!", 400
Here we display the same form as the one for editing beers, except we set the
is_create
parameter to True, and pass an empty Beer
object. This is
necessary because the template still tries to populate the form fields with
existing values.
In the POST
handler, we call normalize_beer_field
as above when editing
beers.
Because we’re creating a new beer, we use the add
method instead. This
raisew an exception if the beer already exists. We catch this and display it to
the user.
If everything went well, the user is redirected to the beer display page for the newly created beer.
In the beer listing page above, you might have noticed a search box at the top. We can use it to dynamically filter our table based on user input. We’ll use Javascript at the client layer to perform the querying and filtering, and views with range queries at the server (Flask) layer to return the results.
Before we implement the Python-level search method, we need to put the following
in the static/js/beersample.js
file (if it’s not there already) to listen on
search box changes and update the table with the resulting JSON (which is
returned from the search method):
$(document).ready(function() {
/**
* AJAX Beer Search Filter
*/
$("#beer-search").keyup(function() {
var content = $("#beer-search").val();
if(content.length >= 0) {
$.getJSON("/beers/search", {"value": content}, function(data) {
$("#beer-table tbody tr").remove();
for(var i=0;i<data.length;i++) {
var html = "<tr>";
html += "<td><a href=\"/beers/show/"+data[i].id+"\">"+data[i].name+"</a></td>";
html += "<td><a href=\"/breweries/show/"+data[i].brewery+"\">To Brewery</a></td>";
html += "<td>";
html += "<a class=\"btn btn-small btn-warning\" href=\"/beers/edit/"+data[i].id+"\">Edit</a>\n";
html += "<a class=\"btn btn-small btn-danger\" href=\"/beers/delete/"+data[i].id+"\">Delete</a>";
html += "</td>";
html += "</tr>";
$("#beer-table tbody").append(html);
}
});
}
});
});
The code waits for keyup events on the search field, and if they happen, it issues an AJAX query on the search function within the app. The search handler computes the result (using views) and returns it as JSON. The JavaScript then clears the table, iterates over the results, and creates new rows.
The search handler looks like this:
def return_search_json(ret):
response = app.make_response(json.dumps(ret))
response.headers['Content-Type'] = 'application/json'
return response
@app.route('/beers/search')
def beer_search():
value = request.args.get('value')
q = Query()
q.mapkey_range = [value, value + Query.STRING_RANGE_END]
q.limit = ENTRIES_PER_PAGE
ret = []
rp = BeerListRowProcessor()
res = db.query("beer", "by_name",
row_processor=rp,
query=q,
include_docs=True)
for beer in res:
ret.append({'id' : beer.id,
'name' : beer.name,
'brewery' : beer.brewery_id})
return return_search_json(ret)
The beer_search
function first extracts the user input by examining the query
string from the request.
It then creates a Query
object. The Query
object’s mapkey_range
property
is set to a list of two elements; the first is the user input, and the second is
the user input with the magic STRING_RANGE_END
string appended to it. This
form of range indicates that all keys that start with the user input ( value
)
are returned. If we just provided a single element, the results would also
contain matches that are lexically greater than the user input; if we just
provided the same value for the second and first elements, only items that match
the string exactly are returned.
The special STRING_RANGE_END
is actually a u"\u0FFF"
UTF-8 character, which
for the view engine means “end here.” You need to get used to it a bit, but it’s
actually very neat and efficient.
We re-use our BeerListRowProcessor
class to filter the results here (because
the data required is the same as that of the beer listing ( beer/index.html
)
page.
However we need to return a JSON array of
{ "id" : "beer_id", "name" : "beer_name", "brewery" : "the_brewery_id" }
so we need to convert the rows into JSON first. This is done by the
return_search_json
function.
Now your search box should work nicely.
While this is implemented in the repository above, it is left as an exercise to the reader to work out some more details.
The tutorial presents an easy approach to start a web application with Couchbase Server as the underlying data source. If you want to dig a little bit deeper, the full source code in the couchbaselabs repository on GitHub has more code to learn from. This code might be extended and updated from time to time.
Of course, this is only the starting point for Couchbase, but together with the Getting Started Guide, you should now be well equipped to start exploring Couchbase Server on your own. Have fun working with Couchbase!
Some things are not implemented in the example. Here’s some ideas for features you might add while learning to use the SDK:
When deleting a brewery, ensure it has no beers dependent on it.
Provide a search where one can query beers belonging to a given brewery.
Handle concurrent updates to a beer and/or brewery.
Implement a like feature, where one can like a beer or a brewery; likewise, they can unlike one as well!
This section only gives an introduction to the available APIs. The actual API reference features more options for each of the APIs described here and is always be more up-to-date than the documentation here.
This section goes a level lower than the Getting Started guide and features the aspects of the APIs offered by the SDK.
You can connect to a bucket using the simple Couchbase.connect()
class method:
from couchbase import Couchbase
client = Couchbase.connect(bucket='default')
Sometimes it is beneficial to let the client know beforehand about multiple nodes. For example, when you have a cluster with several nodes and some nodes might not be up, if you pass only one node the client’s constructor might raise an exception. Instead, you can pass multiple nodes as a list so that the constructor tries each node until it gets a successful connection or the timeout is reached.
c = Couchbase.connect(
bucket='default',
host=['foo.com', 'bar.com', 'baz.com']
)
The client uses timeouts so that your application does not wait too long if the cluster is overloaded or there are connectivity issues. By default, this timeout value is 2.5 seconds.
You can adjust this value by setting it in the constructor.:
c = Couchbase.connect(bucket='default', timeout=5.5)
Or by setting the timeout
property:
c.timeout = 4.2
If your bucket is password protected, you can pass the Simple Authentication and
Security Layer (SASL) password by using the password
keyword parameter in the
constructor:
c = Couchbase.connect(bucket='default', password='s3cr3t')
Threads will be discussed later on in more detail, but the Connection
object
is fully thread-safe out of the box by default. You can tune some parameters
that sacrifice thread-safety for performance.
Before we discuss the individual sections of the API, we’ll discuss the common
return value, which is the Result
object.
Typically subclasses of this object are returned appropriate for the operation executed.
All Result
objects have the following properties:
success
A boolean property that indicates whether this operation was successful.
rc
This is the low level return code as received from the underlying libcouchbase
layer. This is 0 on success and nonzero on failure. Typically this is useful on
operations in which quiet
was set to True
. Normally you’d use it like this:
result = client.get("key", quiet=True)
if not result.success:
print "Got error code", result.rc
__str__
While this isn’t really a property, printing out the result object will yield interesting metadata to aid in debugging this particular operation.
These methods, if successful, set the value of their respective keys. If they
fail, they raise an exception (and are not affected by the quiet
property).
These methods can accept a format
property, which indicates the format in
which the value will be stored on the server, as well as a ttl
property which
indicates the lifetime of the value. After the ttl
lifetime is reached, the
value is deleted from the server.
Starting from version 1.1.0,
these methods can also accept persist_to
and replicate_to
arguments,
specifying the synchronization of application code with replication and
persistence events. See the API documentation for more details.
client.set(key, value, **kwargs)
Sets the key unconditionally.
client.add(key, value, **kwargs)
Sets the key to the specified value, but only if the key does not already exists (an exception is raised otherwise).
client.replace(key, value, **kwargs)
Replaces an existing key with a new value. This raises an exception if the key does not already exist.
These methods operate on 64-bit integer counters. They provide efficient
mutation and retrieval of values. You can use these in place of the set
family
of methods when working with numeric values. For example:
key = "counter"
try:
result = c.get("counter")
c.set(key, result.value + 1)
except KeyNotFoundError:
c.add(key, 10)
key = "counter"
c.incr(key, initial=10)
These methods accept the ttl
argument to set the expiration time for their
values, as well as an amount
value that indicates by what amount to modify
their values. Additionally, an initial
keyword argument is available to
provide the default value for the counter if it does not yet exist. If an
initial
argument is not provided and the key does not exist, an exception is
raised.
The value for the counter stored must either not exist (if initial
is used) or
should be a “Number,” that is, a textual representation of an integer.
If using the default FMT_JSON
, then your integers are already compliant.
If the existing value is not already a number, the server raises an exception
(specifically, a DeltaBadvalError
).
Arithmetic methods return a ValueResult
object (subclass of Result
). The
value
property can be used to obtain the current value of the counter.
c.incr(key, amount=1, ttl=0)
Increments the value stored under the key.
c.decr(key, amount=1, ttl=0)
Decrements the value stored under the key. In this case, amount
is how much to
subtract from the key.
These operations act on the stored values and append or prepend additional data
to it. They treat existing values as strings and only work if the existing
stored data is a string (that is, FMT_UTF8
or FMT_BYTES
).
The format
argument is still available, but the value must be either
FMT_UTF8
or FMT_BYTES
. If not specified, it defaults to FMT_UTF8
.
Otherwise, they are part of the set
family of methods:
c.set("greeting", "Hello", format=FMT_UTF8)
c.append("greeting", " World!")
c.get("greeting").value == "Hello World!"
c.prepend("greeting", "Why, ")
c.get("greeting").value == "Why, Hello World!"
c.append(key, data_to_append, **kwargs)
Appends data to an existing value.
c.prepend(key, data_to_prepend, **kwargs)
Prepends data to an existing value.
Ensure that you only append or prepend to values that were initially stored as
FMT_UTF8
or FMT_BYTES
. It does not make sense to append to a JSON or
pickle string.
Consider:
c.set("a_dict", { "key for" : "a dictionary" })
The key a_dict
now looks like this on the server:
Now, prepend the following to it:
c.prepend("a dict", "blah blah blah")
The value for a_dict
looks like this now:
Now, when you try to get it back, you see this happen:
>>> c.get("a_dict")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "couchbase/connection.py", line 376, in get
return _Base.get(self, key, ttl, quiet, replica, no_format)
File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 365, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 383, in raw_decode
raise ValueError("No JSON object could be decoded")
couchbase.exceptions.ValueFormatError:
<Failed to decode bytes, Results=1, inner_cause=No JSON object could be decoded,
C Source=(src/convert.c,215), OBJ='blah blah blah{"key for": "a dictionary"}'>
This consists of a single method that is used to update the expiration time of a given key. It is passed two arguments, a key and an expiration time. If the expiration time is greater than zero, the key receives the new expiration time.
The expiration time can be expressed as either an offset in seconds or a Unix
time stamp. If the value of the expiration time is smaller than 60*60*24*30
(that is, one month in seconds) it is considered to be an offset in seconds.
Larger values are considered to be Unix time stamps.
If the expiration time is zero, then any existing expiration time is cleared and the value remains stored indefinitely unless explicitly deleted or updated with expiration at a later time.
This is a lightweight means by which to ensure entities “stay alive” without the overhead of having to reset their value or fetch them.
c.touch(key, ttl)
Updates the given key with the specified ttl
.
client.delete(key, quiet=False)
Removes a key from the server. If quiet
is specified, an exception is not
raised if the key does not exist.
client.get(key, quiet=False, ttl=0)
Retrieves a key from the server. If the key does not exist, an exception is
raised if the key does not exist and quiet
is set to False.
If ttl
is specified, this also modifies, in-situ, the expiration time of the
key when retrieving it. This is also known as Get and Touch.
This returns a ValueResult
object (subclass of Result
) that can be used to
obtain the actual value via the value
property.
In production deployments, it is possible that you will have more than a single instance of your application trying to modify the same key. In this case a race condition happens in which a modification one instance has made is immediately overidden.
Consider this code:
def add_friend(user_id, friend):
result = c.get("user_id-" + user_id)
result.value['friends'][friend] = { 'added' : time.time() }
c.set("user_id-" + user_id, result.value)
In this case, friends
is a dictionary of friends the user has added, with the
keys being the friend IDs, and the values being the time when they were added.
When the friend has been added to the dictionary, the document is stored again on the server.
Assume that two users add the same friend at the same time, in this case there is a race condition where one version of the friends dict ultimately wins.
Couchbase provides two means by which to solve for this problem. The first is called Opportunistic Locking and the second is called Pessimistic Locking.
Both forms of locking involve using a CAS value. This value indicates the state of a document at a specific time. Whenever a document is modified, this value changes. The contents of this value are not significant to the application, however it can be used to ensure consistency. You can pass the CAS of the value as it is known to the application and have the server make the operation fail if the current (server-side) CAS value differs.
The opportunistic locking functionality can be employed by using the cas
keyword argument to the set
family of methods.
Note that the cas
value itself may be obtained by inspecting the cas
property of the Result
object returned by any of the API functions.
We can now modify the add_friend
example so that it handles concurrent
modifications gracefully:
def add_friend(user_id, friend):
while True:
result = c.get("user_id-" + user_id)
result.value['friends'][friend] = { 'added' : time.time() }
try:
c.set("user_id-" + user_id, result.value, cas=result.cas)
break
except KeyExistsError:
print "It seems someone tried to modify our user at the same time!"
print "Trying again"
This is called opportunistic locking, because if the CAS is not modified during the first loop, the operation succeeds without any additional steps.
Pessimistic locking is useful for highly contended resources; that is, if the key being accessed has a high likelihood of being contended. While this method may be more complex, it is much more efficient for such resources.
We can use pessimistic locking by employing the lock
and unlock
functions.
The lock
method locks the key on the server for a specified amount of time.
After the key is locked, further attempts to access the key without passing the
proper CAS fail with a TemporaryFailureError
exception until the key is either
unlocked or the lock timeout is reached.
c.lock(key, ttl=0)
This has the same behavior as get
(that is, it returns the value on the
server), but the ttl
argument now indicates how long the lock should be held
for. By default, the server-side lock timeout is used, which is 15 seconds.
Returns a ValueResult
.
c.unlock(key, cas)
Unlocks the key. The key must have been previously locked and must have been
locked with the specified cas
. The cas
value can be obtained from the
Result
object’s cas
property.
Calling any of the set
methods with a valid CAS implicitly unlocks the key,
and thus makes an explicit call to unlock
unnecessary — calling unlock
on a
key that is not currently locked raises an exception.
We can rewrite our add_friend
example using the lock functions:
def add_friend(user_id, friend):
while True:
try:
result = c.lock("user_id-" + user_id)
break
except TemporaryFailError:
# Someone else has locked the key..
pass
try:
result.value['friends'][friend] = { 'added' : time.time() }
c.set("user_id-" + user_id, result.value, cas=result.cas)
except:
# We want to unlock if anything happens, rather than waiting
# for it to time out
c.unlock(result.key, result.cas)
# then, raise the exception
raise
Optimistic locking is more convenient and sometimes more familiar to users. Additionally, it does not require an explicit unlock phase.
However, during a CAS mismatch, the full value is still sent to the server in the case of opportunistic locking. For highly contended resources this has impacts on network I/O, because the value must be sent multiple times before it is actually stored.
Pessimistic locking does not retrieve its value unless the operation was successful, however.
This section provides a bit more information on how to work with views from the Python SDK. If you are new to views, you should read the chapter about views in the Couchbase Server Manual.
To use views, you must have already set up design documents containing one or more view queries you have defined. You can execute these queries from the Python SDK and retrieve their results.
Couchbase Server comes with two pre-defined sample buckets which can be installed from the “Sample Buckets” section in the “Settings” pane.
The basic interface for views is such
client.query(design_name, view_name)
Which returns an iterable object which
yields ViewRow
objects.
ViewRow
objects are simple named tuples with the following fields:
vr.key
The key emitted by the view’s map
function (i.e. first argument to emit
.
vr.value
The value emitted by the view’s map
function (i.e. second argument to emit
).
vr.id
The document ID of this row. The ID can be passed to get
and set
.
vr.doc
A Result
object containing the actual document, if the query
method was
passed the include_docs
directive.
The object returned by query
is a class that defines an __iter__
(and thus
does not have a __len__
or items()
method). You can convert it to a list by
using list comprehension :
rows_as_list = [ c.query("beer", "brewery_beers") ]
You can also pass options to the query
method. The list of available options
is documented in the Query
class in the API documentation.
from couchbase.views.params import Query
client.query(design_name,
view_name,
limit=3,
mapkey_range = ["abbaye", "abbaye" + Query.STRING_RANGE_END],
descending=True)`
The include_docs
directive can be used to fetch the
documents along with each ViewRow
object. Note that while it is possible to
simply call c.get(vr.id)
, the client handles the include_docs
directive by
actually performing a batched ( get_multi
) operation.
You can also pass options for the server itself to handle. These options can be
passed as either an encoded query string, a list of key-value parameters, or a
Query
object.
client.query("beer", "brewery_beers", query="limit=3&skip=1&stale=false")
Note that this is the most efficient way to pass options because they do not need to be re-encoded for each invocation.
However, it is impossible for the SDK to verify the inputs and thus it is suggested you only use a raw string once your query has been refined and optimized.
client.query("beer", "brewery_beers", limit=3, skip=1, stale=False)
This allows simple and idiomatic construction of query options.
from couchbase.views.params import Query
q = Query
q.limit = 3
q.skip = 1
q.stale = False
client.query("beer", "brewery_beers", query=q)
The Query
object makes it simple to programmatically construct a query, and
provides the most maintainable option. When using key-value pairs, the SDK
actually converts them to a Query
object before processing.
Query
objects also have named properties, making query construction easy to
integrate if using an IDE with code completion.
Here are some common parameters used for views. They are available either as
keyword options to the query
method, or as properties on the Query
object
mapkey_range = [ "start", "end" ]
Sets the start and end key range for keys emitted by the map
function.
startkey = "start"
Sets the start key.
endkey = "end"
Sets the end key.
descending = True
Inverts the default sort order.
stale = False
Possible values are True
, False
, or the string update_after
.
limit = 10
Limits the number of rows returned by the query.
These are only available as options to the query
method, and should not be
used on the Query
object.
include_docs = True
Fetches corresponding documents along with each row.
streaming = True
Fetches results incrementally. Don’t buffer all results in memory at once.
Often, view results can be large. By default the client reads all the results
into memory and then returns an iterator over that result set. You can change
this behavior by specifying the streaming
option to the query
method. When
used, results are fetched incrementally.
Using streaming
does not have any impact on how the rows are returned.
The Python Couchbase SDK provides means by which you can manage design documents; including all phases of design document development. You can:
Create a development design.
Publish a development design to a production design.
Retrieve a design document.
Delete a design document.
Note that all design creation methods take a syncwait
argument, which is an
optional amount of time to wait for the operation to be complete. By default the
server (and thus the SDK) only schedule a design document operation. This
means that if you try to use the view right after you created it, you might get
an error because the operation has not yet completed. Using the syncwait
parameter polls for this many seconds and either returns successfully or raises
an exception.
An additional argument that can be provided is the use_devmode
parameter. If
on, the name of the design will be prepended with dev_
(if it does not already
start with it).
All these operations return an HttpResult
object which contains the decoded
JSON payload in its value
property.
c.design_create(name, design, use_devmode=True, syncwait=0)
Creates a new design document. name
is the name of the design document (for
example, "beer"
). design
is either a Python dictionary representing the
structure of the design or a valid string (that is, encoded JSON) to be passed
to the server.
c.design_get(name, use_devmode=True)
Retrieves the design document.
c.design_publish(name, syncwait=0)
Converts a development-mode view into a production mode view. This is equivalent to pressing the Publish button on the web UI.
c.design_delete(name, use_devmode=True)
Deletes a design document.
This covers advanced topics and builds on the Using the APIs section.
Most API functions have both single and multi-key (batched) variants. The
batched variant has the same name as the single-key variant, but its method name
has _multi
appended to it.
The batched operations are significantly quicker and more efficient, especially when dealing with many small values, because they allow pipelining of requests and responses, saving on network latency.
Batched operations tend to accept an iterable of keys (or a dict of keys, depending on the method) and return a dictionary of the following format:
c.foo_multi(["key1", "key2", "key3"])
{
"key1" : FooResult(...),
"key2" : FooResult(...),
"key3" : FooResult(...)
}
Sometimes a single key in a batched operation fails, resulting in an exception.
It is still possible to retrieve the full result set of the failed batched
operation by using the all_results
property of the thrown exception (assuming
it is of type CouchbaseError
).
c.set("foo", "foo value")
try:
c.add_multi({
"foo" : "foo value",
"bar" : "bar value",
"baz" : "baz value"
})
except CouchbaseError as exc:
for k, res in exc.all_results.items():
if res.success:
# Handle successful operation
else:
print "Key {0} failed with error code {1}".format(k, res.rc)
print "Exception {0} would have been thrown".format(
CouchbaseError.rc_to_exctype(res.rc))
The Connection
object by default is thread safe. To do so, it uses internal
locks and explicitly locks and unlocks the Python GIL to ensure that a fatal
error is not thrown by the application.
The locking and unlocking has a slight performance impact, with the guarantee that things will not crash if an application is using threads.
If you are not using threads in your application (at all), you can pass the
unlock_gil=False
option to the connect
method like so:
c = Couchbase.connect(unlock_gil=False, bucket='default')
This disables all locking/unlocking (not to be confused with the lock
and
unlock
features that operate on keys in the server) functionality. If your
application does use threads, those threads will be blocked while the
Connection
object waits for the server to respond.
In addition to locking and unlocking the GIL, upon entry to each function the
Connection
object locks itself (using the equivalent of Lock.acquire
) and
unlocks itself after it leaves. This is to ensure that multiple threads are not
using the same Connection
object at once; and thus access is serialized.
You may disable this behavior with the following two options:
Don’t lock at all
The Connection
object is not locked at all. If your application tries to use
the Connection
object from multiple threads at once, strange errors might
happen and your program will eventually core dump.
If you’re sure you’re not going to use it from more than one thread, you can use
the lockmode = LOCKMODE_NONE
in the constructor:
from couchbase import Couchbase, LOCKMODE_NONE
c = Couchbase.connect(bucket=‘default’, lockmode=LOCKMODE_NONE)
Throw an exception if concurrent access is detected
This is helpful for debugging an application where multiple threads should not
be accessing the Connection
object (but for some reason, they are). You can
use the lockmode = LOCKMODE_EXC
for this:
from couchbase import Couchbase, LOCKMODE_EXC
c = Couchbase.connect(bucket=‘default’, lockmode=LOCKMODE_EXC)
The default lockmode is couchbase.LOCKMODE_WAIT
, which waits silently if
concurrent access is detected.
While the Python SDK offers numerous options for converting your data to be
suitable for storing on the server, it might sometimes not be enough. For this,
the Transcoder
interface is used.
The Transcoder
interface allows you to define an object that is called with
each value together with the format
arguments passed to it.
The value passed for the format
parameter is actually a flag that is stored on
the server. Each key has a small amount of metadata which is stored along with
it on the server. The Python SDK stores the format
value to the metadata when
you store a value (using set
) and then reads it when retrieving the value
(using get
). If the flag is equal to FMT_JSON
, then it attempts to decode
it as JSON; if the flag is equal to FMT_PICKLE
, then it attempts to decode it
as Pickle; and so on.
You may pass the FMT_AUTO
flag to set
, in which case it will guess the
suitable format for your data based on the Python datatype being used.
Additionally, you can pass the no_format
option to get
to suppress automatic
conversion of the value to a native Python type (the bytes
type will be used
instead). This may be useful when conversion is not needed, or if other issues
arise when decoding a format (for example, a Pickle version mismatch).
You can write a custom transcoder that allows Zlib compression. Here’s a snippet:
import zlib
from couchbase.transcoder import Transcoder
from couchbase import FMT_MASK
# We'll define our own flag.
FMT_ZLIB = (FMT_MASK << 1) & ~FMT_MASK
class ZlibTranscoder(Transcoder):
def encode_value(self, value, format):
converted, flags = super(ZlibTranscoder, self).encode_value(value, format & FMT_MASK)
if (format & FMT_ZLIB):
flags |= FMT_ZLIB
converted = zlib.compress(converted)
return (converted, flags)
def decode_value(self, value, flags):
if (format & FMT_ZLIB):
value = zlib.decompress(value)
format &= FMT_MASK
return super(ZlibTranscoder, self).decode_value(value, flags)
In the example, the ZlibTranscoder
class is defined as a subclass of the
provided couchbase.transcoder.Transcoder
class. The latter is a wrapper class
that defaults to use the default conversion methods in the SDK (note that the
library does not use any Transcoder
object by default, but the provided one
wraps the built-in converters.
For encode_value
we are passed the user-specified value, which is any Python
object, and a format
value, which also can be any valid Python object (though
the default transcoder accepts only the FMT_JSON
, FMT_UTF8
, FMT_BYTES
and
FMT_PICKLE
values.
We define an additional format flag called FMT_ZLIB
. We make this one higher
than FMT_MASK
, which is the bitmask for the built-in formatting flags.
In encode_value
we first call our parent’s encode_value
(passing only the
relevant bits of the format
) and receive the converted value and output flags
back (in reality, output flags will typically be the same as the format flags).
Then we convert the already-converted value and compress it as zlib. We then
AND the flag with our FMT_ZLIB
bit, and return it. The value and flag
returned from the encode_value
method are stored as is on the server.
We do the converse when reading data back from the server in decode_value
. In
this method we are passed the value as it is stored on the server, along with
the numeric flags as they are stored in the key’s metadata. We check to see
first whether there is any special FMT_ZLIB
flag applied, and if so,
decompress the data and strip those bits from the flag. Then we dispatch it to
the default decode_value
to handle any further encapsulation formats.
This can all be used like so from Python:
# assuming the ZlibTrancoder class is defined above
c = Couchbase.connect(transcoder=ZlibTranscoder(), bucket='default')
c.set("foo", "long value" * 1000, format=FMT_BYTES|FMT_ZLIB)
c.get("foo")
If you are having difficulties with reading some value from the server (possibly
because it was stored using a different client with different flag semantics)
then you can disable conversion (when retrieving) entirely by using the
Connection
object’s data_passthrough
property. This is a Boolean, and when
enabled does not reconvert the value (that is, it does not call decode_value
but simply interprets the value as a sequence of bytes and returns them as part
of the Result
object’s value
property.
c.set("a_dict", {"foo":"bar"})
c.data_passthrough = True
c.get("a_dict").value == b'{"foo":"bar"}'
The Item
API introduced in version 1.1 introduces a more streamlined way to
deal with Couchbase documents and key-value entries.
The Item
API represents a persistent object in your application which may be
passed by your application to any of the SDK _multi
APIs. When this object is
passed, the result of the operation is stored within the Item
object itself,
rather than creating a new result object.
This allows developers to avoid some common boilerplate methods such as
converting a Result
object to their internal application structure. In fact,
the Item
object itself is a subclass of ValueResult
All methods accepting Items
should be passed a collection of items. These
collections must be the special classes of either an ItemOptionDict
or an
ItemSequence
. The former can be used to specify additional arguments for
each individual item (such as CAS
, format
, and the like), while the latter
may be used if no specific options are required.
#!/usr/bin/python
from pprint import pprint
from couchbase import Couchbase, FMT_UTF8
from couchbase.items import Item, ItemSequence, ItemOptionDict
# The Item API allows subclassing.
class MyItem(Item):
def print_value(self):
pprint(self.value, indent=4)
cb = Couchbase.connect(bucket='default')
# Create blank item
itm = MyItem()
# Must set a key before we do anything with it
itm.key = "simple_key"
# We'll first store it
itm.value = "Hello World"
# Now, we need a sequence for the item.
itmcoll = ItemSequence([itm])
cb.set_multi(itmcoll, format=FMT_UTF8)
itm.print_value()
# Let's append some data to it. We use the special 'append_items' method
# However, append requires a 'fragment' option for each item, so we need
# to make 'itmcoll' an ItemOptionsDict
itmcoll = ItemOptionDict()
itmcoll.add(itm, fragment=" ..again")
cb.append_items(itmcoll, format=FMT_UTF8)
itm.print_value()
# Let's get the key using a normal 'get' this time
print cb.get(itm.key).value
Outputs
'Hello World'
'Hello World ..again'
Hello World .. again
Persistence requirements are provided to allow an application developer to
ensure that a given mutation operation has been performed and
propagated to more than one node in the cluster. By default, operations are
considered successful if they are stored in the RAM of a single node. Using
the persist_to
and replicate_to
arguments in the set
family of methods
you may specify stricted requirements, for example to replicate to at least
two nodes, and to have it persisted on at least one node’s disk.
Note that Couchbase will always eventually store each item on all of its associated replicas, in both disk and memory. The persistence requirements are there as a means for the application developer to synchronize his/her code so that the application does not continue until those requirements are met.
See the API reference for more information on how to use these parameters.
Sometimes it may be useful to retrieve the value for an item, even if the value may not be the most recent version present on the server. Such a scenario may happen for example when trying to retrieve a “tag cloud” entry or similar, where the presence of any data is more important than no data at all.
Data outages can happen for brief periods of time in between the phyiscal disruption of a node (i.e. the node loses immediate network connectivity) and the period of time the cluster waits for auto-failover (if it is enabled).
If your cluster is configured with support for one or more replicas, then any item stored will typically be available on a replica, even if the original vBucket master is unavailable. However, there is a slight chance that the master stored a newer version of the item immediately before it went down and before it had a chance to replicate the item to the replicas.
To retrieve an item from a replica, simply use the replica
boolean parameter
in get
or get_multi
. Typically this is done in a try
-except
loop, where
the try
block attempts to fetch a key through normal means, and the except
block attempts to use the replica:
try:
rv = cb.get(key)
except CouchbaseError:
rv = cb.get(key, replica=True)
There is also the rget
method which is able to get an item from a specific
replica. See the API documentation for more details
This section explains how to uncover bugs in your application (or in the SDK itself).
To debug anything, you must be able to identify in which domain a problem is found. Specifically, the following components participate in typical Couchbase operations:
Couchbase Server
This is the server itself, which stores your data. Errors can happen here if your data does not exist, or if there are connectivity issues with one or more nodes in the server. Note that while Couchbase Server is scalable and fault tolerant, there are naturally some conditions that would cause failures (for example, if all nodes are unreachable).
libcouchbase
This is the underlying layer that handles network communication and protocol handling between a client and a Couchbase node. Network connectivity issues tend to happen here.
Python C Extension Layer
This is the C code that provides the bulk of the SDK. It interfaces with the
libcouchbase component, creates Result
objects, performs input validation, and
encoding and decoding of keys and values.
Python Layer
This is written in pure python. For simple key-value operations these normally just dispatch to the C layer. Most of the view option and row code is handled here as well, with the C layer just performing the lower level network handling.
When something goes wrong, an exception of CouchbaseError
is typically thrown.
The exception object contains a lot of information that can be used to find out
what went wrong.
from couchbase import Couchbase
from couchbase.exceptions import CouchbaseError
c = Couchbase.connect(bucket='default')
try:
# Will fail because 'object' is not JSON-serializable
c.set("key", object())
except CouchbaseError as e:
print e
Printing the exception object typically produces something like this:
# line breaks inserted for clarity
<Couldn't encode value,
inner_cause=<object object at 0x7f873cf220d0> is not JSON serializable,
C Source=(src/convert.c,131),
OBJ=<object object at 0x7f873cf220d0>
>
The exception object consists of the following properties:
message
This is the message, if any, that indicates what went wrong. It is always a string.
>>> e.message
"Couldn’t encode value"
inner_cause
If this exception is triggered by another exception, this field contains it. In the above example, we see the exception
>>> e.inner_cause
TypeError(‘<object object at 0x7f873cf220d0> is not JSON serializable’,)
csrc_info
If present, contains the source code information where the exception was raised. This is only present for exceptions raised from within the C extension.
>>> e.csrc_info
(‘src/convert.c’, 131)
objextra
Contains the Python object that likely caused the exception. If present, it means the object was of an invalid type or format.
>>> e.objextra
<object object at 0x7f873cf220d0>
Because this is a C extension, some fatal errors might result in an application crash. On Unix-based systems, these typically look like this:
python: src/callbacks.c:132: get_common_objects: Assertion `PyDict_Contains((PyObject*)*mres, hkey) == 0' failed.
Aborted
Or simply:
Segmentation Fault
While the actual cause might be in the application code or in the SDK itself, there is often less information available in debugging it.
The SDK should never crash under normal circumstances, and any application crash ultimately indicates a bug in the SDK itself (invalid user input should result in a Python exception being thrown).
To better help us fix the SDK, a C backtrace is needed. To generate a helpful
backtrace, Python must be available with debugging symbols (this can be done by
installing python-dbg
or python-debuginfo
from your distribution. Likewise,
libcouchbase
itself must also be installed with debugging symbols (this can be
done by installing libcouchbase2-dbg
or libcouchbase2-debuginfo
, depending
on your distribution).
You also need gdb
(this is also available on any distribution).
When you have the desired debugging symbols, invoke gdb
as follows. For this
example, we assume python
is a Python interpreter, and crash.py
is a script
that can trigger the crash.
shell> gdb --args python crash.py
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/python...Reading symbols from /usr/lib/debug/usr/bin/python2.7...done.
done.
This brings you to the gdb
prompt. Run the program by typing r
and then
pressing enter.
(gdb) r
Starting program: /usr/bin/python crash.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
python: src/callbacks.c:132: get_common_objects: Assertion `PyDict_Contains((PyObject*)*mres, hkey) == 0' failed.
Program received signal SIGABRT, Aborted.
0x00007ffff6fc9475 in *__GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
Often in the case of web servers, it is difficult to invoke the script directly.
In this case, you need to debug an already-running application. This can be done
with gdb
by determining the process ID of the already-running process. In this
case, you can attach gdb
to the running process like so:
shell> gdb -p 29342
.....
(gdb) continue
Once gdb
is attached, you can type continue
(instead of r
) to continue
the application.
This shows us that an application crashed. When this happens, gdb
will print
the location of the crash. This is not enough, however as we need the full trace
of the crash. To do this, type bt
and then enter to obtain the trace:
(gdb) bt
#0 0x00007ffff6fc9475 in *__GI_raise (sig=<optimized out>)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x00007ffff6fcc6f0 in *__GI_abort () at abort.c:92
#2 0x00007ffff6fc2621 in *__GI___assert_fail (assertion=assertion@entry=
0x7ffff67f6f68 "PyDict_Contains((PyObject*)*mres, hkey) == 0",
file=<optimized out>, file@entry=0x7ffff67f6e0d "src/callbacks.c",
line=line@entry=132, function=function@entry=
0x7ffff67f6fe0 "get_common_objects") at assert.c:81
#3 0x00007ffff67f000c in get_common_objects (cookie=<optimized out>,
key=<optimized out>, nkey=<optimized out>, err=err@entry=LCB_KEY_ENOENT,
conn=conn@entry=0x7fffffffd328, res=res@entry=0x7fffffffd330,
restype=restype@entry=2, mres=mres@entry=0x7fffffffd338)
at src/callbacks.c:132
#4 0x00007ffff67f0623 in get_callback (instance=<optimized out>,
cookie=<optimized out>, err=LCB_KEY_ENOENT, resp=0x7fffffffd3e0)
at src/callbacks.c:216
#5 0x00007ffff65cf861 in lcb_server_purge_implicit_responses ()
from /sources/libcouchbase/inst/lib/libcouchbase.so.2
#6 0x00007ffff65d0f1b in lcb_proto_parse_single ()
from /sources/libcouchbase/inst/lib/libcouchbase.so.2
#7 0x00007ffff65cfef5 in lcb_server_v0_event_handler ()
from /sources/libcouchbase/inst/lib/libcouchbase.so.2
#8 0x00007ffff58b9ccc in event_base_loop ()
from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5
#9 0x00007ffff65d50f0 in lcb_wait ()
---Type <return> to continue, or q <return> to quit---
Python traces can be rather long; continue pressing enter until the last line
( --Type <return>...
) is no longer present.
After you have a backtrace, send the information (along with the script to reproduce, if possible) to your desired support venue.
You can also debug a crash using Valgrind, but the process is significantly more involved and requires a slightly modified build of Python. See Contributing for more details.
This section describes how you can contribute to this SDK.
The latest source code for the Python SDK can be found on github. It is located at https://github.com/couchbase/couchbase-python-client.
If you want to contribute to the C extension itself, it might be worthwhile to use a debug build of Python.
You can skip this section if you do not intend to contribute to the C part of the SDK.
The instructions here have been tested on Python 2.6.7 and Python 3.2.4. They will likely work for any version of Python.
To generate a debug build of python, you need to compile it from source. To do
this, you need to modify some Python source files as instructed in the
Misc/README.valgrind
file within the Python source distribution.
Additionally, if you want to have your Python be useful for installing other
packages (for example, nose
), you need to have pip
and distribute
installed. These themselves depend on several core modules that might not be
built by default on some systems.
The Modules/Setup
file can be modified using the following diff as a guidline:
--- ../../tmp/Python-2.6.7/Modules/Setup.dist 2008-11-27 02:15:12.000000000 -0800
+++ Setup.dist 2013-05-15 15:58:30.559170619 -0700
@@ -162,7 +162,7 @@
# it, depending on your system -- see the GNU readline instructions.
# It's okay for this to be a shared library, too.
-#readline readline.c -lreadline -ltermcap
+readline readline.c -lreadline -ltermcap
# Modules that should always be present (non UNIX dependent):
@@ -215,6 +215,7 @@
#_ssl _ssl.c \
# -DUSE_SSL -I$(SSL)/include -I$(SSL)/include/openssl \
# -L$(SSL)/lib -lssl -lcrypto
+_ssl _ssl.c -DUSE_SSL -lssl -lcrypto
# The crypt module is now disabled by default because it breaks builds
# on many systems (where -lcrypt is needed), e.g. Linux (I believe).
@@ -248,14 +249,14 @@
# Message-Digest Algorithm, described in RFC 1321. The necessary files
# md5.c and md5.h are included here.
-#_md5 md5module.c md5.c
+_md5 md5module.c md5.c
# The _sha module implements the SHA checksum algorithms.
# (NIST's Secure Hash Algorithms.)
-#_sha shamodule.c
-#_sha256 sha256module.c
-#_sha512 sha512module.c
+_sha shamodule.c
+_sha256 sha256module.c
+_sha512 sha512module.c
# SGI IRIX specific modules -- off by default.
@@ -460,7 +461,7 @@
# Andrew Kuchling's zlib module.
# This require zlib 1.1.3 (or later).
# See http://www.gzip.org/zlib/
-#zlib zlibmodule.c -I$(prefix)/include -L$(exec_prefix)/lib -lz
+zlib zlibmodule.c -I$(prefix)/include -L$(exec_prefix)/lib -lz
# Interface to the Expat XML parser
#
Note that on some distributions (specifically Debian) you might get a build
failure when building the ssl
module. If so, you likely need to modify the
Modules/_ssl.c
file like so:
--- ../../tmp/Python-2.6.7/Modules/_ssl.c 2010-08-03 11:50:32.000000000 -0700
+++ _ssl.c 2013-05-15 15:58:03.471170217 -0700
@@ -302,8 +302,6 @@
self->ctx = SSL_CTX_new(TLSv1_method()); /* Set up context */
else if (proto_version == PY_SSL_VERSION_SSL3)
self->ctx = SSL_CTX_new(SSLv3_method()); /* Set up context */
- else if (proto_version == PY_SSL_VERSION_SSL2)
- self->ctx = SSL_CTX_new(SSLv2_method()); /* Set up context */
else if (proto_version == PY_SSL_VERSION_SSL23)
self->ctx = SSL_CTX_new(SSLv23_method()); /* Set up context */
PySSL_END_ALLOW_THREADS
After the source tree is prepared, you can do something like:
shell>./configure --without-pymalloc --prefix=/source/pythons/py267 shell> make
install
If you’ve made changes to the library, you need to run the test suite to ensure that nothing broke with your changes.
To run the tests, you need to have the nose
package installed (this may also
work with the unittest
module as well, but is less tested).
Additionally, you need a real cluster to test against. The test might modify the buckets specified, so be sure not to point it to a production server!
Note that the views test might fail if you have made changes to the
beer-sample
bucket.
To tell the test about your cluster setup, copy the file tests/test.ini.sample
to tests/test.ini
and modify as needed.
To run the tests, from within the root of the SDK source simply do:
You will need sphinx
and numpydoc
installed. Simply do:
shell> make -C docs html
After you do this, the built HTML should be in
docs/build/html
, and you can begin browsing by opening
docs/build/html/index.html
in your browser.
For the Python code, a loose adherence to PEP-8 should be used. For the C extension code, a fairly more strict adherence to PEP-7 should be used.
These rules are meant to be broken; this just reflects some guidelines to use.
In general:
Use spaces, not tabs.
Lines should never be longer than 80 columns.
Code should be compatible with Python versions 2.6 up to the latest 3.x.
Python-Specific:
Doc strings should be readable by Sphinx.
Methods should not have more than three positional arguments.
Avoid using string literals in code.
If a new object makes use of a dictionary, consider converting this dictionary
to a proper Python object by using a namedtuple
, and so on.
Avoid dependencies not in Python’s standard library.
You can add conditional functionality depending on whether a specific library is installed.
Don’t use threads.
While threads are a useful construct in application code, they do not belong in library code without good reason.
C-Specific:
Use of goto
is better than deeply nested blocks.
Return type and storage specifiers should be on their own line.
For example, use the following format
static PyObject*
do_something(PyObject *self, PyObject *args, …)
{
/** … **/
}
rather than this format
static PyObject *do_something(PyObject *self, PyObject *args)
{
/** … **/
}
Code must compile with the following flags (for GCC or clang):
-std=c89 -pedantic -Wall -Wextra -Werror -Wno-long-long -Wno-missing-field-initializers
Non-static functions must have a pycbc_
prefix.
Functions exposed as Python methods must be named using the following pattern:
pycbc_<Object>_<Method>
.
In this pattern, <Object>
is the name of the class in the SDK and <Method>
is the name of the method. For example, if you add a get
method to the
Connection
class, name the new method pycbc_Connection_get
.
Code should be portable to Win32.
Therefore, include only standard library headers and use PyOS_*
functions when
needed.
The following sections provide release notes for individual release versions of Couchbase Client Library Python. To browse or submit new issues, see Couchbase Client Library Python Issues Tracker.
This is the second release. This adds API additions and improvements on top of the 1.0.0 release.
New Features and Behavior Changes in 1.1.0
Durability/Persistence requirements
This feature allows the application to wait until keys have been
stored to more than one node. This is done through the additional
persist_to
and replicate_to
parameters in the set()
family of methods.
A standalone method, endure()
is provided as well.
These features use the new Durability API implemented in libcouchbase.
Issues: PYCBC-37
Experimental gevent support
gevent is a cooperative multitasking framework that allows the creation of greenlets, which are cooperatively scheduled entities. This feature allows the couchbase.Connection object to optionally use a different pure-python non-blocking I/O implementation that can cooperatively yield to other greenlets when needed.
Item API
Item API allows an application to use its own objects and pass them into the various methods of couchbase.Connection where they will be populated with the relevant key status/value information. This allows an application developer to create custom subclasses of the Item object which can be seamlessly used across both the SDK and the application code.
Issues: PYCBC-156
Automatic storage formatting via FMT_AUTO
New format specifier that allows the SDK to guess the appropriate value storage format depending on the value type.
Issues: PYCBC-157
no_format
argument for get()
Allows a value to be retrieved in its raw representation (that is, as it
is stored on the server without any conversion applied). This feature
is provided as an additional keyword argument, no_format
, to the
get()
family of methods.
Replica Read
Allow reading from a replica server. This can be used if a normal
get()
fails. This functionality is exposed via the replica
parameter
to the get()
method and the rget()
method.
Issues: PYCBC-38
Fixes In 1.1.0
Minor documentation errata fixes
Add OBS_LOGICALLY_DELETED
, which was referenced but not defined.
Used by observe
.
Fix potential crash if HTTP view request cannot be scheduled
Fix view queries with many keys. Previously this would return a server error indicating the URI was too long.
Issues: PYCBC-193
lockmode
parameter not propagated from Couchbase.connect
constructor. This would result in the lockmode
always being set to
LOCKMODE_EXC
despite it being overridden in the arguments.
Issues: PYCBC-192
Known Issues in 1.1.0
syncwait
parameter for view creation will time out if new design
document only contains spatial views
Issues: PYCBC-173
This is the first general availability (GA) release.
New Features and Behavior Changes in 1.0.0
Enable Transcoder
to be a class as well as an instance. Fixes some common
misuses. If a class, a new instance is created.
Issues : PYCBC-135
Fixes in 1.0.0
Creating and destroying many Couchbase instances crash program. This happened because of an extra refcount decrement.
Issues : PYCBC-140
Observe crashes when used against clusters with replicas.
Issues : PYCBC-146
This version builds upon previous APIs. It provides new APIs for querying views and managing design documents.
New Features and Behavior Changes in 1.0.0
New query
method and paginated view iterator. View querying is now more
efficient and streamlined with the new query
method. The older _view
method
has been removed.
Low-level observe command wraps the libcouchbase
implementation.
Issues : PYCBC-36
The Connection
object can now be used safely across multiple threads by
default. Tuning the behavior among multiple threads can be changed by the
lockmode
constructor option.
Issues : PYCBC-125
An API to change default JSON and Pickle converters is provided. This allows the user to select a more performant implementation. Note that this affects only the JSON and Pickle conversion performed with key-value operations, not with views.
Issues : PYCBC-124
A new set of design document methods provides simple methods for creating,
fetching, and removing design documents. These replace the older _design
method.
Fixes in 1.0.0
The quiet
parameter was not being used if passed in the constructor.
Issues : PYCBC-136
Multi-arithmetic ( incr_multi
, decr_multi
) does not change the value. These
functions appeared to work and did not return an error, but did not actually
modify the value.
Issues : PYCBC-138
unlock_multi
does not raise an exception on missing CAS. This has been fixed
because unlock must have a CAS.
Issues : PYCBC-
Crash when specifying host:port
in a single string. This is still illegal, but
it now raises an exception and does not crash.
Issues : PYCBC-130
A more informative message is returned if an empty string is passed to get
.
Previously, it returned InvalidArgument
without more information. The client
now checks for this and displays more helpful information.
Issues : PYCBC-131
Client crashes if an invalid hostname is passed. Certain variants of bad
hostnames (such as those with illegal characters) will have libcouchbase
return an error code the client cannot handle and will thus crash the program.
Issues : PYCBC-128
Lock without TTL fails with an erroneous ‘Encoding Error’. The client now throws a more informative exception.
Issues : PYCBC-132
Client crashes if duplicate keys are passed in *_multi
methods.
Issues : PYCBC-134
This version maintains the 0.10.0 API, while adding some new features and fixing some additional bugs.
New Features and Behavior Changes in 0.11.1
Basic HTTP/Views support (experimental): view
and design
methods are now
available.
Issues : PYCBC-105
Add bucket attribute to show bucket name.
Issues : PYCBC-119
Allow Python 3 dict_keys
objects to be passed into the multi_* methods.
Issues : PYCBC-107
Add touch
command.
Issues : PYCBC-115
Add unlock
and lock
commands.
User-defined transcoder classes can be used to override or supplement the default serialization of keys and values.
Issues : PYCBC-92
Expose list of nodes
Issues : PYCBC-106
Pretty print repr
and str
for common objects.
Fixes in 0.11.1
Passing a negative time to live (TTL) now throws an exception. Previously, this resulted in an integer underflow.
Issues : PYCBC-109
Memory leak on each storage operation.
Issues : PYCBC-123
JSON format is now more efficient for Unicode values. Previously, JSON would be encoded in ASCII-safe mode. 0.11 allows encoding in UTF-8 mode.
Issues : PYCBC-108
Memory leak on some exceptions. If bad arguments were received, the exception handling code would leak memory.
Issues : PYCBC-111
This version rewrites the client again, still using libcouchbase
, but this
time using the native Python C API. Specifically this means:
Cython is no longer required for the build.
Code works on Visual Studio compilers (and can be used on Microsoft Windows).
Additionally, the API was changed with respect to return values. Currently most
API functions return a Result
(or a subclass thereof).
The rewrite is filed as PYCBC-103.
Known Issues in 0.10.0
Support for byte values.
Issues : PYCBC-103
Formal specification for key types - keytypes should currently be a Python Unicode object.
Client-side timeout can be set via the timeout
property.
Issues : PYCBC-58
This version is the initial rewrite of the Python library using libcouchbase
.
The installation now requires a C compiler and libcouchbase
.
New Features and Behavior Changes in 0.9.0
Basic key-value functionality, including:
get
set
add
replace
delete
append
prepend
incr
decr
Fixes in 0.9.0
Most of the issues fixed by this release are older bugs closed out from the 0.8.x client and were implicitly fixed by the rewrite.
Issues : PYCBC-6, PYCBC-11, PYCBC-24, PYCBC-26, PYCBC-28, PYCBC-29, PYCBC-56, PYCBC-62, PYCBC-73, PYCBC-75
Known Issues in 0.9.0
Known Issues in 0.8.0
View queries on authenticated buckets are not currently supported.
“id” values from view rows must be converted to strings to be used with Memcached API.
view = bucket.view("design/beer/view/by_name")
for row in view:
id = row["id"].str()
beer = bucket.get(id)
#do something
Exception is thrown on key not found errors with unified client.
try:
bucket.get("key_that_does_not_exist")
except:
#couchbase.exception.MemcachedError
Fixes in 0.7.2
Install with PyPi failing due to missing version number, now fixed.
Issues : PYCBC-51
Install dependencies via setup.py rather than including them in the source distribution
Issues : PYCBC-52
Improve bucket creation defaults:
ramQuotaMB
is now set to 100 by default
authType
is set to sasl by default
Several assets are included for better error catching and reporting on invalid combinations or values.
Server-side validation is now being done to check for port duplicates, name taken, and similar checks.
Issues : PYCBC-53
This is the latest release of the Couchbase Python SDK. It is written from the ground up based on the Couchbase C library, libcouchbase.
This release is considered beta software, use it at your own risk; let us know if you run into any problems, so we can fix them.
New Features and Behavior Changes in 0.7.1
Fixes in 0.7.1
Temporarily removing unimplemented multi-get until full implementation available. This will be re-addressed in PYCBC-49 in a future release
This is the latest release of the Couchbase Python SDK. It is written from the ground up based on the Couchbase C library, libcouchbase.
This release is considered beta software, use it at your own risk; let us know if you run into any problems, so we can fix them.
New Features and Behavior Changes in 0.7.0
SDK now requires Python 2.6.
SDK can now handle server restarts/warmups. Can handle functioning Couchbase Server that is loading data from disk after restart.
Introduced VBucketAwareClient which extends MemcachedClient with Membase/Couchbase specific features.
Fixes in 0.7.0
Deprecated Server() in favor of Couchbase() for the unified client name
Added Apache License headers to all files
Globally, logging is no longer disabled; fixes PYCBC-31.
Issues : PYCBC-31
Renamed VBucketAwareCouchbaseClient to CouchbaseClient.
Set() now works with integer values; fixes PYCBC-15.
Issues : PYCBC-15
Added memcached level flush()
command to unify client with other SDKs. Please
note this only works with 1.8.0 without changing settings. See the release notes
for Couchbase 1.8.1 and 2.0.0 for how to enable memcached flush().
This operation is deprecated as of the 1.8.1 Couchbase Server, to prevent accidental, detrimental data loss. Use of this operation should be done only with extreme caution, and most likely only for test databases as it will delete, item by item, every persisted record as well as destroy all cached data.
Third-party client testing tools may perform a flush_all()
operation as part
of their test scripts. Be aware of the scripts run by your testing tools and
avoid triggering these test cases/operations unless you are certain they are
being performed on your sample/test database.
Inadvertent use of flush_all()
on production databases, or other data stores
you intend to use will result in permanent loss of data. Moreover the operation
as applied to a large data store will take many hours to remove persisted
records.
Deprecating getMulti for pep8-compliant multi-get.
Fixed.save() method; fixes MB-5609.
Deprecated get_view
as it was a duplicate of view_results
.
SDK now working with mixed clusters, including clusters with memcached type buckets.
Can now create memcached buckets
Set() now returns a proper status in the unified Couchbase() client 0.7.0.
Better handling of topology changes; fixes PYCBC-4.
init_cluster
function has been removed.
SDK can now created memcached buckets.
Greater than 50% of SDK covered by unit tests; fixes PYCBC-46.
Issues : PYCBC-46