Database Creation

There are two configuration parameters that control the sharding topology of a BigCouch database. The defaults are specified in the [cluster] block of the server configuration file and may be overridden at database creation time. N specifies the number of replicas of each document that are stored, while Q fixes the number of partitions of the database. A command to create a database comprised of 32 partitions where each document is stored 3 times would be:

curl -X PUT 'http://loadbalancer:5984/test_db?n=3&q=32'

Document Updates

curl -X PUT http://loadbalancer:5984/test_db/doc_1 -H content-type:application/json -d '{"a":1,"b":2}'

BigCouch accepts a w query-string parameter on updates which overrides the default write quorum for the database. when BigCouch writes the N copies of each document it will respond to the client after W of them have been committed successfully (the operations to commit the remaining copies will continue in the background). If W copies cannot be committed successfully, BigCouch will respond with either a 202 Accepted if a copy is saved, or a 409 Conflict if all hosts conclude that the update is based on an outdated revision. W defaults to the simple majority of N and is the recommended choice for most applications.

Document Reads

curl http://loadbalancer:5984/test_db/doc_1

As in the case of updates there is an r query-string parameter that sets the quorum for reads. When BigCouch reads a document it issues requests to all N copies of the partition hosting the document and responds to the client when R matching success responses are received. The default quorum is the simple majority of N and is the recommended choice for most applications.

The differences between BigCouch and CouchDB

When CouchDB runs standalone it listens on some port, typically 5984. BigCouch can run this way but usually several instances will be run as part of a cluster of nodes. Each can be viewed more or less as a single CouchDB instance but it actually listens on two ports. Here's what a typical config file on a programmer's machine might look like:

port = 5984
docroot = /Users/bitdiddle/emacs/bigcouch/rel/dev1/share/www

port = 5986

The chttpd stanza specifies the front end port, which supports the user API and the httpd stanza specifies the back door or the admin port. So .eg. when adding nodes to a cluster or querying membership one might make calls like :

curl -X PUT -d {}

against the admin port. Usually a load balancer would be used in front of BigCouch and it would be passing calls to the 5984 ports. So each node in the cluster has a config file that names these ports. Since BigCouch makes use of distributed Erlang there are also some key parameters in the vm.args file, in particular -name and -setcookie that are important to set correctly. The -name must be the name of a node as specified above, .eg. dev2@ and the -setcookie sets a cookie file common to all erlang nodes in a cluster. More details are documented in the comments in the vm.args file.


BigCouch also supports zones, which is as easy as adding the numbers of zones to the cluster config:

z = 3

and then editing the node entries for each node in a zone to add a zone field, .eg.:

{"_id": "dev1@", zone": "parts_unknown"}

API differences

BigCouch embeds CouchDB and strives to maintain API compatibility, but differences do arise, often due to fundamental constraints of programming distributed systems. In this section we outline those differences.