One of CouchDB’s strengths is the ability to synchronize two copies of the same database. This enables users to distribute data across several nodes or datacenters, but also to move data more closely to clients.
Replication involves a source and a destination database, which can be one the same or on different CouchDB instances. The aim of the replication is that at the end of the process, all active documents on the source database are also in the destination database and all documents that were deleted in the source databases are also deleted on the destination database (if they even existed).
A replication is triggered by storing a replication document in the replicator database. Its status can be inspected through the active tasks API (see GET /_active_tasks and Replication Status). A replication can be stopped by deleting the document, or by updating it with its cancel property set to true.
During replication, CouchDB will compare the source and the destination database to determine which documents differ between the source and the destination database. It does so by following the Changes Feed on the source and comparing the documents to the destination. Changes are submitted to the destination in batches where they can introduce conflicts. Documents that already exist on the destination in the same revision are not transferred. As the deletion of documents is represented by a new revision, a document deleted on the source will also be deleted on the target.
A replication task will finish once it reaches the end of the changes feed. If its continuous property is set to true, it will wait for new changes to appear until the task is cancelled. Replication tasks also create checkpoint documents on the destination to ensure that a restarted task can continue from where it stopped, for example after it has crashed.
When a replication task is initiated on the sending node, it is called push replication, if it is initiated by the receiving node, it is called pull replication.
One replication task will only transfer changes in one direction. To achieve master-master replication it is possible to set up two replication tasks in different directions. When a change is replication from database A to B by the first task, the second will discover that the new change on B already exists in A and will wait for further changes.
There are two ways for controlling which documents are replicated, and which are skipped. Local documents are never replicated (see Local (non-replicating) Document Methods).
Additionally, Filter functions can be used in a replication documents (see Replication Settings). The replication task will then evaluate the filter function for each document in the changes feed. The document will only be replicated if the filter returns true.