Masoud Kalali's Blog

My thoughts on software engineering and beyond…

By

Getting started with CouchDB and MongoDB, Part I: CRUD operations

In a series of blog I am going to write to cover CouchDB and MongoDB as two popular document databases. In each one of the entries I will show how can you do similar tasks with each one of these two so that you get some understanding of the differences and similarities between them. I am not going to do any comparison here but reading the series you can draw conclusion on which one seems more suitable for your case based judging by how different tasks can be accomplished using these two.

Some technicalities:

  • All of the source codes for these series is available at https://github.com/kalali/nosql which you can get and run the samples.
  • All sample codes assume default port numbers unless I mention otherwise in the blog entry itself.
  • All the blog entries use CouchDB 1.3.1 and MongoDB 2.4.5 and thus I expect them to work fine with this version and any newer ones.

A little about these two databases:

  • MongoDB and CouchDB: Both of them are document databases which lets you store freeform data into the storage in form of a document with key/value fields of data. Just look at JSON to see how a document would look like.
  • Apache CouchDB: Provides built-in support for reading/writing JSON documents over a REST interface. Result of a query is JSON and the data you store is in form of JSON all through the HTTP interface. There are tens of access API implemented, each with its own unique characteristics, around the REST interface in different languages which one can use to avoid using the REST interface directly. List of these implementations are available at: Related Projects. Apache CouchDB has more emphasis on availability rather than consistency if you look at it from CAP theorem perspective. CouchDB is cross platform and developed in ErLang.
  • MongoDB: MongoDB does not provide a built-in REST interface to interact with the document store and it does not provide direct support for storing and retrieving JSON documents however it provides a fast native protocol which can be access using the MongoDB Drivers and Client Libraries which are available for almost any language you can think of, and also there are some bridges which one can use to create a REST interface on top of the native interface. MongoDB has more emphasis on consistency rather than availability if you look at it from CAP theorem perspective. MongoDB is cross platform and developed in C++.

Downloading and starting:

  • MongoDB: You can download MongoDB from  the download page and after extracting it you can start it by navigating to bin directory and running ./mongo to start the document store. The default administration port is 28017 and the access port number is 27017 by default.
  • Apache CouchDB: You can download Apache CouchDB from the download page or if you are running some Linux distribution you can get it from the package repositories or you can just build it from the code. After installation run sudo couchdb to start the server.

The basic scenario for this blog entry is to create a database, add a document, find that document back, update the document and read it back from the store, delete the document and finally to delete the database to make the sample code’s execution repeatable.

The CouchDB sample code for the sample scenario:

I used JAX-RS and Some JSON-P in the code to add some salt to it rather than using plan http connection and plain-text content, but that aside the most important thing that you may want to remember is that we are using HTTP verbs for different actions, for example, PUT to create, DELETE to delete, POST to update and GET to query and read. The other important factor that you may have noticed is the _rev or the revision attribute of the document. You see that in order to update the document I specified the revision string, _rev in the document, also to delete the document I use the _rev id. This is required because at each point in time there might be multiple revisions of the document living in the database and you may read an older revision and that is the revision you can update (you can update a document revision because someone else might have updated the document while you were doing the update and thus your revision is old and you are updating your own revision).

The POM file for the couchDB sample is as follow:

Now, doing a clean install for the CouchDB project sample we get:

If you pay attention you see that the _rev attribute has changed after updating the document, that is because by default read operation reads the latest revision of the document and in the second read after we updated the document the revision is changed and document that is read is the latest revision.

The MongoDB code for the sample scenario:

As you can see in the code we are using the Mongo API rather than using direct HTTP communication and that is because MongoDB does not provide a built-in REST interface but rather it has its own native protocol as explained above. The other important part which will be in comparison with CouchDB is absence of revision or _rev or an attribute with similar semantic role. That is because MongoDB does not keep multiple version of the document Look at CAP theorem for more information on consistency and availability attributes of these two databases.

Now, the POM file for the MongoDB sample is as follow:

The output for doing a clean install on the MongoDB sample will be as shown below:

The sample code for this blog entry is available at GutHub:Kalali

This blog entry is wrote based on CouchDB 1.3.1 and MongoDB 2.4.5 and thus the descriptions are valid for these versions and they may differ in other versions. The sample codes should work with the mentioned versions and any newer release.


5 Responses to Getting started with CouchDB and MongoDB, Part I: CRUD operations

  1. Arash says:

    Thank you very much, Masoud. It was indeed really helpful. I look forward to see the next parts.

  2. Arash says:

    Forgot to ask…

    Assume a mission-critical Java EE application in its initial design phase. Would you recommend utilizing these databases (or other NoSQL data stores) this way? I mean, would you consent to calling MongoClient constructor which open a network socket directly in JEE application code, or would you implement a custom JCA adapter (or something else) first?

  3. Masoud Kalali says:

    The Java EE specification discourage having a server socket in an EJB but an EJB can act as a network socket client to interact with, for example, MongoDB running somewhere else.

    If you create a singleton service with one Mongo object (provided that you know about security, etc) then that single Mongo object will use connection pooling underneath, which is one of the features that JCA can bring in, and that gives you a performance edge on establishing connections.

    Now, having a JCA gives you some advantages, like connection management, tx support, lifecycle management, etc but it is not a requirement in this case.

  4. Pingback: CouchDB and MongoDB, Part II: Beginning with Security | Masoud Kalali's Blog

  5. Pingback: Securing MongoDB and CouchDB, From NoSQL series of blogs | Masoud Kalali's Blog

Leave a Reply

Your email address will not be published. Required fields are marked *


− 4 = one

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">