Friday, July 12, 2013

Getting Started with MongoDB – Part 2

In the previous post we explored the basics of MongoDB. In this post we going to dig deeper in MongoDB.

Indexing

Whenever a new collection is created, MongoDB automatically creates an index by the _id field. These indexes can be found in the system.indexes collection. You can show all indexes in the database using db.system.indexes.find() . Most queries will include more fields than just the _id, so we need to make indexes on those fields.

Before creating more indexes, let’s see what is the performance of a sample query without creating any indexes other than the automatically created one for _id. Create the following function to generate random phone numbers.

function (area,start,stop) {
for(var i=start; i < stop; i++) {
var country = 1 + ((Math.random() * 8) << 0);
var num = (country * 1e10) + (area * 1e7) + i;
db.phones.insert({
_id: num,
components: {
country: country,
area: area,
prefix: (i * 1e-4) << 0,
number: i
},
display: "+" + country + " " + area + "-" + i
});

Run the function with a three-digit area code (like 800) and a range of seven digit numbers (5,550,000 to 5,650,000)

populatePhones( 800, 5550000, 5650000 )

Now we expecting to see a new index created for our new collection.

> db.system.indexes.find()
{ "v" : 1, "key" : { "_id" : 1 }, "ns" : "newdb.towns", "name" : "_id_" }
{ "v" : 1, "key" : { "_id" : 1 }, "ns" : "newdb.countries", "name" : "_id_" }
{ "v" : 1, "key" : { "_id" : 1 }, "ns" : "newdb.phones", "name" : "_id_" }

Now let’s check the query without an index. The explain() method is used to output details of a given operation and can help us here.

> db.phones.find( { display : "+1 800-5650001" } ).explain()
{
        "cursor" : "BasicCursor",
        "isMultiKey" : false,
        "n" : 0,
        "nscannedObjects" : 100000,
        "nscanned" : 100000,
        "nscannedObjectsAllPlans" : 100000,
        "nscannedAllPlans" : 100000,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "millis" : 134,
        "indexBounds" : {

        },
        "server" : "ESOLIMAN:27017"
}

Just to make things simple, we will look at the millis field only which gives the milliseconds needed to complete the query. Now it is 134.

Now we going to create an index and see how it improves our query execution time. We create an index by calling ensureIndex(fields,options) on the collection. The fields parameter is an object containing the fields to be indexed against. The options parameter describes the type of index to make. On production environments, creating an index on a large collection can be slow and resource-intensive, you should create them in off-peak times. In our case we going to build a unique index on the display field and we will drop duplicate entries.

> db.phones.ensureIndex(
... { display : 1 },
... { unique : true, dropDups : true }
... )

lets try explain() of find() and see the new value for millis field. Query execution time improved, from 134 down to 16.

> db.phones.find( { display : "+1 800-5650001" } ).explain()
{
        "cursor" : "BtreeCursor display_1",
        "isMultiKey" : false,
        "n" : 0,
        "nscannedObjects" : 0,
        "nscanned" : 0,
        "nscannedObjectsAllPlans" : 0,
        "nscannedAllPlans" : 0,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "millis" : 16,
        "indexBounds" : {
                "display" : [
                        [
                                "+1 800-5650001",
                                "+1 800-5650001"
                        ]
                ]
        },
        "server" : "ESOLIMAN:27017"
}

Notice the cursor changed from a Basic to a B-tree cursor. MongoDB is no longer doing
a full collection scan but instead walking the tree to retrieve the value.

Mongo can build your index on nested values: db.phones.ensureIndex({ "components.area": 1 }, { background : 1 })

Aggregations

count() counts the number of matching documents. It takes a query and returns a number.

> db.phones.count({'components.number': { $gt : 5599999 } })
100000

distinct() returns each matching value where one or more exists.

> db.phones.distinct('components.number', {'components.number': { $lt : 5550005 } })
[ 5550000, 5550001, 5550002, 5550003, 5550004 ]

group() groups documents in a collection by the specified keys and performs simple aggregation functions such as computing counts and sums. It is similar to GROUP BY in SQL. It accepts the following parameters

  • key – Specifies one or more document fields to group by.
  • reduce – Specifies a function for the group operation perform on the documents during the grouping operation, such as compute a sum or a count. The aggregation function takes two arguments: the current document and the aggregate result for the previous documents in the group.
  • initial – Initializes the aggregation result document.
  • keyf – Optional. Alternative to the key field. Specifies a function that creates a “key object” for use as the grouping key. Use the keyf instead of key to group by calculated fields rather than existing document fields. Like HAVING in SQL.
  • cond – Optional. Specifies the selection criteria to determine which documents in the collection to process. If you omit the cond field, db.collection.group() processes all the documents in the collection for the group operation.
  • finalize – Optional. Specifies a function that runs each item in the result set before db.collection.group() returns the final value. This function can either modify the result document or replace the result document as a whole.

> db.phones.group({
... initial : { count : 0 },
... reduce : function(phone, output) { output.count++; },
... cond : { 'components.number' : { $gt : 5599999 } },
... key : { 'components.area' : true }
... })
[
        {
                "components.area" : 800,
                "count" : 50000
        },
        {
                "components.area" : 855,
                "count" : 50000
        }
]

The first thing we did here was set an initial object with a field named count set to 0—fields created here will appear in the output. Next we describe what to do with this field by declaring a reduce function that adds one for every document we encounter. Finally, we gave group a condition restricting which documents to reduce over.

Server-Side Commands

All queries and operations we did till now, execute on the client side. The db object provides a command named eval(), which passes the given function to the server. This dramatically reduces the communication between client and server. It is similar to stored procedures in SQL.

There is a also a set of prebuilt commands that can be executed on the server. Use db.listCommands() to get a list of these commands. To run any command on the server use db.runCommand() like db.runCommand({ "count" : "phones" })

Although it is not recommended, you can store a JavaScript function on the server for later reuse.

MapReduce

MapReduce is a framework for parallelizing problems. Generally speaking the parallelization happens on two steps:

  • "Map" step: The master node takes the input, divides it into smaller sub-problems, and distributes them to worker nodes. A worker node may do this again in turn, leading to a multi-level tree structure. The worker node processes the smaller problem, and passes the answer back to its master node.
  • "Reduce" step: The master node then collects the answers to all the sub-problems and combines them in some way to form the output – the answer to the problem it was originally trying to solve.

To show the MapReduce framework in action, let’s build on the phones collections that we created previously. Let’s generate a report that counts all phone numbers that contain the same digits for each country.

First we create a helper function that extracts an array of all distinct numbers (this step is not a MapReduce step).

> distinctDigits = function(phone) {
... var
... number = phone.components.number + '',
... seen = [],
... result = [],
... i = number.length;
... while(i--) {
...  seen[+number[i]] = 1;
...  }
... for (i=0; i<10; i++) {
...  if (seen[i]) {
...   result[result.length] = i;
...   }
...  }
... return result;
... }

> db.eval("distinctDigits(db.phones.findOne({ 'components.number' : 5551213 }))")
[ 1, 2, 3, 5 ]

Now let’s find find distinct numbers of each country. Since we need to query by country later, we will add the distinct digits array and country as compound key. For each distinct digits array in each country, we will add a count field that hold the value 1.

> map = function() {
... var digits = distinctDigits(this);
... emit( { digits : digits, country : this.components.country } , { count : 1 } );
... }

The reducer function will all these 1s that have been emitted from the map function.

>reduce = function(key, values) {
... var total = 0;
... for(var i=0; i<values.length; i++) {
...  total += values[i].count;
...  }
...  return { count : total };
... }

Now it is time to put all pieces together and start the whole thing (the input collection, map function, reduce function, output collection).

> results = db.runCommand({
... mapReduce : 'phones',
... map : map,
... reduce : reduce,
... out : 'phones.report'
... })
{
        "result" : "phones.report",
        "timeMillis" : 21084,
        "counts" : {
                "input" : 200000,
                "emit" : 200000,
                "reduce" : 48469,
                "output" : 3489
        },
        "ok" : 1
}

Now you can query the output collection like any other collection

> db.phones.report.find()
{ "_id" : { "digits" : [  0,  1,  2,  3,  4,  5,  6 ], "country" : 1 }, "value" : { "count" : 37 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  4,  5,  6 ], "country" : 2 }, "value" : { "count" : 23 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  4,  5,  6 ], "country" : 3 }, "value" : { "count" : 17 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  4,  5,  6 ], "country" : 4 }, "value" : { "count" : 29 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  4,  5,  6 ], "country" : 5 }, "value" : { "count" : 34 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  4,  5,  6 ], "country" : 6 }, "value" : { "count" : 35 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  4,  5,  6 ], "country" : 7 }, "value" : { "count" : 33 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  4,  5,  6 ], "country" : 8 }, "value" : { "count" : 32 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5 ], "country" : 1 }, "value" : { "count" : 5 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5 ], "country" : 2 }, "value" : { "count" : 7 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5 ], "country" : 3 }, "value" : { "count" : 3 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5 ], "country" : 4 }, "value" : { "count" : 6 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5 ], "country" : 5 }, "value" : { "count" : 5 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5 ], "country" : 6 }, "value" : { "count" : 10 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5 ], "country" : 7 }, "value" : { "count" : 5 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5 ], "country" : 8 }, "value" : { "count" : 7 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5,  6 ], "country" : 1 }, "value" : { "count" : 95 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5,  6 ], "country" : 2 }, "value" : { "count" : 104 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5,  6 ], "country" : 3 }, "value" : { "count" : 108 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5,  6 ], "country" : 4 }, "value" : { "count" : 113 } }
Type "it" for more

or

> db.phones.report.find({'_id.country' : 8})
{ "_id" : { "digits" : [  0,  1,  2,  3,  4,  5,  6 ], "country" : 8 }, "value" : { "count" : 32 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5 ], "country" : 8 }, "value" : { "count" : 7 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5,  6 ], "country" : 8 }, "value" : { "count" : 127 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5,  6,  7 ], "country" : 8 }, "value" : { "count" : 28 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5,  6,  8 ], "country" : 8 }, "value" : { "count" : 27 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5,  6,  9 ], "country" : 8 }, "value" : { "count" : 29 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5,  7 ], "country" : 8 }, "value" : { "count" : 10 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5,  8 ], "country" : 8 }, "value" : { "count" : 7 } }
{ "_id" : { "digits" : [  0,  1,  2,  3,  5,  9 ], "country" : 8 }, "value" : { "count" : 8 } }
{ "_id" : { "digits" : [  0,  1,  2,  4,  5 ], "country" : 8 }, "value" : { "count" : 3 } }
{ "_id" : { "digits" : [  0,  1,  2,  4,  5,  6 ], "country" : 8 }, "value" : { "count" : 121 } }
{ "_id" : { "digits" : [  0,  1,  2,  4,  5,  6,  7 ], "country" : 8 }, "value" : { "count" : 25 } }
{ "_id" : { "digits" : [  0,  1,  2,  4,  5,  6,  8 ], "country" : 8 }, "value" : { "count" : 27 } }
{ "_id" : { "digits" : [  0,  1,  2,  4,  5,  6,  9 ], "country" : 8 }, "value" : { "count" : 17 } }
{ "_id" : { "digits" : [  0,  1,  2,  4,  5,  7 ], "country" : 8 }, "value" : { "count" : 4 } }
{ "_id" : { "digits" : [  0,  1,  2,  4,  5,  8 ], "country" : 8 }, "value" : { "count" : 4 } }
{ "_id" : { "digits" : [  0,  1,  2,  4,  5,  9 ], "country" : 8 }, "value" : { "count" : 7 } }
{ "_id" : { "digits" : [  0,  1,  2,  5 ], "country" : 8 }, "value" : { "count" : 14 } }
{ "_id" : { "digits" : [  0,  1,  2,  5,  6 ], "country" : 8 }, "value" : { "count" : 162 } }
{ "_id" : { "digits" : [  0,  1,  2,  5,  6,  7 ], "country" : 8 }, "value" : { "count" : 95 } }
Type "it" for more

The unique emitted keys are under the field _id, and all of the data returned from the reducers are
under the field value. If you prefer that the mapreducer just output the results, rather than outputting to a collection, you can set the out value to { inline : 1 }, but bear in mind there is a limit to the size of a result you can output (16 MB).

In some situations you may need to feed the reducer function’s output into another reducer function. In these situations we need to carefully handle both cases: either map’s output or another reduce’s output.

MongoDB have so many features that we didn’t even mentioned here. In later posts will continue working on them.

Getting Started with MongoDB – Part 1


MongoDB (from "humongous") is an open source document-oriented database system developed and supported by 10gen (founded by Dwight Merriman). First publicly released in 2009, and since then it have been a rising star in the NoSQL world. MongoDB stores structured data as JSON-like documents with dynamic schemas (technically data is stored in a binary form of JSON known as BSON), making the integration of data in certain types of applications easier and faster.

Installation

  1. Download the latest mongoDB version from here.
  2. Extract the archive to your preferred location (in my case C:\mongodb). MongoDB is self-contained and does not have any other system dependencies. You can run MongoDB from any folder you choose. You may install MongoDB in any directory.
  3. MongoDB requires a data folder to store its files (default is C:\data\db ). You may specify a different path with the dbpath setting when lunching mongod.exe.

Starting the Server

To start MongoDB, open the command prompt window, and run mongod.exe from the bin directory (specify the data path if needed)
CWindowssystem32cmd.exe - mongod.exe  --dbpath Cmongodbdata_2013-07-09_16-10-49The waiting for connections message in the console output indicates that the mongod.exe process is running successfully and waiting for connections on port 27017

Connecting to the Server

To connect to the server, open another command prompt window and run mongo.exe from the bin directory.
CWindowssystem32cmd.exe - mongo_2013-07-09_16-27-37

Run MongoDB as a Windows Service

  1. Create a log file for MongoDB (in my case c:\mongodb\log\mongo.log ).
  2. Create a data directory for MongoDB (in my case c:\mongodb\data ).
  3. Open the command prompt window as an administrator.
  4. Run the following command C:\mongodb\bin>mongod.exe –install –rest –master --logpath "c:\mongodb\log\mongo.log"
  5. Run regedit from start menu.
  6. Go to HKEY_LOCAL_MACHINE >> SYSTEM >> CurrentControlSet >> services
  7. Find the MongoDB directory & edit the ImagePath key.
  8. Set value as c:\mongodb\bin\mongod --service  --rest  --master  --logpath=C:\mongodb\log\mongo.log  --dbpath=C:\mongodb\data
  9. Save and exit registry editor
  10. Open ComponentServices from Start menu >> Run
  11. Locate the Mongo DB service, and reigh click >> Properties.
  12. Set the Startup Type to Automatic. Then start the service.
    1. To run the MongoDB service from command window, use net start MongoDB
  13. Check at http://localhost:28017/ to see , MongoDB should return stats.
In case you want to remove MongoDB service C:\mongodb\bin\mongod.exe –remove

Data Model

Data in MongoDB has a flexible schema.
  • Database consists on a set of collections.
  • Collections consists of a set of documents.
    • Documents in the same collection do not need to have the same set of fields or structure.
    • Common fields in a collection’s documents can hold different types of data.
Based on what we mentioned, you could say that MongoDB is schema-less. But you may refer to this data modeling article before doing real work on MongoDB.

CRUD operations

  • When you start the mongo shell, it connect to test database by default. To create a new database or switch to another database, use use newdb
  • To show a list of databases, use show dbs (databases are created when you insert the first values in it. This means that any database have been created but no values inserted in it, it really doesn’t exist).
  • To confirm the current session database, use db
  • To create a collection, just insert an initial record to it. Since Mongo is schema-less, there is no need to define anything up front. The following code creates/inserts a towns collection:
db.towns.insert({
name: "New York",
population: 22200000,
last_census: ISODate("2009-07-31"),
famous_for: [ "statue of liberty", "food" ],
mayor : {
name : "Michael Bloomberg",
party : "I"
}
})

    • brackets like {...} denote an object with key-value pairs.
    • brackets like [...] denote an array.
    • You can nest these values to any depth.
  • To show a list of collections, use show collections.
  • To list the contents of a collection, use db.towns.find()
    • You will see a system generated field _id ( composed of a timestamp, client machine ID, client process ID, and a 3-byte incremented counter) (you can override this system generated
  • MongoDB commands are JavaScript functions.
    • db is a JavaScript object that contains information about the current database. Try typeof db
    • db.x is a JavaScript object that represent a collection named x within the current database. Try typeof db.towns
    • db.x.help() will list available functions related to the given object. Try typeof db.towns.insert
    • If you want to inspect the source code of a function, call it without parameter or parentheses.
  • Functions You can create JavaScript functions and call them on the mongo shell like:
function insertCity( name, population, last_census, famous_for, mayor_info) {
db.towns.insert({
name:name,
population:population,
last_census: ISODate(last_census),
famous_for:famous_for,
mayor : mayor_info
});
}

insertCity("Punxsutawney", 6200, '2008-31-01', ["phil the groundhog"], { name : "Jim Wehrle" } )
insertCity("Portland", 582000, '2007-20-09', ["beer", "food"], { name : "Sam Adams", party : "D" } )

Now have three towns in our collection
Command Prompt - mongo_2013-07-11_14-35-39
  • To get a specific document, we only need _id passed to find() function in type ObjectId(findOne() retrieves only one matching document). String can be converted to ObjectId using ObjectId(str) function.
 db.towns.find({ "_id" : ObjectId("51def56c1cf66f4c40bb7f4a") })
    • The find() function also accepts an optional second parameter: a fields object we can use to filter which fields are retrieved. If we want only the town name (along with _id), pass in name with a value resolving to 1 (or true).
db.towns.find({ "_id" : ObjectId("51def56c1cf66f4c40bb7f4a") }, { name : 1})
    • To retrieve all fields except name, set name to 0 (or false or null).
db.towns.find({ "_id" : ObjectId("51def56c1cf66f4c40bb7f4a") }, { name : 0})
    • You can retrieve documents based on criteria other than _id. You can use regular expressions or any operator.
db.towns.find( { name : /^P/, population : { $lt : 10000 } }, { name : 1, population : 1 } )
We said before that the query language is JavaScript, which means we can construct operations as we would construct objects. In the following query, we build a criteria where the population must be between 10.000 and 1 million. Ranges work also on dates.
> var population_range = {}
> population_range['$lt'] = 100000
100000
> population_range['$gt'] = 10000
10000
> population_range['$lt'] = 1000000
1000000
> population_range['$gt'] = 10000
10000
> db.towns.find( {name : /^P/, population : population_range }, {name: 1})
{ "_id" : ObjectId("51df08e72476b99608460870"), "name" : "Portland" }

    • You can also query based on values in nested arrays, either matching exact values or matching partial values or all matching values or the lack of matching values.
> db.towns.find( { famous_for : 'food' }, { _id : 0, name : 1, famous_for : 1 } )
{ "name" : "New York", "famous_for" : [  "statue of liberty",  "food" ] }
{ "name" : "Portland", "famous_for" : [  "beer",  "food" ] }
> db.towns.find( { famous_for : /statue/ }, { _id : 0, name : 1, famous_for : 1 } )
{ "name" : "New York", "famous_for" : [  "statue of liberty",  "food" ] }
> db.towns.find( { famous_for : { $all : ['food', 'beer'] } }, { _id : 0, name:1, famous_for:1 } )
{ "name" : "Portland", "famous_for" : [  "beer",  "food" ] }
> db.towns.find( { famous_for : { $nin : ['food', 'beer'] } }, { _id : 0, name : 1, famous_for : 1 } )
{ "name" : "Punxsutawney", "famous_for" : [  "phil the groundhog" ] }

    • You can query a sub-document by giving the field name as a string separating nested layers with a dot.
> db.towns.find( { 'mayor.party' : 'I' }, { _id : 0, name : 1, mayor : 1 } )
{ "name" : "New York", "mayor" : { "name" : "Michael Bloomberg", "party" : "I" } }

    • To query the nonexistence of a field value
> db.towns.find( { 'mayor.party' : { $exists : false } }, { _id : 0, name : 1, mayor : 1 } )
{ "name" : "Punxsutawney", "mayor" : { "name" : "jim Wehrle" } }

  • elemMatch

$elemMatch helps us specify if a document or a nested document matches all of our criteria, the document counts as a match and returned. We can use any advanced operators within this criteria. To show it in action, let’s insert some data into a new collection countries
> db.countries.insert({ _id : "us",
... name : "United States",
... exports : {
...  foods : [
...   { name : "bacon", tasty : true },
...   { name : "burgers" }
...          ]
...            }
... })
> db.countries.insert({ _id : "ca",
... name : "Canada",
... exports : {
...  foods : [
...   { name : "bacon", tasty : false },
...   { name : "syrup", tasty : true }
...          ]
...             }
... })
> db.countries.insert({ _id : "mx",
... name : "Mexico",
... exports : {
...  foods : [
...   { name : "salsa", tasty : true, condiment : true }
...          ]
...           }
... })
> print( db.countries.count() )
3

Now if we need to select countries that export tasty bacon, we should $elemMatch in a query like:
> db.countries.find(
... {
...   'exports.foods' : {
...    $elemMatch : {
...      name : "bacon",
...      tasty : true
...     }
...   }
... },
... { _id : 0, name : 1 }
... )
{ "name" : "United States" }

If we didn’t used $elemMatch and wrote a query like the following, it will return countries that export bacon or tasty food, not tasty bacon
> db.countries.find(
... { 'exports.foods.name' : 'bacon' , 'exports.foods.tasty' : true },
... { _id : 0, name : 1 }
... )
{ "name" : "United States" }
{ "name" : "Canada" }

  • Boolean Operators

$or a prefix for criteria to return document that match either condition1 or condition2. There is a a lot of operators you can use.
> db.countries.find({
... $or : [ { _id : "mx" } , { name : "United States" } ] },
... {_id : 1} )
{ "_id" : "us" }
{ "_id" : "mx" }

Update

The find() function toke two parameters, criteria and list of fields to return. The update() function works the same way. The first parameter is a criteria (the same way you will use to retreive the document through find()). The second parameter is an object whose fields will replace the matched document(s) or a modifier operation ($set to set a field value, $unset to delete a field, $inc to increment a field value by a number).
The following query will set the field state with the string OR for the matching document.
db.towns.update( { _id : ObjectId("4d0ada87bb30773266f39fe5") }, { $set : { "state" : "OR" } } )
but the following query will replace the matching document with a new document { state : “OR”}
db.towns.update( { _id : ObjectId("4d0ada87bb30773266f39fe5") }, { state : "OR" } )

References

Although MongoDB is schema-less but you can make one document reference another document using a construct like { $ref : “collection_name”, $id : “reference_id” }. In the following query we linking New York town with the country US. Notice the display of the new country field in the New York town document.
> db.towns.update(
... { _id : ObjectId("51def56c1cf66f4c40bb7f4a") },
... { $set : { country : { $ref : "countries", $id : "us" } } }
... )
> db.towns.find( { _id : ObjectId("51def56c1cf66f4c40bb7f4a") } )
{ "_id" : ObjectId("51def56c1cf66f4c40bb7f4a"), "country" : DBRef("countries", "us"), "famous_for" : [  "statue of liberty",  "food" ], "last_census" : ISODate("2009-07-31T00:00:00Z"), "mayor" : { "na
me" : "Michael Bloomberg", "party" : "I" }, "name" : "New York", "population" : 22200000 }
Now we can retrieve New york from towns collection, then use it to retrieve its country
> var NY = db.towns.findOne( { _id : ObjectId("51def56c1cf66f4c40bb7f4a") } )
> db.countries.findOne( { _id : NY.country.$id })
{
        "_id" : "us",
        "name" : "United States",
        "exports" : {
                "foods" : [
                        {
                                "name" : "bacon",
                                "tasty" : true
                        },
                        {
                                "name" : "burgers"
                        }
                ]
        }
}

or in a different way > db[ NY.country.$ref ].findOne( { _id : NY.country.$id} )

Delete

Removing documents from a collections is simple, use your criteria with a call to remove() function and all matching documents will be removed. It’s a recommended practice to build your criteria in an object, use that criteria to ensure that matching documents are the expected ones, pass this criteria to remove() function.
> var bad_bacon = { 'exports.foods' : {
... $elemMatch : { name : 'bacon', tasty : false }
... } }
> db.countries.find ( bad_bacon)
{ "_id" : "ca", "name" : "Canada", "exports" : { "foods" : [    {       "name" : "bacon",       "tasty" : false },      {       "name" : "syrup",       "tasty" : true } ] } }
> db.countries.remove( bad_bacon)
> db.countries.count()
2

Reading with Code

You could ask MongoDb to run a decision function across documents.
db.towns.find(  function() {
return this.population > 6000 && this.population < 600000;
} )
or in a short-hand format db.towns.find("this.population > 6000 && this.population < 600000")
or combine custom code with other criteria using $where
db.towns.find( {
$where : "this.population > 6000 && this.population < 600000",
famous_for : /groundhog/
} )
Custom code should be your last resort due to the following:
  1. Custom code queries run slower than regular queries.
  2. Custom code queries can’t be indexed.
  3. MongoDb can’t optimize custom code queries.
  4. If your custom code assume the existence of a particular field and this field is missing in a single, the entire query will fail.
In this post we explored the basics of MongoDB, a rising star in the NoSQL family and the most common document database. We saw how to install and configure it. We saw how we can store nested structured data as JSON objects and query that data at any depth. We How we can update and delete data. In the next post we going to dig deep into MongoDB.