PERFORMANCE

Storage Engines: Introduction

As you see above, the drivers communicate with the server. The server communicate with the disks via the storage engine.The storage engine is like a car engine : more it is effective, more the car is fast. It is the same about the request.

It uses memory to access to the disks.

Two types of storage engine exist :

  • MMAP (default)
  • Wired Tiger (2014 Sleepy Cat)

Storage Engines: MMAPv1

  • Automatically allocates power-of-two-sized documents when new documents are inserted.
  • Is built on top of the mmap system call that maps files into memory

Storage Engines: WiredTiger

The WiredTiger storage engine has two main features :

  • Document-level concurrency
  • Commpression of the datas and indexes

In the most of cases, WT is more efficient than MMAP.

To use WT, kill first all mongod instances :

killall mongod

Create a new directory :

mkdir WT

Then tell to mongo to use that directory and use the storageEngine flag to choose the engine you want :

mongod -dbpath WT -storageEngine wiredTiger

To check if you are using WT, you can type this query :

db.foo.stats()

Indexes

While reads are much faster with indexes, writes to a document will happen slower. Combination operations, such as update and deletion operations, where you find the document you want and then perform a write, will benefit from the index when you're performing the query stage. Then may be slowed by the index when you perform the write.

Usually you're still better off having an index, but there are some special cases where this may not be true.

Indexes in mongodb are in btrees. This is true for MMAP (and therefore for MongoDB prior to 3.0), but it does depend on your storage engine. For example, when you are using WiredTiger, as of MongoDB 3.0, indexes are implemented in b+trees.

Creating Indexes

The following command add an index to a collection named students, having the index key be class (increased order), student_name (decreased order) :

db.students.createIndex({class: 1, student_name: -1})

The explain method give us many informations about the following find query :

db.students.explain().find({student_id:5})

The result look like this :

Note that in winningPlan , the value of the stage property is COLLSCAN. That means the query run through all the collection, because we didn't indexed it. If an index is set, the stage property will be different and we will see an indexName property below.

Also, explain can takes true as an argument :

db.students.explain(true).find({student_id:5})

It will return the same result than before and add docsExaminated property. This cool property shows how many documents has been checked by the query.

Discovering (and Deleting) Indexes

The following command allows us to check the index already created :

db.students.getIndexes()

The following command drops a specific index :

db.students.dropIndexes({student_id:1})

Multikey Indexes

The following command creates a multikey index :

db.foo.createIndex( { a:1, b:1 } )

Note that the following insert is not valid in the case that the two query values are arrays :

db.foo.insert( { a : [ 1, 2, 3 ], b : [ 5, 6, 7 ] } )

Dot Notation and Multikey

Suppose you have a collection called people in the database earth with documents of the following form :

{
    "_id" : ObjectId("551458821b87e1799edbebc4"),
    "name" : "Eliot Horowitz",
    "work_history" : [
        {
            "company" : "DoubleClick",
            "position" : "Software Engineer"
        },
        {
            "company" : "ShopWiki",
            "position" : "Founder & CTO"
        },
        {
            "company" : "MongoDB",
            "position" : "Founder & CTO"
        }
    ]
}

This following creates an index on company, descending :

db.foo.createIndex({"work_history.company": -1})

Index Creation Option, Unique

The following command creates a unique index on student_id, class_id, ascending for the collection students :

db.students.createIndex({student_id: 1, class_id: 1}, {unique: true})

create a unique index that enforce to constraint that the keys have to be unique within the collection.

Index Creation, Sparse

If you want to create a unique index on the property that is not present in all documents of the collection, you have to use the sparse option :

db.students.createIndex({student_id: 1, class_id: 1}, {unique: true, sparse: true})

Index Creation, Background

There are two way to create indexes :

  • Foreground (default)
    • fast
    • block writers and readers in the database
  • Background
    • slow
    • don't block writers and readers in the database

When we create an index in the production server, we should create it in the background way to avoid the performances issues :

db.students.createIndex({student_id: 1}, {background: true})

Note that although the database server will continue to take requests, a background index creation still blocks the mongo shell that you are using to create the index.

Using Explain

The explain() command give us informations on how the query works. It allows us to find out the indexes used, the number of documents found, the number of watched documents, etc.

Explain command example :

db.example.explain().find( { a : 1, b : 2 } )