Archive | August, 2011

Optimize slow MongoDB queries using MongoLab

You’ve launched your application, driven some traffic to it (maybe it’s even “gone viral”!), accumulated a meaningful volume of data and now it feels like some of your queries have become sluggish. What now? Time for query profiling!  MongoLab can help.

Step 1: Turn on the profiler

To turn on the query profiler log in to your MongoLab account and navigate to your database’s home screen. From there select the “Tools” tab and the “commands” sub-tab. Then select the “profile (log slow)” command.

The default threshold for slow queries is 100ms. To change this threshold, pass a different value for “slowms”:

{
   "profile" : 1,
   "slowms" : 50
}

Step 2: Examine the ‘system.profile’ collection

The profiler deposits all profile data into a collection named ‘system.profile’. After running your application for a bit (and hopefully accessing the code paths that you believe might be executing slow queries), go into the ‘system.profile’ collection and take a look.

The resulting documents represent operations taking more than 100ms (or whatever you set the threshold to be). Each profile entry has the following form:

{
   "ts" : <timestamp>,
   "info" : <detailed-info-on-the-operation>,
   "millis" : <how-long-it-took>,
}

Step 3: Diagnose and take action

If your slow operations are queries, the most likely problem is that you are doing needless “collection scans” on large collections. This means that the database is iterating through each document in the collection to compute the query result instead of using an index.

You can identify queries doing collection scans by examining the “info” field of the profile entries. If the ‘nscanned’ (number of documents scanned) value is significantly greater than the ‘nreturned’ (number of documents returned) value, you should consider creating an index to optimize the query. You can see an example of this in the second profile entry in the screenshot above.

The next step in the diagnosis would be to look at the query itself. Is it filtering on a field that could be indexed? If that field has a decent range of values (vs. say two values, like with booleans) an index might help a great deal. Is the query performing a sort on an unindexed field? If so, indexing that field is a good idea. Finally, consider compound indexes if you are filtering and sorting by different fields in the same query. You can add indexes for a collection in MongoLab by going to the “Indexes” tab on any collection home screen. See MongoDB’s documentation on indexes here.

Of course, all of this applies to any operation with a query or sort component. This can include updates, deletes, and the use of findAndModify.

Other possible reasons why your queries are slow:

  • You not limiting a large result set (i.e. you are shipping a million records back to your app instead of using skip() and limit() to page through the results)
  • Your documents are simply enormous. Consider masking some of the fields so that only a subset of fields are returned.
  • You are executing a count() on a query that filters by one or more fields that have not been indexed
  • You are using server-side javascript execution (can be slow)

Step 4: Turn off the profiler

You don’t want to leave the profiler on if you are not using it. While not huge, there is some overhead to leaving it on (i.e. it can slow things down a tad). To turn it off simply go back to the same screen you used to turn in on in Step 1, select the “profile (turn off)” command from the menu, and run the command.

Further reading

{ "comments": 16 }

MongoDB: it’s not just about big data

MongoDB has a great story around big data and horizontal scalability, but it is also a great general-purpose database and often the right choice even when you don¹t have a lot of data. I gave a talk at a local San Francisco NoSQL meetup about this very topic.  The talk received a ton of good feedback – maybe I’ll give it again!

Here are the slides:
http://www.slideshare.net/willshulman/mongodb-its-not-just-about-big-data

Enjoy!

{ "comments": 4 }

SF Startup Crawl this Thursday August 11 starting at 5 pm!

Great opportunity to meet our team and hang out with many MongoDB-powered startups this Thursday evening!

We’ll be hosting a table at the first stop along with 10gen (MongoDB), RocketSpace and Zaarly… if you don’t catch us there we’ll see you on the rest of the crawl!

Join us at 5 pm on Thursday August 11:
181 Fremont St 94105
(between Howard & Mission)

Register here:
http://startupcrawl.zaarly.com/

You’ll be emailed if you get on the list with more details and a schedule on Tuesday.  Can’t wait!

{ "comments": 5 }