Archive | general RSS feed for this section

Telemetry Series: Page Faults

A key component of optimizing application performance is tuning the performance of the database that supports it. Each post in our Telemetry series discusses an important metric used by developers and database administrators to tune the database and describes how MongoLab users can leverage Telemetry, MongoLab’s monitoring interface, to effectively review and take action on these metrics.

Page Faults

Databases are optimized for working with data that is stored on disk, but usually cache as much data as possible in RAM in order to access disk as infrequently as possible. However, as it is cost-prohibitive to store in RAM all the data accessed by the application, the database must eventually go to disk. Because disks are slower than RAM, this incurs a significant time cost.

Effectively tuning a database deployment commonly involves assessing how often the database accesses disk with an eye towards reducing the need to do so. To that end, one of the best ways to analyze the RAM and disk needs of a MongoDB deployment is to focus on what are called Page Faults.

What is a Page Fault?

MongoDB manages documents and indexes in memory by using an OS facility called MMAP, which translates data files on disk to addresses in virtual memory. The database then accesses disk blocks as though it is accessing memory directly. Meanwhile, the operating system transparently keeps as much of the mapped data cached in RAM as possible, only going to disk to retrieve data when necessary.

When MMAP receives a request for a page that is not cached, a Page Fault occurs, indicating that the OS had to read the page from disk into memory.

What do Page Faults mean for my cluster?

The frequency of Page Faults indicates how often the OS goes to disk to read data. Operations that cause Page Faults are slower because they necessarily incur disk latency.

Page Faults are one of the most important metrics to look at when diagnosing poor database performance because they suggest the cluster does not have enough RAM for what you’re trying to do. Analyzing Page Faults will help you determine if you need more RAM, or need to use RAM more efficiently.

How does Telemetry help me interpret Page Faults?

Select a deployment and then look back through Telemetry over months or even years to determine the normal level of Page Faults. In instances where Page Faults deviate from that norm, check application and database logs for operations that could be responsible. If these deviations are transient and infrequent they may not pose a practical problem. However, if they are regular or otherwise impact application performance you may need to take action.

A burst in Page Faults corresponding to an increase in database activity.

A burst in Page Faults corresponding to an increase in database activity.

If Page Faults are steady but you suspect they are too high, consider the ratio of Page Faults to Operations. If this ratio is high it could indicate unindexed queries or insufficient RAM. The definition of “high” varies across deployments and requires knowledge of the history of the deployment, but consider taking action if any of the following are true:

  • The ratio of Page Faults to Operations is greater than or equal to 1.
  • Effective Lock % is regularly above 15%.
  • Queues are regularly above 0.
  • The app seems sluggish.

Note: Future Telemetry blog posts will cover additional metrics, such as Effective Lock % and Queues. See MongoDB’s serverStatus documentation for more information.

How do I reduce Page Faults?

How you reduce Page Faults depends on their source. There are three main reasons for excessive Page Faults.

  1. Not having enough RAM for the dataset. In this case, the solution is to add more RAM to the deployment by scaling either vertically to machines with more RAM, or horizontally by adding more shards to a sharded cluster.
  2. Inefficient use of RAM due to lack of appropriate indexes. The most inefficient queries are those that cause collection scans. When a collection scan occurs, the database is iterating over every document in a collection to identify the result set for a query. During the scan, the whole collection is read into RAM, where it is inspected by the query engine. Page Faults are generally acceptable when obtaining the actual results of a query, but collection scans cause Page Faults for documents that won’t be returned to the app. Worse, these unnecessary Page Faults are likely to evict “hot” data, resulting in even more Page Faults for subsequent queries.
  3. Inefficient use of RAM due to excess indexes. When the indexed fields of a document are updated, the indexes that include those fields must be updated. When a document is moved on disk, all indexes that contain the document must be updated. These affected indexes must enter RAM to be updated. As above, this can lead to thrashing memory.

Note: For assistance determining what indexes your deployment needs, MongoLab offers a Slow Query Analyzer that provides index recommendations to Shared and Dedicated plan users.

Have questions or feedback?

We’d love to hear from you as this Telemetry blog series continues. What topics would be most interesting to you? What types of performance problems have you struggled to diagnose?

Email us at support@mongolab.com to let us know your thoughts, or to get our help tuning your MongoLab deployment.

MongoDB version 3.0 now GA on MongoLab

We’re excited to announce that MongoDB 3.0 is now available on all MongoLab plans. Since the release of version 3.0 was announced in March, we’ve done extensive testing to ensure that it is production-ready for MongoLab users. For those looking to upgrade or create a new MongoDB 3.0 plan, you can do so through our self-service UI. Version 3.0 offers several valuable improvements, including collection-level locking; a new, more secure user authentication mechanism (SCRAM-SHA-1); and the WiredTiger storage engine. Each of these three improvements is described in detail below.

There are two important items to note, as you consider upgrading to version 3.0:

1) A driver upgrade may be required when upgrading your database to 3.0. You can find a matrix of 3.0 compatible drivers in the MongoDB 3.0 release notes.

2) Our release of support for version 3.0 comes with the default MMAPv1 storage engine. Support for the new WiredTiger storage engine will come later, most likely with the release of MongoDB 3.2, where it is expected to become the default storage engine for MongoDB. For more information about our support for storage engines, please read the section entitled “WiredTiger storage engine” below.

Collection-level locking

The default storage engine in MongoDB 3.0 is MMAPv1, which experienced MongoDB users may recognize as the same storage engine underlying previous versions of MongoDB. Although the name has stayed the same, MongoDB now offers collection-level locking; in prior versions of MongoDB, the database-level lock was the finest-grain lock.

How will this impact you?  In versions of MongoDB prior to 3.0, database-level locking would lock the entire database any time an operation that required the write lock (e.g. insert, update, delete) was issued. With collection-level locking, a write operation on one collection will not block the database from servicing reads and writes on other collections.

The effects of collection-level locks on your database deployment will vary depending on your data model, but generally you should see performance improvements, particularly in write-heavy workloads that target more than one collection.

SCRAM-SHA-1 authentication

In MongoDB 3.0, SCRAM-SHA-1 has now replaced MONGODB-CR as the default authentication mechanism. For the security buffs, MongoDB has written an interesting blog post that speaks to the advantages of SCRAM (short for “Salted Challenge Response Authentication Mechanism”). Two notable benefits include improved security against “malicious servers,” and heightened resistance to “replay attacks.”

Depending on your driver version, you may need to upgrade your driver to a 3.0- (or SCRAM-) compatible version. If you’re unsure if your current driver version supports SCRAM, be sure to check out MongoDB’s release notes. Again, make sure you double-check your driver version before you upgrade, or your driver will start throwing errors and you will experience downtime!

WiredTiger storage engine

MongoDB 3.0 ships with two storage engines: the default MMAPv1 engine (with collection-level locking), and the new WiredTiger storage engine (with document-level locking). We’re very excited about WiredTiger and have already begun testing internally. We look forward to supporting WiredTiger for MongoLab production plans when it is expected to become the default storage engine in version 3.2.

Questions?

As you will discover, there are numerous changes and enhancements to 3.0. We recommend that you explore the full list of changes and improvements in the MongoDB 3.0 release notes. If you have any questions along the way, drop us a line at support@mongolab.com and we’d be happy to help!  For example, if your MongoLab deployment experiences high write loads, and you would like to discuss how best to leverage collection-level locking to enhance your performance, please drop us a line!

MongoLab Telemetry supports custom MongoDB metric alerts

Telemetry Alerts

We’re excited to announce that you can now use MongoLab Telemetry to configure per-metric alerts for your MongoLab deployments! These custom alerts allow you to stay updated on your database’s performance even when you’re not actively working with the database.

For each metric in your Telemetry dashboard you may define custom threshold values and alerting methods (email, PagerDuty, etc.).

For a Quick-start Guide and full docs, visit our documentation on Telemetry Alerts.

SSL now in public beta on MongoLab

Our team at MongoLab is excited to announce the public beta of SSL-enabled* (Secure Sockets Layer) MongoDB connections on Dedicated deployments**. This feature adds an extra level of security by encrypting the communication between the application and database. It also allows clients to authenticate the identity of their database servers in order to mitigate spoofing attacks.

SSL on production-ready MongoDB

MongoLab has partnered with DigiCert, one of the most respected certificate issuers in the industry, to automate the certificate management process.

What’s unique about this solution is that we allow you to control the scope of your SSL certificates. You can issue unique certificates for each deployment or choose to share a single certificate amongst all of the SSL-enabled deployments in your account.

SSL domain scopes

We are currently offering two domain scopes: account and deployment. The narrower deployment scope maximizes security at increased cost, whereas the broader account scope balances security with cost. You can visit our documentation for more information on available domain scopes.

SSL configuration for clients (drivers)

For instructions on how to connect your driver to a SSL-enabled MongoDB deployment, you can visit the MongoLab documentation.

Pricing and availability

SSL support is currently in beta on MongoLab. For more details on pricing and the beta status you can visit our SSL documentation. If you’re interested in trying out SSL on MongoLab, you can create a new Dedicated deployment or upgrade an existing Dedicated deployment.

*Currently, all deployments with SSL enabled use the preferSSL net.ssl.mode

**We are only offering SSL on MongoDB 2.6

 

{ "comments": 2 }

Respondly explains why devs love Meteor and MongoDB

With the recent release of Meteor 1.0 (and the huge buzz around it), developers may be wondering why the Meteor framework is so popular. Or perhaps, for developers new to web programming, what is Meteor?

To help you better understand Meteor from a developer’s perspective, we’ve asked Tim Haines, founder of team email and twitter inbox service Respondly, to share his experience working with both Meteor and MongoDB.

Continue Reading →

{ "comments": 8 }

Run SQL Queries on MongoLab

This is a guest post by John A. De Goes, CTO of SlamData. SlamData is the commercial company behind the open source project of the same name. John is the original author and an active contributor to the SlamData project.

SlamData is a relatively new open source project that lets you write and execute SQL queries against a MongoDB instance. We just launched 1.0 of the product, after many engineering years of effort.

In this post, I’ll talk a little bit about what SlamData is useful for, and how you can begin using SlamData with your MongoLab account. Continue Reading →

{ "comments": 1 }

Using Fluentd and MongoDB serverStatus for real-time metrics

As developers, we often look for tools to make our work and processes more efficient. Sometimes we have to search for what we’re looking for and sometimes we’re lucky enough that it finds us! When our friends over at Treasure Data wrote to me about Fluentd, an open-source logging daemon written in Ruby that they created and maintain, I immediately saw value for MongoDB users looking for a quick way to collect data streams and store information in MongoDB.

Intro to Fluentd

Fluentd is an open source data collector designed to simplify and scale log management. Open-sourced in October 2011, it has gained traction steadily over the last 2.5 years: today, Fluentd has a thriving community of ~50 contributors and 1,900+ stargazers on GitHub with companies like Slideshare and Nintendo deploying it across hundreds of machines in production.

Fluentd has broad use cases: Slideshare integrates it into their company-wide infrastructure monitoring system, and Change.org uses it to route their log streams into various backends.

Most relevant to MongoDB developers, many folks use Fluentd to aggregate logs into MongoDB. The MongoDB community was one of the first to take notice of Fluentd, and the MongoDB plugin is one of the most downloaded Fluentd plugins to date. Continue Reading →

MongoDB and the new PHP on Heroku

Update 5/7/14 5:15PM: Post has been rewritten to reflect Heroku’s PHP Buildpack changes

Update 5/5/14 10:45PM: Heroku’s default PHP buildpack now allows Composer to install the MongoDB extension. You do not need to use the develop branch. To enable, add “ext-mongo” to the composer.json file as described in the Solution section.

One of our close Platform-as-a-Service (PaaS) partners, Heroku, recently announced its official public beta for new PHP features. This announcement was met with much excitement from the developer community, and the MongoLab team looks forward to working with PHP developers on Heroku who want to power their apps with fully-managed MongoDB databases.

This post will cover how to install the PHP MongoDB extension on Heroku. Continue Reading →

{ "comments": 2 }