Update 8/18/2014: Added section about refresh_interval, removed Mongoid 4 references
Many of the support requests we get at MongoLab are questions about how to properly configure and use particular MongoDB drivers.
This blog post is the first of a series where we plan to cover each of the major MongoDB drivers in depth. The driver we’ll be covering today is Mongoid, developed by Durran Jordan (@modetojoy).
In this post:
- A simple Mongoid example
- Production-ready connection settings
- Mongoid tips & tricks
- Lower the refresh_interval option to quickly discover changes to a replica set node’s state
- By default, Mongoid reads from secondaries
- The timeout setting in Mongoid configuration is doubled
- Adjust connection timeouts to mitigate errors and ensure proper failover handling
- Use the no_timeout option on long operations
- Be careful managing connections when using Sidekiq
- Build the indexes required by DelayedJob Mongoid Backend
- You’re all set!
Mongoid is a Ruby Object Document Mapper (ODM). For relational folks, ODMs are the MongoDB equivalent of Object Relational Mappers (ORMs). One major reason developers use ODMs like Mongoid is that it gives them the ability to define a schema for their documents which can then be used to map documents to objects in their programming language. With Mongoid, this feature serves as an easy transition for Ruby on Rails developers used to working with ActiveRecord.
Mongoid, and the lower-level Moped driver it is written on top of, is one of the only major language drivers not written by MongoDB, Inc. However, because it is the most popular Ruby driver amongst MongoLab users, we get a lot of support requests on how to configure and use this driver properly with MongoLab.
This post will help you understand how to configure and use Mongoid effectively in your MongoDB application.
Note: The official MongoDB Ruby driver backed by MongoDB, Inc. can be found here: http://docs.mongodb.org/ecosystem/drivers/ruby/#ruby-driver
A simple Mongoid example
Production-ready connection settings
We often see that users have problems connecting to MongoLab using the Mongoid driver. The root cause is almost always incorrect configuration of the driver, particularly around timeouts.
In Mongoid, a mongoid.yml file holds the driver configuration settings. The following are the MongoLab-recommended settings for version 3 of Mongoid:
Additional options can be found here.
Mongoid tips & tricks
Lower the refresh_interval option to quickly discover changes to a replica set node’s state
A little-known Mongoid option is the refresh_interval setting. This option sets the number of seconds to cache information about a node – the default time is 300 seconds, or 5 minutes. You will likely want to set the refresh_interval to a lower value to better discover and handle any changes to a replica set node’s state. This option isn’t well-documented, but you can find it in the Mongoid source code. We recommend a value of 5-10 seconds.
By default, Mongoid reads from secondaries
Mongoid only supports two MongoDB Read Preference Modes, primary (consistency: :strong) and secondary (consistency: :eventual).
If you use Mongoid’s default consistency setting, Mongoid will direct reads to secondaries. This is often not desirable for a number of reasons. In particular, secondary reads can often result in:
Reading stale data. Replication from the primary to the secondary is not instantaneous. Therefore queries to a secondary may not reflect changes from preceding database operations and can return unexpected results. You must think about whether eventual consistency semantics are appropriate for your application before using secondary reads.
Slow or failed reads during index builds. MongoDB only supports foreground index builds on secondaries, which blocks other operations while the indexes are building. If you initiate an index build on a running deployment, queries you send to the secondary will hang.
Slow or failed reads during backups or snapshots. Most backup techniques involve locking or stopping a secondary MongoDB instance. If you are using secondary reads queries you send to the secondary will hang.
An overloaded primary, if a secondary node goes down or becomes unreachable. Directing reads to the secondary increases read throughput and reduces load on the primary. However when a secondary node fails, all reads and writes are issued to the primary. These extra operations can significantly increase load and topple the database.
The timeout setting in Mongoid configuration is doubled
In the mongoid.yml file example above with the timeout set to 15, the actual timeout value applied by Mongoid will be 30 seconds. You can find details at https://github.com/mongoid/mongoid/issues/3445.
Adjust connection timeouts to mitigate errors and ensure proper failover handling
A common error that we see with the Mongoid driver is ConnectionFailure. For example:
Moped::Errors::ConnectionFailure: Could not connect to a primary node for replica set <Moped::Cluster nodes=[<Moped::Node resolved_address="ipAddress">, <Moped::Node resolved_address="ipAddress">]>
This error is thrown whenever Mongoid has difficulty making a connection. It’s normal to see this error thrown multiple times in succession as Mongoid will retry the connection creation based on your mongoid.yml settings – specifically the retry_interval.
In many cases the default value of 5 seconds for this timeout will be adequate. But often, users who are running their app with a Platform-as-a-Service (PaaS), such as Heroku, will find that their driver often needs longer to establish a connection to the database, particularly when dynos are just starting up. For this reason we recommend allowing up to 30 sec for connections to become established.
Use the no_timeout option on long operations
Most MongoDB drivers have separate settings for connection timeouts (how long to wait to establish a connection) and socket timeouts (how long to wait for a response from the database when issuing operations, like a query). This can be problematic since you often want a short and finite connection timeout but often you will wait to wait much longer (often indefinitely) for results from queries.
With our recommended connection timeout setting of 30 sec you may find your driver prematurely giving up on queries that take longer than 30 sec.
Luckily, Moped has a no_timeout option that can be used for these particular situations. Revisiting our simple example, the find query can be rewritten as the following to ensure that the cursor will not timeout.
Be careful managing connections when using Sidekiq
When using Sidekiq (for background processing jobs) in conjunction with Mongoid, it’s important to be aware that each job will prompt the creation of a new connection. At lower numbers this behavior is fine but once you get into thousands of connections you need to use connection pooling, which Mongoid does not directly support. We’ve directed our users to this thread on the issue, which discusses the problems that arise and proposes a solution for reusing connections.
Build the indexes required by DelayedJob Mongoid Backend
The delayed_job_mongoid module issues a findAndModify command on the “delayed_backend_mongoid_jobs” collection and queries on the “run_at” field. As with all MongoDB queries, it’s important to make sure right indexes are in place for optimal performance. Be sure to follow the delayed_job_mongoid installation instructions to create the required indexes.
You’re all set!
We hope this post helps shed light on some of Mongoid’s quirks. If you have any tips and tricks that you’d like to share please post in comments or write to us at firstname.lastname@example.org so we can pass on the knowledge. Be sure to check back when Mongoid 4 is released!