Tutorial: Scaling Meteor with MongoDB oplog tailing

Ever since Meteor 0.7.0 first introduced oplog tailing, we’ve had a lot of users asking us about using the MongoDB oplog with their Meteor applications. As a result, we thought a step-by-step tutorial would help folks get started.

Meteor Oplog Tailing Overview

If you’re still feeling your way around the Meteor framework, you may not know about oplog tailing just yet. The Meteor team released an improvement to observeChanges, which monitors MongoDB’s oplog to avoid extra operations on the database. This improvement significantly reduces the number of queries needed to obtain the freshest changes to your MongoDB.

The “local” database holds the MongoDB oplog

MongoDB’s operations log keeps a rolling record of all inserts, updates and deletes. This log is stored in a special database called “local” which exists on each member of a replica set and does not replicate.

From MongoDB’s documentation on the oplog: “MongoDB applies database operations on the primary and then records the operations on the primary’s oplog. The secondary members then copy and apply these operations in an asynchronous process. All replica set members contain a copy of the oplog.”

The Meteor framework smartly uses the oplog to keep track of changes to your data. This minimizes the number of queries necessary when searching for changes to your data.

Meteor Oplog Tailing Tutorial

We’ve hacked up our own example using Meteor’s oplog tailing with a database hosted on MongoLab (don’t worry, you can still use this tutorial even if you’re not using MongoLab). Here’s a link to this example project’s repo on GitHub.

To give you an idea of what we’re trying to accomplish, we’ll be setting up a simple Meteor app that displays a list of players and their scores. The app will also have an input form for inserting new players and scores to show real-time updates in the client view.

When our project is complete, we’ll observe real-time metrics from two apps, one with oplog tailing configured and one without, which are otherwise identical.  The output should look like the following:

Meteor serverFacts output. Oplog tailing enabled output on left, no tailing enabled on right.

Pre-requisites

In this tutorial we’ll assume that you have:

  • a MongoDB database (with access to the “local” database)
    • your own MongoDB deployment OR
    • any for-pay subscription with MongoLab (starts at $15/mo)
  • Node.js installed
  • Meteor installed

Set up the project

First, create a Meteor project.

> meteor create app

Then navigate into your app to start modifying the project files.

> cd app

Now we’ll remove the “autopublish” package which is not recommended for production use.

> meteor remove autopublish

We’ll also want to run the “facts” package, which contains real-time information about our Meteor server. This will help us determine if oplog-tailing is working.

> meteor add facts

Configure the view

First we’ll define a view to display relevant content to the client. We’ll edit our app.html file to look like the following:

You’ll notice that the file contains conventional HTML code and Meteor’s templating language, Spacebars (inspired by Handlebars). In the body, we’ll lay out what we want the user to see: a list of players with their scores, a form to add new players, and “serverFacts” metrics.

The “serverFacts” template is a report auto-generated by Meteor that contains real-time information for the current server. This is how we will later verify that our application has oplog-tailing enabled.

Now that we have the templates in place, we need to configure them. We’ll configure the “players” template to iterate over all the players and list their names and scores. The “form” template is straightforward as well – configure it as you would a HTML form.

Create the model

Now we’ll create a new file, models.js, in our project directory to specify a collection to store our documents.

Our Players model maps to a “players” collection in your MongoDB. You don’t need to create this collection ahead of time as MongoDB will lazily create it for you (it doesn’t already exist) once you insert a document.

Link together the client, server, and MongoDB

In order for the client (browser) view to display all the players, we need to query for them on our database and pass the cursor to the client. We’ll replace the default app.js file to look like the following:

Starting from the top, we have our server side code. This uses Meteor’s publish method to link together the client and server; we’ll cover the intricacies of the publish method in the next section. We also set up the code needed to publish server metrics to the “serverFacts” template that we created back in our app.html file.

We then have our client side code. Similar to the publish method in the server side code, here we use Meteor’s subscribe method, which we’ll also cover in the next section.

Next we create a template helper function that queries on the Players model and passes a cursor to the client (browser) view. This allows the “players” template from our view (app.html file) to iterate through all the Players that are returned and display them in the view.

Finally we create a handler for the “form” template that activates when the form is submitted. Typically you want to put your data validation code here as well, but our example inserts the data directly into the database.

Explore Meteor’s publish and subscribe methods

In order to fully understand what Meteor’s publish and subscribe methods do, it’s important to note that unlike other applications (Rails, Django, etc.) Meteor applications live on both the server and the client. This architecture allows Meteor to send raw data to the client (data on the wire) and access that data instantaneously without having to wait for a round-trip to the server. Meteor ensures that the client and server data (that you specify) are in sync.

In the server code we copied above, we use Meteor’s publish method to help implement oplog tailing. From the Meteor wiki: “Whenever a cursor is returned from the publish function, Meteor calls observeChanges on the cursor and provides it with callbacks which publish the query’s changes to clients.” This means that Meteor continues to watch the published query and calls a  callback function when results change.

Previously, Meteor’s only strategy for implementing observeChanges was to re-run the query frequently and calculate the difference between each set of results. With the introduction of the OplogObserveDriver class, Meteor can now read changes from the oplog.

Back to our example, we publish a record set with the name “playerData”. Once you’ve published the query to the “playerData” record set, you need to listen, or subscribe, to that record set on the client side. The subscribe method in our client code tells the server to send this particular set of records to the client, which is stored in the client-side database called MiniMongo.

Now when the data in our MongoDB changes for our players query, our view should reflect those changes in real-time and with minimal cost on the database.

Set up the app

We’re almost there! Next, you’ll need to bundle then extract your project to set up the application.

> meteor bundle app.tgz

> tar -zxvf app.tgz

Create a “local” database user

You’ll need to create a user for your local database so that your Meteor app can access the oplog. You will need to give the Meteor app your user credentials when you run the application. We recommend creating a user with read-only access.

If you’re using MongoLab, you can visit our docs to find instructions on creating a “local” database user.

Once you’re done, be sure to copy the MongoDB URI for your “local” database for the next step.

Run the app

Once you’re set with credentials, you can run your application with the following command:

> PORT=3000 MONGO_URL=<your_uri> MONGO_OPLOG_URL=<your_local_uri> node bundle/main.js

The MONGO_URL points to the MongoDB database that your application reads and writes to, whereas the MONGO_OPLOG_URL should point to your “local” database (which contains the oplog).

See the difference in real-time!

Once your application is running, we can verify if oplog tailing is working. Again, we highly recommend reading the Meteor wiki on the Oplog Observe Driver so you understand the underlying details.

To check if oplog tailing is enabled, you’ll want to verify that the observe-drivers-oplog metric is rendered and the observe-drivers-polling metric is at 0 or not rendered at all. This difference is subtle, but very important!

For extra fun, clone your app and run multiple copies at once to see real-time updates between concurrent clients. I recommend running one copy of your app with oplog tailing and one without. To run the app without oplog tailing, simply leave out the MONGO_OPLOG_URL option in the command.

> PORT=3000 MONGO_URL=mongolab_uri node bundle/main.js

Get the most out of your MongoDB on Meteor

In addition to this tutorial, we highly recommend watching David Glasser’s Devshop 10 talk on oplog tailing. He clearly articulates the background story, problem and solution and provides excellent visual examples.

We hope this tutorial helps you leverage all the tools at your disposal so that you can get the most out of your MongoDB on Meteor. We’re excited to see what you hack up!

Reporting back from MongoDB World 2014, NYC, Planet JSON

Closely approaching the one year mark of when I first joined MongoLab (and the MongoDB community), I had the pleasure of attending the inaugural MongoDB World conference put together by the incredible MongoDB team. Second only to the excitement around major MongoDB feature announcements was the collective disbelief that this was MongoDB’s first multi-day conference ever.  A big congratulations to all those that worked hard to put on such a massive (did you see the Intrepid!?) event. All this planning would have been for naught if MongoDB leaders and engineers failed to deliver announcements and features that would meet and exceed expectations. From major public cloud announcements to the reveal of document-level locking in version 2.8, developers and conference goers had plenty to be excited about. There was a lot to digest from the conference… we’ll cover the major highlights in case you missed them. Continue Reading →

{ "comments": 1 }

Production-ready MongoDB Replica Sets on Google Cloud Platform

Great news, Google Cloud users!

Today Google, MongoDB Inc., and MongoLab are announcing the arrival of fully-managed, production-ready MongoDB replica set plans on the Google Cloud Platform (GCP). These plans are hosted on Google Compute Engine (GCE) and managed by MongoLab. You can get started for free!

By leveraging MongoLab’s MongoDB-as-a-Service platform on GCP, Google developers running MongoDB can focus on product development and not get bogged down by database administration and operations. Automated provisioning, multi-zone data replication, backups and monitoring are all provided by the platform, so developers only need to worry about is their schema and their code (ok, we can help you a little with that too). Continue Reading →

{ "comments": 1 }

Using Fluentd and MongoDB serverStatus for real-time metrics

As developers, we often look for tools to make our work and processes more efficient. Sometimes we have to search for what we’re looking for and sometimes we’re lucky enough that it finds us! When our friends over at Treasure Data wrote to me about Fluentd, an open-source logging daemon written in Ruby that they created and maintain, I immediately saw value for MongoDB users looking for a quick way to collect data streams and store information in MongoDB.

Intro to Fluentd

Fluentd is an open source data collector designed to simplify and scale log management. Open-sourced in October 2011, it has gained traction steadily over the last 2.5 years: today, Fluentd has a thriving community of ~50 contributors and 1,900+ stargazers on GitHub with companies like Slideshare and Nintendo deploying it across hundreds of machines in production.

Fluentd has broad use cases: Slideshare integrates it into their company-wide infrastructure monitoring system, and Change.org uses it to route their log streams into various backends.

Most relevant to MongoDB developers, many folks use Fluentd to aggregate logs into MongoDB. The MongoDB community was one of the first to take notice of Fluentd, and the MongoDB plugin is one of the most downloaded Fluentd plugins to date. Continue Reading →

MongoDB driver tips & tricks: PHP

A large proportion of support requests to MongoLab are questions about how to properly configure and use a particular MongoDB driver.

This blog post is the third of a series where we are covering each of the major MongoDB drivers in depth. The driver we’ll be covering here is the PHP driver, developed and maintained by the MongoDB, Inc. team (primarily @derickr, @bjori and @jmikola). Continue Reading →

{ "comments": 1 }

MongoDB and the new PHP on Heroku

Update 5/7/14 5:15PM: Post has been rewritten to reflect Heroku’s PHP Buildpack changes

Update 5/5/14 10:45PM: Heroku’s default PHP buildpack now allows Composer to install the MongoDB extension. You do not need to use the develop branch. To enable, add “ext-mongo” to the composer.json file as described in the Solution section.

One of our close Platform-as-a-Service (PaaS) partners, Heroku, recently announced its official public beta for new PHP features. This announcement was met with much excitement from the developer community, and the MongoLab team looks forward to working with PHP developers on Heroku who want to power their apps with fully-managed MongoDB databases.

This post will cover how to install the PHP MongoDB extension on Heroku. Continue Reading →

{ "comments": 1 }

Analyze MongoLab Data with Hadoop in Mortar

The following is a guest post by Doug Daniels, CTO of Mortar Data Inc.

Today, we’re excited to announce integration between MongoLab and Mortar, the Hadoop platform for high-scale data science. If you have one of the 100,000+ databases at MongoLab, you can now seamlessly use Hadoop to:

  • Run advanced algorithms (like recommendation engines)
  • Build reports that run quickly in parallel against large collections
  • Join multiple collections (and outside data) together for analysis
  • Store results to Google Drive, back to MongoLab, or many other destinations

In this article we’ll show you how to connect your MongoLab database to Hadoop, and then use Hadoop to do something simple but very useful: gather schema information from an entire collection, including histograms of common values, data types, and more. Mortar handles all deployment, monitoring and cluster management, so no prior knowledge of Hadoop is required. Continue Reading →

Announcing New MongoDB Instances on Microsoft Azure

The following is a guest blog post by Brian Benz, Senior Technical Evangelist at Microsoft Open Technologies, Inc.

Since the previous release of production-ready MongoLab plans on Azure, we’ve seen demand increase significantly. The MongoLab and Microsoft teams have been working together to develop a solution for your growing requirements and are excited to announce the arrival of our newest high-memory MongoDB database plans, with virtual machine choices that now provide up to 56GB of RAM per node with availability in all eight Azure datacenters worldwide. Continue Reading →

{ "comments": 1 }