<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>MongoLab: MongoDB-as-Service</title>
	<atom:link href="http://blog.mongolab.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.mongolab.com</link>
	<description>Blog for MongoDB cloud hosting service MongoLab</description>
	<lastBuildDate>Wed, 15 May 2013 00:36:37 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>MongoLab now supports Google Cloud Platform!</title>
		<link>http://blog.mongolab.com/2013/05/mongolab-now-supports-google-cloud-platform/</link>
		<comments>http://blog.mongolab.com/2013/05/mongolab-now-supports-google-cloud-platform/#comments</comments>
		<pubDate>Wed, 15 May 2013 00:36:21 +0000</pubDate>
		<dc:creator>will</dc:creator>
				<category><![CDATA[general]]></category>
		<category><![CDATA[partnerships]]></category>

		<guid isPermaLink="false">http://blog.mongolab.com/?p=2052</guid>
		<description><![CDATA[This week at Google I/O we are launching support for MongoLab&#8216;s fifth cloud provider &#8211; Google Cloud Platform. You can now use MongoLab to provision and manage MongoDB deployments on Google Compute Engine (GCE)! So far we are very impressed with the capabilities of the GCE infrastructure.  In particular: The network is fast. I mean [...]]]></description>
				<content:encoded><![CDATA[<p dir="ltr"><a href="https://cloud.google.com/"><img class="alignnone size-full wp-image-2057" style="border-color: #ddd;" alt="01-digital_google_cloud_platform_logo_lockup-03" src="http://blog.mongolab.com/wp-content/uploads/2013/05/01-digital_google_cloud_platform_logo_lockup-03.png" width="371" height="95" /></a></p>
<p dir="ltr">This week at <a href="https://developers.google.com/events/io/" target="_blank">Google I/O </a>we are launching support for <a href="http://mongolab.com">MongoLab</a>&#8216;s fifth cloud provider &#8211; <a href="https://cloud.google.com/" target="_blank">Google Cloud Platform</a>. You can now use MongoLab to provision and manage MongoDB deployments on <a href="https://cloud.google.com/products/compute-engine" target="_blank">Google Compute Engine</a> (GCE)!</p>
<p>So far we are very impressed with the capabilities of the GCE infrastructure.  In particular:</p>
<ul>
<li dir="ltr">
<p dir="ltr">The network is fast. I mean really fast. Some of the <a href="http://bit.ly/YDwq2n" target="_blank">bandwidth and latency benchmark scores</a> are astounding. Since I/O is king for databases this will be great for connecting your GCE-hosted application to a MongoDB instance hosted by MongoLab.</p>
</li>
<li dir="ltr">
<p dir="ltr">GCE has a global private network connecting GCE regions across the world. This will be great for global multi-region clusters. We don&#8217;t support this quite yet, but when we do GCE will provide a high-speed private backbone upon which to build a great solution.</p>
</li>
<li dir="ltr">
<p dir="ltr">The API is clean, and VMs spin-up fast. This is key for automation, and we like to automate.</p>
</li>
</ul>
<p dir="ltr">For now we are in an early access beta, supporting only our free Sandbox database plans in GCE’s us-central1 region. We will be launching support for the rest of our product line in subsequent releases.</p>
<p dir="ltr">We will have a <a href="https://developers.google.com/events/io/developer-sandbox#t-google-cloud-platform" target="_blank">Developer Sandbox</a> (a.k.a &#8220;booth&#8221;) at the conference on Friday May 17th. If you are at Google I/O and into MongoDB come visit us!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mongolab.com/2013/05/mongolab-now-supports-google-cloud-platform/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>MongoSF 2013 : scaling the hyperbola of evolution with MongoDB</title>
		<link>http://blog.mongolab.com/2013/05/mongosf-2013-scaling-the-hyperbola-of-evolution-with-mongodb/</link>
		<comments>http://blog.mongolab.com/2013/05/mongosf-2013-scaling-the-hyperbola-of-evolution-with-mongodb/#comments</comments>
		<pubDate>Mon, 13 May 2013 17:44:15 +0000</pubDate>
		<dc:creator>dampier</dc:creator>
				<category><![CDATA[events]]></category>
		<category><![CDATA[general]]></category>
		<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://blog.mongolab.com/?p=1912</guid>
		<description><![CDATA[You know, I attend a fair number of MongoDB events, and frankly I keep expecting them to get stale. But after being at MongoSF this past Friday, I&#8217;m happy to say it hasn&#8217;t happened yet. The growth and vigor of the Mongo ecosystem was everywhere apparent, and it has never been more encouraging. Our sincere thanks go out [...]]]></description>
				<content:encoded><![CDATA[<div id="attachment_1972" class="wp-caption alignleft" style="width: 176px"><a href="http://www.sfpalace.com" target="_blank"><img class=" wp-image-1972 " alt="Palace Hotel lobby, c. 1930" src="http://blog.mongolab.com/wp-content/uploads/2013/05/history4_lg-237x300.jpg" width="166" height="210" /></a><p class="wp-caption-text">Palace Hotel, c. 1930</p></div>
<p>You know, I attend a fair number of MongoDB events, and frankly I keep expecting them to get stale. But after being at <a title="mongoSF 2013" href="http://www.10gen.com/events/mongodb-san-francisco-2013" target="_blank">MongoSF</a> this past Friday, I&#8217;m happy to say it hasn&#8217;t happened yet. The growth and vigor of the Mongo ecosystem was everywhere apparent, and it has never been more encouraging. Our sincere thanks go out to the 10gen team for putting together another fabulous and informative event.</p>
<p>If you were there and managed to stop by <a href="https://twitter.com/mongolab/status/332978181734817794" target="_blank">MongoLab&#8217;s table</a> in the exhibit hall of the super-elegant <a href="http://www.sfpalace.com/" target="_blank">Palace Hotel</a>, then thanks! It was nice to meet you and/or see you again! Hope you got as much out of the day as we did. If you didn&#8217;t — or if you&#8217;d just like my personal take on the whole thing — well, please read on.</p>
<h1>Ecosystem predicts viability</h1>
<p>Setting aside any of the relative merits of MongoDB as a database for just a moment, I have to say my top takeaway continues to be amazement at the size and enthusiasm of the community around MongoDB.</p>
<blockquote><p>[I]f an organism or aggregate of organisms sets to work with a focus on its own survival and thinks that that is the way to select its adaptive moves, its “progress” ends up with a destroyed environment. If the organism ends up destroying its environment, it has in fact destroyed itself. &#8230; The unit of survival is a flexible organism-in-its-environment.” <a href="#footnote1" name="ref1">[1]</a></p></blockquote>
<p>History is littered with the scarcely recognizable fossils of good ideas, clever inventions, and even superior products that might have flourished save for one thing: <em>adoption</em>. The modern proving ground for technological species looks less and less like the traditional &#8220;marketplace&#8221; with pockets of asymmetric information and discrete &#8220;deals.&#8221;  Today, the landscape has evolved to include open-source transparency, synergies of technologies and ideas, and a globally interconnected (and often, informed) fabric of opinion. A vigorous ongoing conversation (and overlap!) among diversified populations of users and developers is now the surest predictor, I believe, of long-term survival.</p>

<p>So, more than the database technology (which is impressive) or the well-capitalized company devoted to developing it (which is formidable), it is the <em>people</em> and the strength of this community that inspire my confidence that MongoDB will continue to thrive, improving and growing in popularity as a viable or even preëminent database for an ever increasing number of applications.</p>
<h1>MongoDB: the Next Generation</h1>
<p>Eliot Horowitz, 10gen CTO &amp; Co-founder, kicked things off on a strong note, clearly articulating his focus for the immediate future of MongoDB. In my opinion, these are exactly the right priorities for taking the platform to the next level:</p>
<ul>
<li>Maturity</li>
<li>Innovation</li>
<li>Operations</li>
</ul>
<p>If you peer into its internals today, you&#8217;ll see the evolutionary legacy of MongoDB: steadily improving and expanding functionality, accreted around a core of pragmatic and sometimes downright scrappy engineering — just what you might expect from a small, clever team with a product rapidly establishing itself in the marketplace. But many of the expedients that accelerate a large piece of software in the short term can eventually bog down development and become obstacles to its further progress. You want a larger team to be able to add and maintain a growing number of features, without commensurate increases in code complexity. At some point, once experience has shown where the grain boundaries lie, there comes a time to refactor (<a title="things you should never do" href="http://www.joelonsoftware.com/articles/fog0000000069.html" target="_blank">not</a> <a title="second-system effect" href="http://c2.com/cgi/wiki?SecondSystemEffect" target="_blank">reinvent!</a>) the core, teasing out clear and minimal abstraction contracts that the new implementations of existing and future features can target.</p>
<p>This engineering story arc is not lost on Eliot. Cleaner factoring, he explains, will be a a key enabler to efficiently deliver capabilities that MongoDB has needed for a long time, to make it a more &#8220;mature,&#8221; fully-featured general purpose database. It will also form the groundwork for innovating and building on the strengths of MongoDB as a data substrate for modern applications. Specific examples Eliot mentioned included:</p>
<ul>
<li>non-constant query constraints — <em>e.g.</em>, find all documents where the values of fields &#8220;<tt>a</tt>&#8221; and &#8220;<tt>b</tt>&#8221; are equal.</li>
<li>inline aggregation operations — <em>e.g.</em>, update each document to set its &#8220;<tt>total</tt>&#8221; field to the sum of the &#8220;<tt>dollarAmt</tt>&#8221; field of each element of its &#8220;<tt>lineItems</tt>&#8221; array.</li>
<li>index intersections — <em>e.g.</em>, optimize a query like <tt>{a: 3, b: 6}</tt> by dynamically combining an index on &#8220;<tt>a</tt>&#8221; with an index on &#8220;<tt>b</tt>&#8221; to yield performance comparable to what today would require an explicit compound index comprising both fields.</li>
</ul>
<p>So that&#8217;s the broad story around Maturity and Innovation — right on. What about the third item: Operations? This of course refers to the realities of keeping a database running and available behind a production system of any kind. Happily, there is another three-item list here:</p>
<ul>
<li>Monitoring</li>
<li>Backups</li>
<li>Management</li>
</ul>
<p>Eliot spoke to 10gen&#8217;s efforts on each of these facets: MMS, which became available some 18 months ago; the remote backup service, which is in Limited Release now; and a suite of management tools to be announced later this year.</p>
<p>Of course, the topic of production-class operations is near to our hearts: seamlessly handling these three facets for our customers is what <a href="http://mongolab.com" target="_blank">MongoLab</a> is all about!</p>
<h1>You got your lagerstätten in my Burgess Shale!</h1>
<div class="wp-caption alignright" style="width: 260px"><a href="http://en.wikipedia.org/wiki/Cambrian_explosion"><img class=" " alt="Opabinia" src="http://upload.wikimedia.org/wikipedia/commons/thumb/a/a2/Opabinia_BW2.jpg/250px-Opabinia_BW2.jpg" width="250" height="188" /></a><p class="wp-caption-text">Opabinia, c. 505,000,000 BC</p></div>
<p>Max Schireson, the 10gen CEO who claims to have been <a title="Codd on the line (1970)" href="http://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf" target="_blank">born the same year as the relational database</a>, followed up with a pointedly evolutionary perspective on database technologies. He compared today&#8217;s landscape to the early part of the <a href="http://en.wikipedia.org/wiki/Cambrian_explosion" target="_blank">Cambrian Explosion</a>, in which biodiversity increased by orders of magnitude in a small fraction of the total history of life on earth up to that point. Of course, the unstated implication was that hitherto more &#8220;established&#8221; databases (Oracle, MySQL) were the long-dominant single-celled organisms in this analogy, whereas MongoDB would be perhaps more like a <a href="http://en.wikipedia.org/wiki/Opabinia" target="_blank">sighted predator of some kind</a>.</p>
<p>Schireson quoted some consumption figures from the top of the food chain (<em>e.g.</em>, 3 of the top 10 global investment bank use MongoDB) and noted some recent shifts in environmental pressures (<em>e.g.</em>, developer-driven decision making).  He also cited an amusing factoid: prior to this year&#8217;s report, the last <a href="http://www.gartner.com/technology/research.jsp" target="_blank">Gartner Research</a> update on databse technology came out in 2003. That&#8217;s right: <em>a full ten years ago</em>. Something new must be going on. (Can you guess what?)</p>
<p>In short, Schireson made it sound like a pretty exciting time to be in databases, with MongoDB figuring prominently on the changing landscape.</p>
<h1>Okay, now back to your niche&#8230;</h1>
<p>After this inspiring keynote, of course, there followed a full day of stimulating talks and sessions at all levels of the mongo-guru ladder — oceans of fresh, insightful, useful stuff.</p>
<p>My personal favorite was probably the session led by <a href="http://www.linkedin.com/pub/charity-majors/5/b76/826" target="_blank">Charity Majors</a>, who is responsible for the MongoDB servers <a title="Parse.com Ops blog" href="http://blog.parse.com/category/ops/" target="_blank">at the heart of Parse.com</a>. If you were lucky enough to catch her outstanding talk on the care and feeding of a grown-up mongo deployment, you&#8217;ll know that there&#8217;s a whole host of operational issues that you&#8217;d just rather not worry about — or at the very least, you&#8217;d very much like an experienced hand at the helm when you do. Why do I say her talk was outstanding? Because that stuff is our bread and butter. It&#8217;s what we do all day every day here at MongoLab: hook you up with the database of tomorrow, so you can use more of your energy to dominate YOUR product&#8217;s ecological niche today, and still get a good night&#8217;s sleep (assuming your species isn&#8217;t nocturnal).</p>
<p>There&#8217;s never been a richer ecosystem, or a better time to be a database consumer. And there are more reasons than ever today for your consumption preferences to be of the MongoDB phyla. Yummy! Why not <a href="https://mongolab.com/newdb" target="_blank">try one right now</a>?</p>
<p style="text-align: right;"><a title="yup, that's me" href="https://twitter.com/t0dampier/" target="_blank">T. Dampier</a>, 2013-05-11</p>
<h4>Notes</h4>
<p><a href="#ref1" name="footnote1">[1]</a> Source: Gregory Bateson, “Form, Substance and Difference”, 19th Annual Alfred Korzybski Memorial Lecture, 9 Januiary 1970, Oceanic Institute, Hawaii. From the book Ecology and Consciousness, edited by Richard Grossinger, North Atlantic Books, 1978. p. 32.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mongolab.com/2013/05/mongosf-2013-scaling-the-hyperbola-of-evolution-with-mongodb/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Introducing flip-flop: a MongoDB Replica Set demonstration and experimentation service</title>
		<link>http://blog.mongolab.com/2013/04/introducing-flip-flop-a-mongodb-replica-set-demonstration-and-experimentation-service/</link>
		<comments>http://blog.mongolab.com/2013/04/introducing-flip-flop-a-mongodb-replica-set-demonstration-and-experimentation-service/#comments</comments>
		<pubDate>Tue, 30 Apr 2013 15:00:29 +0000</pubDate>
		<dc:creator>dave</dc:creator>
				<category><![CDATA[education]]></category>
		<category><![CDATA[general]]></category>

		<guid isPermaLink="false">http://blog.mongolab.com/?p=1424</guid>
		<description><![CDATA[Greetings adventurers! A lot of our users upgrade from single-node databases to replica set clusters without fully understanding how their driver, and therefore their application, will react to failover. In fact, we get so many questions about best practices with MongoDB replica sets that we thought it could be cool to host a replica set [...]]]></description>
				<content:encoded><![CDATA[<p>Greetings adventurers!</p>
<p>A lot of our users upgrade from single-node databases to replica set clusters without fully understanding how their driver, and therefore their application, will react to failover. In fact, we get so many questions about best practices with <a href="http://docs.mongodb.org/manual/core/replication/" target="_blank">MongoDB replica sets</a> that we thought it could be cool to host a replica set that <em>anyone</em> can connect to using their MongoDB driver of choice.</p>
<p>Today we invite you to check out <a href="http://mongolab.org/flip-flop" target="_blank"><strong>flip-flop</strong></a><strong></strong>, a MongoDB Replica Set demonstration and experimentation service.  The flip-flop service consists of:</p>
<ul>
<li>A live replica set that fails-over (i.e. &#8220;flips&#8221; and &#8220;flops&#8221;) every 60 seconds.  This cluster is always running and available to all at the following address:
<pre class="brush: plain; title: Connection URI:; notranslate">mongodb://testdbuser:testdbpass@flip.mongolab.com:53117,flop.mongolab.com:54117/testdb</pre>
</li>
</ul>
<ul>
<li>A <a href="http://mongolab.org/flip-flop" target="_blank">real-time visualization of the cluster</a> flippin&#8217; and floppin&#8217; with streaming database server logs</li>
</ul>
<ul>
<li>A set of example client scripts (currently just <a href="https://gist.github.com/mongolab-org/5347810" target="_blank">in Python</a>) that simulate client interactions with the cluster that you can use as a starting point for your own experimentation</li>
</ul>
<p>The flip-flop service is also great for those of you working on third-party drivers. Gustavo Niemeyer, author of <a href="http://labix.org/mgo" target="_blank">mgo</a>, a MongoDB driver for the Go language, told us flip-flop helped him find and quickly fix a small bug in the driver: &#8220;This is brilliant. I actually managed to find an edge case coding a trivial example against it due to the timing of the server re-election.&#8221; Pretty cool!</p>
<p><span id="more-1424"></span></p>
<h3>How to get started with flip-flop:</h3>
<ol>
<li>Open <a href="http://mongolab.org/flip-flop" target="_blank">http://mongolab.org/flip-flop</a> in one window</li>
<li>Download and run our <a href="https://gist.github.com/mongolab-org/5347810/raw/766f8266db377e462f90e5c1389be6211fbd2db6/watch-flip-flop.py" target="_blank">sample Python script</a> (requires pymongo 2.4+) in a second window
<pre class="brush: plain; title: ; notranslate">
sudo pip install pymongo
curl https://gist.github.com/mongolab-org/5347810/raw/766f8266db377e462f90e5c1389be6211fbd2db6/watch-flip-flop.py &gt; watch-flip-flop.py
python watch-flip-flop.py
</pre>
</li>
<li>Watch this script gracefully recover from failover as the cluster flips and flops every 60 seconds.</li>
</ol>
<p>We hope you all find this service informative and useful. As you play around with flip-flop, please do <a href="mailto:support@mongolab.com" target="_blank">let us know what you think</a>!</p>
<p>Sincerely,</p>
<p>Your friends at MongoLab</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mongolab.com/2013/04/introducing-flip-flop-a-mongodb-replica-set-demonstration-and-experimentation-service/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Backup your MongoDB databases with MongoLab</title>
		<link>http://blog.mongolab.com/2013/04/backup-your-mongodb-databases-with-mongolab/</link>
		<comments>http://blog.mongolab.com/2013/04/backup-your-mongodb-databases-with-mongolab/#comments</comments>
		<pubDate>Thu, 25 Apr 2013 14:11:00 +0000</pubDate>
		<dc:creator>angela</dc:creator>
				<category><![CDATA[announcement]]></category>
		<category><![CDATA[general]]></category>

		<guid isPermaLink="false">http://blog.mongolab.com/?p=1629</guid>
		<description><![CDATA[Last year, a lot of folks asked us if they could use MongoLab&#8217;s admin tools on databases not hosted with MongoLab. We thought this was a cool idea, and released Remote Connections, a feature that allows you to point MongoLab&#8217;s web interface at any cloud MongoDB instance. Since then, this feature has received great response. Today&#8230; Remote Connections [...]]]></description>
				<content:encoded><![CDATA[<p dir="ltr">Last year, a lot of folks asked us if they could use <a href="http://mongolab.com" target="_blank">MongoLab&#8217;s</a> admin tools on databases <em>not</em> hosted with MongoLab. We thought this was a cool idea, and <a href="http://blog.mongolab.com/2012/01/mongodb-gui-admin-tool-for-all/" target="_blank">released Remote Connections</a>, a feature that allows you to point MongoLab&#8217;s web interface at any cloud MongoDB instance. Since then, this feature has received great response.</p>
<p dir="ltr">Today&#8230; Remote Connections got even better! You can now use exactly the same backup tools on remote databases that our users that host with MongoLab know, love, and trust.</p>
<p dir="ltr">MongoLab&#8217;s backup system makes it extremely easy to schedule and manage backups. You can use the system to perform one-time backups or create recurring Backup Plans with custom schedules and retention policies. Backups can be stored in MongoLab&#8217;s own secure cloud containers or in a container at the cloud storage provider of your choice (e.g. Amazon S3).</p>
<p><span id="more-1629"></span></p>
<h1><strong>How it works</strong></h1>
<h3>Step 1. Create a Remote Connection</h3>
<p>After you have <a href="http://mongolab.com/signup" target="_blank">signed up</a> for a MongoLab account, create a Remote Connection to your database by providing us with the <a href="http://docs.mongodb.org/manual/reference/connection-string/" target="_blank">MongoDB connection URI</a> for your database, server, or Replica Set cluster.</p>
<p dir="ltr"><a href="http://blog.mongolab.com/wp-content/uploads/2013/04/CreateRemoteConnection1.png"><img class="alignnone  wp-image-1648" alt="CreateRemoteConnection" src="http://blog.mongolab.com/wp-content/uploads/2013/04/CreateRemoteConnection1.png" width="526" height="294" /></a></p>
<h3> Step 2. Kick-off a one-time backup or create a recurring Backup Plan</h3>
<p dir="ltr"><a href="http://blog.mongolab.com/wp-content/uploads/2013/04/CreateRCBackupPlan.png"><img class="alignnone  wp-image-1655" alt="CreateRCBackupPlan" src="http://blog.mongolab.com/wp-content/uploads/2013/04/CreateRCBackupPlan.png" width="526" /></a></p>
<h3 dir="ltr">Step 3. Sit back and relax</h3>
<p dir="ltr">Why implement and manage backups yourself when you can ask our trusty robots to do it for you?</p>
<p>Not only do we take the headache out of taking backups, but we make sure backups happen as they should be. Our comprehensive backup auditing system is continuously monitoring all Backup Plans to ensure that every backup is happening when it is supposed to, and without error.  You will be alerted when any backups do not occur as planned.</p>
<h1><strong>Free during beta</strong></h1>
<p>For now, this feature is in beta. During this beta period, we are offering this functionality for free with a database data size limit of 5 GB. When we come out of beta, Remote Connection backups will support larger database sizes and be priced on a per-run basis.</p>
<p>If you run into any trouble and/or have any feedback, please don&#8217;t hesitate to send us an email at <a href="mailto:support@mongolab.com">support@mongolab.com</a>.</p>
<p>Thank you in advance for trying out backup tools for Remote Connections; we look forward to hearing what you think!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mongolab.com/2013/04/backup-your-mongodb-databases-with-mongolab/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>[“Thinking”, “About”, “Arrays”, “In”, “MongoDB”]</title>
		<link>http://blog.mongolab.com/2013/04/thinking-about-arrays-in-mongodb/</link>
		<comments>http://blog.mongolab.com/2013/04/thinking-about-arrays-in-mongodb/#comments</comments>
		<pubDate>Thu, 18 Apr 2013 21:08:41 +0000</pubDate>
		<dc:creator>eric</dc:creator>
				<category><![CDATA[education]]></category>
		<category><![CDATA[general]]></category>
		<category><![CDATA[MongoDB]]></category>

		<guid isPermaLink="false">http://blog.mongolab.com/?p=1485</guid>
		<description><![CDATA[Greetings adventurers! The growing popularity of MongoDB means more and more people are thinking about data in ways divergent from traditional relational models. For this reason alone, it’s exciting to experiment with new ways of modelling data. However, with additional flexibility comes the need to properly analyze the performance impact of data model decisions. Embedding [...]]]></description>
				<content:encoded><![CDATA[<p>Greetings adventurers!</p>
<p>The growing popularity of MongoDB means more and more people are thinking about data in ways divergent from traditional relational models. For this reason alone, it’s exciting to experiment with new ways of modelling data. However, with additional flexibility comes the need to properly analyze the performance impact of data model decisions.</p>
<p>Embedding arrays in documents is a great example of this. MongoDB’s versatile array operators ($push/$pull, $addToSet, $elemMatch, etc.) offer the ability to manage data sets within documents. However, one must be careful. Data models that call for very large arrays, or arrays with high rates of modification, can often lead to performance problems.</p>
<p><span id="more-1485"></span></p>
<p>That’s because at high levels of scale, large arrays require relatively high CPU-overhead that leads to longer insert/update/query times than desired.</p>
<p>Let’s use car dealerships as an example and discuss why an array of cars in a dealer document isn’t necessarily the ideal data model.</p>
<p>Here’s our dealer document:</p>
<pre>{ "_id" : 1234,
  “dealershipName”: “Eric’s Mongo Cars”,
  “cars”: [
           {“year”: 2013,
            “make”: “10gen”,
            “model”: “MongoCar”,
            “vin”: 3928056,
            “mechanicNotes”: “Runs great!”},
           {“year”: 1985,
            “make”: “DeLorean”,
            “model”: “DMC-12”,
            “vin”: 8056309,
            “mechanicNotes”: “Great Scott!”}
  ]
}

</pre>
<p>Now for some concerns with this model.</p>
<h3>Querying Cars</h3>
<p>One of the advantages of MongoDB is its rich query language, which supports accessing documents by the contents of an array. If we want to locate cars by make and model using a query like {“make”: MAKE, “model”: MODEL}, we need a specific feature of MongoDB’s query language, $elemMatch. This operator, while optimizable by indexes, still needs to traverse the entire “cars” array of every eligible dealer document in order to execute the query. If we query documents with “cars” arrays that contain thousands of car entries, we are basically doing mini collection-scans. As a result, we’ll notice high CPU utilization and slow query execution.</p>
<p>What about more complex computation? The Aggregation framework — with the $unwind operator — could support many of the queries we’d like to perform, but indexes may not be used to their full effectiveness.</p>
<p>Finally, dealer documents can get very large with this data model. Although projections and atomic update operators cut down the size of the documents transmitted over the wire, there may still be scenarios where the system is hauling more baggage around than it should.</p>
<h3>Adding and updating Cars</h3>
<p>Adding and modifying car entries can require a scan of much or all of each array being updated, resulting in slow operations. For example, $addToSet, is a way of adding new elements to arrays that requires the database to scan through every array item to make sure the new element does not already exist. $pull can be similarly inefficient.</p>
<p>Furthermore, sometimes modifications to the “cars” array can grow a dealer document in size such that it must be moved in memory. These moves can be very expensive, particularly when the collection is heavily indexed as each index bucket that points to the document being relocated must also be updated to point to its new memory location.</p>
<h3>Can we fix it?</h3>
<p>Sometimes a data model that entails very large arrays can be reformulated into a data model that is much more efficient. In our case we could alternatively model dealers and cars like this:</p>
<p><strong>Dealers</strong></p>
<pre>{ "_id": 3423, “dealershipName”: “Eric’s Mongo Cars” }

</pre>
<p><strong>Cars</strong></p>
<pre>{ "_id" : 1234,
  “dealership”: 3423,
  “year”: 2013,
  “make”: “10gen”,
  “model”: “Mongos”,
  “vin”: 3928056,
  “mechanicNotes”: “Runs great!”},
{ "_id" : 54321,
  “dealership”: 3423,
  “year”: 1985,
  “make”: “DeLorean”,
  “model”: “DMC-12”,
  “vin”: 8056309,
  “mechanicNotes”: “Great Scott!"}

</pre>
<p>In this data model we avoid the excessively large arrays and, with the right indexes, perform efficient queries on dealerships with even the largest of inventories. By keeping cars in their own collection, Eric’s Mongo Cars is ready to move inventory with crazy low prices, without fear that our volume is going to bring down the system for Eddie’s Junker Shack down the road. We love those guys.</p>
<h3>Conclusion</h3>
<p>Storing information in document arrays is an exciting capability available in MongoDB, but we want to avoid it under the following conditions:</p>
<ul>
<li>The arrays can get very large, even if only in some documents</li>
<li>Individual elements must be regularly queried and computed on</li>
<li>Elements are added and removed often</li>
</ul>
<p>None of the pitfalls described above are deal-breakers in and of themselves. It’s just that when summed together, the total overhead can become noticeable. So, be wary. In these high-volume cases, it is appropriate for us to use collections, not arrays, to store data. When we store such data using collections,</p>
<ul>
<li>Regular computation is performed using simple, efficient methods</li>
<li>Adding and removing elements are simple insert/remove operations</li>
<li>Each element is accessible using simpler queries that can be effectively indexed to scale well</li>
</ul>
<p>There’s a lot more detail under the hood, but if you’d like to discuss it, we’ll have to get there in the comments section below.</p>
<p>Thanks for reading, and good luck out there!</p>
<p>Sincerely,<br />
Eric@MongoLab</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mongolab.com/2013/04/thinking-about-arrays-in-mongodb/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>How to use MongoDB on RedHat OpenShift with MongoLab</title>
		<link>http://blog.mongolab.com/2013/04/how-to-use-mongodb-on-redhat-openshift-with-mongolab/</link>
		<comments>http://blog.mongolab.com/2013/04/how-to-use-mongodb-on-redhat-openshift-with-mongolab/#comments</comments>
		<pubDate>Thu, 11 Apr 2013 18:10:17 +0000</pubDate>
		<dc:creator>eric</dc:creator>
				<category><![CDATA[education]]></category>
		<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[partnerships]]></category>

		<guid isPermaLink="false">http://blog.mongolab.com/?p=1201</guid>
		<description><![CDATA[Hey RedHat fans &#8211; we&#8217;ve got your MongoDB hosting needs covered! In today&#8217;s post we&#8217;ll be presenting a quick-start guide on how to connect OpenShift, the free RedHat auto-scaling Platform-as-a-Service (PaaS), with our popular MongoDB Database-as-a-Service (DBaaS), MongoLab. For demonstration purposes, we&#8217;ll be using a Node.js application that we&#8217;ve written (available for download here). All [...]]]></description>
				<content:encoded><![CDATA[<p>Hey RedHat fans &#8211; we&#8217;ve got your MongoDB hosting needs covered!</p>
<p>In today&#8217;s post we&#8217;ll be presenting a quick-start guide on how to connect <a href="http://www.openshift.com" target="_blank">OpenShift</a>, the free RedHat auto-scaling Platform-as-a-Service (PaaS), with our popular MongoDB Database-as-a-Service (DBaaS), <a href="http://www.mongolab.com">MongoLab</a>.</p>
<p>For demonstration purposes, we&#8217;ll be using a Node.js application that we&#8217;ve written (available for download <a href="https://github.com/mongolab/mongolab-openshift-quickstart/blob/master/server.js" target="_blank">here</a>). All it takes to connect your OpenShift application is five easy steps!</p>
<p><span id="more-1201"></span></p>
<h3>Step 1. Create an OpenShift account and application</h3>
<p>Create an account at <a href="http://openshift.redhat.com/" target="_blank">http://openshift.redhat.com</a> and install the <strong>rhc</strong> command-line tool on your development machine. For more info about rhc, see <a href="https://openshift.redhat.com/community/developers/rhc-client-tools-install" target="_blank">https://openshift.redhat.com/community/developers/rhc-client-tools-install</a>.</p>
<p>Once rhc is installed, create a <strong>nodejs-0.6</strong> application using the path to this repository as the <code>--from-code</code> argument and by replacing with your desired application name:</p>
<pre class="brush: plain; title: ; notranslate">
% rhc app create  nodejs-0.6
       --from-code https://github.com/mongolab/mongolab-openshift-quickstart
% cd
</pre>
<p>rhc initializes your application using this repository as a baseline.</p>
<h3>Step 2. Create a MongoLab account and database</h3>
<ol>
<li>Sign up for an account at <a href="http://www.mongolab.com">http://www.mongolab.com</a>. After you&#8217;ve successfully created your MongoLab account, you&#8217;ll see &#8220;Databases&#8221; header and a &#8220;Create new&#8221; button.</li>
<li>Click on the &#8220;Create new&#8221; button to create a database. Be sure to specify a database user name and password. These credentials are <strong>not</strong> the same as your MongoLab account credentials.</li>
<li>Click on your database. The database landing page provides a mongodb URI connection string of the form:<br />
<code><code> mongodb://&lt;db user&gt;:&lt;db password&gt;@&lt;host&gt;:&lt;port&gt;/&lt;db name&gt;<br />
</code></code></li>
<li>Copy this value somewhere helpful and replace placeholders with your <strong>database user</strong> credentials.</li>
</ol>
<h3>Step 3. Commit and deploy the app</h3>
<p>When you&#8217;ve created your app, the rhc command line client automatically initialized a git repo with a remote link to OpenShift. The code is also already deployed to your app gear.</p>
<p>If you don&#8217;t make any changes, you can skip this step. However, if you make any modifications (now or later), perform the following to update the code on the gear:</p>
<pre class="brush: plain; title: ; notranslate">
% git add .
% git commit -m &quot;my first commmit&quot; -a
% git push
</pre>
<h3>Step 4. Configure environment variables on the app gear</h3>
<p>The <a href="https://github.com/mongolab/mongolab-openshift-quickstart/blob/master/server.js" target="_blank">example code</a> uses <code>mongodb://localhost:27017/test</code> when the MONGOLAB_URI environment variable is not available. This is sufficient for testing with a MongoDB database running on your local machine, but not for production.</p>
<p><strong>Note:</strong> We find that configuring this value outside of the code (and not storing it in a repository) allows for maximum security and flexibility. However, there are repository-driven alternatives for configuring this environment variable that may meet your requirements. See <a href="https://openshift.redhat.com/community/kb/kb-e1072-how-to-create-and-use-environment-variables-on-the-server" target="_blank">how to create and use environment variables on the server</a> for more information.</p>
<p>To configure your environment variable without placing credentials in a repository:</p>
<ol>
<li>Login to <a href="http://openshift.redhat.com/" target="_blank">http://openshift.redhat.com</a></li>
<li>Click <strong>My Apps</strong>.</li>
<li>Click the <strong>&gt;</strong> next to your app name to reach your application page.</li>
<li>Click <strong>Want to log in to your application?</strong></li>
<li>Copy the provided ssh shell command to a shell window and press enter to ssh to your app gear.</li>
<li>Open your gear&#8217;s .bash_profile in your text editor of choice. It is located at <code>~/app-root/data/.bash_profile</code>.</li>
<li>Add the line <code>export MONGOLAB_URI=&lt;db uri&gt;</code> where <strong>db uri</strong> is the mongodb URI you obtained in Step 2, with your database user credentials added.</li>
<li>After editing the file, run <code>source ~/app-root/data/.bash_profile</code></li>
<li>Use <code>echo $MONGOLAB_URI</code> to confirm success. The value you added should be displayed at the console.</li>
<li>Restart your app by running <code>ctl_all stop</code> then <code>ctl_all start</code>.</li>
</ol>
<h3>Step 5. View the app</h3>
<p>Visit your deployed app at:</p>
<pre><code>    http://&lt;app name&gt;-&lt;app namespace&gt;.rhcloud.com

</code></pre>
<p>And there you have it &#8211; just five steps to get your OpenShift deployed application connected to your MongoLab database!</p>
<p>If you ever have any questions, don&#8217;t hesitate to get in touch with us at <a href="mailto:support@mongolab.com">support@mongolab.com</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mongolab.com/2013/04/how-to-use-mongodb-on-redhat-openshift-with-mongolab/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Weekend Project: Send sensor data from Arduino to MongoDB</title>
		<link>http://blog.mongolab.com/2013/03/sensor-data-arduino-mongodb/</link>
		<comments>http://blog.mongolab.com/2013/03/sensor-data-arduino-mongodb/#comments</comments>
		<pubDate>Fri, 29 Mar 2013 13:00:28 +0000</pubDate>
		<dc:creator>benwen</dc:creator>
				<category><![CDATA[education]]></category>
		<category><![CDATA[general]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://blog.mongolab.com/?p=1326</guid>
		<description><![CDATA[Arduino is an open-source electronics platform that can acknowledge and interact with its environment through a variety of sensor types.  It&#8217;s great for hardware prototyping and one-off projects. I just got an Arduino Board from our friends at SendGrid, who also gave me a little tutorial in the art of Arduino hacking. Inspired by the tutorial and [...]]]></description>
				<content:encoded><![CDATA[<p><a href="http://blog.mongolab.com/wp-content/uploads/2013/03/mongolab-motion-layout.png"><img class="alignleft  wp-image-1329" alt="mongolab-motion-layout" src="http://blog.mongolab.com/wp-content/uploads/2013/03/mongolab-motion-layout.png" width="308" height="519" /></a></p>
<p><a href="http://arduino.cc/" target="_blank">Arduino</a> is an open-source electronics platform that can acknowledge and interact with its environment through a variety of sensor types.  It&#8217;s great for hardware prototyping and one-off projects.</p>
<p>I just got an Arduino Board from our friends at <a href="http://sendgrid.com" target="_blank">SendGrid</a>, who also gave me a little tutorial in the art of Arduino hacking. Inspired by the tutorial and armed with this new  board, I bought a passive infared (PIR) motion sensor from my local Radio Shack. Now I was ready to play; in particular, I wanted to be able to collect that continuous stream of hardware sensor data into a <a href="http://mongodb.org" target="_blank">MongoDB database</a> for logging, trend analysis, system event correlation, etc.</p>
<p><span id="more-1326"></span></p>
<p>To this end, I created the demo project &#8220;mongodb-motion&#8221;, which I&#8217;ve made public on Github. In the <a href="https://github.com/benzenwen/mongolab-motion" target="_blank">&#8220;mongodb-motion&#8221; Github repo</a>, you will find an Arudino project that writes motion sensor data to a cloud MongoDB database at <a href="http://mongolab.com" target="_blank">MongoLab</a> and sends alerts via email based on certain criteria. I built this demo using <a href="http://nodejs.org/" target="_blank">Node.js</a> and the <a href="https://support.mongolab.com/entries/20433053-REST-API-for-MongoDB" target="_blank">MongoLab REST API</a>.</p>
<p>Below, I&#8217;ll go through exactly what hardware you need to make your own &#8220;mongodb-motion&#8221; project a success, and how the code actually works.</p>
<h2>What You Need</h2>
<p>The hardware used in this demo includes: an <a href="http://arduino.cc/en/uploads/Main/ArduinoUno_R3_Front.jpg" target="_blank">Arduino UNO R3</a> and a <a href="http://www.parallax.com/tabid/768/productid/83/default.aspx" target="_blank">Parallax PIR motion sensor</a>.</p>
<h2>How the Code Works</h2>
<p>You can use a variety of motion sensors with the Arduino. In this particular experiment, I used a PIR motion sensor. The PIR motion sensor behaves like a switch, with &#8216;down&#8217; events emitted on motion detection and &#8216;up&#8217; events a few seconds after motion ceases to be detected.</p>
<p>On the receiving side, I used <a href="https://github.com/rwldrn/johnny-five" target="_blank">JohnnyFive</a>, an appropriately named Node.js package that accepts sensor events and sends messages to the Arduino board.</p>
<p>With the two ends set, I&#8217;ll move on to the project&#8217;s configuration file. In this demo, I&#8217;ve included a configuration file, <a href="https://github.com/benzenwen/mongolab-motion/blob/master/config-sample.js" target="_blank">config-sample.js<strong>,</strong></a> where credentials for the MongoLab REST API and for the email SMTP server can be added. In my case, I used the SendGrid SMTP service.</p>
<p>The configuration file also has two callbacks that determine when an email is emitted, one for each type of event &#8211; &#8220;detect&#8221; and &#8220;ceased&#8221;. I&#8217;ve used this feature to automatically send an email alert if an event timestamp is between 7:00pm and 8:00am, ostensibly when my office should be motionless&#8230; I&#8217;m out there watching you, office!</p>
<p>Once you&#8217;ve customized this config-sample.js file, be sure to rename it to config.js in order for it to be usable.</p>
<p>If you inspect the <a href="https://github.com/benzenwen/mongolab-motion/blob/master/app.js" target="_blank">project code</a>, you&#8217;ll notice that the <a href="https://support.mongolab.com/entries/20433053-REST-API-for-MongoDB" target="_blank">MongoLab REST API</a> is called in the <code>logMsg()</code> function, using an <code>https.request</code>.</p>
<p>Building this little demo has given me some new ideas for hardware hacking the cloud. I hope you give it a try too.</p>
<p>Thanks to the Arduino, Node.js and Javascript communities, and special thanks to Rick Waldon for Johnny Five, SendGrid for the UNO board, and a big shout out to <a href="http://twitter.com/swiftalphaone" target="_blank">@swiftalphaone</a> for the Waza tutorial.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mongolab.com/2013/03/sensor-data-arduino-mongodb/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>MongoLab at Overdriver.com</title>
		<link>http://blog.mongolab.com/2013/03/mongolab-at-overdriver-com/</link>
		<comments>http://blog.mongolab.com/2013/03/mongolab-at-overdriver-com/#comments</comments>
		<pubDate>Mon, 25 Mar 2013 16:56:56 +0000</pubDate>
		<dc:creator>benwen</dc:creator>
				<category><![CDATA[announcement]]></category>
		<category><![CDATA[general]]></category>
		<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[partnerships]]></category>

		<guid isPermaLink="false">http://blog.mongolab.com/?p=1311</guid>
		<description><![CDATA[We&#8217;re excited to partner with Overdriver, a unique Platform-as-a-Service for gaming that is showcasing their beta service for the first time at this year&#8217;s Game Developers Conference (GDC)! This exciting partnership means that MongoLab will serve as the core data storage provider for Overdriver&#8217;s online game developer community. You can find out more right here. [...]]]></description>
				<content:encoded><![CDATA[<p><a title="Overdriver Home Page" href="http://overdriver.com" target="_blank"><img class="alignnone size-medium wp-image-1320" style="border: 0px;" alt="Overdriver_Logo" src="http://blog.mongolab.com/wp-content/uploads/2013/03/Overdriver_Logo-300x90.png" width="300" height="90" /></a></p>
<p>We&#8217;re excited to partner with <a title="Overdriver Home Page" href="http://overdriver.com/" target="_blank">Overdriver</a>, a unique Platform-as-a-Service for gaming that is showcasing their beta service for the first time at this year&#8217;s <a title="Game Developers Conference" href="http://www.gdconf.com/" target="_blank">Game Developers Conference (GDC)</a>!</p>
<p>This exciting partnership means that <a href="http://mongolab.com/">MongoLab</a> will serve as the core data storage provider for Overdriver&#8217;s online game developer community. You can find out more right <a title="Overdriver SDK" href="http://overdriver.com/resources/sdk" target="_blank">here</a>.</p>
<p>By marrying dynamic cloud scaling with gaming-specific tooling for features like virtual goods and social functionality, Overdriver addresses the biggest demand issue that game studios face: bringing titles to market quickly but with little operational risk.</p>
<p>MongoDB naturally supports the object-oriented information model in games, capturing concepts such as user characters, possessions, game pieces, and state of play. MongoLab&#8217;s robust and performant MongoDB-as-a-Service lets Overdriver game developers give their full attention to designing compelling games.</p>
<p>We are fascinated by Overdriver and honored to be an inaugural partner in their ground-breaking specialized PaaS approach.</p>
<p>If you&#8217;re in San Francisco this week stop by their booth, <strong>#1838</strong>, at GDC and meet the Overdriver team. Also, be sure to sign up for an account at <a title="Overdriver Home Page" href="http://overdriver.com" target="_blank">Overdriver.com</a>. FYI, creating a new environment at Overdriver automatically creates a new MongoLab account and database.</p>
<p>We can&#8217;t wait to play with what you build!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mongolab.com/2013/03/mongolab-at-overdriver-com/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Replication Lag &amp; The Facts of Life</title>
		<link>http://blog.mongolab.com/2013/03/replication-lag-the-facts-of-life/</link>
		<comments>http://blog.mongolab.com/2013/03/replication-lag-the-facts-of-life/#comments</comments>
		<pubDate>Mon, 18 Mar 2013 15:00:07 +0000</pubDate>
		<dc:creator>dampier</dc:creator>
				<category><![CDATA[MongoDB]]></category>

		<guid isPermaLink="false">http://blog.mongolab.com/?p=1207</guid>
		<description><![CDATA[So you&#8217;re checking in on your latest awesome application one day — it&#8217;s really getting traction! You&#8217;re proud of its uptime record, thanks in part to the MongoDB replica set underneath it. But now … something&#8217;s wrong. Users are complaining that some of their data has gone missing. Others are noticing stuff they deleted has [...]]]></description>
				<content:encoded><![CDATA[<p>So you&#8217;re checking in on your latest awesome application one day — it&#8217;s really getting traction! You&#8217;re proud of its uptime record, thanks in part to the MongoDB replica set underneath it. But now … something&#8217;s wrong. Users are complaining that some of their data has gone missing. Others are noticing stuff they deleted has suddenly reappeared. What&#8217;s going on?!?</p>
<p>Don&#8217;t worry&#8230; we&#8217;ll get to the bottom of this! In doing so, we&#8217;ll examine a source of risk that&#8217;s easy to overlook in a MongoDB application: <em>replication lag —</em> what it means, why it happens, and what you can do about it.</p>
<p><span id="more-1207"></span></p>
<p>Here&#8217;s what we&#8217;re going to cover:<br />
</p>
<p>Continuing this cautionary tale&#8230; Seriously, wtf?! You were doing everything right!</p>
<p>Using MongoDB with a well-designed schema and lovingly-tuned indexes, your application back-end has been handling thousands of transactions per second without breaking a sweat. You&#8217;ve got multiple nodes arranged in a replica set with no single point of failure. Your application tier&#8217;s Mongo driver connections are aware of the replica set and can follow changes in the PRIMARY node during failover. All critical writes are “safe” writes. Your app has been up without interruption for almost six months now! <em>How could this have happened?</em></p>
<p>This unsettling situation has the hallmarks of an insidious foe in realm of high-availability data stewardship: <strong>unchecked replication lag</strong>.</p>
<p>Closely monitoring a MongoDB replica set for replication lag is critical.</p>
<h2>What is replication lag?</h2>
<p>As you probably know, like many data stores MongoDB relies on <em>replication</em> — making redundant copies of data — to meet design goals around availability.</p>
<p><a href="http://en.wikipedia.org/wiki/The_Facts_of_Life_%28TV_series%29" target="_blank"><img class="alignright size-medium wp-image-1281" alt="The Facts of Life" src="http://blog.mongolab.com/wp-content/uploads/2013/03/titlecard12-ytv-wherearetheynow-factsoflife-jpg_203702-240x300.jpg" width="240" height="300" /></a>In a perfect world, data replication would be instantaneous; but in reality, thanks to pesky laws of physics, some delay is inevitable — it&#8217;s a <strong>fact of life</strong>. We need to be able to reason about how it affects us so as to manage around the phenomenon appropriately. Let&#8217;s start with definitions&#8230;</p>
<p>For a given secondary node, <strong><em>replication lag</em></strong> is the delay between the time an operation occurs on the primary and the time that same operation gets applied on the secondary.</p>
<p>For the replica set as a whole, replication lag is (for most purposes) the smallest replication lag found among all its secondary nodes.</p>
<p>In a smoothly running replica set, all secondaries closely follow changes on the primary, fetching each group of operations from its <a href="http://docs.mongodb.org/manual/reference/glossary/#term-oplog" target="_blank">oplog</a> and replaying them approximately as fast as they occur. That is, replication lag remains as close to zero as possible. Reads from any node are then reasonably consistent; and, should the current primary become unavailable, the secondary that assumes the PRIMARY role will be able to serve to clients a dataset that is almost identical to the original.</p>
<p>For a variety of reasons, however, secondaries may fall behind. Sometimes elevated replication lag is transient and will remedy itself without intervention. Other times, replication lag remains high or continues to rise, indicating a systemic problem that needs to be addressed. In either case, the larger the replication lag grows and the longer it remains that way, the more exposure your database has to the associated risks.</p>
<h2>Why is lag problematic?</h2>
<p>Significant replication lag creates failure modes that can be problematic for a MongoDB database deployment that is meant to be highly available.  Here&#8217;s why:</p>
<ul>
<li>If your replica set fails over to a secondary that is significantly behind the primary, a lot of un-replicated data may be on the original primary that will need to be manually reconciled. This will be painful or impossible if the original primary is unrecoverable.</li>
</ul>
<ul>
<li>If the failed primary cannot be recovered quickly, you may be forced to run on a node whose data is not up-to-date, or forced to take down your database altogether until the primary can be recovered.</li>
</ul>
<ul>
<li>If you have only one secondary, and it falls farther behind than the earliest history retained in the primary&#8217;s oplog, your secondary will require a full resynchronization from the primary.
<ul>
<li>During the resync, your cluster will lack the redundancy of a valid secondary; the cluster will not return to high availability until the entire data set is copied.</li>
<li>If you only take backups from your secondary (which we highly recommend), backups must be suspended for the duration of the resync.</li>
</ul>
</li>
</ul>
<ul>
<li>Replication lag makes it more likely that results of any read operations distributed across secondaries will be inconsistent.</li>
</ul>
<ul>
<li>A “safe” write with &#8216;w&#8217; &gt; 1 — i.e., requiring multiple nodes acknowledge the write before it returns — will incur latency proportional to the current replication lag, and/or may time out.</li>
</ul>
<p>Strictly speaking, the problem of replication lag is distinct from the problem of data durability. But as the last point above regarding multi-node write concern illustrates, the two concepts are most certainly linked. Data that has not yet been replicated is not completely protected from single-node failure; and client writes specified to be safe from single-node failure must block until replication catches up to them.</p>
<h2>What causes a secondary to fall behind?</h2>
<p>In general, a secondary falls behind on replication any time it cannot keep up with the rate at which the primary is writing data. Some common causes:</p>
<h4>Secondary is weak</h4>
<p>To have the best chance of keeping up, a secondary host should match the primary host&#8217;s specs for CPU, disk IOPS, and network I/O. If it&#8217;s outmatched by the primary on any of these specs, a secondary may fall behind during periods of sustained write activity. Depending on load this will, at best, create brief excursions in replication lag and, at worst, cause the secondary to fall irretrievably behind.</p>
<h4>Bursty writes</h4>
<p>In the wake of a burst of write activity on the primary, a secondary may not be able to fetch and apply the ops quickly enough. If the secondary is underpowered, this effect can be quite dramatic. But even when the nodes have evenly matched specs, such a situation is possible. For example, a command like:</p>
<pre class="brush: plain; title: ; notranslate"> db.coll.update({x: 7}, {$set: {y: 42}}, {multi: true}}</pre>
<p>can place an untold number of separate “update” ops in the primary&#8217;s oplog. To keep up, a secondary must fetch those ops (max 4MB at a time for each <code>getMore</code> command!), read into RAM any index and data pages necessary to satisfy each <code>_id</code> lookup (remember: each oplog entry references a single target document by <code>_id</code>; the original query about “x” is never directly reflected the oplog), and finally perform the update op, altering the document and placing the corresponding entry into its oplog; and it must do all this in the same amount of time that the primary does merely the last step. Multiplied by a large enough number of ops, that disparity can amount to a noticeable lag.</p>
<h4>Map/reduce output</h4>
<p>A specific type of the extreme write burst scenario might be a command like:</p>
<pre class="brush: plain; title: ; notranslate"> db.coll.mapReduce( ... { out: other_coll ... })</pre>
<p>From the point of view of the oplog, the entire output collection basically materializes at once, from which point the replication to the secondary plays out as above.</p>
<h4>Index build</h4>
<p>It may surprise you to learn that, even if you build an index in the background on the primary, it will be built in the foreground on each secondary. There is currently no way to build indexes in the background on secondary nodes (cf. <a href="https://jira.mongodb.org/browse/SERVER-2771" target="_blank">SERVER-2771</a>). Therefore, whenever a secondary builds an index, it will <strong>block all other operations</strong>, including replication, for the duration. If the index builds quickly, this may not be a problem; but long-running index builds can swiftly manifest as significant replication lag.</p>
<h4>Secondary is locked for backup</h4>
<p>One of the <a href="http://docs.mongodb.org/manual/administration/backups/#replica-set-backups" target="_blank">suggested methods for backing up data in a replica set</a> involves explicitly locking a secondary against changes while the backup is taken. Assuming the primary is still conducting business as usual, of course replication lag will climb until the backup is complete and the lock is released.</p>
<h4>Secondary is offline</h4>
<p>Similarly, if the secondary is not running or cannot reach the primary for whatever reason, it cannot make progress against the replication backlog. When it rejoins the replica set, the replication lag will naturally reflect the time spent away.</p>
<h2>How do I measure lag?</h2>
<h4>Run the <tt>db.printSlaveReplicationInfo()</tt> command</h4>
<p>To determine the current replication lag of your replica set, you can use the <a href="http://docs.mongodb.org/manual/mongo/" target="_blank"><code>mongo</code> shell</a> and run the <code><b>db.printSlaveReplicationInfo()</b></code> command.</p>
<pre class="brush: jscript; title: ; notranslate">
rs-ds046297:PRIMARY db.printSlaveReplicationInfo()

source: ds046297-a1.mongolab.com:46297
syncedTo: Tue Mar 05 2013 07:48:19 GMT-0800 (PST)
      = 7475 secs ago (2.08hrs)
source: ds046297-a2.mongolab.com:46297
syncedTo: Tue Mar 05 2013 07:48:19 GMT-0800 (PST)
      = 7475 secs ago (2.08hrs)
</pre>
<p>More than 2 hours — whoa, isn&#8217;t that a lot? Maybe!</p>
<p>See, those “syncedTo” times don&#8217;t have much to do with the clock on the wall; they&#8217;re just the timestamp on the last operation that the replica has copied over from the PRIMARY. If the last write operation on the PRIMARY happened 5 minutes ago, then yes: 2 hours is a lot. On the other hand, if the last op was 2.08 hours ago, then this is golden!</p>
<p>To fill in that missing piece of the story, we can use the <code><b>db.printReplicationInfo()</b></code> command.</p>
<pre class="brush: jscript; title: ; notranslate">
rs-ds046297:PRIMARY db.printReplicationInfo()

configured oplog size:   1024MB
log length start to end: 5589secs (1.55hrs)
oplog first event time:  Tue Mar 05 2013 06:15:19 GMT-0800 (PST)
oplog last event time:   Tue Mar 05 2013 07:48:19 GMT-0800 (PST)
now:                     Tue Mar 05 2013 09:53:07 GMT-0800 (PST)
</pre>
<p>Let&#8217;s see &#8230; PRIMARY&#8217;s “oplog last event time” – SECONDARY&#8217;s “syncedTo” = 0.0. Yay.</p>
<p>As fun as that subtraction may be, it&#8217;s seldom called for. If there is a steady flow of write operations, the last op on the PRIMARY will usually have been quite recent. Thus, a figure like &#8220;2.08 hours&#8221; should probably raise eyebrows; you would expect to see a nice low number there instead — perhaps as high as a few seconds. And, having seen a low number, there would be no need to qualify its context with the second command.</p>
<h4>Examine the &#8220;repl lag&#8221; graph in MMS</h4>
<p>You can also view recent and historical replication lag using the <a href="http://www.10gen.com/products/mongodb-monitoring-service" target="_blank">MongoDB Monitoring Service</a> (MMS) from 10gen. On the Status tab of each SECONDARY node, you&#8217;ll find the <strong>repl lag</strong> graph:</p>
<p><a href="http://blog.mongolab.com/wp-content/uploads/2013/03/Screen-Shot-2013-03-10-at-12.13.22-PM.png"><img class="alignnone size-full wp-image-1208" alt="Screen Shot 2013-03-10 at 12.13.22 PM" src="http://blog.mongolab.com/wp-content/uploads/2013/03/Screen-Shot-2013-03-10-at-12.13.22-PM.png" width="396" height="238" /></a></p>
<h2>How do I monitor for lag?</h2>
<p>It is critical that the replication lag of your replica set(s) be monitored continuously.   Since you have to sleep occasionally, this is a job best done by robots.  It is essential that these robots be reliable, and that they notify you promptly whenever a replica set is lagging too far behind.</p>
<p>Here are a couple ways you can make sure this is taken care of:</p>
<ul>
<li>If <a href="https://mongolab.com" target="_blank">MongoLab</a> is hosting your replica set, relax! For any multi-node, highly-available replica set we host for you, you can monitor replication lag in our UI and by default you will receive automated alerts whenever the replication lag exceeds 10 minutes.</li>
</ul>
<ul>
<li>You can also set up an alert using the MMS system. Its <a href="http://blog.10gen.com/post/41442945582/announcing-new-mms-alerts" target="_blank">exciting new features</a> allow you to configure a replication lag alert:</li>
</ul>
<p style="padding-left: 30px;"><a style="margin-left: 5px;" href="http://blog.mongolab.com/wp-content/uploads/2013/03/Screen-Shot-2013-03-10-at-1.18.56-PM.png"><img class="alignnone wp-image-1211" alt="Screen Shot 2013-03-10 at 1.18.56 PM" src="http://blog.mongolab.com/wp-content/uploads/2013/03/Screen-Shot-2013-03-10-at-1.18.56-PM.png" width="573" height="369" /></a></p>
<h2>What can I do to minimize lag?</h2>
<p>Out of courtesy (for them or for ourselves), we would like to make those lag-monitoring automata&#8217;s lives as boring as possible. Here are some tips:</p>
<h3>Tip #1: Make sure your secondary has enough horsepower</h3>
<p>It&#8217;s not uncommon for people to run under-powered secondaries to save money — this can be fine if the write load is light. But in scenarios where the write load is heavy, the secondary might not be able to keep up with the primary. To avoid this, you should beef up your secondary so that it&#8217;s as powerful as your primary.</p>
<p>Specifically, a SECONDARY node should have enough network bandwidth that it can retrieve ops from the PRIMARY&#8217;s oplog at roughly the rate they&#8217;re created and also enough storage throughput that it can apply the ops — i.e., read any affected documents and their index entries into RAM, and commit the altered documents back to disk — at that same rate. CPU rarely becomes a bottleneck, but it may need to be considered if there are many index keys to compute and insert for the documents that are being added or changed.</p>
<h3>Tip #2: Consider adjusting your write concern</h3>
<p>Your secondary may be lagging simply because your primary&#8217;s oplog is filling up faster than it can be replicated. Even with an equally-brawny SECONDARY node, the PRIMARY will always be capable of depositing 4MB in its memory-mapped oplog in a fraction of the time those same 4MB will need to make it across a TCP/IP connection.</p>
<p>One viable way to apply some back-pressure to the primary might be to adjust your <a href="http://docs.mongodb.org/manual/core/write-operations/#write-concern" target="_blank">write concern</a>.</p>
<p>If you are currently using a write concern that does not acknowledge writes (aka “fire-and-forget” mode), you can change your write concern to require an acknowledgement from the primary (<code>w:1</code>) and/or a write to the primary&#8217;s journal (<code>j:true</code>). Doing so will slow down the rate at which the concerned connection can generate new ops needing replication.</p>
<p>Other times it may be appropriate to use a &#8216;w&#8217; &gt; 1 or a &#8216;w&#8217; set to &#8220;<code>majority</code>&#8221; to ensure that each write to the cluster is replicated to more than one node before the command returns. Requiring confirmation that a write has replicated to secondaries will effectively guarantee that those secondaries have caught up (at least up to the timestamp of this write) before the next command on the same connection can produce more ops in the backlog.</p>
<p>As previously alluded to, choosing the most appropriate write concern for the data durability requirements of your application — or for particular critical write operations within the application — is something you must give thought to irrespective of the replication lag issue we&#8217;re focusing on here. But you should be aware of the interrelationship: just as the durability guarantee of w&gt;1 can be used as a means of forcing a periodic “checkpoint” on replication, excessive replication lag can show up as a surprisingly high latency (or timeout) for that very occasional critical write operation where you&#8217;ve used “<code>w: majority</code>” to make sure it&#8217;s truly committed.</p>
<h5><strong><em>Adjust to taste</em></strong></h5>
<p>Having servers acknowledge every write can be a big hit to system throughput. If it makes sense for your application, you can amortize that penalty by doing inserts in batches, requiring acknowledgement only at the end of each batch. The smaller the batch, the greater the back-pressure on PRIMARY data creation rate, and correspondingly greater potential adverse impact to overall throughput.</p>
<h5><strong><em>Don&#8217;t overdo it</em></strong></h5>
<p>Using a large value for &#8216;w&#8217; can itself be problematic. It represents a demand that <em>w</em> nodes finish working through their existing backlog before the command returns. So, if replication lag is high (in the sense of there being a large volume of data waiting to copy over) when the write command is issued, the command execution time will suffer a proportionally high latency. Also, if enough nodes go offline such that &#8216;w&#8217; cannot be satisfied, you have effectively locked up your database. This is basically the opposite of “high availability.”</p>
<h3>Tip #3: Plan for index builds</h3>
<p>As mentioned earlier, an index build on a secondary is a foreground, blocking operation. If you&#8217;re going to create an index that is sizeable, perhaps you can arrange to do it during a period of low write activity on the primary. Alternately, if you have more than one secondary, you can follow the steps <a href="http://docs.mongodb.org/manual/administration/indexes/#index-building-replica-sets" target="_blank">here</a> to minimize the impact of building large indexes.</p>
<h3>Tip #4: Take backups without blocking</h3>
<p>Earlier we discussed the technique of locking the secondary to do a backup. There are other alternatives to consider here, including filesystem snapshots and “point-in-time” backups using <a href="http://docs.mongodb.org/manual/reference/mongodump/#cmdoption-mongodump--oplog" target="_blank">the “<tt>--oplog</tt>” option of <tt>mongodump</tt></a> without locking. These are preferable to locking the secondary during a period of active writes if there&#8217;s any chance you&#8217;ll use the secondary for anything other than backups.</p>
<h3>Tip #5: Be sure capped collections have an <tt>_id</tt> field &amp; a unique index</h3>
<p>Reliable replication is not possible unless there is a unique index on the <code>_id</code> field. Before <a href="http://docs.mongodb.org/manual/release-notes/2.2/#id-indexes-capped-collections" target="_blank">MongoDB version 2.2</a>, <a href="http://docs.mongodb.org/manual/core/capped-collections/" target="_blank">capped collections</a> did not have an <code>_id</code> field or index by default. If you have a collection like this, you should create an index on the <code>_id</code> field, specifying <code>unique: true</code>. Failing to do this can, in certain situations, cause replication to <strong>halt entirely</strong>. So &#8230; this should not be regarded as optional.</p>
<h3>Tip #6: Check for replication errors</h3>
<p>If you see that replication lag is only increasing (and never falling), your replica set could be experiencing replication errors. To check for errors, run <code>rs.status()</code> and look at the <code>errmsg</code> field in the result. Additionally, check the log file of your secondary and look for error messages there.</p>
<p>One specific example: if you see “<tt>RS102 too stale to catch up</tt>” in the secondary&#8217;s <tt>mongodb.log</tt> or in the <code>errmsg</code> field when running <code>rs.status()</code>, it means that secondary has fallen so far behind that there is not enough history retained by the primary (its “oplog size”) to bring it up to date. In this case, your secondary will require a full resynchronization from the primary.</p>
<p>In general, though, what you do in response to an error depends on the error. Sometimes you can simply restart the <tt>mongod</tt> process for your secondary; but the majority of the time you will need to understand the root cause of the error before you can fix the problem.</p>
<h2>Don&#8217;t let replication lag take you by surprise.</h2>
<p>At the end of the day, replication lag is just one more source of risk in any high-availability system that we need to understand and design around. Striking the right balance between performance and “safety” of write operations is an exercise in risk management — the “right” balance will be different in different situations. For an application on a tight budget with occasional spikes in write volume, for example, you might decide that a large replication lag in the wake of those spikes is acceptable given the goals of the application, and so an underpowered secondary makes sense. At the opposite extreme, for an application where every write is precious and sacred, the required “majority” write concern will mean you have essentially no tolerance for replication lag above the very minimum possible.  The good news is that MongoDB makes this all very configurable, even on an operation by operation basis.</p>
<p>We hope this article has given you some insight into the phenomenon of replication lag that will enable you to reason about the risk it poses for a high-availability MongoDB application, and armed you with some tools for managing it. As always, <a href="mailto:support@mongolab.com" target="_blank">let us know if we can help</a>!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mongolab.com/2013/03/replication-lag-the-facts-of-life/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Object Modeling in Node.js with Mongoose</title>
		<link>http://blog.mongolab.com/2013/03/object-modeling-in-node-js-with-mongoose/</link>
		<comments>http://blog.mongolab.com/2013/03/object-modeling-in-node-js-with-mongoose/#comments</comments>
		<pubDate>Mon, 11 Mar 2013 19:02:41 +0000</pubDate>
		<dc:creator>angela</dc:creator>
				<category><![CDATA[education]]></category>
		<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[partnerships]]></category>

		<guid isPermaLink="false">http://blog.mongolab.com/?p=1214</guid>
		<description><![CDATA[Check it out! We&#8217;ve just updated our Heroku Dev Center tutorial on object modeling in Node.js using Mongoose, a MongoDB ODM library. Mongoose gives your collections structure and simplifies Node&#8217;s callback patterns to make using MongoDB with Node.js even easier. Learn more and download the sample Node.js app right here at the Heroku Dev Center.]]></description>
				<content:encoded><![CDATA[<p>Check it out! We&#8217;ve just updated our <a href="http://bit.ly/YeeCL3" target="_blank">Heroku Dev Center tutorial</a> on object modeling in Node.js using Mongoose, a MongoDB ODM library. Mongoose gives your collections structure and simplifies Node&#8217;s callback patterns to make using MongoDB with Node.js even easier.</p>
<p>Learn more and download the sample Node.js app right <a href="http://bit.ly/YeeCL3" target="_blank">here</a> at the Heroku Dev Center.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mongolab.com/2013/03/object-modeling-in-node-js-with-mongoose/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
