How big is your MongoDB?

Update 11/10/14: The next blog post in this series is on managing disk space in MongoDB.

As your MongoDB grows in size, information from the db.stats() diagnostic command (or the database “Stats” tab in our management portal) becomes increasingly helpful for evaluating hardware requirements.

We frequently get questions about the dataSize, storageSize and fileSize metrics, so we want to help developers better understand how MongoDB storage works and what these particular metrics mean.

MongoDB storage structure basics

First, we’ll go over the basics of how MongoDB stores your data.

Data files

Every MongoDB instance consists of a namespace file,  journal files and data files. For our discussion, we’ll only be focusing on data files, since that is where all of the data and indexes for your database reside.

Data files store BSON documents, indexes, and MongoDB-generated metadata in structures called extents. Each data file is made up of multiple extents.


Extents are logical containers within data files used to store documents and indexes.

Photo of data files and extents

The above diagram illustrates the relationship between data files and extents. Note:

  • Data and indexes are each contained in their own sets of extents; no extent will ever contain content for more than one collection
  • Data and indexes are never contained within the same extent
  • The data and indexes for a collection will usually span multiple extents
  • When a new extent is needed, MongoDB will attempt to use available space within current data files. If space cannot be found MongoDB will create new data files.

Metrics from db.stats()

Now that we understand the basics of how MongoDB storage is organized, we can explore metrics commonly examined with db.stats(): dataSize, storageSize and fileSize.


Picture of MongoDB dbStats dataSize

The dataSize metric is the sum of the sizes (in bytes) of all the documents and padding stored in the database.

While dataSize does decrease when you delete documents, dataSize does not decrease when documents shrink because the space used by the original document has already been allocated (to that particular document) and cannot be used by other documents.

Alternatively, if a user updates a document with more data, dataSize will remain the same as long as the new document fits within its originally padded pre-allocated space.


Photo of MongoDB dbStats storageSize

The storageSize metric is equal to the size (in bytes) of all the data extents in the database. This number is larger than dataSize because it includes yet-unused space (in data extents) and space vacated by deleted or moved documents within extents.

The storageSize does not decrease as you remove or shrink documents.


Photo of MongoDB dbStats fileSize

The fileSize metric is equal to the size (in bytes) of all the data extents, index extents and yet-unused space (in data files) in the database. This metric represents the storage footprint of your database on disk. fileSize is larger than storageSize because it includes index extents and yet-unused space in data files.

While fileSize does decrease when you delete a database, fileSize does not decrease as you remove collections, documents or indexes.

What now?

That’s it! The next time someone asks you how big your database is you know what to tell them.

, , , , , ,

  • Pingback: Managing disk space in MongoDB | MongoLab: MongoDB-as-Service

  • lyon

    Great post!

  • Pingback: Mongolab about disk usage and data structure of MongoDB | Data story

  • tarunjaiswal

    Great explanation, Chris!

  • Raji

    For 2 sample sizes of collections with data sizes of 1K + 120 bytes of Index and 64K + 256 bytes of Index , what will be the storage size and file size be? What is the best way to calculate these?

  • Pingback: MongoDB. Repara tus bases de datos | Rekkeb's Blog

  • Kajsa Anderson

    So if I want to see how much space has been allocated but is not used, do I want fileSize – dataSize – indexSize?

  • André Badenhorst

    Just a quick question. Which of the dataSize, storageSize or fileSize is considered for the 500MB limitation on the MongoLab Sandbox option?

    • Chris Chang

      file size

      • André Badenhorst

        Amazingly quick response, thank you! It appears, with the creation of the MongoLab Sandbox db, that the file size is already set to 580MB, probably reserved and limited to that size. So I guess the storageSize will now be the one to watch.

  • Nico Vazquez

    The game is growing and 2.2 million users who installed the app today we only have 50,000 active users who play daily basis. The other users can not connect more than three months ago and no longer regard them as active users. But even if they are not active, these users could connect at any time and must be able to recover the last user session state automatically becoming an active user.

    Today we are using a server MongoDB replica set 7 GB of RAM and this generates a very large for the number of active users has cost the game.

    They want to reduce the cost of database and run the game properly and with the same response times with the current server.

    We know that if we move the base with the current characteristics at a lower cost server, the response times of queries increments due to the number of indexes and records that have the same.

    Sorry for the wording but I’m Spanish. Any recommendation?