How to setup and optimize MongoDB in the public cloud

Long gone are the days when MongoDB was the ‘new kid on the block’. These days, it has evolved to become the go-to solution for many people who are leaving relational databases behind. In this article, I’m not going to make the argument for why you should use MongoDB over {insert other NoSQL database}, or even why you should use a NoSQL database over a relational database. I’m just going to assume that you’ve done your due diligence and decided that MongoDB is the best fit for you.

So, before we begin, let’s take a step back and look at the various components of its components. Let’s keep things simple, so let’s stay away from advanced topics, such as sharding.

When you are rolling out MongoDB in production, you really should use something called a Replica Set. A replica set is MongoDB’s equivalent of a Master/Slave setup in the relational world, but in contrast, it is very painless to set up, as everything is built in.

Planning your MongoDB Cloud Server Cluster

In order to set up a Replica Set, you need three nodes (or alternatively, two nodes and an arbiter). These nodes can run more or less any operating system you’d like, but in this guide we’re going to use Ubuntu 14.04. Yet, most of these tips will apply to any Linux-distribution. It is also important that you give them the same amount of resources, because any of them can become the primary (which is MongoDB-language for the ‘master’).

Since the whole point with a Replica Set is that the cluster should survive a single node going down, it would be rather pointless if all your servers resided on the same physical host. Fortunately, we offer something called availability groups. What this means is that you can instruct our system to group all three of your servers into different groups. By doing so, they will never reside on the same physical host. More information about this can be found here.

It is also important that you use a 64-bit version of Linux. The reason is simply that MongoDB doesn’t play well with 32-bit systems (more about that here).

Installing MongoDB in the Cloud

This section is pretty straightforward. Either use one of the pre-configured Ubuntu 14.04 image, or install it yourself.

The CPU, RAM and disk configuration is really individual and depends on your load. For a smallish installation, 4 Ghz CPU, 4 GB RAM and 40 GB disk (for the system) should be sufficient. When you attach your drives, make sure you are using VirtIO. If you use IDE, performance will suffer significantly. Also, since we’re creating a Replica Set, we need all nodes (and app servers) to be on the same VLAN.

Contrary to many other cloud vendors, there’s no need to configure your storage with RAID10 or similar to improve performance. Since we don’t use any magnetic disks (only SSD), you’ll get amazing performance out-of-the-box.

We do however recommend that you keep your MongoDB data on a separate drive. The reason for this is simply that is that we need to make some file system optimizations that you wouldn’t want to do to your entire file system.

With this in mind, it’s easiest to just add this drive after you’ve set up your servers. For now, just focus on the system installation. If you’re installing yourself (instead of using the pre-configured systems), I’d recommend that you press F4 in the boot menu and select ‘Install a minimal virtual machine’.

After the installation, you can attach your data drive. The size of this highly depends on your usage, but for a small system, 20GB should probably be sufficient. However, since it’s sometimes hard to predict how much data you will store, we will use LVM. This will allow us to simply add another drive later on and expand the volume without having to start over. Alternatively, you can use a single drive and scale it up later with resize2fs.

With the drive attached to the system (which will appear as /dev/vdb), we’re going to set this drive up in an optimal fashion. First we need to initiate the drive. Use your tool of choice. I prefer fdisk. With fdisk, you’d simply run:

[bash] $ sudo fdisk /dev/vdb
[/bash]

Navigating within fdisk is pretty straight forward too:

Command (m for help): n &lt;enter&gt;
Select (default p): &lt;enter&gt;
Partition number (1-4, default 1): &lt;enter&gt;
First sector (x-y, default x): &lt;enter&gt;
Last sector, +sectors or +size{K,M,G} (x-y, default y): &lt;enter&gt;
Command (m for help): w

Command (m for help): n <enter>

Select (default p): <enter>

Partition number (1-4, default 1): <enter>

First sector (x-y, default x): <enter>

Last sector, +sectors or +size{K,M,G} (x-y, default y): <enter>

Command (m for help): w

With the disk partitioned, we need to create the LVM pool. This is pretty straight forward. Replace ‘N’ in the ‘lvcreate’ command with the size of your disk.

[bash] $ sudo pvcreate /dev/vdb1
$ sudo vgcreate mongodb /dev/vdb1
$ sudo lvcreate -n db -L Ng mongodb
[/bash]

After running the above commands, you will now have a new device created called /dev/mongodb/db. Now we need to format this device with ext4 (don’t use ext3, since it’s much slower). To do this, simply run:

[bash] $ sudo mkfs.ext4 /dev/mongodb/db
[/bash]

Now, the only thing left is to create the mount point, add the disk to fstab, and finally mount the disk. Here we go:

[bash] $ sudo mkdir /mongodb
$ echo -e ‘/dev/mongodb/dbt/mongodbtext4tdefaults,auto,noatime,noexec,nodiratimet0t0’ | sudo tee -a /etc/fstab
$ sudo mount /mongodb
[/bash]

Installing MongoDB

With the system prepared, let’s move on to installing MongoDB. While Ubuntu does offer a version of MongoDB in their own repository, we recommend that you instead use the official MongoDB version. The reason being that the Ubuntu repository is pretty far behind in releases, so if you want to get the most out of MongoDB, you’ll have to turn to official releases.

Since MongoDB offer their own repository, we can simply add this to our system and then install MongoDB as normal:

[bash] $ sudo apt-key adv –keyserver hkp://keyserver.ubuntu.com:80 –recv 7F0CEB10
$ echo &quot;deb http://repo.mongodb.org/apt/ubuntu &quot;$(lsb_release -sc)&quot;/mongodb-org/3.0 multiverse&quot; | sudo tee /etc/apt/sources.list.d/mongodb-org-3.0.list
$ sudo apt-get update
$ sudo apt-get install -y mongodb-org
[/bash]

Assuming you didn’t run into any problems with the above commands, you should now have MongoDB installed on your system. Now we need to configure it to store the data on the drive we created above.

[bash] $ sudo service mongodb stop
$ sudo mkdir /mongodb/data
$ sudo chown -R mongodb:mongodb /mongodb/data
$ sudo sed -i ‘s/dbpath=/var/lib/mongodb/dbpath=/mongodb/data/g’ /etc/mongodb.conf
$ sudo service mongodb start
[/bash]

You should now have MongoDB up and running, with the data being on the drive. If you are expecting heavy load and/or a lot of connections, you may need to raise the ulimit values.

If you want to gain more insight into your data, you might also want to sign up for Mongodb’s MMS, which is a free cloud-based monitoring service for MongoDB.

Creating the Replica Set for your MongoDB Cloud

Before we can start setting up the replica set, we need to make sure all three nodes can communicate with each other on the internal network. Setting this up varies depending on distribution and the structure of your network. The instructions for doing this on Ubuntu can be found here.

I’m just going to assume that you’ve set up your network properly for now. For simplicity, let’s assume you named the servers mongo0, mongo1 and mongo2. I’m also going to assume that you either have a DNS set up, or have added the entries to /etc/hosts, such that they resolve and ping each other by name.

If you’ve activated the firewall (which you really should), make sure that the nodes can send and receive TCP traffic on port 28017 and 27017 on the internal interface.

With the above things sorted, we now need to configure MongoDB to be used as a Replica Set. There are two pieces to this. First, we need to tell MongoDB what it is part of a Replica Set (and provide a name), and second, initiate the Replica Set.

Let’s start by enabling the Replica Set feature, and call our replica set ‘MyReplSet’:

[bash] $ echo -e ‘replSet = MyReplSet’ | sudo tee -a /etc/mongodb.conf
$ sudo service mongodb restart
[/bash]

MongoDB is now aware that it is part of a Replica Set. This needs to be done on all nodes in the cluster.

With that done, we need to initiate the Replica Set within the interpreter. You can do this from any node, but in this example, we will do it from ‘mongo0’.

[bash] $ mongo
[/bash]

Within the interpreter, we need to add the two other nodes to the Replica Set:

&gt; rs.initiate()
&gt; rs.add('mongo1')
&gt; rs.add('mongo2')

> rs.initiate()

> rs.add('mongo1')

> rs.add('mongo2')

You can then monitor the status using the command rs.status().

That’s really it. You should now be up and running with your MongoDB cluster on our blazing fast cloud

About
Latest

About Viktor Petersson

Former VP of Business Development at CloudSigma. Currently CEO at WireLoad and busy making a dent in the Digital Signage industry with Screenly. Viktor is a proud geek and loves playing with the latest technologies.

Manage Docker resources with Cgroups - May 12, 2015
Docker, Cgroups & More from ApacheCon 2015 - April 30, 2015
How to setup & optimise MongoDB on public cloud servers - March 24, 2015
Presentation deck from CloudExpo Europe - March 17, 2015
CoreOS is now available on CloudSigma! - March 10, 2015