So far we’ve looked at securing the network and securing access to cloud infrastructure. In part 3 we look at how to ensure your data storage is robust, secure and kept private in IaaS clouds. Keeping data private and secure is a key concern of many looking to move to a public cloud. Its important to separate real dangers and how to avoid them from natural psychological reactions to moving data away from in-house provision.
We see data storage in the cloud breaking down into three distinct areas; keeping data private/secure, vendor transparency and data portability.
Each area is critically important to understanding the issues surrounding data in the cloud and must be clearly appreciated by customers using the cloud in order to form meaningful data handling policies that fit the specific needs of each cloud user.
Fundamentally, when moving to a public cloud there are two big changes for customers and their data. Firstly, the data will be stored remotely from the customer’s location; this can have legal implications which we unpack a bit more closely later in this post. Secondly, the data is usually moving from a single tenant to a multi-tenant environment and that’s where the problem of data leakage comes in.
Data leakage is simply the movement of data from one customer to another. Essentially each user in the cloud should only have access to their own data and not be able to access the data of others. We’ve already looked at cloud networking and seen how this is achieved securely through traffic separation and giving customers the control they need to apply networking policies that directly address their needs. For storage, client data is stored in virtual block devices. Essentially virtual drives sitting on larger storage arrays (see cloud storage and the future for more info). These are then accessed by the CPU/RAM of each cloud server.
The data leakage problem comes when a customer deletes their drive and then a new customer creates a new drive. The areas on the physical disks used for the old and new drives can overlap. Its therefore possible for the new customer to try and image off previously written data from other customers. That in a nutshell is the problem and its one that many IaaS clouds are exposed to today. For the most part customers using those platforms don’t actually appreciate the danger. For us that’s a little scary which is why prior to ever launching we took steps to make customers aware of the issue and to provide them with tools that protect them against data leakage.
This problem can been addressed in a couple of key ways. The first approach is to make sure that any confidential data is stored encrypted within the operating system or that the entire operating system/file structure itself is fully encrypted. This can be done using LVM under most Linux distributions or products like Truecrypt for Windows environments. The good news is that it works. Encryption doesn’t avoid the issue of data leakage, it just ensures that any data that does leak is completely meaningless and unusable to others.
Performing encryption in this way from within a cloud server does however have a couple of key drawbacks. Firstly it relies on customers making an explicit effort to use encryption which in a dynamic cloud environment with servers being created and destroyed with high regularity is not that feasible. Secondly, if a server using encryption in this way crashes, on reboot it will require a manual intervention from the customer to input the necessary password to enable access to the encrypted data. In reality such an approach just isn’t very feasible for most users and can result in severe disruption to computing.
The second way of solving this problem is at the vendor level. Its possible to save the virtual block devices i.e. the virtual hard drives, fully encrypted; this can be achieved below the level of the cloud server. As a result drives can be stored fully encrypted implicitly within the system and the data decrypted and served on the fly to customer cloud servers as it is accessed. This approach needs no manual intervention or set-up on the part of the customer and is completely robust to server crashes, restarts etc.
This is how CloudSigma has enabled customers to avoid data leakage. When a customer creates a new drive (via our API or web console), they have the ability to choose whether to store the data encrypted or not. We use a 256bit AES triple encryption cascade to encrypt complete drive images as marked by customers. The impact on performance is limited to between 10%-15% performance reduction in most cases.
From a usability aspect its invisible to the cloud server yet solves the data leakage issue; the cloud server sees the drive as unencrypted so no on-server modification is necessary by customers. We always advice customers to store data they wish to keep confidential on an encrypted drive. The ability to have multiple drives with our cloud servers also means you can create a system drive and a data drive then encrypt just the data drive. It means you can easily control your confidential data and ensure it is stored encrypted and safe from any data leakage. We see this as a fundamental requirement of any public cloud so encryption and related services are all included and implicit in our pricing. Anyone considering using a public cloud service should ask and clearly understand how the vendor addresses the issue of data leakage to protect confidential data.
Moving data to the cloud can also result in a loss of transparency about where it is stored and more importantly who has access to it. This is less of a problem for IaaS clouds than some global SaaS products which are distributed across many different locations and jurisdictions.
As a customer of an IaaS cloud, its important to understand:
The answers to these questions will go some way to determining the legal implications of moving your data to that cloud vendor. For ourselves:
You could argue it isn’t strictly related to security but the ability to get your data in and out of a cloud has direct implications for data management and control. Before placing any data in a public cloud, firstly establish what procedures are in place to allow you to migrate your data out. Key characteristics to look at are:
Before making the investment to migrate to our cloud, understand properly the investment needed to migrate back out again!
Just to address these points for CloudSigma, our main migration path is via our drive image FTP over SSL gateway. This gives our customers a direct connection to their private library of drive images in our cloud and the ability to upload or download whole drive images in RAW ISO format. This enables customers to extract their entire data from our cloud using a standard established protocol without modifying data structures. The RAW ISO format enables the drive image to then be used on open source or proprietary solutions or even burned back onto physical hardware. We charge our standard flat rate per gigabyte fee for outgoing data transfer (incoming is free) at the rate of CHF0.065/ US$0.0585 /EUR0.0455 per GB. For example a very large 1 terrabyte drive image would be free to upload to our cloud and would cost just $59.90 to migrate. Best of all, its possible to transfer out the drive image from our cloud via FTP to another hosting provider or cloud directly at high connectivity speeds.
Customer education and more vendor openness is definitely needed to allow a much more transparent debate on data storage in a cloud environment. Issues do exist but solutions that solve them are available to achieve real data security in the cloud. Customers need to ask intelligent and reasonable questions regarding data handling/storage and vendors need to be prepared to give full and frank answers.