‘The Cloud’. Used in this popular singular form with the definite article, it suggests that there really is a single, nebulous entity where computing and storage magically take place.Of course reality is that “The Cloud” is a network of datacenters, and within those, a network of servers and storage nodes. So when you put data in “the cloud”, where is it exactly?
Some companies are very wary of placing data in anything but their own datacenters. In that case they know exactly where their data is, and a private cloud doesn’t change that. This blog is not for them.
However, there are many reasons why even enterprises in regulated industries want to use external cloud services, and such a move doesn’t necessarily mean that they lose sight and control of their data – they just need to ask the six questions.
You can view this as a series of increasingly contained questions that peel back layer by layer where the data is actually stored:
In which region is my data stored?
Many cloud providers have datacenters deployed across multiple regions. The physical distance between the user and the datacenter, even in high bandwidth connections, still matters. It will continue to matter because of network latency, or the minimum time it takes a bit of information to get from one place to another. High latency causes data transfers to time out and applications to break down, so one would always prefer to have the data stored in one’s own region.
There are additional considerations here for companies that are spread out over multiple regions (e.g. North America and Europe), who would want data to be replicated across multiple regions. This is also sometimes done for redundancy or disaster recovery purposes.
Many cloud services providers automatically send the data to the closest data center. However, there may be cases where an organization would prefer to limit certain types of data to one region – e.g., the European Union – for legal or regulatory reasons.
In which country is my data stored?
The issue of data sovereignty has gone to the forefront as cloud services become more popular. The issue is that in some countries – many European countries as well as Canada – certain types of data should not leave the country. Sometimes this isn’t a legal requirement but a privacy concern that a company would like to convey to its customers.
In which country is my cloud provider registered?
Data residency, or where the data is physically stored, is not the only aspect influencing sovereignty. Even if the data is stored in your own country, but the provider hosting it is a company subject to foreign laws, your data may be accessible to foreign governments under various laws of information disclosure, or it may be disclosed to certain parties in case of a lawsuit. Check your service provider’s legal status if it’s important to you not to have your data exposed to such disclosures.
In what type of datacenter(s) is my data stored?
You want to know that industry-standard security best practices are applied to your data storage, and that includes both IT security as well as physical security measures. Things like 24/7 surveillance, anti-fire systems and multi-factor authentication for entry are to be expected. There are various types of certification for datacenters that audit datacenters to ensure compliance with best practices.
In what type of tenancy model is my data stored?
There’s a difference between the multi-tenancy model of public cloud, and the model now popularly called Virtual Private Cloud or VPC. The difference is data storage and access is significant. In the normal multi-tenancy model, your data is stored in the same logical system or “bucket” with other organizations’ data, and access to it is governed by access control mechanisms.
With a VPC, your data is stored in logically separate, fenced-off infrastructure that can be made accessible only via your VPN. This makes the VPC as secure as your own private datacenter environment, so even if access control mechanisms fail, your data can never be mixed with other data. By default, this also means you can encrypt your data with your own keys, and control every aspect of encryption policy. We explained this in depth in a previous blog post.
In what storage format is my data stored?
Commons storage formats (in and out of the cloud) include block, file and object storage. Cloud storage typically includes flavors of all three, but they are used for different purposes and applications. For most file services such as backup, file sync & share, as well as for storage of large amounts of unstructured data, object is the preferred format – for both its massive scalability as well as it cost-efficiency. AWS S3 is an object storage system, as is OpenStack Swift. Price per GB on object is roughly half that of most other file-based systems. CTERA supports nearly all object storage vendors.
Normally, this is not something cloud users need to concern themselves with – but if you do have cost concerns and scalability expectations and are using one of the lesser-known cloud providers, it is wise to verify the suitability of storage format to your needs.
How Deep Is Your Love? For Your Data That Is.
If you made it this far in this blog post, congratulations! You deeply care about your data.
Seriously, not everyone needs to dig through all these layers, but it’s good to know they exist. Cloud providers are not responsible for your data – you are. Demystifying cloud storage to understand where your data really resides and how it affects data governance, security, integrity, sovereignty, and compliance can remove some of the obstacles in adopting “The Cloud” in an optimal way.