blog

Common use cases for multi-cloud databases

Krzysztof Ksiazek

Published May 30, 2023

A happy, bearded man holding three ice-cream cones, but instead of ice-cream in the cones it's clouds

Multi-cloud database has become a common term heard in the IT world, but what does it really mean? What does it entail, and why is it an important topic of discussion? Is it something every organization needs?

In this blog, we’ll dive into these questions and explore various use cases that might point organizations toward a multi-cloud infrastructure.

What is a multi-cloud database?

Let us start by trying to understand what a multi-cloud database is.

Multi-cloud implies multiple different clouds. So, we are talking about a database spanning across multiple clouds. The majority of cloud service providers come with some kind of a DBaaS (database-as-a-service) solution (Amazon RDS, for example). The thing is, such solutions are not good in terms of interoperability with other cloud providers. Typically you can implement some sort of replication, but the main reason for this feature is not building a scalable database but importing the data into the DBaaS. There are significant limits in terms of what you can or cannot do.

If not the DBaaS, then we have to focus on compute instances as the building blocks for our multi-cloud database — this is the most common pattern. Using compute instances allows for the most flexibility regarding how to approach the inter-cloud connectivity, how to configure the databases, and how to implement automated recovery for the cluster. This makes it possible to build different types of environments, either relying to some extent on the tools made available by the CSP or relying solely on the open-source software, building a cloud-agnostic setup.

Is building and managing a multi-cloud database easy?

The short answer is it is not. Even the fact that you must work with multiple nodes separated by WAN links makes the process challenging. How to deal with network splits? How to handle failures of one or more data centers?

Those questions are not easy to answer. They will require knowledge and experience in building WAN-spanning networks and databases. Operating such a database is a challenge on its own as well. The question may arise, why are we even talking about such a concept? What are the advantages of it that would overshadow the disadvantages and challenges?

Multi-cloud databases and their use cases

Let’s talk about some of the reasons why organizations around the world go through the paces to build those complex environments. As you may think, there are many reasons for this to happen.

Disaster recovery and survivability

Disaster recovery is, by far, the most common reason why people decide to go multi-cloud.

Data is one of (if not the most) important assets of any organization; therefore, its well-being and safety are very important. We want the data to be safe, and we want to be able to recover it should something happen.

People are setting up replication to mitigate the risk of hardware failure. We are utilizing multiple availability zones to ensure that the infrastructure (power, cooling, network) is redundant and that a failure of one of its elements will not have a negative impact on the availability of the organization’s data. Then we are talking about utilizing multiple regions to protect the data from even the most serious hazards (hurricanes flooding the data center, uncontrolled fires, or such subtle problems like an excavator cutting main fiber lines leading to the data center).

This escalation ladder goes further. In one of the highest levels, we finally have an environment spanning across multiple cloud providers, ensuring that even a complete closure of a single CSP will not impact the availability of your data. Of course, the more you want to be protected, the more expensive it will be. It all depends on how critical and valuable your data is. In some cases, where the infrastructure is required to be available all the time, this is a viable option.

Data sovereignty

Another very important reason to implement multi-cloud environments is to have complete control over where your data is stored.

As you may know, today’s world is full of regulations that govern where and how particular types of data can be stored. If your organization is dealing with sensitive data, you may have to comply with standards like HIPAA, PCI DSS, or GDPR that define what you can and cannot do with your data. This typically involves knowing where the data is allowed to be located. You may not be allowed to store data that belongs to an organization located in a European country in a data center that is located in another country (United States, for example).

In some cases, the problem becomes even larger. You may be forbidden to store your data in a European data center that belongs to an American company. In this example, you are practically banned from using the infrastructure of the main hyperscalers like Amazon Web Services, Google Cloud Platform, or Azure, even if you have the infrastructure in one of the data centers in the European Union.

Country-level law might be even more strict. If you are a government entity or working closely with one, you may have been unauthorized to store the data outside the country. This realistically forces you to use one of the service providers based in that particular country and have data centers within it.

Those regulations pose some challenges. Let’s assume that our organization provides services to multiple customers from different countries and works with data of varying levels of confidentiality. In such a case, you probably cannot use a one-size-fits-all solution. If you go ahead and build your infrastructure on AWS, you won’t be able to provide services to some of your customers (or you will limit the pool of potential customers that would be legally allowed to use your services).

It doesn’t mean that you cannot use AWS at all. In some cases, this might be a perfectly ok solution to use with some of your customers. You have to keep in mind the other issues, though. For that, you probably need to utilize some other cloud service providers, more minor, more local, that would allow you to build services that will meet the security requirements of your clients.

Building it, you must always know how you process your data. Connecting your AWS infrastructure with the local “branches” from particular countries is perfectly fine. The challenge is that you cannot process the “local” data by the software located in AWS. In most cases, building a “control plane” of your software solution would also be perfectly fine, and storing it in one of the big cloud providers. You can then use it to manage the rest of your solution as long as the data stays in the local “branches” and is never transferred outside the data center it has been stored in.

Cost-awareness

Cloud infrastructure cost reduction is another common reason for utilizing a multi-cloud setup. Large hyperscalers provide a huge variety of services, different types of data stores, and numerous kinds of solutions to process the data. This allows organizations to quickly build complex environments tailored to particular data processing needs. This has the other, darker side —CSP lock-in and a price tag attached to those services. Vendor lock-in is a topic for another discussion, and you can avoid it by skipping custom services and utilizing compute resources to build your data pipeline with open-source software.

No matter what service you use, you will quite commonly find that large CSPs are expensive. Sure, custom, managed services are never cheap, but even simple VMs are priced higher than in the case of smaller competitors. This is something organizations try to exploit to their advantage. It is not uncommon to see some services used in one particular cloud because, let’s say, there is no easy way to build an alternative utilizing open-source technology. Maybe it’s a lack of knowledge in the team and requires a complex setup and expensive maintenance. In that case, it might be perfectly reasonable to use a managed service for that particular technology while reducing expenses by using a cheaper cloud service provider to compute resources used to build the rest of the data processing infrastructure.

Another common reason to utilize multiple cloud services is to have the option to quickly and easily migrate between them in case it makes sense financially. Prices change over time, and what was a good decision a year ago may be a wrong decision to stick to in the long run. Building your environment across multiple clouds lets you move the resources from one CSP to another if it helps you to reduce expenses.

Scale out

Finally, let’s talk about one more reason to go multi-cloud. If you are a large organization, you are utilizing a large fleet of compute instances. We are talking about thousands of instances. Let’s say that, for some reason, you expect a significant increase in load on your systems. It could be some event that will bring you more traffic; it can be a marketing effort. What’s important is that you have to scale out and do it fast.

The problem is that the cloud, despite what marketing says, is just someone else’s computer. A CSP may be able to spin up a couple of thousand VMs, but it’s not something you can take for granted. Especially smaller CSPs may struggle to provide you with such infrastructure upon request. Organizations, which have built their environment across multiple clouds, are in a better position to deal with such limitations – the more clouds your environment is spanning, the easier it will be to perform a large scale-up in a short time frame – more CSPs mean more resources available for you to use.

To sum up, while building a multi-cloud environment is not easy, it offers several benefits that may outweigh the drawbacks and challenges. It is something every organization should consider at the planning stage. It comes with a great deal of flexibility and presents you with more options to choose from should you encounter technical and financial challenges.

Wrapping up

Multi-cloud databases are complex systems with many operational challenges, yet organizations are adopting them due to benefits like improved disaster recovery, data sovereignty, and compliance. Building a multi-cloud environment allows greater control over data storage, security, and legal compliance. Despite challenges and costs, the potential advantages make this a crucial topic in the IT world.

If you’re considering moving towards a multi-cloud implementation, check out how to address some of the common challenges of multi-cloud architectures, and download our free multi-cloud guide for a more in-depth look at the what, why, and how of building a multi-cloud setup.To stay in the loop on all things multi-cloud, don’t forget to subscribe to our newsletter and follow us on LinkedIn and Twitter, as we’ll be sharing more great content in the coming weeks. Stay tuned!