Analyst Chat

Analyst Chat #67: Ensuring Business Continuity for the Cloud


As organizations go through digital transformation, they increasingly turn to using cloud services. One aspect of the digital transformation plan that is often forgotten is ensuring business continuity. Mike Small joins Matthias to explain why business continuity is essential for cloud services, especially in light of current events.

welcome to the KuppingerCole Analyst chat. I'm your host. My name is I'm lead advisor and senior analyst at KuppingerCole analysts. My guest today is a first time guest. It's Mike Small. He is senior analyst with KuppingerCole working out of Stockport UK. I'm a right. Hi Mike.
That's that's correct. Yes. Stockport suburb of the city of Manchester.
Great, great. Okay. Have you today, and it's a shame that we have not had you on this podcast before, but this time we really have to, because we want to talk about, about business continuity and the cloud. And that is something that we want to talk about for a very proper reason because of a recent incident that happened just yesterday.
Yes, that's correct. And I think it really comes down to this that a lot of people believe erroneously that if they are using the cloud service, that that service is pretty much guaranteed under all circumstances. Now from time to time, the unusual or unexpected happens. And only yesterday, there was a fire at a major data center of a European cloud provider called OVH cloud. And it's interesting, and we're very grateful to have heard that no one was injured and that, um, the fire service was able to speedily intervene and prevent the damage from spreading further. However, it did impact on the users of the cloud, uh, to the extent that certainly there were according to some of my students, some games that they were expecting some online games, which were no longer available and what this needs to be is, uh, what I keep telling our clients is they need to have a wake up to the possibility that the cloud may not be there when they wanted, and they have to factor it into their business continuity plans.
And this is not the first example of cloud services, um, not being available. And, uh, this has happened for a variety of reasons. On one occasion, there was an argument between the cloud service provider and the hosting organization that basically led to the servers being cut off on another occasion. A small cloud service provider in the UK went into administration and their clients received what was effectively a demand from the administrator saying that unless you pay this summit money by Friday afternoon, your service will disappear. So there are many reasons why the cloud service may no longer be there when you need it.
Right. And know that, um, many organizations are actually moving towards the cloud actually for availability reasons, among all the reasons that are there. You never the less have to plan for the cloud service not being available. Although it is to be expected, that is much more available than when an organization would provide it services on their own. So they are really relying on, on proven on reliable infrastructure. You need to factor in that you need to have business continuity planning. If we talk about business continuity planning, where would you start? Where would you begin looking at cloud services at criticality and availability? What would be your starting point?
Well, when you are, uh, making plans for business continuity, where you start is to understand the impact on the business of applications or pieces of data not being available. And most organizations need to go through this kind of process in order to identify what their recovery times need to be. And so there are some applications in some industries where as soon as the service disappears, you start to lose revenue. And in effect, the example that I gave earlier on over an online gaming, as soon as it's gone, you no longer can gain any revenue from it. And so you need to sort of do this triage of what your systems and a business criticality are, so that you can divide the work of what you have to do from what is going to hurt your business badly in an hour, in a day and in a week, because there are always things that can go longer term, you can live without them for a longer period of time.
And that is the first critical step that you have to take. Once you've identified what the applications and the systems are that depend upon you then have to start to work out how you will achieve the availability that you actually need. And that can be through a variety of different techniques. Cloud services needed in disrespect do give you these kinds of capabilities, but you have to use them. So for example, it is fairly common for a cloud service to be delivered from what are called multiple availability zones. And what that allows you to do is to say that you can concurrently deliver the service from two different data centers. And in such a way that if one of these data centers ceases to function correctly, then your application will continue. Or you can have various varieties of fail-over from concurrent operations through to heart failure or through to a standby where you can bring the application on, in a given period of time.
Now all of that needs thought-out planning and it needs testing now. So if you don't test it, then you're not going to really do very well. And unfortunately, many people, many organizations only pay lip service to that degree of testing. But to go back to the cloud again, what is happening with people and organizations that use the cloud is that often they do not fully recognize how responsibility for business continuity is shared. The cloud service provider will guarantee a certain level of availability of their services, and they will do this by having, as you recently earlier mentioned by, uh, having multiple sources of power by having things in multiple locations, by having multiple connections to the internet and so forth. But their responsibility ends if you will, at a given level. And so whilst they will say, we are 99.9, 9% available. If you are going to have a business impact of that, not point not, not 1%, then you have to do something about it.
You have to make sure that your data is available. Uh, so, uh, once again, the cloud service provider will take extreme steps to make sure that your data is preserved, but it may be there when you need it. They may say, well, we've got it, but the service is down so you can't get up it, or it may be that you deleted it by mistake. In which case the cloud service provider will just say, sorry, so you have to take steps to protect your own data. And that may mean taking a copy of your critical data, either into an alternative cloud service or into alternative nodes within the service that you are using, or even offline.
You've mentioned it already that this duty of the tenant of the cloud service provider is often underestimated or just not understood. Well, you've mentioned that when we, when we talk about your administrator having a bad day and deleting data, these could be the causes for such an incident apart from availability, apart from such a catastrophe, what other causes do you expect to happen in the future, or have already happened that bring down cloud services so that, um, business continuity planning is required on the tenant side?
Well, I think it's important to understand that it may not need to bring down the whole of the cloud service. It only has to make your use of it not available. And so this can range from ransomware, which of course, I think nearly everybody has heard about the impact of ransomware on north DRO. And however, would you believe that ransomware could bring to a halt aluminium smelting, but he did. And that really is that everything that you depend upon to use that cloud service end to end has to be working for your service to be delivered where you need it when you need it to who needs it. And it's loss of availability could be simply because the internet connection to the point where it's being used as failed, that your end user devices have been infected and so on and so forth. So depending upon the criticality of the data and the service, you need to take more rigorous steps,
Right? And you've mentioned that the surface has not to come down completely. And that was the case in this week's event as well, because there were four data centers and just only one was immediately involved in that fire, but it affected as a subsequent step. Then also other sites within the cloud service providers of premises. Um, nevertheless it brought down several services and they are still working on recovering from that. We don't expect that the next will be a fire again. So we need to prepare for all types of events. So what would be when somebody is really wanting to be cautious and prepared for that, what would be your advice to tenants who are running cloud services in general? What would be your advice to, to our customers?
Well, I, I think I'd like to sort of respond to that in several stages. First of all, what you you see in that incident is how consequential damage it results from what is effectively a relatively localized incident, that if you have a series of co located data centers and they are all powered, then if you have a fire in what you need for safety reasons to disconnect the power, because there are megawatts of power going into that data center, which if I'm uncontrolled is going to make the fire worse. And so by having four buildings, which are very closely located, and depending upon the same power supplies, you find that the first thing that the fire service does when it comes, is to turn off all the power to everything. And that is fairly typical now in terms of the data center itself, in my experience in areas of temptation, I'm not saying that this is applies to any particular cloud service provider, but it certainly was the case to individual organizations that the accountants would like the data centers to be in the cheapest possible place.
And usually the cheapest possible real estate is in places that are where I would consider them to be more hazardous. They are in a flood plain, they are under an aircraft approach to an airfield. They are next to a large petrochemical plant. And there was a famous story for those of you that like to look up bums field, where there was a data center next to a petrochemical plant near London in the two thousands where the petrochemical plant blew up. So those are the kinds of problems with the data center. Now, in terms of the organization, one of the things that has been happening is that organizations have been turning to the cloud because they see the cloud isn't essential tool of digital transformation. And this is good except what they don't factor incorrectly is that the more you become digital transformed, the more you become dependent upon those it services that you've created and, and support the cloud.
And so you cannot do digital transformation without doing proper business continuity planning. And that has to take account of the fact that you have become more and more dependent now upon these cloud-based it systems. And so you have got to take all of the kinds of steps that you would have taken if you were going to build it in your own data center. And the benefit of the cloud is the cloud potentially makes it easier for you to do it because most of the large cloud service providers do provide you with multiple data centers, physically far separated. And so the first thing you should be doing is designing your new systems so that they can run and kind of running fail over mode, where you can, um, effectively load balance between two different data centers. You should make sure that you have control over your data, that you are yourself taking backups of critical data that your business continuity plan is thought through for how you would respond. If there is an incident and we have a workshop and lots of material in KuppingerCole all that you also need to have. So having architected your systems correctly so that they can do that, you then need to do things like to test that. In fact, it all works when something happens because when something bad happens, that is the worst time in the world to discover that your carefully thought through plan, doesn't actually work in practice.
Right. Very interesting and good to hear these different facets of this topic. If some of our audience have become a bit nervous while listening to your right now, I know that there is quite some research material available that has been written by you. Um, and maybe I want to hint at one document, which is an advisory note, which is already two years old or so, but it is still very current as we can see in the light of the recent events, it's called cloud services and security. And that is something that our audience can find at the KuppingerCole website. Is there more material around which you could recommend to look at at our site?
Well, yes. I think that looking at, um, our market compass on cloud backup and recovery, uh, gives you a good hint as to what you should be looking out for, uh, data backup and recovery processes, uh, that you should yourself be making sure that today to process as your cloud service providers convince that their service will be able to meet your needs through, uh, independent verification of their claims. And, uh, there is also a lot of research material that we have on how best to respond to incidents. So what you have to do, and this, this is where, when we come to security, you have to remember that security is about these three objectives. It's confidentiality, integrity, and availability, and a failure of any one of those can have a deleterious impact on your digital transformation. And the one that usually stands out and most immediately is that of availability. Although we are gonna is about, um, uh, data breaches, which are important, if your systems suddenly become unavailable and you have nothing to substitute them with, that is a showstopper. That is something that every, every board of directors of every enterprise needs to have thought about. And most of them have thought about this from the perspective of what would happen if there was a terrorist attack on the building, but perhaps less should have thought this through carefully about the lack of your critical it applications. Yeah, I
Fully agree. And I think many organizations are currently also thinking of multicloud strategy. So really spreading the workload and thus the availability risks across more than one cloud service provider. But this is also an important thing to properly invest into doing things, right, because multicloud strategy per se, it just by, by subscribing to three services does not make a platform and infrastructure more sustainable, more reliable, more available, this needs to be done properly. And that's really an architectural point of view. And that is something where we, as KuppingerCole, of course also can, can support in designing that any final thoughts that you want to add. Mike,
I think, um, that, uh, you know, it's the famous words of planning and preparation prevent problems, and that's what you need to do.
Yeah. Great summary for today's call. Um, that was a very interesting episode and I'm glad that we finally made it, that you are my guest here as well. I'm looking forward to having you here soon again, uh, for today. Thank you very much, Mike, for being my guest today
And thank you very much for inviting
Me. Thanks. And bye. Bye. Bye .

Video Links

Stay Connected

KuppingerCole on social media

Related Videos

Analyst Chat

Analyst Chat #149: The Top 5 Cybersecurity Trends - Looking Back at CSLS 2022

Deep Fakes, AI as friend and foe, Business Resilience, Mis-, Dis- and Malinformation: The Cybersecurity Leadership Summit has taken place in Berlin and covered all of this and much more. Martin Kuppinger and Matthias look back on the event and identify their Top 5 Trends from CSLS2022 in…

Analyst Chat

Analyst Chat #142: Cyber Resilience: What It Is, How to Get There and Where to Start - CSLS Special

A key issue for many companies beyond technical cybersecurity is cyber resilience. This refers to the ability to protect data and systems in organizations from cyber attacks and to quickly resume business operations in the event of a successful attack. Martin Kuppinger, Mike Small, and John…

Analyst Chat

Analyst Chat #110: Cloud Backup and Disaster Recovery Done Right

The importance of efficient and secure cloud backup and recovery is often underestimated. Mike Small explains these two disciplines to Matthias and looks at the market of available solutions on the occasion of his recently published Leadership Compass. He also provides valuable guidance on…

Analyst Chat

Analyst Chat #100: A Perspective on "Everything as Code"

No big celebration, but at least a mention: this is the 100th episode of the KuppingerCole analyst chat. Martin Kuppinger joins Matthias to discuss the increasingly important topic of "everything as code" and how to define proper strategies for approaching this, especially in the context of…

Webinar Recording

Disaster Planning Made Simple

In the digital era, as businesses become increasingly reliant on IT, a potentially devastating cyber-attack or other type of disruptive cyber incident is inevitable. Being prepared is the single most effective action that those responsible for information security can take.

How can we help you

Send an inquiry

Call Us +49 211 2370770

Mo – Fr 8:00 – 17:00