I enjoy watching airplanes take off and land. It still amazes me that machines that weigh almost one million pounds can defy gravity and take off with such grace and ease. As a matter of fact one of my favorite vacations was to the island of St. Maarten. It has one of the most unique airports in the world, where you can sit at a bar located right at the end of runway and watch the planes fly overhead, literally less than 50 feet above the ground. Of course, if one is brave enough, they can experience an even more exciting takeoff by standing right behind a jet and try to withstand the thrust of the blast from a 747 (hopefully, without getting literally blown off the beach directly into the water).
Regardless, the question I’d like to address is, “what is the similarity between flying a 747 and operating a Cloud data center?” However, before I get into that, I need to identify the common characteristics of these two technologies.
First, both offer a “shared service”. Unlike a car, that is used to transport individuals (yes, I realize that you can have more than one person in a car), an airplane really is a “shared service” because it is used by a large number of people. As you move towards “shared services”, you tend to adopt more “cookie-cutter” resources and services. The seats, the meals (or whatever they serve on flights these days), and the baggage requirements are more or less constrained by the limited number of choices you are provided. Similarly, in the Cloud, one deploys “cookie-cutter” services and users of cloud services are constrained in what they can tune/configure in these services. This is in stark contrast to applications deployed in a traditional data center, where everything was more or less configured differently. Services deployed in the cloud are more or less pre-configured in a service catalog by the cloud admin and the users have some, although limited set of knobs/dials they can turn. Effectively, the cloud introduces an environment that has a high level of governance. This controlled environment creates interesting opportunities from a monitoring and operations perspective.
Second, both are designed to offer high levels of reliability. Unlike a car, which has just one engine and a single point of failure a 747 has four engines. Not only does it give the high levels of thrust needed to lift the million pounds off the ground, it offers very high levels of reliability. I was surprised when I learnt that, once airborne, a 747 could fly on just one engine. Actually, the 747 can “almost” fly even after losing all 4 engines. In fact it can glide 15 km horizontally and lose only 1 km in altitude. This is not just fiction! Despite all this redundancy, a British Airways flight did lose all 4 engines over the Indian Ocean and glided for almost 16 minutes before the pilots were able to restart the engines.
There have been many other instances where airplanes had to fly on less than full “service capacity”. Similarly, cloud deployments are characterized by resources that are deployed as clusters of “pooled resources”. While this definitely provides more compute power, it also provides some level of redundancy and reliability. Specially, with the ability of Virtual Machines to move from one resource to another within a “resource pool” the right cloud architectures can deliver high levels of reliability against planned and unplanned downtime. As a result, IT Operations has to evolve the way cloud resources are monitored. While the thresholds and alerts from a single resource may be important, the performance and availability behavior of the “pooled resources” is more critical.
The third similarity is around capacity planning and optimization. As you offer “shared services” having the right capacity for your flight and all your operations is critical. If you don’t plan your flight capacity appropriately, you end up giving free round-trip tickets to every passenger that you bump-off the flight and in the worst case you have ticked-off customers who will think twice about flying on your airline again. Not only is the planning of a given flight very important, the overall optimization of the entire network of flights is critical, otherwise you will have passengers that have flown from destination A to an intermediate destination with not enough seat capacity to finish the rest of their journey. Similarly, while operating a cloud, the capacity planning and optimization at every layer of the stack is extremely critical.
Stay tuned for future blogs that drill into each one of these aspects in more detail. I’ll also cover how BMC is providing cloud operations solutions that not only help you maintain your existing investment in monitoring and operating the data center, but at the same time enable you to evolve your approach to monitoring and operating the cloud, with all of its unique aspects.