When we talk to IT operations teams we often hear the question “what do I need to change in order to adapt to operations in a cloud world?” After all, operating a cloud once all the pieces are in place isn’t exactly like slicing through butter. In fact, the complexity of operating clouds has given rise to an entirely new set of practices that the industry is calling “DevOps” or developer operations. If we look at what’s different in a cloud environment versus a traditional environment, it can help us understand how to adapt traditional operations to a cloud world.
Relinquish control over underlying infrastructure
Ops teams are used to being able to dig into the underlying hardware in order to debug issues. Many problems usually lie in the hardware layer. In a hybrid cloud environment, ops teams are faced with the situation that they may, or may not be able to get at the hardware layer. If an application is running on an external cloud, ops must take steps to remediate problems that are different from an internal or private cloud. Rather than looking for a hardware issue, the solution may be to simply relocate the application to another virtual server in the public cloud. Adapting procedures to this reality is one of the biggest changes ops can look forward to.
Lack of workload permanence
In a cloud environment, workloads are provisioned and de-provisioned at a faster rate than in traditional or even virtual environments. What’s more, this is typically all automated without manual interference. So not only is there no guarantee of where a workload is running, nobody is necessarily going to go tell the ops team that a new application has been provisioned. Ops needs tools that can keep up with the automated nature of cloud since they are expected to support applications that are provisioned on-demand. This goes beyond simply discovering the application is out there, because each application might have a different SLA that ops is expected to support. Also, there may be little in the way of notice given to ops when an app is de-provisioned, which could lead to questions when an application disappears of whether it was supposed to disappear.
Operate on pools of resources, not servers
Traditional ops tools focus on the performance of an individual server or virtual machine. That’s not good enough in the cloud. Ops has to start thinking in terms of the health of pools of resources. Supporting pools is not as straightforward as it might seem. It means ensuring that workloads are balanced across the pool so that no single resource is overloaded. Further, if one application is misbehaving, it could impact the entire pool, so it’s important to quickly isolate problems with an application before it impacts other applications. Finally, even though cloud lets us think in terms of pools, it’s still important for ops to be able to drill down to the individual units of compute. If a single server is misbehaving, but the pool is still healthy, ops can quickly remove the server from the pool and reallocate workloads to new servers.
Ops will play a critical role in helping organizations make the transition to cloud computing. Ops will have to adapt some of their tools and procedures to the new reality that cloud presents, but the overall goal of ensuring that IT delivers on the service levels the business requires is just as intact as ever. For more information on cloud operations, visit here.