It is clear that a data center full of computers running at 90% capacity is far more power efficient that the same workload running on many discrete machines but each machine only averaging, say 3% average utilization. Even if those smaller machines have smaller power supplies.
Example (and I'll show this is conservative in a sec) one machine running 1000 watt supply, with 30 VM's on it. That power supply is probably only averaging 500 watts but lets use .67 as the factor, just to keep it high side. 670 watts. 22 watts per OS image.
The same thing running on 30 smaller machines, each with 200 watt power supply, each using 100 watts (.5 factor). Still favoring the small machines. You have 670 watts versus 3000. You are using 4.4 times less power.
Those numbers are high in our world though. In our current blade world (Documented at length in the Go Big to Get Small" series), we are running 150 VM's per Dell M630 blade, and each blade is pulling less than 500 watts on average. Less than 3.3 watts per VM.
Clearly virtualization is 'Green', not to mention saving tons of DC space, power, cooling and all of that equals money.
All is NOT Golden in Virtual Land
Virtualizing seems a no brainer for most things that do NOT use up the entire physical hardware footprint in a single application / instance. Even that statement had caveats before I even GET to the major issues.
The main problem I see / come across with virtualization is DataBases. Example: Oracle won't certify their RDB for any virtualization platform other than their own OVM, and the reason is I/O. Over in MS land, Hyper-V with Server 2016 is just now becoming good for SQL Server workloads.
Virtual I/O as a bottleneck is a well understood problem, and Intel and AMD long ago added microcode to allow their virtualization assists to dedicate PCI slots to particular instances. One of those is Intel's VT-d. If you are a a mainframer, this is not unlike the 'attach' or 'dedicate' command/directive in z/VM, and with it you give one virtual image complete control over the device. It’s the only one that can do I/O to it. It undoes some of the flexibility of virtualization, but it decreases Virtual I/O overhead. Here is a cookbook for how to do it with KVM for example.
It’s the classic virtual versus physical tradeoff, but it allows you, in theory, to dedicate something like a FiberChannel card to a data base server, and get the hypervisor out of its way for I/O. You can instantly get into trouble with stuff like this, because this is your important data base! You can't have the single point of failure. You have to dedicate two cards! Which means now all the other images need a complete OTHER set of Fiber cards to run through. What if you want to do the same thing with your Test / Dev / QA instance to be sure you are keeping your environment apples to apples ..
Lots of dedicated cards, and therefore lots of single use I/O slots on your server frame.
As you work your way through the things you can do to get rid of the most overhead for the least amount of effort, you sooner or later have to arrive at the OS itself. How much overhead a particular hypervisor has is about as complicated a question as you could care to have. You have so many variables.
- What's the Hypervisor itself? KVM? OVM/Xen? VMware? Hyper-V? Acropolis?
- Is this full on Virtualization, or Paravirtualization?
- What's the hardware platform architecture? (Mainframe Virtualization is decades older / more mature still, though many articles on virtualization forget it was not invented by VMware.)
- Which generation of chipset / Microcode is in play, and are you fully set up to take advantage of everything available?
Containerization asks a different question or two: How much of that OS do you actually need to do what you want? Also, how isolated do you really need to be between the hosting OS / Platform, and the applications?
If the answer to this is "Not much and not very" then Containers change the math of efficiency / overhead a great deal. Real world example: When we were consolidating one of data centers a few years ago, we went to move as many of the physical Sun / Solaris systems as we could into Zones. When we did the math to figure out how to size the host, we computed we could put about fifty of the OS images one each host. There were variables that we had to take scientific guesses for:
- How much faster the new hardware was than the old
- What was the actual, real combined overhead of the Zone
We used VMWare consolidation style calculations. In the end, we ended up with hosts that easily could have held twice as many OS Images as what we planned for. We eliminated hundreds of systems into few, but it could have been half as many still again. We were after 10:1 space reductions, and we could have gotten 11:1. More importantly we could have spent less money on the Sun blades to absorb the workload if we had known the real number ahead of time. In the scope of the larger project, again, this was not much, but still. What we ended up with was excellent performance for all the new guests, and no need to buy any new capacity for years. It was like Y2K all over again.
That same logic applies to Linux containers today, and companies have come along to make it even MORE attractive by bundling up Containers into libraries that you can check out and personalize. Need an Apache web server? Just spin it up from the library. Docker is a great example of adding value to that by creating an internet registry of such things, and all sorts of management tools around that.
Fast to deploy. Low overhead to run. Portability from host to host. Low virtualization I/O overhead. Higher application density per host. What's not to like?
Other than Application Sprawl of course.
If you thought keeping track of your CLM cloud was fun, wait till you have all tiny little containers running all over the place. Seems we are always trading ease and lowered cost in for sprawl. Have a mainframe? Its central and expensive and too controlled for your taste? Do client server. Spread it around. Lower the acquisition cost. Now there are a zillion little pools of computer, and a zillion applications. Crud! We need to get our arms around that. Lets make great big DC's, and rack them all where we can see them. Crud! We have cabinets full of tiny systems. Lets consolidate and make everything virtual. Cool! Its smaller and we know where it all is, and its all on supportable hardware. But… what are all those zillions of VM's doing? Anyone know who owns all those things? OK. Lets corral all that, and get it all under Cloud Lifecycle Management or something similar. Now we have names. And expirations. But.. What about all those heavy OS's and the RDB's that need better I/O… Hey! Lets Containerize!
None of that even counts all the hidden computing costs going on out there in the public clouds and running off the employee company credit cards.
Round and round we go…
We seek efficiency and manageability and flexibility. We want to enable everyone, but also be sure we know what's going on so we can stay in front of the next zero-day that comes into our life, not to mention have some idea where or corporate data might be currently living.
Coda to What's Next
What got me thinking about all of this was my pondering on what the DC will look like next. Part of that is determined by what is going to go into it. We are not just reaching the end of the line for things like Moore's Observation. We are reaching the end of the line for what we can do to strip out certain kinds of waste. We have gotten rid of underused computers. We have shared pools or storage together so that they can be more efficient. We are de-duping and compressing. We are stripping out all the unrequired parts of the OS.
What we are NOT doing is going back to writing things in assembly language so that programs are as efficient as possible. We are staying high level. Getting ever more abstract. Software will continue to get bigger even as we reach the lower limits of the ability of the hardware to get smaller. That will be an interesting inflection point.
We used to joke for years about how you always needed to upgrade the hardware to run the latest, ever fatter version of MS Windows. The reality there is that Windows 10 runs fine one Windows 7 spec hardware. Its NOT the OS's getting fatter any more. The software target is getting larger because of all the OTHER code we are ginning up. Software defining everything!
The data center will be as small and dense as we can make it. The world that runs inside it, and that it will be connected to? Managing it? That is a whole other thing.