Blades and Going Big
Before I dive into Sun / Oracle blades, a blade overview. A very generic, non-vendor specific overview.
Way back in the day, when blades first arrived on the scene, they did not have the density to replace regular rack mount systems. Not in compute and even more not in memory. To even begin to be a serious candidate for virtualization, you have to have RAM, and lots of it.
Early blades just did not have that many DIMM sockets, and the density you could put in a DIMM slot was low. Later versions of blades let you get to higher density DIMM's, but to get near what you'd need, the DIMM's were a kings ransom. Totally making these numbers up, but it was something like this: if a 1 GB DIMM was 10 USD, a 2 GB DIMM was $25, a 4 GB $100 and an 8 GB $500, etc.
Point only this: The price went up way faster than the density.
Today's units have more DIM slots, better memory management in the chipset (more slots per socket), use more commodity priced DIMM's like DDR3, and so forth. Today's blade lets you get to useful RAM amounts.
As noted in "Go Big to Get Small", we are a "Virtual First" shop. That has helped us already make huge inroads in the reduction of power and CO2 emissions. To take this to the next level required a vendor by vendor look at their various compute solutions to see what we could use for a big increase in virtualization, as well as re-hosting older virtualization solution to latest / greatest / dense-est footprints.
Blades of come of age, by and large. In some vendors, we could actually be denser in rack mount still, but there is also value in standardization of form factor and process.
With the massive virtual server density, we wanted to make sure we were not introducing choke points outside the blades, so everything needed to be able to connect up with 10 GB Ethernet, and 8 GB fiber Channel for SAN. The SAN would need to be fast.
To enable fast re-hosting, and HA, everything needed to be able to boot from SAN. A blade in a chassis had to be interchangeable with any other blade like it. Carry the same workload. Boot from the same place. No internal disks in the blades.
So, fast network, fast disk access on a SAN, lots of RAM, boot from SAN, interchangeable, backwards compatible.
About that last one: As I have mentioned innumerable times over the years, we are an R&D shop. Not just working on the latest and greatest, but also copies of all sorts of older versions of things. We try, like any software company to limit how far back we have to go, because iterations and versions math is crazy. Its not just the OS, but the application or applications, various dependencies like data bases... on and on. Still, worst case, you'd at least like the option of going back more than one version on something.
As a general rule then, the less backwards compatible something is, the less useful it is in this shop.
Rule Of Thumb for Blades (This generation of tech...)
ROT's are useful things, and as blades go, here is one of mine: 128 GB of RAM per Socket, assuming that socket has 8 cores, and generally performs for its designated workload the way a Xeon 7000 does in AMD64 space.
I base that off watching our BCO data for VMware and other X86 virtualization environments. At that ratio, and with our workload (your workload is not my workload), then in general 80% of the workload runs out of RAM before it runs out of CPU, and 20% or the workload runs out of CPU before it runs out of RAM). By run out, I mean we are well into the 90 percentiles. We are starting to clip the peaks.
Pre-Nehalem, when all the virtual assists were still young, and the cores-per-socket were lower, the rule was 64 GB per socket, and for a pre-T4 chipped Sun it was more like 32 GB per socket.
That brings me to...
Sun Blade 6000
In an Oracle press release a few years back, I was quoted talking about how we use T class servers to save power. Reading that reminds me just how long we have been on this journey to save power, CO2, floor space, as well as modernize where possible the server footprint.
Sun's current blade offering is the 6000. The new announcements about the T5 chipset, specifically the T5-1B were of great interest to me because the T4-1B was head and shoulders faster in virtualization workloads than what came before it. However, its Achilles heal is that the blade only has 1 socket. Most of the time, that means putting more than 128GB is a blade is not going to be fully utilized.
The T5 is, on paper, 2.3 times faster than a T4, and from what I have read of the tech specs, I think that we are set for 256 GB per blade with this generation. You could put 256 GB is a T4-1B: It was just not clear how well used it would be. The CPU would bottleneck first.
All I have in house at this writing is T4-1B's. But I do have a few of them:
For HA reasons, there are two chassis. Each chassis has 5 T4-1B's. Each T4-1B has an 8 GB FC card in it. The 10 GB is shared because of a switch-like device on the back called the NEM.
Bottom Line: This set of two will replace 250 Sun systems when we are done.
All kinds. V240's. V100's. V880's and V890's. Ultra 5's. On an on. A list out of history and time.
The secret sauce is that we can virtualize Solaris 8, 9, 10, and 11. Solaris 8 and 9 have to run in zones rather than LDOMS (Yea yea: Oracle VM Server for SPARC.).
I promised numbers last post, so here goes:
We have retired 250 computers from all over the line. A Netra nameplates at 108 watts. A V240 pulls about 550 nameplate. A V880 2,900. Big spread. We mostly have the V100-V280's running around, but one or three of just about everything came off the line. A bunch of E450's.
The average watts was 474, and the average U was 4.
I was very careful to call that nameplate, because a computer does not use all of the power its power supply is rated to put out. According to measurements I have done comparing the UPS readings to known nameplate values, we use about .57 - .6 of them. This power factor is lower than the planning number you normally use for such things, which is .67. But that lead to me claiming less savings, which is fair. I am using averages too, so going lowside keeps it real.
- 250 Suns @ 474 watts = 118,000 watts, or 118 kw
- Apply power factor of .6 and that means 70,000 watts of gear was replaced by those two chassis and their 10 blades. I rounded down at every turn there, to make this as low as reasonably possible.
Here is a place the Sun chassis is very different from other blade vendors. Rather than having six power supplies like Dell or IBM, for example, they have two. Either one can supply the whole chassis, so its a doozy: 5,740 watts each one. AC input power to the chassis is 6,272 watts. See the spec sheet for details.
We have two of them, each half full, so about the same power consumption as one chassis with a bit of overhead. Call it 7,000 watts.
The Magic Number
We have now arrive at a number I have seen popping up over and over in this. Giving away the end of this series here, but 70,000 watts versus 7,000 watts. 10 to 1. We see that number emerge time and again for us. Partly is a function of the age of the gear we are retiring. I see it over and over though.
That is 10 times less CO2, because CO2 is directly related to wattage.
The average server was 4U, and we got rid of 250 of them. 1000U now sits in 20 U. Way more than 10 to 1. 1000 U racked tightly (say 40 U per rack) is 25 racks, now sitting in one rack. The reality is far more organic (read: not that dense, but full of historical reasons for why it was the way it was) than that, and I'll get to what we have hauled out of the DC later.
Shelling out from Sun for a second about space: phase one of this project involves taking 37,000 square feet of DC, and making it into 11,000 square feet, and having enough capacity in that DC to go after other DC's in later phases. This time it was 3 to 1. But now we are set for absorbing other, smaller DC's, and later, even shrinking this one again. We are after 10 to 1. Power, CO2, and Space.
Next time: Another blade server.