I am traveling back from the RH Summit 2010, and have a couple hour layover in Detroit, so I thought I would try to get some thoughts about it down here while they were still fresh. Never mind the pages upon pages of notes I took on the iPad, where next week I will be editing them for an internal presentation and going "What in the world did I mean when I wrote "rhev next advantage T.P."" (Oh yeah: RedHat Enterprise Virtualization version 2.2 (next version) will take advantage of transparent large pages in EL 6)
If you have never been to a RedHat Summit it is probably worth talking about what it is like. In some ways, it is like many other of the very best tech conferences I have been too, reminding me favorably of SHARE and BMC Userworld, mostly because it is non-stop, never ending, drinking from a fire hose of information delivery. Like a good conference, one person can not cover it. In fact, two of us went and could not cover it. And we did not even try to get to anything about JBoss. Just the Redhat and and Cloud tracks were massed with things I wanted to attend, and I was very often choosing between two different talks that I very much wanted to be at.
The energy and the drive of the whole thing was palpable, and the geekery was also off the scale. Over and over I heard, when a GUI was being described that we should not worry because the command line was still there, and everything was still script-able. Competence and Confidence. There were also quite a number of decision makers there: It was an interesting melange of the corporate and the creative. IBM even sponsored dinner one night, which means IBM bought us all a beer! The HP booth was right next to the RedHat booth, and if there were any ill feeling about the fact that RedHat is pulling Itanium support with RHEL 6, it was not in evidence there. Of course, with Nehalem-EX and AMD's 6000 series, there is not much reason to miss Itanium either.
One of the cognitive dissonances for me came from all the references to the new memory and processor addressing limits of RHEV 6 over RHEV 5.5. I live in what they call the "upstream" kernel space so much on the desktop that I forget the RHEV 5.5 starts with kernel 2.6.18, and then starts backporting things from later kernels. In fact, the whole kernel number system really does not have much meaning in RHEV. One of the basic limits that appears to have been in place though were those surround how many processors, how much memory, and how well NUMA was handles. I imagine that would have just been way to hard to pull out of the upstream kernel because the changes are everywhere. With 2.6.32 as the new starting place, all sorts of new capabilities are enabled.
The last paragraph is *not* a criticism of RedHat, though I can see how it might read as such. That complication is the cost of providing Enterprise level, tested in every possible way code, so that it can be supported for the very most critical applications, be they medical or something that underlies the very heart of a stock market computing process. This is not just even about support per-se, but how long it is supported and supportable. RHEV 6 will have a seven year life span! Sure, it support 4096 cores now, and that seems like a lot, but seven years from now? One session I went to was also very careful to point out theoretical limits versus tested ones. In truth, we don't really know how many CPU's Linux can support in practice because the kernel does not really put a limit on it. Just money does, and I can not buy a 4096 core system from anyone yet... and probably not for a few years.
The "tickless" kernel is another example of a huge change for RHEV 6 that actually we have had upstream for a while. Sure, it changes all sort of things, not the least of which is how one might go about figuring out time related problems. The old days of the 1000-times-a-second loop are yesterday, and the quantum that was based off it is too. It's a brand new world of scheduled interrupts for everything. Sure, for mainframe or VMS folks that is old news, but it is a huge change with subtle effects all through the kernel. I went to two back to back session about problem determination in this new world, and it was barely able to scratch the surface of all the new things this, and other changes to RHEV 6 bring to the table.
One stat alone should tell the tale, at least for the permutation minded of you: RHEV5 was about 1500 or so packages. RHEV 6 is 3000+.
That is not a simple doubling of packages. That is orders of magnitude more permutations that need to be looked at and tested.
My main focus while at the summit was around KVM, and to a lessor extent, VDI. I had been looking at this for a while, but I viewed the summit as a chance to take a deeper dive. While there I got a chance to talk to some of the people that helped write KVM, and the management tools around it, and I am impressed. I have been a virtualization guy since 1980, and there are things here that both reminded me very much of my roots (such as AMD and Intel's microcoded page table assists) and leveraging the modern such as the data-de-dup like Kernel Samepage Management (KSM). It was also clear that we are still think in more mainframe like terms here with our virtualization servers, since we just switched to Dell R810's with 256GB of RAM, and folks I talked to at the conference were saying their sweet spot was 32 GB of RAM. And a PS about KSM: One person in RH mentioned that calling KSM "Kernel Samepage Management" was a bit of a struggle, and that others used different words like "Kernel Storage Management".
Either way, KSM is one of the big big wins of RHEL 6 / RHEV. It just makes too much sense. If the page is the same, just keep one copy. It does not matter what type of Guest VM it is. If the page matches, just keep one. This is huge for things like Windows, that zero memory at boot, so they have lots of pages with recent writes that are all zeros, but Windows isn't the only guest that can have a page like that, and it should not matter.. and to KVM and RHEV, it does *not* matter. It is simple, and it is elegant. It should also help safely overcommit memory without a lot of jumping through hoops with special memory managers and guest device drivers or modified guests, etc.
I will be starting on a test deployent of KVM in the near future, and that will be my next "Adventure" in Linux. Read about it here...