The Tru64/TruCluster based fileserver has served us well as the R&D Highly Available central fileserver. But, as the Tru64 Unix OS has been put on limited support by HP and the hardware is now seven years old, it's time to retire the system and replace it with current technology.
Today I want to start on a new series, and a different approach to some of the Internal R&D Support projects I post about in this blog. What I have in mind is to start talking about an in-flight project, warts and all, so that what I end up with is truly an open conversation about this project. Previously I have always reported things once we knew the outcome. This way, some of the process itself will be under open discussion.
The other thing that is new.. well new-ish, is that most of this post comes from the person that is actually doing the work. In this case, our "Master Abuser of Network Protocols", Dan Goetzman.
There are about a zillion posts about the server we are replacing here in "Adventures", going all the way back to the beginning. Way too many to link here without creating something that looks like a table of contents. The summary below gives some of the history, so hopefully that will serve to level-set things.
Dan started by defining the project on our Wiki thus:
- Support NFS file serving
- Support CIFS/SMB file serving
- High Availability
- High Performance
- Cost Effective
Leveraging the Past
For several years now, R&D Support has been building cost effective file servers using commodity hardware and using Linux. We have evolved these to the most recent "generation 2" designs, nicknamed "Snapple". A summary of the "Snapple" based design;
- Linux - Currently using Fedora Core 6 as it has proven to be a stable NFS server
- OS and DATA separation - OS is mirrored on internal hard drives, DATA is on the shared SAN storage
- XFS filesystems for user data - High performance and scales well
- Sun X2200 servers
- SAN shared storage
- Apple XServe Raid storage subsystems
The only item not addressed using the Snapple [Sun and Apple hardware: Snapple. We are so punny - Steve] based fileservers is high availability. The Snapple configuration allows for a spare server on the SAN to be manually switched to recover a failed server to allow rapid recovery of services. But, that is short of a highly available solution.
Designing the Future
To address the manual service failover versus automatic service failover, it seems logical to look at Linux cluster technology. As the central R&D fileservers are considered a production service, is also seem to make sense to look at the "enterprise" Linux distros. Red Hat Enterprise Linux, or in the case of this project, the CentOS variant of the Red Hat EL distribution. The new design starts to look like;
- CentOS 5.0 with Cluster Suite and Cluster Storage Support
- OS boot drives will remain simple "md" mirrors on the internal disks in the server heads, not under any logical volume manager.
- DATA filesystems will be GFS, as the cluster is simplified if a cluster "parallel" filesystem is used.
- Shared SAN for storage, using dual SAN switches operating as a single fabric.
- Apple XSR storage, decent performance at a great price for SAN based storage.
The NFS service will become a cluster service that the CentOS cluster will make highly available. The Samba CIFS/SMB service can also be another cluster service, configured to run on another cluster node by default.
Optional / Later
Storage Virtualization. With the shared storage on the SAN, using a true parallel cluster filesystem, the next step is to look a virtualizing the storage. Advantages with storage virtualization include;
- Remote replication of data at the SAN block level
- Mirror data at the SAN block level
- Create storage classes that can align data with class of service on the storage farm.
We have, available to us, IBM SVC (Storage Virtulization Controllers) subsystem that we can insert into the SAN to test and qualify the storage virtualization option. After initial testing using normal "Zoned" SAN storage LUN's we will insert the SVC and virtualize the storage and then compare functionality and performance.
Wikis are great for this kind of thing. I can see what is happening when I am ready, and I can fix spelling errors if I notice them. Unfortunately for both Dan and I, English is not our native language. Neither is anything else either. We do what we can....
A quick note about the storage virtualization bit: We pulled that out of the initial pass to minimize variables. Once we know we have a working solution, we'll layer that in, because this is how we plan to enable some advance features like block replication across the WAN. All that comes later though.
Once the project was defined, The "Server Beater Most Excellent" (Dan: He has a lot of titles) went to work. We bought the hardware, assembled it, had some internal discussions, and decided that the first pass at this new server would be CentOS 5 based, and leverage the Cluster LVM and GFS to make fail-over between the three Sun X2200's easy.
Well, we had hoped it would be easy. The Wiki problem tracking page for the project currently looks like this:
NFS test CentOS Cluster - Problem Tracking
NFS V2 "STALE File Handle" with GFS filesystems
Only using NFSV2 over a GFS filesystem!
NFSV3 over GFS is OK. NFSV2 over XFS is also OK.
From any NFSV2 client we could duplicate this;
- cd /data/rnd-clunfs-v2t - To trigger the automount
- ls - Locate one of the test directories, a simple folder called "superman"
- cd superman - Step down into the folder
- ls - Attempt to look at the contents, returns the error:
ls: cannot open directory .: Stale NFS file handle
Note: This might be the same problem as in Red Hat bugzilla #229346
Not sure, and it appears to be in a status of ON_Q, so it is not yet released as a update. If this is the same problem, it's clearly a problem in the GFS code.
NFS V2 Mount "Permission Denied" on Solaris clients
This problem was detected on a previous test/evaluation of Red Hat AS 5 and expected with CentOS 5.
Certain Solaris clients, Solaris 7, 8, and maybe 9 fail to mount using NFSV2. Apparently the problem is a known issue in Solaris where the NFS server (in this case CentOS) offers NFS ACL support. Solaris attempts to use NFS ACL's even with NFSV2 where they are NOT supported.
The correct behavior is to have the Solaris clients NOT try to use NFS ACL's on version 2. This problem has been fixed on more recent versions of Solaris (like some 9 and 10+).
And that is where we are at! We don't have all the answers, and in this case, not even all the answers on what will happen next here. Lots of questions though.
We know that the bugzilla bug on GFS is still open, and that probably means that we'll have to take GFS out of the equation, at least for now. That is not good, since that means we'll have to script the NFS and CIFS failover. Yuch.
More on this as it unrolls: Let me know if you find this approach too talking about this project interesting (rather than just summarizing it at the end, once everything is all known and decided)