We can divide DB2 DBAs into three categories: those who have come to DB2 later in the life of our favourite mainframe DBMS, those who worked with DB2 back in the heady days of versions 1 and 2, and some who (like me) fall into both camps. We remember the old days, but we also see how our world of databases has changed.
Not only has DB2 itself changed, but the infrastructure within which DB2 resides has also changed – and in some ways, it’s changed dramatically. For example, the way DASD storage is provided and how it is recognized and managed by the operating system.
When I started my career as a DB2 DBA, you could actually go into the machine room (if you were allowed) and look at boxes of DASD. We could literally SEE the volumes that we were used to looking at with LISTCAT or with ISPF 3.4. Because this allowed us to visualize how our DB2 data was laid out, we could foresee problems that could be caused by unthinking placement of data sets.
Do you remember the “rules” that suggested NEVER putting a table space AND its indexes on the same volume? This was to prevent “head contention” as DB2 tried to read index leaf pages and then table space data pages all at the same time. We tried very hard to ensure that tables and indexes never coexisted on the same DASD volume.
We also had to worry about the number of extents in which a data set existed. The maximum number of extents was fairly limited in those days, and it was startlingly easy to run out of extents and get a DB2 extend error as DB2 tried, and failed, to find more space to store data. As the company I worked for was growing at an alarming rate, I found myself being called out to fix data set extend errors far too often (and why did they always happen in the middle of the night?). So I developed a trick to keep myself ahead of DB2. If a data set needs to be extended, DB2 tries to extend BEFORE the space is actually needed; if it can’t be extended, a warning message is issued. I used our automated operations tool to look for those messages, extract the jobname from the message (it always seemed to be a batch job that was responsible), and change the job into a performance class that was permanently swapped out. This stopped the job in its tracks before it could fail and gave me the time to move other things around on the volume to make space so that the extend could complete. Then I’d swap the job back in and no one was any the wiser.
If you are relatively new to DB2, you are probably wondering if I am making this stuff up. But I suspect that those of you who have been working with DB2 for years are gradually realising where some of our more esoteric DASD management rules have come from.
DASD doesn’t work like this anymore. From a z/OS (or DB2 perspective) things don’t look any different. We still give DASD volumes “meaningful” names, and we can still look at data sets with LISTCAT and ISPF 3.4. But what we are looking at is an illusion.
What is sitting in the machine room now doesn’t look like a collection of 3390 volumes. It’s now an intelligent array of inexpensive (relative term) disks that are pretending to be the DASD that we are used to seeing. These arrays use advanced management techniques (like RAID) to ensure that they can recover from most hardware failures. (How many readers have experienced a head crash?) The arrays provide significant amounts of cache memory to aid in the speed and responsiveness of the devices. I used to aim for around a 20 millisecond response time from my DASD – now people should be aiming for a fraction of that. And these virtual devices can have significantly higher capacity than the 3390s of old – in excess of 50 times the capacity for some models!
As z/OS has evolved, the maximum number of extents a data set is allowed to have has also been creeping forever upwards. I used to be limited to 15 extents, but now you can have hundreds of extents (and data sets that can happily go multi-volume as well).
But – wait a minute! Didn’t we just say that the physical DASD doesn’t exist in the way that we are viewing it from z/OS? What does that mean to extents and sizes? And what about contention? Does all that caching relieve us from the need to spread our data sets around?
What we see from a z/OS view today is actually a figment of the operating system’s imagination. A DASD administrator will define “z/OS volumes” and how those volumes map to his storage array devices. Any one volume will, in all likelihood, be spread across a large number of those inexpensive disks. And, for redundancy, multiple z/OS volumes may share space on the same disks. So our first problem would be to determine exactly which z/OS volumes are sharing space in the disk array. It would make no sense to separate our table spaces and indexes if the VIRTUAL 3390s we put them on ultimately mapped to the same disks in the storage array. In reality, it would be impossible to do that; from the z/OS side, there is no control of where a specific data set goes, and it may move over time. Of course, because of the caching that happens, we probably don’t have a performance problem to worry about anyway.
Some of you are probably thinking, “So, do extents matter anymore?”
As far as performance and DASD management goes, they probably don’t. The extents you see in LISTCAT or ISPF 3.4 are virtual. They are a description of how z/OS thinks the data sets are laid out on disks that are only pretending to be 3390s. From a physical perspective, it is not possible to see from z/OS exactly HOW the data sets are laid out – and it hardly matters. Granted, if you have a data set in hundreds of extents, you may see a small performance degradation, but for normal numbers of extents it would be hardly noticeable. What does matter, though, is that the total number of extents for a data set is still limited – even though they are only virtual extents. In other words, it’s still possible for a dataset to still fail to extend because it has reached the limit of how many extents it can grow to, and this is something that you do need to keep an eye on.
Both z/OS and DB2 have been trying to make the management and sizing of data sets more of a science than an art. I remember using my Supercalc (remember that?) spreadsheet to calculate DB2 table space sizes based on column sizes, nullability attributes, and the like (with the usual healthy guesswork on the expected number of rows). Even then, I found it irritating that I had to create data sets of a certain size when work I was doing on my PC just need me to create a file and it would just
z/OS is making allocation of new extents work more intelligently. I got a little lost in this sentence. Would this alternate wording work? Who has looked at a LISCAT of a table space to discover that the first extent is not the same size as the primary quantity defined and that the secondary extends are all different sizes? When z/OS needs to allocate a new extent for a data set and that extent is EXACTLY adjacent to the prior extent, the current highest extent is just made larger. This works with both the primary and secondary allocations. (But z/OS is still playing with VIRTUAL extents that just look as if they are adjacent). Because of this extent consolidation, it is possible for a data set to grow to a larger size than a simple calculation of (Primary Quantity + (n times Secondary Quantity)) would suggest. It also makes it possible to copy a data set from one volume to another and have the target data set run out of extents, even though the source data set is just fine.
So…just how many extents can a data set extend to these days? Well, in the best traditions of DB2 “it depends” – on how you create the table space/index space in DB2 and what SMS attributes are defined for the classes that the pageset is created in.
DASD has changed, and we need to change along with it. Don’t try to manage DASD the same way we did back in the old days. Instead, understand the implications of the way disk storage is created and managed now and revise your DB2 physical management rules accordingly.
I'd be interested in hearing from you if you have modified your DB2 pageset management rules based on the new world of virtual DASD