Track-It! Database 101: Transaction Logs and Recovery Model

Version 2

    Sometimes we'll recommend, or be asked about, shifting the Track-It! database's Recovery Model from Full Recovery (the SQL default) to Simple Recovery.  I thought I'd put a simplified explanation of it here.  (I know DBAs who read this are probably going to be cringing after this discussion, but it's close enough to get the point across!)


    Your "database" is made up of two files; a data file, and a log file.


    Think of the data file has holding the tangible data as it appears right now; and the log file as holding the changes that took place to get it to this point.  The SQL engine locks these files and keeps them synchronized whenever the database is mounted for use.  When you do a full database backup, SQL puts an index mark in the log file to say "a full backup occurred here".


    Under the default Full Recovery model, pretty much every transaction is recorded in the transaction log and held there.  In the Full Recovery model, the transaction log can grow to huge values if you don't periodically back it up (We've seen values of over 100 Gigs!).  Effectively, backing up the transaction log also truncates it, which causes everything prior to the last "full backup occurred here" index mark to be purged.


    It should be noted that the actual space taken up by the transaction log in Windows will not be reduced after the transaction log backup is performed unless you physically shrink it, but the size of the transaction log alone isn't necessarily an indication of a problem. It is the amount of free space within the transaction log that matters. For example, just because you have a 5 GB transaction log does not necessarily mean SQL server's performance will suffer. If you have 99% free space within it, SQL will be able to use the log's space efficiently. It should also be noted that it is always better to size the log on your own rather than allowing SQL Server to grow it automatically because that process can be very expensive and can eventually lead to poor performance. The appropriate size of the transaction log varies with each environment depending on the type of work being done, but the typical recommendation is that the size of the transaction log be set to around 25% of the size of the database.


    Shifting to the Simple Recovery model alleviates the requirement of doing regular transaction log backups. The transaction log can still grow in size depending on the work being done, but SQL Server will automatically truncate the log at a checkpoint, freeing the space within it again. The drawback to using the Simple Recovery model is that you cannot do "point-in-time" data recovery in a disaster recovery situation.



    So, let's discuss disaster recovery when using the Full Recovery model:


    In theory, you could take last night's full database backup and restore it into a functional SQL system.  Then you could take the transaction log from right now (or just prior to the catastrophe), start at that index mark from last night's full backup, and "fast forward" right up to how your data looks at this very moment.  This is a point-in-time recovery, and using this method can allow you to recover from any catastrophe, bringing your data right up to the way is was before the catastrophe.



    Now, let's contrast that to the Simple Recovery model:  Since the transaction log does not contain everything that happened, it is, for all intents, worthless to you as a recovery tool.  You're limited to restoring your data from last night's backup, with no ability to "fast forward" to the point of the catastrophe.  Any new data that's been added since the backup was made is not in the database, and will have to be manually re-created.



    So, the logical choice would seem to be, stick to Full Recovery!  Except, if the catastrophe that takes out SQL is a drive failure, and your transaction logs are on that drive, you're right back to using only last night's full backup and having to manually re-create everything that occurred since then.  (You DO store your nightly backups on a separate system, right?)  Now, you can establish "transaction log shipping", which has the SQL Agent automatically make backups of the transaction log at intervals (say, every 10 or 15 minutes) and move those off to a NAS device or network path.  This will allow you to still have those logs, plus last night's backup, and you can recover right back to the point-in-time of the catastrophe.  This process of transaction log shipping will require that you have a fairly robust SQL server, as it's doing a lot more than just hosting your database now.  Since this transaction log shipping uses the SQL Agent, you must use a full SQL edition (Enterprise, Standard, etc); you can't do this in the free Express edition, as it does not have an SQL Agent.  Transaction log shipping also requires that you have adequate NAS or network storage, and a reliable connection between that and your SQL server.  And finally, it adds a significant layer of complexity to both your regular backup process, and your restore process should it ever become necessary.


    Now, if you're a government agency or financial corporation, and you have databases that contain data that's absolutely business-critical and subject to review from outside agencies, this point-in-time recovery is mandatory.  You already have this established for other databases, so adding Track-It! to that process is a minor effort.  (You also have DBAs who make very large salaries and are laughing at this discussion, if they made it this far.)


    However, for the average Track-It! user, Full Recovery and point-in-time restorations may impose time and equipment costs that are not justified.  In this case, shifting to the Simple Recovery model wouldn't come with any negative trade offs; in the event of a catastrophe, you restore last night's backup, and manually recover anything that occurred since.


    Once last word of caution: The data and log files are locked by the SQL engine when they are "spun up" or mounted.  This means that if you use a tape backup process that does not explicitly contain an "SQL module" or an "open file module", it probably does not understand the critical timing and synchronization that takes place between the two files, and making a tape backup of the drive that contains the database (or right clicking on the files and having Windows do a "copy" of them) will probably NOT result in a backup that you can restore.  If there's any doubt, dust off the backup that you made last night, and restore it under a temporary name.  If it restores and SQL doesn't give you any errors, then you should be ok.