Checkpointing Logical Disc Driver

 

This is still in its early days - it's experimental code and it may just bite you. On the other hand if you want to try it, you can see what progress has been made on solving one of the problems which every sysadmin knows only too well.

And what is this famous problem?

It concerns backups. Do you throw all the users off while you backup the system? Maybe there is some nicer solution in your application like making access to the databases read-only but in most cases you must choose:

Limit user activity in order to obtain a consistent and complete backup - or run the risk that the backup is inconsistent because you are backing up a moving target. In reality backups done while users are on a system are usually useless in a real emergency - some files will be missing - file A will not agree with file B etc etc. It takes such a long time to sort out the mess that you may question why you made backups like this. Many sites resort to doing a once a week complete backup with all users off the system at the weekend and then risking "moving target" backups during the week.

CLDD is at least a partial answer. It interposes itself between the user applications and the real disc driver.

The first step in using CLDD is to associate it with a real disc partition. From then on all disc access is via the cldd device. So the next step would be to create an ext2 filesystem on the cldd device. Use this in the same way you would use the real disc partition.

In fact CLDD creates two devices which we call "master" and "slave". So far in this description we have only used the "master" device.

When a consistent backup is required you still have to get all users to stop working. But now instead of backing up the data (which of course takes time) you umount the "master" device, enable the "slave" device, mount the "master" and allow the users on again. This should take only a few seconds.

The "slave" device now gives you an image of the "master" device as it was at the time the "slave" was enabled - i.e. umounted. You can backup the "slave" device at the same time as the users continue their work. When the backup is finished (and checked) you disabled the "slave".

What are the disadvantages? Besides the fact that it is still experimental the main problem is that only half the disc space is usable. If you have a 2Gb partition then only 1Gb is available for the ext2 filesystem. The reason of course is that in the worst case the user applications could change every single block while the slave is enabled and CLDD needs to store the changes somewhere! At least disc space is much less expensive than it was a few years ago.

 

Download CLDD version 0.2

 

August 1999 - Allan Latham alatham@flexsys-group.com