Backups, rsync, and –link-dest not working

Backups, rsync, and –link-dest not working

I use Retrospect to backup most of the machines at GAEL. You may wonder why do I use a commercial tool that still shows it’s OS 9 roots, instead of open source alternatives. Well, Retrospect has some cool advantages (namely the very good support of laptops that may be disconnected abruptely from the network while a backup is in progress). Also, when I first did this setup, Amanda and other tools did not work reliably with Mac OS X file format.

While this works to backup all the desktop workstations and laptops of GAEL members, I have a problem with our xServe. It runs Mac OS X Server, and Retrospect will not backup machines with the Server version of Mac OS X with the license we have. To do that, we would have to buy a much more expensive license.

No problem. A server, due to it’s nature, doesn’t have the “sudden disappearing” problem of the laptops, so I can use a “classic” UNIX approach – and my choice was rsync and the –link-dest option. You may read about this option in the rsync manpage, but in case you don’t know, what it does is the following: instead of synchronizing a directory in the usual way, it will create a new directory with a new file tree. But, to save space, it won’t copy the non-updated files from the old tree to the new one. Instead, it creates hard links, so that both entries in the file system point to the same data on the hard drive (to the same inode), thus saving space. So, everytime you update your backup, you will create a new tree, but you will only waste the space required by the files that were updated since the last backup, and some more space for the filesystem structures that support the directory tree. You can use a command like this:

// rotate old dirs
rm -rf /Volumes/Storage/test/test.5
mv /Volumes/Storage/test/test.4 /Volumes/Storage/test/test.5
mv /Volumes/Storage/test/test.3 /Volumes/Storage/test/test.4
mv /Volumes/Storage/test/test.2 /Volumes/Storage/test/test.3
mv /Volumes/Storage/test/test.1 /Volumes/Storage/test/test.2
mv /Volumes/Storage/test/test.0 /Volumes/Storage/test/test.1

/usr/bin/rsync --rsync-path=/usr/bin/rsync -az -E -e ssh --exclude=/dev/* --exclude=/private/tmp/* --exclude=/Network/* --exclude=/Volumes/* --exclude=/private/var/run/* --exclude=/afs/* --exclude=/automount/* --exclude=/.Spotlight-V100/* --link-dest="/Volumes/Storage/test/test.1" "" "/Volumes/Storage/test/test.0/"

Side note: the -E option (capital E) is an option present on Mac OS X rsync version, that forces rsync to copy all the extended Mac file system attributes, including resource forks. It only exists in Mac OS X 10.4 (Tiger) or newer versions. If you are still using 10.3 (Panther) or older, use rsyncx. Do not use rsyncx with Tiger.

Until about a week ago, my backup machine (an old PowerMac G4) had an external SCSI Raid with 640 GB, and an internal RAID 0 (2 * 80 GB drives), besides the boot disk. All the Retrospect backups were being placed on the external RAID, and the server backups were going to the internal RAID 0. Now, I know it’s living on the edge to backup to a RAID 0. But there was really no more space, and it was a temporary situation, because the new drives for the external RAID were already ordered.

When the new drives arrived, I stored all the backups where I could for some days (640 GB was huge when we purchased the RAID, but today is relatively managable), switched the drives and created a new fresh RAID 5. Formatted it in the HFS+ file system, and copied back all the backups, including the server backups and finally trashed the internal RAID 0.

Some path adjustements on my server backup scripts, and we are back in business. But the RAID free space was getting dramatically shorter every day. I used the ‘ls -i’ command to compare the inodes of files that were supposed to be unchanged from the backup of a day to the other in the next day, and as I suspected, rsync was duplicating all the files, instead of hard-linking them.

After Googling a lot, I could not find answers for this. I tried to see if ‘cp -la’ would successfully create hard links, but to my surprise, I found out that the Mac OS X built-in ‘cp’ command would not support the “l” option. Nice. Before installing the GNU ‘cp’ version (and because I’m lazy and I didn’t want to do that) I started thinking about everything I had done since the new drives arrived. The OS was the same, the rsync command was the same, it worked before, so it had to work now. The only reason why it could be not working was because rsync, somehow, thought that all the files were changing, even when they did not.

Suddenly, the solution poped up in my head. Mac OS X has an option, associated to every HFS+ volume, called “Ignore ownership on this volume”. This is turned off by default on the boot drive, but it’s turned on by default on all the external drives you format. There’s a good reason for this: Mac OS X is a consumer product. And average users want to buy an external drive, store data on it, bring it to another Mac, and read their data. They don’t care if their UID is the same on both machines or not.

But this causes serious problems to rsync. Althought the file system will store the owner of the files, it probably won’t report it to the applications who try to read it (or will mask them to the user who’s trying to access them). Somewhere this information is filtered, between the file system and application layers. So, rsync was not getting the real UID of the files. As the files that came from the server had real UIDs, both UIDs wouldn’t match, and rsync would create a new copy because, from it’s point of view, the file had been changed.

The solution was simple – just going to the machine console, and “Get Info” of the external volume. I turned off the “Ignore ownership on this volume” setting, and rsync started operating normally again.