rsync+hfsmode

last updated: 18Dec2005

rsync+hfsmode is a patch for rsync to enable recognition of Mac OS X HFS+ resource forks and Finder metadata, and to copy them to a remote filesystem. The destination system can be any OS and filesystem that supports rsync, so you can use rsync to archive Mac OS X files to servers running Linux, Solaris, et cetera.

Q: OK, what do I do with it?
A: Marion Bates has written a great practical explanation of how to use rsync+hfsmode in a backup strategy: http://whoopis.com/howtos/rsync-hfs-howto/

Update (27Mar2008): Mainline rsync (3.0.0) has added support for ACLs, extended attributes, filename character-set conversions, and resource forks! See the rsync-3.0.0 release notes, search for "xattrs".

I haven't tested it, but reports are that it has the same limitations as rsync_hfs and Apple's recent rsync patches, namely that the receiving side must be HFS+ or HFSX to store the meta and additional data. If true, then this hfsmode patch is still relevant -- but it will require significant updates to take advantage of these new features (if false, of course, this patch becomes happily obsolete!). The ETA for any such updates, if performed by me, is murky.

The patch against rsync-2.6.3 should continue to work properly with newer versions of rsync, though this too is presently untested.

If you have any data to share, please let me know, and I'll summarize here.

Thank you!


Files Available

rsync-2.6.3+hfsmode-1.2b2 (stable) rsync-2.6.6+hfsmode-1.3a1 (alpha! not recommended!) To apply the patch and build rsync-2.6.6 (requires Mac OS X Xcode Tools!)
   bash$ tar xzf rsync-2.6.6.tar.gz
   bash$ patch -p0 -u < rsync-2.6.6+hfsmode-1.3a1.diff
   bash$ cd rsync-2.6.6
   bash$ LDFLAGS="-framework CoreServices" ./configure
   bash$ make proto
   bash$ make man
   bash$ make
NOTE: make man will fail, unless by some coincidence you have yodl2man installed. Don't worry about it.


The Problem

Mac OS X uses the HFS+ filesystem, by default. HFS+ files are often composed of a data fork, a resource fork, and Finder metadata. The data and resource forks contain what you normally think of when you think of file information: data, program code, etc. Finder metadata includes information like file type and creator, comments, modification dates, locked and invisible status, and Finder colors.

Traditional UNIX filesystems only store a single stream of file data (the HFS+ equivalent of the data fork). Mac OS X (or Darwin, more precisely) is a genuine BSD UNIX, but with a nontraditional filesystem. Because of this, standard UNIX tools can only see certain portions of OS X files.

The difficulties caused by HFS/HFS+ aren't new. Early Mac users of BBSes and the Internet had the same problems when uploading files from a Mac for storage by other operating systems. Since the foreign OS had no way to store the additional HFS data, the uploads would be incomplete/useless. To solve this problem, Apple and others invented conversion formats that collected the full set of file information into a single data stream. Examples (some with varying and/or additional design goals) include AppleSingle, AppleDouble, MacBinary, BinHex, and Stuffit.

The relatively new development is OS X. Now that Mac users can run decades of software written for traditional UNIX machines, some of that software needs to be updated to work properly with HFS+ files.

One such standard UNIX tool is rsync, an excellent file synchronization utility which is also great for use as filesystem backup software. Rsync builds and runs without errors on OS X, but because it is unaware of resource forks and other metadata, it creates incomplete (and therefore corrupt) backups.


Possible Solutions

There are a few ways to approach the problem.

rsync_hfs and RsyncX, by Kevin Boyd, send the whole file (both forks and metadata) to the destination system, but have no provisions for saving the "extra" data unless the destination filesystem is also HFS+. This is great, but it requires OS X on both sides, which doesn't fit my needs.

Another method frequently suggested on mailing lists and web forums is to run a script which copies the resource forks (and sometimes metadata) into new filenames on the source filesystem before beginning the synchronization. This works, but it doubles the disk space required for resource forks on the source machine. Also, most of the methods suggested don't preserve Finder metadata.

I think extensive pre-processing and source filesystem modification are awkward solutions to a simple problem -- there's no justification for a backup process that requires a significant amount of free disk space on the source system to run.


My Solution

This patch will make rsync HFS+ metadata-aware. Resource forks and Finder metadata are assembled on the sender into an ephemeral file in standard AppleDouble format, before being sent to the destination.

This method preserves disk space on both sides, with zero redundant data and only a small amount of overhead per file (~100 bytes of AppleDouble headers for each file that has a resource fork and/or HFS+ metadata). It works with any destination filesystem and operating system (tested with Solaris and Linux), and even with older or unpatched versions of rsync.

This is currently a one-way, backup-only process. To restore from the backup to an HFS+ filesystem, you will have to reassemble the component files (see instructions below).

Currently, this patch will not transfer metadata for directories! This may be added in a future revision, but I don't consider it high priority. Let me know if you do, and I might reconsider. Directories don't have types or creators, so the only metadata is the Finder flags (and only a few of them are relevant for directories). A description of the flags data structure can be found in Apple's Finder Flags Reference.


Filename mapping details

For the sake of simplicity, and by common request, this patch uses the Netatalk naming scheme by default.

For a source file named filename, the destination filesystem will store two files: filename, containing the source file's data fork, and ._filename, containing the resource fork and Finder metadata, in AppleDouble format.

There are at least four naming "standards" for AppleDouble files, but the Netatalk scheme seems to be preferred.

Available hfs-modes

hfs-mode destination filename(s) / format
metadata transferred
appledouble
(preferred)
filename / (none/various)
  • Data Fork
._filename / AppleDouble v2 RFC1740
  • Resource Fork
  • Finder info: type/creator, flags (color,locked,stationery,etc)
These AppleDouble files are identical to those generated by Apple's SplitForks, and compatible with FixupResourceForks. See below.
appledoublex filename / (none/various)
  • Data Fork
._filename / AppleDouble v2 RFC1740
  • Resource Fork
  • Finder info: type/creator, flags (color,locked,stationery,etc)
  • Finder comment
  • Finder create/modify/backup/access dates
  • Finder native filesystem filename
These AppleDouble files are NOT compatible with FixupResourceForks. See below.

Reassembly of restored files

Apple's /System/Library/CoreServices/FixupResourceForks will reassemble two AppleDouble component files into a single HFS+ file. However, it only understands certain "entries" in an AppleDouble file: the entries created by /Developer/Tools/SplitForks, and for compatibility, the entries in appledouble mode.

When FixupResourceForks encounters a valid AppleDouble file with entries it doesn't understand, it will not reassemble the files! It will report a print an error message and exit. I haven't found a standalone tool that will handle all AppleDouble entries (or even the five I think are still relevant in OS X and useful enough to include in mode appledoublex). I'll probably write something to fix this some day, but for now you should use appledouble mode unless you have an alternate plan (and if you do, please tell me about it!).

   bash$ ls -a
      ._filename
      filename
   bash$ /System/Library/CoreServices/FixupResourceForks filename
   bash$ ls -a
      filename

NOTE: FixupResourceForks works recursively, so to reassemble a large number of files just specify the directory name (or ".") in place of filename, above. FixupResourceForks will reassemble each pair of AppleDouble files beneath that point.


Caveats

If your backups are important, test them regularly! This is good advice regardless of your choice of backup software, and especially so in this case.

Although rsync has proven itself to be a reliable and essential UNIX tool, it has seen less use on Mac OS X, for reasons which should be obvious by now. New problems may be discovered, such as the /dev/fd issue mentioned below. Historically, rsync bugs tend to get fixed quickly.

I'm not an official rsync developer and this is not an official part of the mainline build. I've learned a lot about the internals of rsync, but there's no substitute for an expert.

Doom and gloom aside, I fully expect this patch to work for you. It works for me, and reports from the field say it's working for other people too. I use it every day to back up a few Mac OS X desktops and powerbooks to Linux and Solaris fileservers, and I trust it enough that it is my only backup method.

Hopefully, at some point in the future, the official rsync release will be able to handle HFS+ (and NTFS) metadata, but until then I hope to keep this patch useful and updated to a reasonably current version of rsync.

Some rsync operation modes (e.g. batch-mode) probably need special (unwritten) handling for the hfs-mode switch. I don't use them, so I haven't tested them.

Combining --hfs-mode with --hard-links would cause file corruption on the destination filesystem. The current version of the patch will print an error message and exit if you try. Future versions might attempt to solve this problem more elegantly.


Examples

To back up your home directory:

   bash$ rsync --archive --delete --verbose \
               --hfs-mode=appledouble \
               --delete-excluded \
               --exclude=.Trash \
               --exclude=Cache \
               --exclude=Caches \
               /Users/username fileserver:/backups/username
To back up your whole machine:
   bash$ rsync --archive --delete --verbose \
               --hfs-mode=appledouble \
               --delete-excluded \
               --exclude=/dev/fd \
               --exclude=/.Trashes \
               --exclude=/Network \
               --exclude=.Trash \
               --exclude=Cache \
               --exclude=Caches \
               / fileserver:/backups/machinename
NOTE: Most of the exclusions above are arbitrary, but the --exclude=/dev/fd is important! Rsync has trouble with some of the special files in that directory, when the source directory is /, and gets stuck in an infinite loop while creating the sender file list. The problem doesn't occur when any other source directory (even /dev!) is used. The fix has nothing to do with hfs-mode specifically, so it might require a change in mainline rsync. Until it's fully resolved, the exclude above should work.


To Do


Special Thanks To


Contact Info

If you like this (or hate it), send a message and let me know: reynhout@quesera.com

thanks