rsync+hfsmode is a patch for rsync to enable recognition of Mac OS X HFS+ resource forks and Finder metadata, and to copy them to a remote filesystem. The destination system can be any OS and filesystem that supports rsync, so you can use rsync to archive Mac OS X files to servers running Linux, Solaris, et cetera.
Q: OK, what do I do with it?
A: Marion Bates has written a great practical explanation
of how to use rsync+hfsmode in a backup strategy: http://whoopis.com/howtos/rsync-hfs-howto/
Update (27Mar2008): Mainline rsync (3.0.0) has added support for ACLs, extended attributes, filename character-set conversions, and resource forks! See the rsync-3.0.0 release notes, search for "xattrs".
I haven't tested it, but reports are that it has the same limitations as rsync_hfs and Apple's recent rsync patches, namely that the receiving side must be HFS+ or HFSX to store the meta and additional data. If true, then this hfsmode patch is still relevant -- but it will require significant updates to take advantage of these new features (if false, of course, this patch becomes happily obsolete!). The ETA for any such updates, if performed by me, is murky.
The patch against rsync-2.6.3 should continue to work properly with newer versions of rsync, though this too is presently untested.
If you have any data to share, please let me know, and I'll summarize here.
Thank you!
bash$ tar xzf rsync-2.6.6.tar.gz bash$ patch -p0 -u < rsync-2.6.6+hfsmode-1.3a1.diff bash$ cd rsync-2.6.6 bash$ LDFLAGS="-framework CoreServices" ./configure bash$ make proto bash$ make man bash$ makeNOTE:
make man will fail, unless by some
coincidence you have yodl2man installed. Don't worry about
it.Mac OS X uses the HFS+ filesystem, by default. HFS+ files are often composed of a data fork, a resource fork, and Finder metadata. The data and resource forks contain what you normally think of when you think of file information: data, program code, etc. Finder metadata includes information like file type and creator, comments, modification dates, locked and invisible status, and Finder colors.
Traditional UNIX filesystems only store a single stream of file data (the HFS+ equivalent of the data fork). Mac OS X (or Darwin, more precisely) is a genuine BSD UNIX, but with a nontraditional filesystem. Because of this, standard UNIX tools can only see certain portions of OS X files.
The difficulties caused by HFS/HFS+ aren't new. Early Mac users of BBSes and the Internet had the same problems when uploading files from a Mac for storage by other operating systems. Since the foreign OS had no way to store the additional HFS data, the uploads would be incomplete/useless. To solve this problem, Apple and others invented conversion formats that collected the full set of file information into a single data stream. Examples (some with varying and/or additional design goals) include AppleSingle, AppleDouble, MacBinary, BinHex, and Stuffit.
The relatively new development is OS X. Now that Mac users can run decades of software written for traditional UNIX machines, some of that software needs to be updated to work properly with HFS+ files.
One such standard UNIX tool is rsync, an excellent file synchronization utility which is also great for use as filesystem backup software. Rsync builds and runs without errors on OS X, but because it is unaware of resource forks and other metadata, it creates incomplete (and therefore corrupt) backups.
There are a few ways to approach the problem.
rsync_hfs and RsyncX, by Kevin Boyd, send the whole file (both forks and metadata) to the destination system, but have no provisions for saving the "extra" data unless the destination filesystem is also HFS+. This is great, but it requires OS X on both sides, which doesn't fit my needs.
Another method frequently suggested on mailing lists and web forums is to run a script which copies the resource forks (and sometimes metadata) into new filenames on the source filesystem before beginning the synchronization. This works, but it doubles the disk space required for resource forks on the source machine. Also, most of the methods suggested don't preserve Finder metadata.
I think extensive pre-processing and source filesystem modification are awkward solutions to a simple problem -- there's no justification for a backup process that requires a significant amount of free disk space on the source system to run.
This patch will make rsync HFS+ metadata-aware. Resource forks and Finder metadata are assembled on the sender into an ephemeral file in standard AppleDouble format, before being sent to the destination.
This method preserves disk space on both sides, with zero redundant data and only a small amount of overhead per file (~100 bytes of AppleDouble headers for each file that has a resource fork and/or HFS+ metadata). It works with any destination filesystem and operating system (tested with Solaris and Linux), and even with older or unpatched versions of rsync.
This is currently a one-way, backup-only process. To restore from the backup to an HFS+ filesystem, you will have to reassemble the component files (see instructions below).
Currently, this patch will not transfer metadata for directories! This may be added in a future revision, but I don't consider it high priority. Let me know if you do, and I might reconsider. Directories don't have types or creators, so the only metadata is the Finder flags (and only a few of them are relevant for directories). A description of the flags data structure can be found in Apple's Finder Flags Reference.
For the sake of simplicity, and by common request, this patch uses the Netatalk naming scheme by default.
For a source file named filename, the
destination filesystem will store two files:
filename, containing the source file's data
fork, and ._filename, containing the resource
fork and Finder metadata, in AppleDouble format.
There are at least four naming "standards" for AppleDouble files, but the Netatalk scheme seems to be preferred.
| hfs-mode | destination filename(s) / format metadata transferred |
|---|---|
appledouble(preferred) |
filename / (none/various)
._filename / AppleDouble v2 RFC1740
SplitForks, and compatible with FixupResourceForks.
See below.
|
appledoublex |
filename / (none/various)
._filename / AppleDouble v2 RFC1740
FixupResourceForks. See below.
|
Apple's /System/Library/CoreServices/FixupResourceForks
will reassemble two AppleDouble component files into a single HFS+
file. However, it only understands certain "entries" in an AppleDouble
file: the entries created by /Developer/Tools/SplitForks,
and for compatibility, the entries in appledouble mode.
When FixupResourceForks encounters a valid AppleDouble
file with entries it doesn't understand, it will not reassemble
the files! It will report a print an error message and exit.
I haven't found a standalone tool that will handle all AppleDouble
entries (or even the five I think are still relevant in OS X and
useful enough to include in mode appledoublex). I'll
probably write something to fix this some day, but for now you
should use appledouble mode unless you have an
alternate plan (and if you do, please tell me about it!).
bash$ ls -a
._filename
filename
bash$ /System/Library/CoreServices/FixupResourceForks filename
bash$ ls -a
filename
NOTE: FixupResourceForks works
recursively, so to reassemble a large number of files just specify
the directory name (or ".") in place of
filename, above. FixupResourceForks
will reassemble each pair of AppleDouble files beneath that point.
If your backups are important, test them regularly! This is good advice regardless of your choice of backup software, and especially so in this case.
Although rsync has proven itself to be a reliable and essential
UNIX tool, it has seen less use on Mac OS X, for reasons which
should be obvious by now. New problems may be discovered, such as
the /dev/fd issue mentioned below. Historically,
rsync bugs tend to get fixed quickly.
I'm not an official rsync developer and this is not an official part of the mainline build. I've learned a lot about the internals of rsync, but there's no substitute for an expert.
Doom and gloom aside, I fully expect this patch to work for you. It works for me, and reports from the field say it's working for other people too. I use it every day to back up a few Mac OS X desktops and powerbooks to Linux and Solaris fileservers, and I trust it enough that it is my only backup method.
Hopefully, at some point in the future, the official rsync release will be able to handle HFS+ (and NTFS) metadata, but until then I hope to keep this patch useful and updated to a reasonably current version of rsync.
Some rsync operation modes (e.g. batch-mode) probably need special (unwritten) handling for the hfs-mode switch. I don't use them, so I haven't tested them.
Combining --hfs-mode with --hard-links
would cause file corruption on the destination filesystem. The
current version of the patch will print an error message and exit
if you try. Future versions might attempt to solve this problem
more elegantly.
To back up your home directory:
bash$ rsync --archive --delete --verbose \
--hfs-mode=appledouble \
--delete-excluded \
--exclude=.Trash \
--exclude=Cache \
--exclude=Caches \
/Users/username fileserver:/backups/username
To back up your whole machine:
bash$ rsync --archive --delete --verbose \
--hfs-mode=appledouble \
--delete-excluded \
--exclude=/dev/fd \
--exclude=/.Trashes \
--exclude=/Network \
--exclude=.Trash \
--exclude=Cache \
--exclude=Caches \
/ fileserver:/backups/machinename
NOTE: Most of the exclusions above are arbitrary,
but the --exclude=/dev/fd is important!
Rsync has trouble with some of the special files in that directory,
when the source directory is /, and gets stuck in an
infinite loop while creating the sender file list. The problem
doesn't occur when any other source directory (even /dev!)
is used. The fix has nothing to do with hfs-mode specifically, so it
might require a change in mainline rsync. Until it's fully resolved,
the exclude above should work.
--disable-times will force
checksums for all files, which seems like a good idea when
using rsync as part of a backup plan, but I want something
that only forces checksums for ephemeral files. ...One way
(ugly) would be to send an incorrect file-length in the flist,
so ephemeral files always fail the initial date-and-size check
...but that's kind of kludgey.
If you like this (or hate it), send a message and let me know: reynhout@quesera.com