UPDATE 12/12/2010:
I noticed that this post is the first hit when you google for 'zil_disable' so I thought it was worth mentioning that 'zil_disable' has been replaced with a per-filesystem option in recent versions of ZFS. On snv 140 or later (which includes OI and the new Solaris Express), disabling the Zil is now done with the zfs 'sync' property! The original post remains below, but that mechanism is thankfully now obsolete.I've been fighting a major performance issue with NFS over ZFS for a while now. It hasn't been a deal breaker or anything - everything I want to do over NFS has been working adequately - but I've noticed that my performance is subpar.
I didn't realize just *how* subpar until I started dealing with small files. Large files were generally OK - writing/reading a single large file via NFS would be about 80% as fast as with local storage. Since most of my mythtv operations are, obviously, with large files, this wasn't a big deal.
I noticed the problem with small files when I started migrating home directories to ZFS. Copying over the 85 MB ~mythtv directory took, according to time, 3m37.114s! What the hell is that about?
I then performed the same operation with rsync over ssh, which gave me a more reasonable result of 0m17.847s. That's almost 12 times faster!
I googled around a bit - well, more like a lot - and finally found out about zil_disable. Turns out that by adding 'set zfs:zil_disable=1' to /etc/system you can *dramatically* improve NFS performance on ZFS.
This is vaguely similar to turning on 'async' for Linux's NFSv3 server, but it impacts *all* filesystem operations, not only NFS. Effectively, NFS clients frequently request commit operations which inform the server to immediately write changes to stable storage. This happens at least on every file close over NFS, and the server itself forces a write to storage whenever metadata is modified. You can imagine then why there's such a huge impact when dealing with thousands of files - each single operation requires a flush of the cache and a write to disk.
After disabling the zil, I re-ran the same copy operation and achieved dramatically improved results:
mingus ~ # time cp -r ~mythtv /nfs/pvr/test/
real 0m18.265s
user 0m0.076s
sys 0m1.740s
Now that's more like it!
I've also noticed a moderate improvement with large files - both over NFS *and* locally.
I guess I'm actually a little surprised that the base configuration of the system would result in such horrid performance. I'm willing to make some trade-offs of performance for data consistency, but reading the blog mentioned above the risks seem tiny compared to the benefits of an order of magnitude performance increase.