SWY's technical notes

Relevant mostly to OS X admins

Diskwarrior- it’s good for RAID too.

Earlier this week, my production Xserve suddenly started behaving badly- massive latency, timeouts authenticating users, dismal disk performance, lots of SBBOD on the console.  Checking the logs, there were many errors such as

client: 0x825200 : USER DROPPED EVENTS!
callback_client: ERROR: d2f_callback_rpc() => (ipc/send) timed out (268435460) for pid 17336

along with fseventd errors.  I also noted that CrashPlan ProE was simply halted in a scan.  I started with a reboot, which fixed the issues for that day, but by morning, they’d returned, with similar log errors.  I poked around, first starting with Disk Utility to run checks.  It stated that the first 2 volumes I asked it to check were healthy, but got stuck for over half an hour on another.  After finally persuading it to cancel that check, I brought over a go-to disk maintenance tool I’ve used for decades: Allsoft’s DiskWarrior.  It has never harmed data on a directory rebuild, but I have to admit that the idea of running it on 2 production AFP storage RAIDs (R5 and R6) and a boot volume RAID1 gave me pause. But I double checked on last night’s backups, and had at it.

DiskWarrior found Volume Information errors on all 3 RAIDs, fixed them, and in the 48 hours following, it’s been humming along as expected.

I remember using DW back on an AppleShareIP server in the pre-OSX days. Unfortunately, HFS+ has its flaws, but DW has a good shot at fixing them.

Now… where’s my native ZFS?


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: