SWY's technical notes

Relevant mostly to OS X admins

Why you can’t always trust the Hardware Compatibility List (HCL)

Late last year, I started on a project to replace my workplace’s main AFP/SMB/NFS file storage.  Due to already having good experiences with Synology gear, knowing DSM well, and knowing other macadmins happy with their Synos, I ordered storage and built a system that should perform quite well:

  • Synology 3614RPXS unit
  • 10 HGST 7k4000 4TB drives
  • 2 250gig SSD drives
  • 10 Gig SFP+ card, fiber networking back to the switches.
  • +16 Gigs ECC RAM

All taken from the Synology HCL- so I was ready to rock.  I built it as a RAID6, started copying files, and everything was looking good.  Once I’d moved a few TB onto the NAS, I started checking out consecutive read speeds from the NAS, and found an unexpected behavior:  the reads would often saturate a GigE connection as you’d expect, until they didn’t- large transfers would look like Image

When it’s great, it’s great.  When it wasn’t it wasn’t.  REALLY wasn’t.  It would run for about 90 seconds at full tilt, then 20 of nearly nothing.

During these file copies, the overall Volume Utilization levels would ebb and flow with an inverse relationship with speed.  When volume utilization approaches and hits 100%, the network speed plummets.Pasted_Image_3_11_15__10_15_AM

So the question became “why does the volume utilization go so high?” I started a ticket with Synology on Feb 2.  I did their tests and requested configuration changes- direct GigE connections to take the LAN out of the equation, SMB/AFP/NFS, disabling every service that makes the NAS a compelling product.  This stumped the U.S. based support, so it became an issue for the .tw engineers.

If your ticket goes off to the Taiwanese engineers, the communications cycles start to rival taking pen to paper and paying the government to deliver the paper. To Taiwan.  It all runs through the US support staff, and it gets slow. Eventually, I coordinated a screen sharing session with an engineer, where I replicated the issue.  They tested more… htop, iostat.  “can you make a new volume?”  “if you send me disks I can!”

Meanwhile, I’m asking the storage guys I know on Twitter (and their friends), and scouring the Synology forums for anybody who has an answer.  Eventually, I don’t find an answer, but someone else who has the same experience.  We start collaborating.  Then a few days later, I find another forum post from a user who has the same issues.  We start exploring ideas… amount of RAM?  Version of DSM that built the storage?  RAID level?  Then we find the overlap: we all use Hitachi HUS724040AL[AE]640 drives, and at least 10 of them in an volume.  One user was fine with 8 of them in a NAS, but when expanded to 13, performance changed and led to his post looking for help.

I then brought this information to Synology, and on March 27, Synology informed me they were trying to replicate the issue with that gear.  On April 16, they’d finally received drives to test.  On April 21, they agreed with my conclusion:

The software developers have came to a conclusion on this issue. That is, the Hitachi with a HUS724040… suffix indeed has slow performance in a RAID consisted of more than 6 disks.

Despite being on the list, and despite configuring everything properly, I still ended up with gear that did not perform as expected, as they’d not tested this number of drives in a volume.  Hitachi now tells me that they’re working on the issue with Synology, but in the meantime, I’m abandoning the Hitachi drives for WD Red Pros.


Don’t test installers from a VMWare Fusion shared folder

As I prepared to upgrade my workplace’s Creative Cloud installations to CC 2014, I built a virtual machine in Fusion 6 using the same methods I would use to deploy a machine to a new creative hire- because snapshots make life easier for testing installs and configurations.  On a separate build machine, I grabbed Creative Cloud Packager, and used it to build an installer for the full suite.

In order to save time, I copied the CCP output .pkg to the same external USB3 storage as the VM, and used Fusion’s Shared Folders feature to share that directory.  The VM was happy to see the installer there, and run it from the GUI.  The install started out as normal, but unexpectedly halted with a “The installation failed, contact vendor for assistance” dialog.

I started digging into /Library/Logs/Adobe/Installers, and saw that Adobe InDesign CC 2014 10.0 <datestamp>.log was the most recently written to log, so I examined it for details about the fail.  Looking in the log, the following caught my attention

10/07/14 19:35:07:861 | [ERROR] |  | OOBE | DE |  |  |  | 63746 | DS015: Unable to read symlink target of source file “/Volumes/VMware Shared Folders/SeagateUSB3/Adobe CC 2014-all but Acrobat/Build/Adobe CC 2014-all but Acrobat_Install.pkg/Contents/Resources//Setup/IDSN10.0en_US/payloads/AdobeInDesign10AppBase-mul/OEM_/Adobe InDesign CC 2014/Plug-Ins/InCopyWorkFlow/Assignment UI.InDesignPlugin/Assignment UI”(Seq 11962)

So I see we couldn’t find the target of a simlink.  Digging down that path in the pkg from the VM, we see

Screen Shot 2014-10-09 at 4.39.01 PM

Yep, Assignment and Resources are broken.  So what if we look at the same path from the mac that hosts the VM?

Screen Shot 2014-10-09 at 4.42.27 PM

Assignment has the same icon, but if I follow the symlink while viewing the content in the host OS, it resolves to:

Screen Shot 2014-10-09 at 4.47.52 PM

You can see the icon difference between Resources, and it too works.

To move on, I copied the .pkg onto the VM’s boot disk, which had to be done via an AFP share- a copy from the shared folder failed with an Error 41.  That copied pkg installed without errors.

Bottom line: things can go wrong with your testing VM if your source .pkg is stored on a shared folder in VMWare Fusion.

iOS8 Family Sharing with an Apple ID for a child

It’s a happy coincidence that my < 13 year old kid accumilated the savings to buy the iPod Touch he’s wanted right as iOS 8 with Family Sharing hit the market.  Here’s my experience with the process:

1) Upgrade a device of mine to iOS8.  Pretty straightforward process- but always make a backup anyway.

2) In Settings: iCloud, there’s now a new “Set Up Family Sharing…” link2014-09-17_16_09_38



3) Since my kid is under 13, the “Create an Apple ID for a child” option is the right choice:



4) Yep, this seems like exactly what I want:



5) But this isn’t.  I have my own domain. I don’t need a proliferation of email addresses.  Even if email is mostly dead to young people these days, and full of spam, email still isn’t going away.  Oh well, the THOU SHALT USE ICLOUD DOMAIN rule appears to be non-negotiable, so I begrudgingly complied.



6) After the standard, mandatory Security Questions (should I answer for me? For him? Must be me, since he couldn’t yet have a favorite singer in High School), I can enable Ask To Buy.




7) With that, my kid has an Apple ID.  The next day, his iPod Touch arrives, and out of the box, we attempt to authenticate with the new Apple ID.  Being a refurb, and day 1 after iOS 8 release, it ships with iOS7.  I’m not sure if that’s the cause, but when trying to use his new Apple ID, we get the most confusing error dialog I’ve ever received from an Apple product.  And I’ve received a few.


I eventually gave in on trying to authenticate with his Apple ID, it consistently gave the above.  I signed in with my ID, attached to iTunes, and started downloading the iOS 8 update.  Following that install, I signed out, and now it was happy to accept his Apple ID credentials on the first try.


8) With that configured, it was time to see the electronic  “please dad, may I have an app?” conversation.  The Buy button gets a new behavior in this situation:



And promptly over on my device, I see an alert from Family, which links to this page in the App Store:




9) I approve and authenticate to the App Store, and with this, the installation proceeds on his Touch:


Just because it isn’t logged…

… doesn’t prove it’s not working.

With the release of iOS8 this week, I wanted to make use of Caching Server on work’s guest WiFi, as I figure I’ll have a few early adopter staff looking to upgrade their personal devices.  In my setup, the guest SSID is tagged with a VLAN, which is routed straight out to the internet.  My Caching Server had no connection to that VLAN, so the first step was to add that VLAN to the switch port the OS X Server running Caching Server was connected to.

With the Guest WiFi VLAN now available to the Caching Server, it needed a new network interface associated with that VLAN.  That is done via System Preferences: Network: Gear button: Manage Virtual Interfaces:




Screen Shot 2014-09-16 at 9.52.16 AM

Then the [+] to Add a new VLAN:

Screen Shot 2014-09-16 at 9.53.29 AM

Name it usefully, associate it with the proper tag and interface for your environment (probably Ethernet), and [create].

With this, my new virtual interface came up with an IP on the Guest wireless subnet, as would be expected.

Per OSX Server documentation, the default behavior for Caching Server is to listen on all interfaces.  To confirm this was happening, I put Caching Server into verbose mode via

sudo serveradmin settings caching:LogLevel = verbose

And restarted the service, while tailing /Library/Server/Caching/Logs/Debug.log .  This is where I got concerned: the log only acknowledged “registering” on the local subnet, with no mention of the VLAN network.  After some troubleshooting, I was able to confirm it really was listening on the VLAN, by noting what port the HTTP server was started on (as listed in the log), and pointing a browser from a machine on the Guest WiFi to that Caching Server:port combination.

When you do this, the client browser will return a blank page, and Debug.log on the Caching Server will record a Error 400 – Bad Request from that source machine, citing a non-whitelisted URL.  This confirms that the service is listening on the added VLAN, despite not being mentioned in the verbose log.  Therefore, the documentation is correct: unless overridden, Caching Server is active on all interfaces. Don’t let the fact that the log doesn’t acknowledge multiple interfaces bother you, as I did.

If you wish to use Caching Server for multiple networks in this way, it’s important to make sure they both appear to the internet from the same WAN IP.  Caching Server will only be available to clients that contact Apple from the same network that the Caching Server did.


And then it turns out that Caching Server can’t/won’t/doesn’t cache iOS8.  Sometimes you just can’t get ahead of the game.

Going MAD presentation: PSUMac 2014

Slides from my talk at Penn State University on combining Munki, AutoPkg, DeployStudio and other tools to take a new mac from new in box to ready to use.

Going MAD

It’s also on youtube, as part of the psumac 2014 playlist.

Sonicwall “Error: Index of the interface.: Transparent Range not in WAN subnet”

I recently needed to put an internal server in my org’s DMZ, because the service didn’t play nicely with 1:1 NAT, returning unusable data to remote IP phones.  To configure my Sonicwall, I started following a blog post by guru-corner.com, as I often find what is documented by an outsider as more complete than the manufacturer docs (YMMV).  However, my efforts to set up both the X4 and a vlan in transparent mode were rejected with a “Error: Index of the interface.: Transparent Range not in WAN subnet” alert.  It wasn’t until I read Dell SonicWall’s documentation that I focused in on the one key word: primary.

We have 3 WAN links here, 2 I use for traffic, and a small link only suitable for “when all else fails”.  SonicWall devices give a number of options for failover and load balancing:

  • Basic Failover: Always route traffic out primary connection, secondary quietly waits to take over iff primary fails
  • Round Robin: Cycle through the outbound links for each new connection, maintaining an approximately equal number of connections through each.
  • Spill-over: Use the first defined interface for traffic, until a certain bandwidth usage is reached.  Once that happens, use round robin logic across the remaining link(s).  If there’s only a secondary link, then once primary hits the usage threshold, all subsequent requests go out the 2nd link.  (I’ve not found it documented for what duration traffic must exceed the threshold. 1 second? 1 minute? Rolling average over $time?)
  • Ratio: Round Robin, but instead of equal distribution, it can be weighted.  I see using this when there are multiple links, you’d like to use them all, but they’re not equal bandwidth.  You could set a ratio of use proportional to their percent of their bandwidth contribution to the whole.

Since the device was brought online, I’ve defined our cable modem as the primary link, with a Spill-over at 85% of inbound capacity to the 2nd link.  This has worked well, but it’s what tripped me up: our cable provides a single IP, while my 2nd link routes a /28 network.  It was one of these /28 I wished to apply transparent IP mode to, but since it wasn’t defined as the primary in my load balancing configuration, my change was rejected. After redefining the load balancing group to have X2 as the primary with a low exceeds value, I was then able to define the transparent mode as desired.

Additionally, when configuring failover criteria for SonicWall links, you want to set up multiple conditions with “Probe succeeds when either a Main Target or Alternate Target responds”, with a very reliable external host as the Alternate Target- I use ICMP to http://www.google.com, with 3 DNS servers on 3 networks configured under Network:DNS.  Default SonicWall conditions for the probes that monitor “is this link up?” is to connect to responder.global.sonicwall.com:5000.  This is fine for one of the criteria, but consider what would happen if you have multiple links, all only asking “is responder.global.sonicwall.com up?” and something happens to the service, which has happened.  Both links simultaneously and erroneously say “probe failed, therefore the WAN link failed. I’m supposed to shut down”, and dutifully do so- unnecessarily taking that location offline.  Not fun.

This configuration is found under Network: Failover and LB: [expand the group]: click [configure] for each link member.


Automated builds including Office updated to 14.4.1

With Office 2011 for Mac update 14.4.1, Microsoft has again caused trouble with licensing, where an automated update run at loginwindow (as tools like Munki will do) will result in a Volume License install requesting the user to enter a volume license key, sign in to Office 365, or to trial Office 365.  To address this, all the admin needs to do is gather the license file at /Library/Preferences/com.microsoft.office.licensing.plist and replicate it on the managed machines.  To do so, I used The Luggage, and took the following steps.

  1. Make a new project folder
  2. Copy a valid com.microsoft.office.licensing.plist into it
  3. Create a makefile, in the following format:
include /usr/local/share/luggage/luggage.make


Then cd into the directory from Terminal, type make pkg, and the output is a simple pkg that will drop the licensing into the proper destination.  Feed this to your package management tool as an update for your Office installer, and you’re set.  I set my munki install to check for the presence of the file by MD5 checksum with an installs key, to ensure the license key always remains.

For an alternate approach to integrating the license fix into a new, signed 14.4.1 combined installer, see Rich Trouton’s post.

10.5 to 10.9 upgrade

The time came to upgrade my mom’s trusty iMac 8,1 to Mavericks: being stuck on Firefox 16 and no access to the App Store just wasn’t cutting it anymore.  Unfortunately, the minimum system requirements to install Mavericks is to already be on 10.6.8.  I didn’t want to sit through 2 OS upgrades, so instead I went with the following:

  1. Purchase Snow Leopard. Not a required technical step, but technically required to be legally compliant. If you opt to skip this step, Apple will never know.
  2. Partition a spare external drive into 2 volumes.  I took an unused 1TB drive, made a 50 gig partition, named it Tools, and named the rest Transfer.
  3. Use AutoDMG to make an unbooted 10.9.2 .dmg, including extra packages as needed. Since she’s a CrashPlan user, Java was appropriate.
  4. Use Disk Utility to duplicate that .dmg output to both the Transfer and Tools partitions.
  5. Boot my computer from the Tools partition, and go through the setup wizard.  I then downloaded Carbon Copy Cloner into the Tools volume.
  6. Take this disk to mom’s place, and boot from the Transfer volume.  The never touched 10.9.2 install will start, and Migration Assistant will be happy to see the computer’s internal drive and Time Machine as sources to migrate from, and import data and settings to this temporary boot volume.  Don’t let the icons confuse you- in this case, they’re backwards, and don’t represent what’s really happening:
    2014-03-15 13.27.57
  7. When this completed, I had a full copy of her stuff migrated into 10.9, and I’ve not yet touched her real boot volume.  In the small possibility that something might go awry, no real data was at risk.  At this point, I could confirm the new volume, see that Mail upgraded, printer drivers downloaded, ect.
  8. Once satisfied the import to the new OS was working properly, I rebooted from the Tools volume, and used Carbon Copy Cloner sync the internal disk to match the Transfer volume. It’s smart enough to see that there’s not a recovery partition on the old 10.5.8 disk, and handle making that.
  9. With that done, it’s all good to go, and I have one more copy of her iMac in case old harddrive starts acting old.

Repackaging NetExtender- updated method

While my earlier blogged method for repackaging SonicWall NetExtender gave solid results, I’d rather learn to use The Luggage, due to it being easier to repeat consistent results.  The issue to solve with NetExtender is that while Dell provides a drag and drop .app that’s simple to dump into the Applications folder, it’s not ready to run.  Without adjustments, on first launch, it makes this request from the user:Screen Shot 2013-01-28 at 1.01.02 PM

That wasn’t going to work in 2013, and it’s still not in 2014.  Approving this request leads to an authentication dialog, and once authenticated, “magic happens”, and NetExtender is happy, probably until the next MacOS update, where the dialog will return.  Therefore, the question was to determine what sort of “magic” happens there.

Enter fseventer.  Like opensnoop, it will answer the question “what file(s) are being modified?”  The answer I came up with was a group in /usr/sbin and in /etc/ppp (which was consistent with my Composer work last year)

Next was to gather these files into The Luggage, and create a makefile.  My first attempts to build the package included many more cp, chown, chmod steps than necessary, but with some help from @chilcote and @mikeymikey, the following was created:

include /usr/local/share/luggage/luggage.make

    unbz2-applications-NetExtender.app \
    pack-script-postinstall \
    pack-Library-LaunchAgents-com.hiebing.netextender.plist \
    pack-usr-sbin-netExtender \
    pack-usr-sbin-nxMonitor \
    pack-usr-sbin-uninstallNetExtender \
    pack-config \
    pack-man1-netExtender.1 \
    pack-ppp \

pack-config: netextender_config.sh l_Library
    @sudo mkdir -p ${WORK_D}/Library/Hiebing/Scripts
    @sudo chown -R root:wheel ${WORK_D}/Library/Hiebing
    @sudo chmod -R 755 ${WORK_D}/Library/Hiebing
    @sudo ${INSTALL} -m 755 -g wheel -o root "netextender_config.sh" ${WORK_D}/Library/Hiebing/Scripts

pack-ppp: ppp.tar.bz2 l_private_etc
    @sudo ${TAR} xjf ppp.tar.bz2 -C ${WORK_D}/private/etc
    @sudo chown -R root:wheel ${WORK_D}/private/etc/ppp
    @sudo chmod -R 755 ${WORK_D}/private/etc/ppp
    @sudo chmod 644 ${WORK_D}/private/etc/ppp/peers/sslvpn
    @sudo chmod 744 ${WORK_D}/private/etc/ppp/sslvpnroute
    @sudo chmod 666 ${WORK_D}/private/etc/ppp/netextenderppp.pid
    @sudo chmod 666 ${WORK_D}/private/etc/ppp/netextender.pid
    @sudo chmod 644 ${WORK_D}/private/etc/ppp/options

    @sudo chmod u+s ${WORK_D}/usr/sbin/uninstallNetExtender
    @sudo chmod 744 ${WORK_D}/usr/sbin/nxMonitor

Postinstall: a 1 liner to suid /usr/sbin/pppd.  In retrospect- that could have gone in the fix-perms statement.

com.hiebing.netextender.plist: a LaunchAgent that runs a configuration script, defined in pack-config

netExtender, nxmonitor and uninstallNetExtender need to go in /usr/sbin, so pack-usr-sbin-<item> handles that

pack-config: the script, which checks to see if ~/.netextender exists, and if not, creates the appropriate config file, based on my account config.

pack-man1 handles the man file

pack-ppp unarchives the contents of /etc/ppp, and ensures ownership and permission match the source, as NetExtender is checking these on first launch.

fix-perms does just that.

After running make pkg to build the pkg, I have a nice installer to push to clients, and a LaunchDaemon to configure the connection for each user login.  While this handles packaging for install, there’s one more aspect to keeping a healthy NetExtender: MacOS updates may remove the suid on /usr/sbin/pppd.  It happened on 10.8.5, 10.9, and 10.9.2.  One way to handle this is puppet, another is an installer in munki- mine has the following components:

An installcheck_script that queries the permissions on /usr/sbin/pppd:

#installcheck for /usr/sbin/pppd permissions

current=`ls -al /usr/sbin/pppd | cut -c1-10`

if [ $current == $proper ] ; then
    exit 1    
    exit 0


A postinstall script that is run if the permissions vary:

chmod u+s /usr/sbin/pppd

exit 0

Net result is that after any update that alters the suid of pppd, on next munki run, it will be reset, and users will not be asked to “do maintenance tasks”- and more importantly, not asked to authenticate.

I disabled downloading VMWare tools?


And I should contact myself about it?  My VMware Fusion 5 install was done using the VMware mass deployment pkg guidelines, and sometime after that, I discovered that I allegedly disabled myself from downloading VMware tools into a VM.  I know I did not decide “yes, Tools are bad, better keep those out of there”, and in studying the deployment options, I don’t see where one could enable that option.  My package was simple- only the serial number placed into the Deploy.ini file, no VMs distributed with it.  I also couldn’t google up a solution to this, and the inability to have Tools in my Mavericks VMs was becoming a problem, as this continued through an upgrade to Fusion 6.

With nothing to loose, I wondered if creating a new mass deployment pkg for Fusion 6 and installing it would overwrite the “tools disabled” flag, and it did.  I’m still unclear where that option is stored, and I would presume it’s relatively well obfuscated, for the benefit of environments that really do want Fusion unable to download Tools… for whatever problem that may solve.


Update, October 2014:  Mike Solin pointed out in IRC that in Fusion 7.0.1, if deploying with a deploy.ini, if one disables software updates in the .ini, it leads to this behavior.  I reviewed my older installer deploy.ini configuration, and the only non commented or section heading line is the [Volume License] and following license key, yet I still had the “softwareUpdates = deny” behavior.

Bottom line: if your deploy.ini says to deny softwareUpdates, expect installing VMWare Tools to fail in a VM.