warning: Creating default object from empty value in /home/jaharmic/public_html/jaharmi/modules/taxonomy/taxonomy.pages.inc on line 33.
Software to manage UNIX and Mac OS X filesystems.

Hashing with splash in Python

Every time I do hashing in Python, I have to look it up. I forget how to do it. That's probably a bad thing, at least compared to the shell. The shell way isn't exactly simple, but I find it something I can do by rote.

I'm going to write down how I got MD5 and SHA-1 hashes for a file — which is something I occasionally have to do when posting a download, for example — thereby making it possible to find my own perfectly-tailored how-to next time:

>>> import hashlib # hashlib is new in Python 2.5

file_reference=open('/path/to/file', 'rb').read() # open the file for reading, in binary mode

There, that wasn't so bad. I just have to remember the name of the [Bad link] and how to call for a hash of some data with it. You're missing the twenty other lines I tried which didn't work, of course, but you don't really need to see that. Sigh.

Without specifying hexdigest(), the result is a hash object rather than the hash value.

>>> hashlib.md5(file_reference)
<md5 HASH object @ 0x639c0>

I compared the Python hashlib results above with the following output from OpenSSL, and they are the same:

$ openssl md5 /path/to/file
MD5(/path/to/file)= 11fb57ba7927ad04534d0a341dd9c943
$ openssl sha1 /path/to/file
SHA1(/path/to/file)= bff8e8bcd74662ee52dde369e9387cb10d5a5ece

On balance, I think I'd still like comparing that hash against another string better in Python, but getting the hashes was quite a bit more confusing to me. It was enough to interrupt my flow.

Apple purchases the rights to CUPS

Wow, Apple bought the Common UNIX Printing System (CUPS) back in February, and the [Bad link]. (I have to wonder why the delay … perhaps it has something to do with Leopard?) The software continues to be [Bad link].

Michael Sweet, one of the principals behind Easy Software Products and developer of CUPS, is now an Apple employee.

I’m sure this all means something.

Is Entourage a client that doesn’t suck?

In reviewing Jesper's list of requirements for [Bad link], I was somewhat surprised how many of his points are already handled by Microsoft Entourage 2004.

I’d give it 19 out of 26 points. There are some places where I’m being charitable towards Entourage, partly because it can support the requirement with a little work (which does not always mean scripting — and it should be noted that Entourage is very scriptable) or I didn’t understand what Jesper meant by the requirement.

Many Mac users discount Entourage. There are a couple of reasons that may be cited:

  • Microsoft bundles it with Office 2004, so you have to buy it along with a lot of other software in order to get it. That costs more than a standalone client, even if you’re eligible for the $149 [Bad link] (which is often discounted even more).
  • It’s not Universal yet, so it works natively only on PowerPC Macs. It requires Rosetta on Intel Macs today. (But Microsoft has announced that [Bad link].)
  • It comes from Microsoft, so it must be evil somehow.
  • It has a custom monolithic database for its backend. This predates any of Apple’s Core Data-type development. It can be a completely valid issue for some — such as those with network or portable home directories, or those performing regular client backups.
  • It doesn’t look like a Mac OS X application. Rather, you might say it looks more like an application from classic Mac OS … and I’d agree, but there are some quick things you can do to spruce things up. (I had some on my old blog, but I still haven’t ported the old content over. Suffice it to say that changing a lot of the font choices to “Lucida Grande” in the preferences makes a big difference.)

That said, there are many valid concerns about Entourage. I voice many through the Microsoft feedback channels available to me.

However, I think there is a huge impediment to creating a new e-mail client today, simply because of how connected this kind of product is to your whole computing experience. Any developer should take that into consideration, and realize that it’s probably an unending effort.

Hi, I’m your Mac enterprise interviewee

Mentioning [Bad link] reminded me that I’d never tooted my own horn here regarding another press placement. Why? I was interviewed by Lisa Nadile for [Bad link] back in spring, and the article appeared quite some time ago (after I’d almost forgotten about it or thought I’d missed it).

So, read below the fold to find my quoted moments — not necessarily quotable moments, mind you — in the article: [Bad link].

The sheer awesomeness of bad version checking

The small corner of the ‘net I inhabit is already abuzz with things that have broken with the Mac OS X 10.4.10 update. Software seems to be breaking not because of significant changes in this update, but because of poor version checking routines.

Awesome — isn’t it?

Anyway, here are some reports I’ve heard:

  • [Bad link]: CA eTrust 8.1 installer thinks it’s installing on Solaris 10
  • [Bad link]: Tivoli Storage Manager installer 5.1 installer just thinks v10.4.10 is older than 10.4.3, which is rather pedestrian
  • [Bad link] and [Bad link]: Apple Remote Desktop 3 (of all applications … ahem!) can’t display version numbers correctly, or use the new version in Smart Lists.

Too bad you can’t always use Python distutils.version to compare Mac OS X version numbers. (For Mac OS X installers, you sometimes must use JavaScript.) It seems as if the version objects from distutils.version would make things soooo much easier.

Yeah, I know … I guess I’m becoming a Python fanboy. Apologies.

Comparison of sys.path on Python 2.3 and 2.5 framework builds

From Python 2.3.5, installed by default in Mac OS X Tiger:

import sys
['', '/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python23.zip', '/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3', '/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/plat-darwin', '/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/plat-mac', '/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/plat-mac/lib-scriptpackages', '/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/lib-tk', '/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/lib-dynload', '/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/site-packages', '/System/Library/Frameworks/Python.framework/Versions/2.3/Extras/lib/python']
for z in sys.path:
... if '2.3' not in z:
... print(z)

From [Bad link]:

import sys
['', '/Library/Frameworks/Python.framework/Versions/2.5/lib/python25.zip', '/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5', '/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/plat-darwin', '/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/plat-mac', '/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/plat-mac/lib-scriptpackages', '/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-tk', '/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload', '/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages']
for z in sys.path:
... if '2.5' not in z:
... print(z)

This simply means there are no directories in common in sys.path — the list of directories where modules can be installed — between these two versions of Python. I find that a bit annoying, since I can’t by default count on deploying one module to a single location that will work in both the default and upgraded versions of Python.

That lack of a default common location also has an impact if you’re managing the filesystem with [Bad link].

Disk Utility, restoring images, and periods

Yesterday, I thought I’d came across a potential solution to a problem I’d been having with restoring disk images to target drives using Disk Utility. I keep getting error 22, “invalid argument,” when restoring images over HTTP, using Disk Utility on the original Tiger version of the Mac OS X Install DVD.

While there could be [Bad link], restoring the same images locally — over FireWire target disk mode, for example — appears to succeed reliably and repeatedly. (See also [Bad link], where the core error was also not resolved.)

Let’s say that you have your images stored on a Web server — preferably one with some access controls, because you don’t want your system images open to just anyone. That should let you use Apple Software Restore’s HTTP-based image restore feature. You have already prepped the images for ASR, creating them appropriately (possibly through the use of shadow files) and applying the volume-level checksum that is required for block-level restores.

However, let’s assume that your naming convention for system images includes more than one period. After all, you’re creating images for Mac OS X versions like 10.3.9 and 10.4.8 … it’s natural to want to use those version numbers somewhere in your system image names. There’s no sense in changing those periods to some other character, right?

Well, I surmised there might have been a reason to change or remove those periods. I expected that removing — or encoding with %2e (as noted in this [Bad link]) — the periods from a test image’s name would have a beneficial effect, to let me restore it successfully via HTTP. However, it was still a no-go with the encoded characters, so I’m now back where I started. The image restoration works until near the end of the process, and I get error 22 again. Grr.

I’m wondering out loud now whether there’s a problem with automount, since I’ve seen some search results that seem to mention error 22 in conjunction with it. I have no clue why it would fail only for HTTP-based ASR restores using a Tiger install DVD, though.

It seems I can recall the restore-via-HTTP feature working at some point in the past, but it certainly hasn’t done so to my recollection under Tiger.

Updating a base Mac OS X system image using shadow files

When deploying system software with disk images, it is helpful to have various checkpoint images that you can revert to while you’re building up a fully-fledged template computer. This is something they teach you in school (really, I was taught it in a systems administration class) and it’s more or less encoded in the solution accelerator documentation for Microsoft’s Business Desktop Deployment 2007 for Windows.

However, if you’re updating images, keeping the base and intermediate images can strain your storage capacity. Mac OS X lacks the compelling live editing features of Microsoft’s new WIM image format — which if it had appeared first on the Mac, I’d be trumpeting loudly, so I feel compelled to at least give a nod to Microsoft here.

Since I’m always struggling with storage capacity and I prefer having an up-to-date base image, I thought about this problem a bit in the context of Mac OS X imaging and have come upon what seems to be a unique solution: the use of shadow files.

Here’s the basic idea:

  1. Create a base operating system install on a partition of your template computer’s disk
  2. Capture a compressed, read-only image of it immediately (go wild, save the extra percent or two, and compress it with the Tiger-only bzip2 scheme … you can afford to do this if you have time and are only deploying with a Tiger startup disk)
  3. Scan the base image for Apple Software Restore’s block-level checksums
  4. Attach that base image to the filesystem — honoring ownership and specifying a shadow file — so that it acts as if it is writable, using hdiutil
  5. Install the latest Mac OS X combo update on it
  6. Create a new read-only compressed image of the mounted volume
  7. Prep the new updated base image for ASR.

Congratulate yourself on this use of shadow files, because you’ve saved at least one intermediate step and the space required for a full read-write disk image — or worse, an extra local partition needed only for restoration and updating the base image.

Unmount that volume and throw away the shadow file at this point if you want, because you’ve now got two system images ready for deployment. One has the base system software, and serves as a checkpoint that you can return to later; it’s the base for all future updates of that major revision of Mac OS X. The other image has the latest version of Mac OS X. If you’re deploying that image with ASR, the result will be a more secure system because it’s closer to being fully patched — and it should take less time to update it with the additional security updates and application installs — whether you use installers or Radmind or another solution — because you’ve got the bulk of the operating system done.

Unfortunately, many updates can only be installed on the startup disk and thus cannot be included in the updated base image. Beyond the combo operating system updates, few of Apple’s other installers will work on a non-startup volume. But you can install them after deploying the updated base image, using your tool of choice. For reasons like this, Geoff doesn’t see updating the base image as valuable, but in some IT environments, it may be very worthwhile.

My next step is to script this process and tie it to a watched folder. Imagine dropping a combo update into a watched folder … and letting a script generate the new, updated image for you.

Compressed Mac OS X disk image statistics

I wanted to determine how much space I’d save—and what the compression time tradeoff might be—to compress Mac OS X system images with gzip versus bzip2. Both are options in Mac OS X Tiger, but the bzip2 compression is new in that release (and thus not backward compatible with Panther and earlier). Given that Apple’s own disk imaging is the bedrock of almost all imaging solutions from Mac OS X, including those from third party computer management solutions providers, this information is valuable and widely applicable on the Mac platform.

If the time were the same but I saved some disk space, I’d count that as a win for one compression choice or the other. For significant space savings, I’d even sacrifice time to a longer conversion operation because disk image compression and ASR prep can be handled in the background during downtime or operational lulls. A Folder Action or launchd task might be ideal for that, and might even be a useful option if you are just archiving images.

The two disk image formats I tested are UDZO and UDBZ, as defined by the hdiutil command and its man page.

For an UDZO version of the image (gzip compression at level 9—or best),
I got the following results from hdiutil convert:

Elapsed Time: 1h 1m 50.751s
File size: 4226869141 bytes, Checksum: CRC32 $8CF3F127
Sectors processed: 20184610, 19980489 compressed
Speed: 2.6Mbytes/sec
Savings: 59.1%

I ran the hdiutil command with time (a handy utility if you haven’t used it), to get these figures:

real 61m52.133s
user 51m7.575s
sys 2m47.700s

For the same image converted to UDBZ (bzip2 compression):

Elapsed Time: 2h 16m 28.057s
File size: 3956667545 bytes, Checksum: CRC32 $8CF3F127
Sectors processed: 20184610, 19980489 compressed
Speed: 1.2Mbytes/sec
Savings: 61.7%

The time results for that were:

real 136m29.205s
user 106m1.832s
sys 8m30.739s

You may draw your own conclusions, as appropriate to the data and your situation. The source image was the same size, nearly 10 GB, which included Mac OS X Tiger and a suite of applications with a small number of user accounts. The source and converted files were both stored on an external FireWire 400 hard disk, and converted with a 1.5-GHz 12-inch PowerBook G4.

Although the ASR-prepped (i.e. asr --imagescan) gzip and bzip2 images were not that different in the end, the bzip2 version definitely came out smaller. It has in ever test I’ve ever run. However, it did take twice as long to create the image. So, if you’re strapped for disk space to collect your images and CPU time means less to you than storage (which is often the case for me, given that I have access to a lot of G4-class Mac hardware), I would consider bzip2. However, gzip at level 9 is perfectly acceptable (still providing almost 60% savings in this case), is faster to create, and has the benefit of working on older versions of Mac OS X, as well.

I should try gzip at other settings (like levels 1 and 6, for example), because you can’t choose level 9 in the Disk Utility interface; you must use the command line -imagekey flag. (You can get bzip2 from the GUI if you enable the hidden advanced disk image formats, but I still don’t see a way to choose a higher gzip level.)

Programmatically changing Safari’s “open safe downloads” preference

The use of the defaults command in Mac OS X is something of a black art. For one thing, you need to delve into property lists to determine what property you want to change.

Therefore, I didn’t want to resort to it to set Safari’s “open safe downloads” preference, but it looks like that particular setting is regrettably not one that you can control with Workgroup Manager (or the Apple MCX schema for managed clients).

So, to make that settings change at the command line or through a script, you can use the following tips. (If you have to do something much more advanced with plists than this, you may want to consider using PlistBuddy, a propertly list manipulation tool which Apple includes in their installers.)

With defaults read, you can read the AutoOpenSafeDownloads property in the user’s domain:

% defaults read com.apple.Safari AutoOpenSafeDownloads

That will return 0 or 1, the former meaning the “off” or “no” and the latter meaning “on” or “yes.”

To turn AutoOpenSafeDownloads off, use the defaults write command with the -bool option at the command prompt:

% defaults write com.apple.Safari AutoOpenSafeDownloads -bool NO

To turn it on, change the boolean option to YES:

% defaults write com.apple.Safari AutoOpenSafeDownloads -bool YES

Syndicate content