Python

Turn RSS feed items into Keynote slides with Python Appscript

I wanted to set up a kiosk system that displayed some static and some constantly-updated information. I already had a kiosk that used Apple Keynote in full-screen presentation mode to show a slideshow, but all of the slides themselves were static.

In pondering the idea of keeping the slideshow fresh with more dynamic and up-to-date information, I wondered if I could start with an RSS or ATOM feed as the source. After all, the Web is full of dynamically-updated content, thanks to blogging and content management (CMS) software. These updates are often streamed out in RSS or ATOM format. If that software is already present and the workflow is (or can be) a part of a group’s Web presence, then repurposing those same dynamic updates is an efficient use of peoples’ time. That the software is often reasonable for use by trained staff or dedicated volunteers is another benefit.

With that in mind, I constructed a proof of concept that let me add new slides to a Keynote presentation.

For this task, I wanted to avoid controlling Keynote with AppleScript because I’d had an interest in trying my hand at Python Appscript for a while. I found Appscript to be simple to use when controlling Keynote — and it made as much if not more sense to me than AppleScript did.

Unfortunately, Appscript is now deprecated, so while my proof of concept (below) works for now, it may not continue to do so with inevitable march of software progress.

For this sample project, we need to have the Universal Feed Parser and Appscript installed.

Unfortunately, with the disappearance of Mark Pilgrim and his projects from the Web, the original source site for the Universal Feed Parser no longer exists. However, UFP is still open source software and people had the source and are interested in further developing it. When installing UFP, Python “easy_install” will fetch it from its home on Google Code.

At this point, with two key Python modules in strange circumstances, you are probably wondering what I was thinking. My only defense is that I did all of this some time ago and neither the AppScript nor UFP situations was apparent at the time.

Install Python Appscript. This requires that you are logged in with a “sudo”-ready account, or have used “su” to switch to such a user.

$ sudo easy_install feedparser
Password:
Searching for feedparser
Reading <a href="http://pypi.python.org/simple/feedparser/
Reading"
title="http://pypi.python.org/simple/feedparser/
Reading"
>http://pypi.python.org/simple/feedparser/
Reading</a> <a href="https://code.google.com/p/feedparser/
Reading"
title="https://code.google.com/p/feedparser/
Reading"
>https://code.google.com/p/feedparser/
Reading</a> <a href="http://code.google.com/p/feedparser/
Best"
title="http://code.google.com/p/feedparser/
Best"
>http://code.google.com/p/feedparser/
Best</a> match: feedparser 5.1
Downloading <a href="https://feedparser.googlecode.com/files/feedparser-5.1.zip
Processing"
title="https://feedparser.googlecode.com/files/feedparser-5.1.zip
Processing"
>https://feedparser.googlecode.com/files/feedparser-5.1.zip
Processing</a> feedparser-5.1.zip
Running feedparser-5.1/setup.py -q bdist_egg —dist-dir /tmp/easy_install-tOBghK/feedparser-5.1/egg-dist-tmp-s77JZ2
zip_safe flag not set; analyzing archive contents…
Adding feedparser 5.1 to easy-install.pth file

Installed /Library/Python/2.7/site-packages/feedparser-5.1-py2.7.egg
Processing dependencies for feedparser
Finished processing dependencies for feedparser

Install Python Appscript.

$ sudo easy_install appscript
Password:
Searching for appscript
Reading <a href="http://pypi.python.org/simple/appscript/
Reading"
title="http://pypi.python.org/simple/appscript/
Reading"
>http://pypi.python.org/simple/appscript/
Reading</a> <a href="http://appscript.sourceforge.net
Best"
title="http://appscript.sourceforge.net
Best"
>http://appscript.sourceforge.net
Best</a> match: appscript 1.0.0
Downloading <a href="http://pypi.python.org/packages/source/a/appscript/appscript-1.0.0.tar.gz#md5=6619b637037ea0f391f45870c13ae38a
Processing"
title="http://pypi.python.org/packages/source/a/appscript/appscript-1.0.0.tar.gz#md5=6619b637037ea0f391f45870c13ae38a
Processing"
>http://pypi.python.org/packages/source/a/appscript/appscript-1.0.0.tar.g…</a> appscript-1.0.0.tar.gz
Running appscript-1.0.0/setup.py -q bdist_egg —dist-dir /tmp/easy_install-RuEa2e/appscript-1.0.0/egg-dist-tmp-_nOlL1
zip_safe flag not set; analyzing archive contents…
Adding appscript 1.0.0 to easy-install.pth file

Installed /Library/Python/2.7/site-packages/appscript-1.0.0-py2.7-macosx-10.7-intel.egg
Processing dependencies for appscript
Finished processing dependencies for appscript

Launch Keynote and create a new presentation from the template selector. We will enter the “title” and “summary” RSS/ATOM feed data into the first Keynote presentation, so you need to have a presentation open first.

In either a Python script or the Python interpreter in Terminal, use the following sample code to download the feed data from the Apple Hot News RSS feed and enter it into the Keynote document.

I picked the Apple Hot News RSS feed because it is commonly seen on Macs that use the RSS screen saver and its defaults. Because of that, I also thought the feed was likely to be around long enough into the future to serve as a good example for others that came across this article. I am not endorsing this particular feed, don’t follow it, and don’t know its contents at any given time. You are free to substitute the RSS/ATOM feed URL of your own choosing — well, as long as the Universal Feed Parser can handle it — with the following example.

#Copyright (c) 2010-2012 by Jeremy Reichman
#All rights reserved.
#Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
# *     Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
# *     Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

# Import the feedparser module
import feedparser
# Define a feed to download and parse
d = feedparser.parse(‘http://images.apple.com/main/rss/hotnews/hotnews.rss’)

# Import appscript
from appscript import *
# Define a variable to refer to the Keynote application
kn = app(‘Keynote’)

# Process the items in the RSS feed represented by the variable
for entry in d.entries:
        # For each item in the RSS feed, create a new slide
        kn.slideshows[0].make(new=k.slide)
        # Add the title of the RSS feed item as the slide title
        kn.slideshows[0].current_slide.title.set(entry[‘title’])
        # Add the summary of the RSS feed item as the slide body
        kn.slideshows[0].current_slide.body.set(entry[‘summary’])

The Keynote document will be populated with new slides. Each new slide will feature the “title” of the RSS/ATOM feed item as the slide title, and the “summary” as the slide body. That’s it!

I’ll leave it as an exercise for another day to consider how to work a dynamic batch of RSS/ATOM-based slides into an existing static presentation.

Read preferences from a property list within a Mac OS X Python script

I had a need to read some settings from a Python script on Mac OS X recently. I wanted to be able to change selected parameters for the script — some of which could be site or implementation specific — without embedding them directly in the code. Since the script was Mac OS X-only, using a property list seemed like a good idea.

With customizable settings from a property list, a script could become useful and more customizable for a wider community — or even different internal audiences.

I thought reading preferences was going to take a lot of effort. I was, however, pleasantly surprised at how easily I was able to accomplish it.

Asking others who had been down this route before resulted in some links to Apple docs. I wanted a more concrete example of how it was done in Python, and I got a great one.

That example came from reading the source to munkilib from the open source project, Munki. Since Munki had its own preference file, it needed to read from it, and its example was very enlightening.

Frogor directed me to a specific spot in the Munki code. That spot demonstrated how to read a preferences file with CoreFoundation.

Munki also has its own internal defaults for preferences. The munkilib/munkicommon.py example showed how to implement default settings in their absence in a property list. That provides a fallback position so that you always have some value available. In Munki, setting the defaults was done within a function. It seemed that it would be more generic if those defaults were separated out of the preference-reading function.

My own example of how to read a plist is outlined below. This focuses on just what you need in order to read the preferences and provide default settings for a larger script. A single set of preferences can be shared between scripts, and each script can encode its set own defaults. While you can create your own keys and values, for the example below, I will use the keys “StringPreference,” “BooleanPreference,” “ArrayPreference,” and “DictionaryPreference.”

  1. Create a new property list. There are several ways to do this, including the property list editor that has been rolled into Xcode (and is no longer a standalone application) in v4.
  2. Save the file. The name will be used later.
  3. Set up the basic shell of the script that will read the plist and import CoreFoundation.
    #!/usr/bin/env python

    from Foundation import CFPreferencesCopyAppValue
  4. Add in a variable for the script’s bundle ID. The bundle ID uniquely identifies your script’s preferences. It is written in reverse DNS notation, which should be familiar to almost anyone who has dealt with property lists before. It’s handy to have this defined globally for your script so that you can refer back to it as needed.
    #!/usr/bin/env python

    from Foundation import CFPreferencesCopyAppValue

    this_bundle_id = ‘com.example.your-script-here’
  5. Create a function to read a preference by its bundle ID. The business end of the function is the use of CFPreferencesCopyAppValue get a value for a key.
    #!/usr/bin/env python

    from Foundation import CFPreferencesCopyAppValue

    this_bundle_id = ‘com.example.your-script-here’

    def get_preference(preference_key, bundle_id=this_bundle_id):
        """
        Get the preference value for a given combination of preference
        key and bundle ID. Returns the requested preference value.
        """

        # Get the specified preference key from the specified preference
        # bundle ID using CoreFoundation
        preference_value = CFPreferencesCopyAppValue(preference_key, bundle_id)
        return preference_value
  6. Tie the get_preferences function together with a default_preferences object. This way, any missing preferences will fall back to the defaults encoded in your script. Any preferences that are set in a property list will override the defaults.
    #!/usr/bin/env python

    from Foundation import CFPreferencesCopyAppValue

    this_bundle_id = ‘com.example.your-script-here’
    default_preferences = {
        ‘StringPreference’: False,
        ‘BooleanPreference’: False,
        ‘ArrayPreference’: list(),
        ‘DictionaryPreference’: dict(),
    }

    def get_preference(preference_key, bundle_id=this_bundle_id):
        """
        Get the preference value for a given combination of preference
        key and bundle ID. Returns the preference value.
        """

        # Get the specified preference key from the specified preference
        # bundle ID using CoreFoundation
        preference_value = CFPreferencesCopyAppValue(preference_key, bundle_id)
        # If the value is not set in the property list, get a default value
        # from the default_preferences objects
        if preference_value == None:
            preference_value = default_preferences.get(preference_key)
        return preference_value

The script won’t do anything yet, until we create a main() or other functions to call the get_preferences function. Since you’d be reading preferences as part of a larger script, this is fine for now. However, with what’s written already, you can test out reading preferences interactively with the shell and Python interpreter.

  1. Start by checking the property list in the shell. In my case, I created an example property list that sets two of the four preferences.
    $ defaults read com.example.your-script-here
    {
        BooleanPreference = True;
        StringPreference = abc;
    }
  2. Open Terminal, run “python” at the command prompt, and paste the script code above at the interactive Python “>>>” prompt.
  3. Run the get_preferences function and print its output.
    >>> print(get_preference(‘StringPreference’))
    abc
    >>> print(get_preference(‘BooleanPreference’))
    True
    >>> print(get_preference(‘ArrayPreference’))
    []
    >>> print(get_preference(‘DictionaryPreference’))
    {}

That’s it, the expected results were returned. Notice that the output for StringPreference and BooleanPreference are taken from the property list, while the empty array and dictionary come from the default_preferences in the script.

Update: Greg pointed out that there is a certain degree of danger using #!/usr/bin/env python as the shebang line in the script. Should a system have a non-Apple installation of Python, then /usr/bin/env python might return that. Another Python is probably less likely to be able to bridge CoreFoundation, and thus wouldn’t be able to use the CFPreferencesCopyAppValue call to read preferences.

I think the overall shebang danger is small because few Mac OS X systems will have an alternative Python installed, but clearly your chances of that increase in certain situations. To eliminate this risk, do what Munki does and insert the #!/usr/bin/python shebang instead.

Install Mercurial 1.9, Dulwich, and Hg-Git on Mac OS X Lion

Hg-Git is the Mercurial extension to use if you want to connect to local or remote Git repositories. I exclusively use Mercurial and Hg-Git for all of my Github transactions, so I can personally vouch that it works.

Now that Hg-Git has been updated to better support Mercurial 1.9, let’s see if we can get an Hg toolchain working on Lion. Since I did that on Snow Leopard a few days ago, Hg-Git has made it into PyPI. The installation instructions this time are a bit more streamlined, because we can now use easy_install to get Hg-Git and its dependencies.

To get the toolchain set up, we’ll need Xcode. The Xcode suite includes tools we’ll need to make Python easy_install work, along with Subversion (a prerequisite for Hgsubversion, which I’ll talk about in a later article) and other useful tools.

The Xcode installation is a multi-step install process. Both current download methods — the developer download through connect.apple.com (if you have a paid Mac Developer Account) and the Mac App Store — give you an “Install Xcode” application. That application runs a second, real installer that you have to finish before you actually have the Xcode tools available in a ready-to-use state. This is very similar to the situation for Mac OS X Lion, so you may be developing a sense of familiarity with the situation.

To install Mercurial:

  1. Download Mercurial 1.9.2 or later. The binary packages are standard Mac OS X packages; get the one for Lion.
  2. Install Mercurial.

To add Hg-Git to Mercurial on Lion:

  1. Download Xcode 4.1 or later if you don’t already have it. You can do this through connect.apple.com or via the Mac App Store.
  2. Install Xcode if it is not already installed.
    • Open the developer disk image, run the installer inside it, and then run the “Install Xcode” application that was placed in /Applications.
    • Run the “Install Xcode” application that was placed in /Applications by the Mac App Store.
  3. Open Terminal. Run the following command, which will install hg-git and its dependencies (including dulwich, of which you’ll want version 0.8.0 or later):
    $ sudo easy_install ‘hg-git>=0.3.1’
    Password:
    Searching for hg-git>=0.3.1
    Reading <a href="http://pypi.python.org/simple/hg-git/
    Reading"
    title="http://pypi.python.org/simple/hg-git/
    Reading"
    >http://pypi.python.org/simple/hg-git/
    Reading</a> <a href="http://hg-git.github.com/
    Best"
    title="http://hg-git.github.com/
    Best"
    >http://hg-git.github.com/
    Best</a> match: hg-git 0.3.1
    Downloading <a href="http://pypi.python.org/packages/source/h/hg-git/hg-git-0.3.1.tar.gz#md5=4b15867a07abb0be985177581ce64cee
    Processing"
    title="http://pypi.python.org/packages/source/h/hg-git/hg-git-0.3.1.tar.gz#md5=4b15867a07abb0be985177581ce64cee
    Processing"
    >http://pypi.python.org/packages/source/h/hg-git/hg-git-0.3.1.tar.gz#md5=…</a> hg-git-0.3.1.tar.gz
    Running hg-git-0.3.1/setup.py -q bdist_egg —dist-dir /tmp/easy_install-_Uauza/hg-git-0.3.1/egg-dist-tmp-rERQMH
    zip_safe flag not set; analyzing archive contents…
    Adding hg-git 0.3.1 to easy-install.pth file

    Installed /Library/Python/2.7/site-packages/hg_git-0.3.1-py2.7.egg
    Processing dependencies for hg-git>=0.3.1
    Searching for dulwich>=0.8.0
    Reading <a href="http://pypi.python.org/simple/dulwich/
    Reading"
    title="http://pypi.python.org/simple/dulwich/
    Reading"
    >http://pypi.python.org/simple/dulwich/
    Reading</a> <a href="http://samba.org/~jelmer/dulwich
    Reading"
    title="http://samba.org/~jelmer/dulwich
    Reading"
    >http://samba.org/~jelmer/dulwich
    Reading</a> <a href="http://launchpad.net/dulwich
    Best"
    title="http://launchpad.net/dulwich
    Best"
    >http://launchpad.net/dulwich
    Best</a> match: dulwich 0.8.0
    Downloading <a href="http://samba.org/~jelmer/dulwich/dulwich-0.8.0.tar.gz
    Processing"
    title="http://samba.org/~jelmer/dulwich/dulwich-0.8.0.tar.gz
    Processing"
    >http://samba.org/~jelmer/dulwich/dulwich-0.8.0.tar.gz
    Processing</a> dulwich-0.8.0.tar.gz
    Running dulwich-0.8.0/setup.py -q bdist_egg —dist-dir /tmp/easy_install-bHRaTM/dulwich-0.8.0/egg-dist-tmp-MNy6RK
    dulwich/_objects.c: In function ‘py_parse_tree’:
    dulwich/_objects.c:101: warning: implicit conversion shortens 64-bit value into a 32-bit value
    dulwich/_objects.c: In function ‘cmp_tree_item’:
    dulwich/_objects.c:148: warning: implicit conversion shortens 64-bit value into a 32-bit value
    dulwich/_objects.c:152: warning: implicit conversion shortens 64-bit value into a 32-bit value
    dulwich/_objects.c: In function ‘py_sorted_tree_items’:
    dulwich/_objects.c:192: warning: implicit conversion shortens 64-bit value into a 32-bit value
    dulwich/_objects.c:224: warning: implicit conversion shortens 64-bit value into a 32-bit value
    dulwich/_pack.c: In function ‘py_apply_delta’:
    dulwich/_pack.c:98: warning: implicit conversion shortens 64-bit value into a 32-bit value
    dulwich/_pack.c:101: warning: implicit conversion shortens 64-bit value into a 32-bit value
    zip_safe flag not set; analyzing archive contents…
    dulwich.tests.__init__: module references __file__
    dulwich.tests.test_index: module references __file__
    dulwich.tests.test_objects: module references __file__
    dulwich.tests.test_pack: module references __file__
    dulwich.tests.utils: module references __file__
    Adding dulwich 0.8.0 to easy-install.pth file
    Installing dul-daemon script to /usr/local/bin
    Installing dul-web script to /usr/local/bin
    Installing dulwich script to /usr/local/bin

    Installed /Library/Python/2.7/site-packages/dulwich-0.8.0-py2.7-macosx-10.7-intel.egg
    Finished processing dependencies for hg-git>=0.3.1
  4. Edit your ~/.hgrc to enable the Hg-Git Mercurial extension, as noted in the Hg-Git documentation.
    [extensions]
    hgext.bookmarks =
    hggit =

That’s it! Once Mercurial 1.9 plus Hg-Git 0.3.1 or later are installed and you’ve enabled Hg-Git in your ~/.hgrc, you are ready to use Mercurial with local and remote Git repositories.

Parse a vendor RSS feed to get the latest available product version

There may be times when you want to obtain the number of the latest available version — not just the latest installed version — of a software package through automated means. If the vendor or project provides a syndication feed (either RSS or Atom) that describes new releases, then you may be able to parse that data and get the newest release from it.

As an example, let’s examine the RSS feed for Group Logic’s ExtremeZ-IP. Other developers provide RSS/Atom feeds for their releases, but the EZIP feed is a good one to start a demonstration with because it is generally structured well.

We can break apart the EZIP feed with the Universal Feed Parser module for Python, which you must obtain separately.

Update: Because of Mark Pilgrim situation (also described here), the Universal Feed Parser Web site is no longer available. There is an alternative source for the Universal Feed Parser at Google Code, and I have cloned the Universal Feed Parser repository to Bitbucket from there.

import feedparser
ezip_feed = feedparser.parse(‘https://www.grouplogic.com/ezipreleases.xml’)
ezip_feed[‘feed’][‘title’]
u‘ExtremeZ-IP Latest Releases’
ezip_feed.version
‘rss20’

As you can see, the “ExtremeZ-IP Latest Releases” feed is automatically recognized as RSS 2.0. I prefer to use HTTPS for fetching these feeds whenever possible, so if the developer has an HTTP feed, I try to see if it also works with HTTPS.

Next, let’s find out where the version numbers are kept in the feed. It looks like they are in the entry title, based on reading the feed in Safari RSS. I can confirm that with the Universal Feed Parser. We’ll want to examine the title of every feed item so we can better handle both current and future entries from the feed. There are more entries to the EZIP than I will print out.

for entry in ezip_feed.entries:
    entry[‘title’]
u‘ExtremeZ-IP File and Print Server - Version 7.1.1x94’
u‘ExtremeZ-IP File and Print Server - Version 7.1x14’
u‘ExtremeZ-IP File and Print Server - Version 7.0x41’

We get Unicode strings as output from the Universal Feed Parser. That’s why the quoted strings are preceded with a “u” character.

I’d like to strip out the version string from the title element in each entry. I’m going to do so by splitting on whitespace and getting the last group of characters from the string. (This doesn’t account for text like “Hot Fix,” as seen in the EZIP feed, but it is still a good enough starting point for my purposes.)

for entry in ezip_feed.entries:
    entry[‘title’].split()[-1]
u‘7.1.1x94’
u‘7.1x14’
u‘7.0x41’

By stripping out the build number after “x” in the version string, you potentially lose some data. In the EZIP feed, there are entries where two consecutive version numbers are the same except for the build number after the “x.” However, depending on your needs, it may still be useful to eliminate that part of the version string, so we’ll do that next.

for entry in ezip_feed.entries:
    entry[‘title’].split()[-1].partition(‘x’)[0]
u‘7.1.1’
u‘7.1’
u‘7.0’

We really only need the most current or “top” item in the feed, since that should give us the newest release number. The newest version in this particular feed should be in the first entry. That’s “entries[0]” below, because we’re using Python and it zero references the first item in lists.

ezip_current_release = ezip_feed.entries[0].title
ezip_current_release_version = ezip_current_release.split()[-1].split()[-1]
ezip_current_release_version_stripped = ezip_current_release_version.partition(‘x’)[0]
u‘7.1.1’

There, we now have the version number of the most current release, 7.1.1, for the product. We have drawn it straight from the developer’s syndicated feed, so it is as current as the developer makes it.

How could that be useful? The output can be compared against other data, like the currently-installed version. The comparison, in turn, could be made part of a monitoring workflow, so you could get alerts if you fall behind.

If we didn’t strip the build number after the “x,” we would be left with a complex version number. Some Python tools, like distutils, will not currently handle the trailing characters in the version number well.

I have found that you can improve upon distutils’ StrictVersion/LooseVersion version number handling by switching to parse_version in pkg_rsources (“from pkg_resources import parse_version as V”). More coverage of that topic appears in PEP 386. If you are comparing the original version strings from the EZIP feed with similarly complex output from elsewhere, then I would probably use the pkg_resources module.

Get the display resolution on Mac OS X with PyObjC

I came across this hint about display properties on StackOverflow and thought it was worthwhile to write down for later. If you want to get the screen or Desktop resolution of a Mac via Python, you can do so with PyObjC.

First, let’s get the information about the main screen:

>>> from AppKit import NSScreen
>>> print(NSScreen.mainScreen().frame())
<NSRect origin=<NSPoint x=0.0 y=0.0> size=<NSSize width=1920.0 height=1200.0>>

If you want just the horizontal and vertical resolution from that blob of data, you can pull the width and height out:

>>> print(NSScreen.mainScreen().frame().size.width)
1920.0
>>> print(NSScreen.mainScreen().frame().size.height)
1200.0
>>> width, height = NSScreen.mainScreen().frame().size.width, NSScreen.mainScreen().frame().size.height
>>> width, height
(1920.0, 1200.0)

This might be useful in situations where you don’t have any of the “hundred of portable libs in Python that give you access to that information” — such as in your stock Mac OS X Python installation. To clarify: I’m in no way meaning to belittle that there are portable libraries that would let you do the same thing, but you also have to program for your audience and its constraints. One of the reasons I appreciate Python over some scripting languages is that you get so much capability in the Standard Library. However, on Mac OS X, you don’t get modules like pygame by default (yet … and maybe never) while you do get PyObjC.

Get the number of pages for a PDF using the Quartz 2D Python bindings

A question came up on the AppleScript-Users mailing list and I wanted to make note if it, because I came up with a quick Python-based way to answer it. Here, I reiterate that answer on how to get the number of pages from a PDF.

This Python 2.5 sequence (shown as run interactively from the Terminal on Leopard — not as a script) works for me using both a local file and a file from an AFP-mounted volume. It uses the Quartz 2D bindings for Python, which are part of the Python install on Mac OS X (since 10.3?), so it is not pure Python.

$ python
>>> import os
>>> from CoreGraphics import *
>>> pdf_filename = ‘/Volumes/path/to/PDF/document.pdf’
>>> provider = CGDataProviderCreateWithFilename(pdf_filename)
>>> pdf = CGPDFDocumentCreateWithProvider(provider)
>>> pages = pdf.getNumberOfPages()
>>> pages
9
>>> print(pages)
9

Substitute your own path for the pdf_filename variable’s contents, and you can try it yourself. (Above, $ is the shell prompt, and >>> is the Python interpreter’s command prompt, so don’t copy/paste those.)

Since I was doing this in Terminal, I could also type “pdf_filename = ‘” then drag and drop a file from the Finder into Terminal window to insert its path, complete the quoting, and pressed Return. (Saves some time and potential for mistyping.)

You could wrap this sequence into one longish command line and run it from AppleScript with “do shell script.” Or you could save it as a Python script file and again run it with “do shell script.” Or, you could skip AppleScript and just use Python, which I prefer over AppleScript. (Python seemed very natural to me with my minor AppleScript background, and I’m pretty sure I’ve written more of it than AppleScript by this point.)

This is derived from “Example 2: Splitting a PDF File” in the Using Python with Quartz 2D on Mac OS X document at the Apple Developer Connection.

That ADC example also shows how you could specify a PDF file’s path at the command line (with sys.argv), rather than hardcoding it as I did for my example.

There may be a shorter way to do this since I’m no expert on Quartz 2D. (I just appreciate that the bindings are there for a scripting language I happen to like a lot.) I honestly don’t know what’s happening when the provider and pdf objects are being set, but I really don’t need to know for this.

Conditionally import a Python module if it is available

I have been struggling with the issue of module availability in Python. While the “batteries included” nature of the standard library is great, there are occasionally times when I need to resort to a module that isn’t included with Python.

There are also times where I’m using modules whose status has changed. I expect that to happen more in the eventual transition to Python 2.6 and 3.0, because I’ve used modules that are being deprecated.

So I wondered how I could conditionally import a module if it was available, without stopping the flow of my scripts — and gracefully handle situations where it is missing. And here’s one basic answer: use a “try” block to catch the “ImportError” exception. For example, if I were concerned that DNSPython wasn’t going to be installed on my target system:

try:
import dns.resolver # Import DNSPython
dnspython_available = True
except ImportError:
dnspython_available = False

The “except ImportError” clause could specify a different module to load, or other workarounds entirely. You could map the namespaces in the “try” bock so the rest of your script doesn’t notice the change in module functions, if you have a way to work around the missing module. Perhaps, you could even try to obtain and install the module, at least for temporary use by your script.

Thanks to authors of the article Python modules – how do they work? for the assist. The information under heading 2.11, “Is my module available?” answered my question and has given me something to think about.

Python to remove commands module and some Mac-specific modules

Drat! I’ve learned that the commands module for Python, which I use, is deprecated and removed in Python according to PEP 3108. That means I can no longer safely call commands.getstatusoutput() anymore. I’ve frequently used this call in the past because it seemed the sanest, easiest way to call for a shell utility and get both its output and return status (for success or error).

I’ll have to find some other way to perform the same function — preferably one that will work on Python 2.3 from Mac OS X Tiger, Python 2.5.1 from Leopard, and future Pythons. The stated replacement for a number of similar modules (including popen2, which frankly kind-of frightened me off with its name) is the subprocess module from PEP 324, but I don’t know if that will work for my purposes.

There are also a bunch of Mac-specific modules being removed. I don’t use any of them right now, but that doesn’t mean they wouldn’t have been useful.

This kind of thing is spirit-crushing to me for some reason. I’m especially annoyed that Python has been around for so long and it is still reorganizing the ways it calls shell commands. Just settle on something! It seems hard to take it seriously as a system administration scripting language when things like this happen.

On the other hand, I love so much of the Python Standard Library, which has afforded me a lot for system administration …

Format numbers with the Python locale module

I’m constantly astounded by the breadth of features available in the Python Standard Library. Although the functions I find there are not always easy to grasp, it is almost always worth searching around a bit for a function or method in the standard library before I write my own code to do something someone else has probably had to do before.

Take the “locale” module. It lets you format certain kinds of data based on your locale and its customs. Numbers (including currency) happen to be one of its specialties. Since I had a need to output long numbers whose digits were grouped with commas — which makes them easier to read — and locale.format() does just that. Even better, it’s internationalized and formats them for your system’s own locale.

>>> import locale
>>> a = {'size': 123456789, 'unit': 'bytes'}
>>> print(locale.format("%(size).2f", a, 1))
123456789.00
>>> locale.setlocale(locale.LC_ALL, '') # Set the locale for your system
'en_US.UTF-8'
>>> print(locale.format("%(size).2f", a, 1))
123,456,789.00

In trying to use it in the bundled Python 2.5.1 on Mac OS X Leopard, I noticed that the default for scripts and the interpreter doesn’t format numbers as I expected for my locale. I found that I needed to set a locale to get the expected formatting. To do this, I run locale.setlocale(), as above. I’m not sure if this is required for other Python installations, but it’s worth mentioning.

One difficulty I created for myself was when I tried mixing more into my format string. For example, locale.format() would fail to reformat a number when I added in the string for the unit from my original dictionary:

>>> print(locale.format("%(size).2f %(unit)s", a, 1))
123456789.00 bytes

In retrospect, this makes total sense, but it took Mark looking over my code to discover my error.

Get Python installation information from Distutils

The Distutils Python module includes functions to obtain information about the Python installation. This may be useful for system administrators, and it certainly caught my eye when I read about it.

The results below are from Apple’s bundled build of Python 2.5.1 in Mac OS X Leopard. Credit for the comments describing each function comes from the distutils.sysconfig documentation.

>>> import distutils.sysconfig
>>> distutils.sysconfig.get_python_version() # Get the major Python version without patchlevel
'2.5'
>>> distutils.sysconfig.get_python_lib(standard_lib=True) # Return the directory containing the Python library; if 'standard_lib' is true, return the directory containing standard Python library modules
'/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5'
>>> distutils.sysconfig.get_python_lib() # Otherwise, return the directory for site-specific modules
'/Library/Python/2.5/site-packages'
>>> distutils.sysconfig.get_python_lib(plat_specific=True) # Return any platform-specific modules from a non-pure-Python module distribution
'/Library/Python/2.5/site-packages'

You’d install your own modules for system-wide use in the directory returned by distutils.sysconfig.get_python_lib().

Syndicate content