Jesper Noehr explains why {l,r}strip are considered harmful for removing extensions from filenames with Python. I think he’s absolutely right on that score, and I would agree. The lstrip() and rstrip() methods shouldn’t be used for this purpose.
However, like the only commenter on that post, I’d also recommend os.path.splitext() as the proper tool for the extension-removing job.
Let’s take some example filenames you might come across on Mac OS X Snow Leopard:
If we had a list of filenames (or file paths) like this — perhaps created by os.walk() or some other generator-based process — we couldn’t easily use Jesper’s recommended solution. The replace() string method would give us a much harder time dealing with the range of filenames and extensions in that list. In the case where you don’t know the filename extensions in advance, replace() breaks down. The replace() method would have to be looped with many possible filename extensions.
What we need is a way to split filename from extension, even if we don’t know the extension beforehand. The os.path.splitext() alternative does just that, returning a tuple. Here, I’ll import the os module and then use a list comprehension to run os.path.splitext() through every filename in the list above.
It becomes a simple matter to get just the filename from the tuple, as I do here by modifying the list comprehension to just get the zeroth item from it:
Note that several interesting conditions are handled by os.path.splitext():
Mac OS X Snow Leopard does not include Core Graphics bindings (CGBindings) for 64-bit Python.
The SWIG-based Python CGBindings originally shipped with Mac OS X 10.3, which bundled Python 2.3. Since that time, these bindings — specific to the system’s bundled framework build of Python — had allowed access to Core Graphics objects and commands from within scripts.
They were one of the reasons I decided to use Python in the first place. I thought they would be fun to learn and use, particularly with the then-new PDF Services feature of Mac OS X. The Core Graphics bindings also provided much, much more power than the command line sips tool and had an advantage over other alternatives by being bundled with the operating system. I thought they offered the possibility of growing with Mac OS X’s graphics hardware acceleration. I even found a way to use them to create better screenshots with drop shadows, a task where I’d previously employed Ambrosia’s Snapz Pro X.
Here’s an example of what you’ll see on Snow Leopard if you try to “import CoreGraphics” in 64-bit Python:
With 32-bit Python on Snow Leopard:
While the CGBindings are still available to 32-bit Python in Snow Leopard, you must use PyObjC to replace their functionality for 64-bit Python. Since 64-bit Python is the default in Snow Leopard, it makes sense to transition from the bindings to PyObjC as soon as possible. This means there is some porting work for scripts that used the Core Graphics bindings. I guess I’m glad I didn’t do as much with them as I’d planned.
I see this change as something of a loss. (Is this what Carbon developers are experiencing? Hm.) The Core Graphics bindings were relatively easy to use and felt reasonably Pythonic, even if the documentation was almost nonexistent. PyObjC feels more foreign to me when I attempt to use it — even though it’s clearly the future.
The default installation of Python on Mac OS X Snow Leopard is version 2.6.1. According to the man page for Python on Snow Leopard, Python 2.6 executes as a 64-bit application by default.
If, for some reason, you need to run it as a 32-bit application, this can be changed at the command line:
The preference can be set in either the User or Local filesystem domain in Mac OS X, following the normal precedence rules. To unset it, presumably you would change the boolean to “no” — or perhaps even delete the “Prefer-32-Bit” key.
There is also an environment variable that can override this preference.
A question came up on the AppleScript-Users mailing list and I wanted to make note if it, because I came up with a quick Python-based way to answer it. Here, I reiterate that answer on how to get the number of pages from a PDF.
This Python 2.5 sequence (shown as run interactively from the Terminal on Leopard — not as a script) works for me using both a local file and a file from an AFP-mounted volume. It uses the Quartz 2D bindings for Python, which are part of the Python install on Mac OS X (since 10.3?), so it is not pure Python.
Substitute your own path for the pdf_filename variable’s contents, and you can try it yourself. (Above, $ is the shell prompt, and >>> is the Python interpreter’s command prompt, so don’t copy/paste those.)
Since I was doing this in Terminal, I could also type "pdf_filename = '" then drag and drop a file from the Finder into Terminal window to insert its path, complete the quoting, and pressed Return. (Saves some time and potential for mistyping.)
You could wrap this sequence into one longish command line and run it from AppleScript with “do shell script.” Or you could save it as a Python script file and again run it with “do shell script.” Or, you could skip AppleScript and just use Python, which I prefer over AppleScript. (Python seemed very natural to me with my minor AppleScript background, and I'm pretty sure I've written more of it than AppleScript by this point.)
This is derived from “Example 2: Splitting a PDF File” in the Using Python with Quartz 2D on Mac OS X document at the Apple Developer Connection.
That ADC example also shows how you could specify a PDF file’s path at the command line (with sys.argv), rather than hardcoding it as I did for my example.
There may be a shorter way to do this since I’m no expert on Quartz 2D. (I just appreciate that the bindings are there for a scripting language I happen to like a lot.) I honestly don’t know what’s happening when the provider and pdf objects are being set, but I really don’t need to know for this.
I wanted to find a way to do syntax highlighting of code snippets on my Drupal blog. I came across the GeSHi Filter module, which lets Drupal sites take advantage of the apparently well-regarded GeSHi Generic Syntax Highlighter library that’s meant for just this purpose.
However, I ran into some roadblocks implementing it on my site. Here’s the short story of what I settled on after some trial and error.
My existing code snippets are in <code> blocks, and the initial GeSHi Filter settings applied badly to them. I made the decision to only use GeSHi on <blockcode> blocks, since I wasn’t using that tag yet and it wouldn’t conflict with the snippets already posted.
I most commonly write Bash/Zsh, Python, and AppleScript snippets on my blog. However, the Bash code I was using as part of my trial and error simply wasn’t highlighting; it was coming through as the default (and boring) plain text — but was at least boxed off from the rest of the blog post.
I thought that GeSHi wasn't correctly discovering that the code was written in UNIX shell syntax. I couldn’t find a way to specify the language for that blockcode tag, until I did some searching on the ’net. To change my blockquotes to choose a certain language — at least for the purposes of this Drupal module, if not for GeSHI in general — I needed to add the “lang=lang” style to the tag. For Bash, I could use “lang=bash,” for Python, “lang=python,” and for AppleScript, “lang=applescript.” That made sense.
However, my code was still not being syntax highlighted. I discovered that the Drupal module came with an initial set of languages enabled. The others were all turned off, but that could be changed in the module settings. Without turning them on, even properly-tagged <blockcode> sections did not get the benefit of syntax highlighting.
I changed the GeSHi Filter options to enable some of the languages that were initially disabled, and then disabled the ones I didn’t anticipate using. This allowed me to add Bash and AppleScript syntax highlighting support, as both had been turned off by default. After that, I saw the results I’d hoped for: a syntax-highlighted code snippet.
It took some work, but now that it’s done, I should be all set.
Irrational Exuberance has An Epic Introduction to PyObjC and Cocoa. Skimming over it, it really does seem to start at the ground level — and that’s what I need. I have high hopes it will help me figure out PyObjC.
While trying to troubleshoot what I’d done to mess up the Mercurial repositories managing my Drupal installations last weekend, I really would have liked a way to see what files had changes in specific revisions. Each revision to a Mercurial repository affects some files, of course, but it seems awfully hard to figure what files changed in that check-in.
I have since found a way to do that by customizing the output of Mercurial. To customize output, you can create templates on the command line (with --template) or for more powerful reformatting, create an output style file.
I struggled for a while to figure out how to use style files, and eventually came up with something that works for me so far.
Since I’ve installed Mercurial from Lee Cantey’s standard binary package for Mac OS X Leopard, I created the file “map-cmdline.changedfiles” at the “/Library/Python/2.5/site-packages/mercurial/templates” path. (Where you put the file may vary depending on where Mercurial is installed, and I’m sorry but I don’t know where it gets installed on other systems.) The contents of “map-cmdline.changedfiles” are below, along with my possibly inept description of what each line is doing:
# Get all of the files in the selected revision
# and stringify them, whatever that means
# but do not 'tabindent' or wrap them to 68/76 columns
# Without first setting changeset to the list of files
# you won't get output from subsequent lines
changeset = '{files|stringify}'
# List modified files, one per line
# preceded by M to mimic `hg status`
file = 'M {file}\n'
last_file = 'M {file}\n'
# List added files, one per line
# preceded by A to mimic `hg status`
file_add = 'A {file_add}\n'
last_file_add = 'A {file_add}\n'
# List deleted files, one per line
# preceded by ! to mimic `hg status`
file_del = '! {file_del}\n'
last_file_del = '! {file_del}\n'
I don’t know why the “map-cmdline.” portion of the filename is there, but as long as I have it, I can call the style file from the command line with what follows the period. So, I can call the style with “--style changedfiles” — and that tiny bit of voodoo seems reasonable enough to me. (The other styles in the directory above, many of which end in “.tmpl” extensions, seem related to the Mercurial Web server, hgweb. I tried, but I couldn’t use their names at the command line, with or without their extensions. Plus, their contents looked HTML-ish.)
With the “map-cmdline.changedfiles” style file saved in that location, I can call Mercurial’s “log” command:
$ hg log --style changedfiles -r tip
… which gives me a list of the files changed in the “tip” (or latest revision) of the repository. I could substitute in any revision identifier for “tip.”
I haven’t actually seen the “file_add” and “file_del” keywords in action; every time I’ve used this style file in the manner described, I’ve only seen files marked as “M” — even if I’m looking at a revision where new files were first checked into the repo. I’m confused by that, but I’m not going to let it sour my day at this point.
There might have been an easier way to do this but I didn’t find one last weekend. It took me some time to figure even this bit out, and I hope writing this post saves someone new to Mercurial from future frustration.
I have been struggling with the issue of module availability in Python. While the “batteries included” nature of the standard library is great, there are occasionally times when I need to resort to a module that isn’t included with Python.
There are also times where I’m using modules whose status has changed. I expect that to happen more in the eventual transition to Python 2.6 and 3.0, because I’ve used modules that are being deprecated.
So I wondered how I could conditionally import a module if it was available, without stopping the flow of my scripts — and gracefully handle situations where it is missing. And here’s one basic answer: use a “try” block to catch the “ImportError” exception. For example, if I were concerned that DNSPython wasn’t going to be installed on my target system:
try:
import dns.resolver # Import DNSPython
dnspython_available = True
except ImportError:
dnspython_available = False
The “except ImportError” clause could specify a different module to load, or other workarounds entirely. You could map the namespaces in the “try” bock so the rest of your script doesn’t notice the change in module functions, if you have a way to work around the missing module. Perhaps, you could even try to obtain and install the module, at least for temporary use by your script.
Thanks to authors of the article Python modules – how do they work? for the assist. The information under heading 2.11, “Is my module available?” answered my question and has given me something to think about.
Mac OS X has a type of URL specifically for opening UNIX man pages. For me, using one of these URLs opens a new Terminal window to display the man page. Just put the name of the man page after the “x-man-page://” URL scheme to create one of these URLs.
For example, a link to the rsync man page would look like “x-man-page://rsync” when written out.
This is a handy way to refer to UNIX man pages with other Mac OS X users in e-mail correspondence, on mailing lists, or on the Web. Because of that, I wanted a quicker way to create these man page URLs. I wrote the Mac OS X Service named “Man Page URL for Command” to satisfy that desire. The “Man Page URL for Command” Service is provided “as-is” with no warranty.
The Service was wrapped up with ThisService by Peter Hosey, as with my earlier Mac OS X Service to shorten a URL with Bit.ly.
To install the Service:
To use it:
Thanks to Nigel for the inspiration.
| Attachment | Size |
|---|---|
| ManPageUrlForCommand.zip | 34.45 KB |
Since the Bit.ly URL-shortening service is all the rage lately, and I hadn’t seen anyone create a Mac OS X Service for it yet, I decided to try my hand at it.
Here’s the result. The core is a relatively simple Python script and requires Mac OS X 10.5 (or Python 2.5 if you have an earlier version of Mac OS X). The Service was wrapped up with ThisService by Peter Hosey. It’s my first attempt at creating a Mac OS X Service — with or without ThisService — and I hope it works for you. However, it is provided “as-is” with no warranty.
That said, if you have comments or suggestions, please feel free to contact me.
Also, I’m not counting this as an endorsement of Bit.ly; I just looked at info on their site and thought I could probably script it and, as you can see, I did.
To install the Service:
To use it:
Of special note, however, is that if you shorten URLs with Bit.ly this way rather than through your browser, you probably won’t see them show up in your history (the most recent 15 URLs you’ve shortened). The script just shortens URLs for you and does so outside of your browser, so whatever cookies or other tracking Bit.ly is doing to generate your history, it doesn’t appear to carry over when using this simple little Service.
| Attachment | Size |
|---|---|
| ShortenUrlWithBitly.zip | 34.17 KB |