Jesper Noehr explains why {l,r}strip are considered harmful for removing extensions from filenames with Python. I think he’s absolutely right on that score, and I would agree. The lstrip() and rstrip() methods shouldn’t be used for this purpose.
However, like the only commenter on that post, I’d also recommend os.path.splitext() as the proper tool for the extension-removing job.
Let’s take some example filenames you might come across on Mac OS X Snow Leopard:
If we had a list of filenames (or file paths) like this — perhaps created by os.walk() or some other generator-based process — we couldn’t easily use Jesper’s recommended solution. The replace() string method would give us a much harder time dealing with the range of filenames and extensions in that list. In the case where you don’t know the filename extensions in advance, replace() breaks down. The replace() method would have to be looped with many possible filename extensions.
What we need is a way to split filename from extension, even if we don’t know the extension beforehand. The os.path.splitext() alternative does just that, returning a tuple. Here, I’ll import the os module and then use a list comprehension to run os.path.splitext() through every filename in the list above.
It becomes a simple matter to get just the filename from the tuple, as I do here by modifying the list comprehension to just get the zeroth item from it:
Note that several interesting conditions are handled by os.path.splitext():
Mac OS X Snow Leopard does not include Core Graphics bindings (CGBindings) for 64-bit Python.
The SWIG-based Python CGBindings originally shipped with Mac OS X 10.3, which bundled Python 2.3. Since that time, these bindings — specific to the system’s bundled framework build of Python — had allowed access to Core Graphics objects and commands from within scripts.
They were one of the reasons I decided to use Python in the first place. I thought they would be fun to learn and use, particularly with the then-new PDF Services feature of Mac OS X. The Core Graphics bindings also provided much, much more power than the command line sips tool and had an advantage over other alternatives by being bundled with the operating system. I thought they offered the possibility of growing with Mac OS X’s graphics hardware acceleration. I even found a way to use them to create better screenshots with drop shadows, a task where I’d previously employed Ambrosia’s Snapz Pro X.
Here’s an example of what you’ll see on Snow Leopard if you try to “import CoreGraphics” in 64-bit Python:
With 32-bit Python on Snow Leopard:
While the CGBindings are still available to 32-bit Python in Snow Leopard, you must use PyObjC to replace their functionality for 64-bit Python. Since 64-bit Python is the default in Snow Leopard, it makes sense to transition from the bindings to PyObjC as soon as possible. This means there is some porting work for scripts that used the Core Graphics bindings. I guess I’m glad I didn’t do as much with them as I’d planned.
I see this change as something of a loss. (Is this what Carbon developers are experiencing? Hm.) The Core Graphics bindings were relatively easy to use and felt reasonably Pythonic, even if the documentation was almost nonexistent. PyObjC feels more foreign to me when I attempt to use it — even though it’s clearly the future.
The default installation of Python on Mac OS X Snow Leopard is version 2.6.1. According to the man page for Python on Snow Leopard, Python 2.6 executes as a 64-bit application by default.
If, for some reason, you need to run it as a 32-bit application, this can be changed at the command line:
The preference can be set in either the User or Local filesystem domain in Mac OS X, following the normal precedence rules. To unset it, presumably you would change the boolean to “no” — or perhaps even delete the “Prefer-32-Bit” key.
There is also an environment variable that can override this preference.
After using Acquia Drupal for a while, I took advantage of a trial subscription to the Acquia Network. The network’s services showed me that I had files present in my install that the agent could not account for.
I suspected this was happening because of the way I manage my Acquia Drupal installation with Mercurial. So, I’ve modified my previous process (and updated my instructions) to extract the downloaded tar archive with the --recursive-unlink option. This option appears to successfully remove the contents of every directory before putting new files back into them.
When the archive is extracted in this way, my repository’s working directory shows modified, unknown, and deleted files. This allows me to treat each category of files individually before I commit the changes for a Drupal update as a revision.
$ hg status
The modified files will be tracked normally because they’ve already been added to the Mercurial repository, so I don’t need to do anything special for them.
The unknown files are ones that are completely new, and have not appeared in the same position in a previous revision. They have yet to be tracked by Mercurial, so I have to add them to the repository. To add just those unknown files, then, I have to pick them out from the status listing:
$ hg status --unknown
In order to operate just on those files to add them to the repository, I run a for loop:
This changes the “?” status to “A,” because the files were successfully being tracked by Mercurial.
I use the “--no-status” flag on the “status” command so that just the file paths are printed; the actual status code is not, which is appropriate for the target of the “add” command in the loop.
I do the same basic steps with deleted files. These are files that were in the previous revisions but have been deleted by the --recursive-unlink option from the tar extraction and not replaced with the extraction of the new Acquia Drupal tar archive. If the deleted files had been replaced by the tar extraction, they would either be unchanged (which would not show up in the “status” output) or marked as modified.
To remove the files that are marked as deleted from the repository’s working directory:
However, that may be the same as simply using the following, which I have to explore further:
$ hg remove --after
So, to follow all of these changes in the repository, I run the loop for the uknown files and the loop for the deleted files. The modified files are already tracked, so I don’t need to do anything additional for them. After that, a “commit” will record all of the changes — modifications, additions, and deletions — in the repo.
These commands are based on my current understanding of Mercurial, and they do work for me right now. There could certainly be another better way to do this in one fell swoop — or at least fewer steps. I would welcome that, so if you’re aware of a way, feel free to comment or contact me.
Update: I found that the “hg addremove” command cleanly replaces all of the shell loops I mentioned above. Therefore, I recommend using it instead of the “for” loops I described.
Here is a sequence of commands and output that show how I keep the Acquia Drupal open source content management system up to date with Mercurial, the open source distributed version control system.
In the example below, my Mercurial repositories for Drupal are located in the “drupal” subdirectory of my “repo” folder. Once I’ve moved into that directory, I download the Acquia Drupal distribution with curl and then extract it into my previously-created Mercurial working directory, “acquia_drupal,” using tar.
(Update: I added the --recursive-unlink option after I noticed that the Acquia Network control panel keeps track of extra — possibly unneeded — files and folders you have in your install. The recursive unlink option seems to avoid having stray files from old versions of modules hanging around in your repository after you install updates.)
After extracting Acquia Drupal my Mercurial working directory, I get the status of the repository. It shows there are changes from the last version I checked in — and this includes new files, denoted by a “?” at the beginning of their line.
Since there are new files, I have to add them so they’ll be tracked by the repository. I only need to add in the parent directory for any changed files, and any new files within it will also be added for tracking.
Excellent; the new files have been added. After this, I just need to accommodate the deleted files that no longer need to be tracked (created when using the “--recursive-unlink” option on tar). For that, see my newer instructions.
Now that the right files are being tracked, I need to commit the changes — modified, added, and deleted files — to the repository. This will create a new revision in the repository’s history, which I’ll tag with the text “Acquia Drupal 1.2.0.”
Once this revision is checked in, I can use it to propagate changes to other repositories. I keep the main Acquia Drupal distribution in its own repository, and then use the “hg fetch” command to pull its changes into one where I track contributed modules. That second repository is then pulled into a third repository which stores just the changes for my production Web site. The use of three repositories in this way modularizes and isolates the updates.
Radmind transcripts with symlinks will be damaged when edited in the Radmind Transcript Editor. I have confirmed this with RTE version 0.7.7 used in conjunction with the version 1.13.0 Radmind command line tools.
The problem appears to be an interaction between RTE 0.7.7 (which is old) and newer Radmind tools, according to posts on the Radmind-Users mailing list. It apparently relates to any version of the Radmind tools greater than 1.12.0, which introduced symlink ownership, when used in combination with RTE 0.7.7. This issue is in the Radmind bug report tracker and has been fixed in the CVS version of RTE. To use that newer version of RTE, you have to build the GUI tools from CVS.
You only see the problem — assuming you are using the right combination of versions — if you edit and save a transcript or create a new transcript within RTE (either by drag and drop or the “Add Item to Transcript” command). So, using the RTE to simply view the transcript file — and then editing with a different editor (which is an inconvenience) — is a workaround.
To get a count of the affected transcripts (on your Radmind server), use the following command:
You can simplify the grep search to only return the path of each match, and then process that with Awk to get just the basename of the file. Here’s how to use that technique to get the list of affected transcript files on a Radmind client:
As for actually fixing the damaged transcripts, it appears that the best way to do so is to recreate them from scratch.
A question came up on the AppleScript-Users mailing list and I wanted to make note if it, because I came up with a quick Python-based way to answer it. Here, I reiterate that answer on how to get the number of pages from a PDF.
This Python 2.5 sequence (shown as run interactively from the Terminal on Leopard — not as a script) works for me using both a local file and a file from an AFP-mounted volume. It uses the Quartz 2D bindings for Python, which are part of the Python install on Mac OS X (since 10.3?), so it is not pure Python.
Substitute your own path for the pdf_filename variable’s contents, and you can try it yourself. (Above, $ is the shell prompt, and >>> is the Python interpreter’s command prompt, so don’t copy/paste those.)
Since I was doing this in Terminal, I could also type "pdf_filename = '" then drag and drop a file from the Finder into Terminal window to insert its path, complete the quoting, and pressed Return. (Saves some time and potential for mistyping.)
You could wrap this sequence into one longish command line and run it from AppleScript with “do shell script.” Or you could save it as a Python script file and again run it with “do shell script.” Or, you could skip AppleScript and just use Python, which I prefer over AppleScript. (Python seemed very natural to me with my minor AppleScript background, and I'm pretty sure I've written more of it than AppleScript by this point.)
This is derived from “Example 2: Splitting a PDF File” in the Using Python with Quartz 2D on Mac OS X document at the Apple Developer Connection.
That ADC example also shows how you could specify a PDF file’s path at the command line (with sys.argv), rather than hardcoding it as I did for my example.
There may be a shorter way to do this since I’m no expert on Quartz 2D. (I just appreciate that the bindings are there for a scripting language I happen to like a lot.) I honestly don’t know what’s happening when the provider and pdf objects are being set, but I really don’t need to know for this.
Nigel and Jeff present Mac OS X Laptop Deployments with Puppet in the MacIT track at Macworld Expo 2009. They are two of the first Mac system administrators I knew of using Puppet, and both had a background in Radmind.
I’ve been reading through James Turnbull’s Pulling Strings with Puppet, since our library had a copy. I had hoped to get through it during our winter break, but illness and other factors (no Puppet pun intended) conspired to get in the way. From what I’ve read about it already, Puppet is clearly interesting. Nigel was very enthusiastic about it when we talked at WWDC 2008.
To me, it seems that it would take some effort to model what you want in it and build up a repository of what you want managed. Perhaps I’m feeling like an old dog trying to learn new tricks. Grin.
One point that Nigel and Jeff made in their presentation slides that struck me is that they needed a solution that works when offline, which Puppet does. Radmind can work offline but I daresay that’s not the way that most people would think to use it (lapply with its “-n” flag would be the most basic change).
Kyle also mentioned to me that he’s been using Puppet in conjunction with Radmind. I believe he has Puppet managing configurations and Radmind managing the bulk of the filesystem.
I wanted to find a way to do syntax highlighting of code snippets on my Drupal blog. I came across the GeSHi Filter module, which lets Drupal sites take advantage of the apparently well-regarded GeSHi Generic Syntax Highlighter library that’s meant for just this purpose.
However, I ran into some roadblocks implementing it on my site. Here’s the short story of what I settled on after some trial and error.
My existing code snippets are in <code> blocks, and the initial GeSHi Filter settings applied badly to them. I made the decision to only use GeSHi on <blockcode> blocks, since I wasn’t using that tag yet and it wouldn’t conflict with the snippets already posted.
I most commonly write Bash/Zsh, Python, and AppleScript snippets on my blog. However, the Bash code I was using as part of my trial and error simply wasn’t highlighting; it was coming through as the default (and boring) plain text — but was at least boxed off from the rest of the blog post.
I thought that GeSHi wasn't correctly discovering that the code was written in UNIX shell syntax. I couldn’t find a way to specify the language for that blockcode tag, until I did some searching on the ’net. To change my blockquotes to choose a certain language — at least for the purposes of this Drupal module, if not for GeSHI in general — I needed to add the “lang=lang” style to the tag. For Bash, I could use “lang=bash,” for Python, “lang=python,” and for AppleScript, “lang=applescript.” That made sense.
However, my code was still not being syntax highlighted. I discovered that the Drupal module came with an initial set of languages enabled. The others were all turned off, but that could be changed in the module settings. Without turning them on, even properly-tagged <blockcode> sections did not get the benefit of syntax highlighting.
I changed the GeSHi Filter options to enable some of the languages that were initially disabled, and then disabled the ones I didn’t anticipate using. This allowed me to add Bash and AppleScript syntax highlighting support, as both had been turned off by default. After that, I saw the results I’d hoped for: a syntax-highlighted code snippet.
It took some work, but now that it’s done, I should be all set.
You can find out if an SSL certificate has expired with the command below. I’ve found it useful to be able to check for expired certificates in my use of Radmind, where you can uniquely identify clients to the server with them.
$ openssl x509 -in /path/to/cert.pem -noout -checkend 0
I mention this command primarily because I reviewed the the OpenSSL x509 man page (“man x509”) that comes with Mac OS X Leopard, and it didn’t show the “checkend” option for the command. That was odd, because that option was just what I needed.
I did, however, find it documented in the usage statement-style help for the command:
$ openssl x509 --help
In that usage statement, the “checkend” option is described (with little punctuation) as a way to “check whether the cert expires in the next arg seconds [sic] exit 1 if so, 0 if not.” So, using zero seconds shows you if the certificate has already expired, while an integer greater than zero will show if it will expire in the future. No matter how many seconds you check against, you must examine the results from the exit code (the “$?” shell variable) to see if the certificate is or has expired.
I find this is tremendously useful knowledge when dealing with certificates in Radmind, where an expired certificate can mean the failure of a client to connect to the Radmind server. It could be beneficial in other circumstances, of course — but I don't have those circumstances.
Taking this further, you could check for certificate expiration on a Radmind server — if your certificates are stored in the Radmind special directory for each hostname of a managed client. (Substitute one of your own managed clients’ hostnames for “hostname” in the path below.)
$ openssl x509 -in /var/radmind/special/hostname/private/var/radmind/cert/cert.pem -noout -checkend 0
Since you can do it for one client certificate, you could also loop through all of the certificates on a Radmind server. In this example, I’ll continue to use the path of /var/radmind even though, on Mac OS X, I’d generally prefer to specify the full /private/var/radmind; your Radmind server may not be on Mac OS X even if your clients are. Also, you may need to modify the “depth” parameter on your search to accommodate the paths on your server. Finally, I’ll change the “checkend” parameter to 604800, for seven days (60*60*24*7=604800). That produces something along the lines of:
Change the last line to “done | grep expiring” if you only want to see the expiring certificates.
It’s great to get just the CN of the certificate in these circumstances, since it’s likely you’ll want to act on just those that need attention. One way to do this relatively cleanly is to use OpenSSL x509’s “subject” and “nameopt” options, and then parse the output. Below, I’ll use awk for that. (Again, substitute one of your own managed clients’ hostnames for “hostname” in the path below.)
$ openssl x509 -in /private/var/radmind/special/hostname/private/var/radmind/cert/cert.pem -noout -subject -nameopt sep_multiline | awk '/CN/ {split($1,elements,"=") ; print elements[2] ;}'
Beyond checking for expiration on the server, it may be valuable to do so in your Radmind client scripts, especially if you favor SSL connections. If you find an expired certificate, you can take some remedial action right away that might allow the client to communicate with the server.
I thought about this a while, and the easiest way I came up with — after having already developed more complex logic — was to simply rename or remove the expired certificate from its normal path. Then, allow the client to connect with another authorization level where the client certificate is unnecessary. (Use of a client certificate implies Ramind’s “-w2” authorization level, while a lesser level would mean you’re performing hostname/DNS rather than certificate verification.) This would probably mean you have multiple Radmind server processes running, each on its own port, to accept such incoming requests on the server.