After starting to use Drush, I wanted to switch my Drupal cron jobs over to it. I’d previously been running these jobs the standard way, loading a URL for each of my Drupal sites with curl. That curl command was run by the cron; Site5 allows editing of your crontab directly or through its Backstage Web interface.
I started with the basics. Drush was installed in my home directory, with an alias under ~/bin in my Site5 account. I replaced my curl command one-for-one with the following:
This didn’t work out well. I was getting a huge number of mail messages, indicating problems with the cron jobs. The messages typically contained “tput: No value for $TERM and no -T specified.” Needless to say, this was rather frustrating, so after some trial and error, I modified the command as follows, and that has been working better for me.
The biggest change is that I specify PHP, rather than Drush, directly. This was done so that I could increase the PHP memory limit for the cron job significantly, to 64M, while running Drush. I’m sure that this increase was needed partly due to the number of modules I have installed on my main site (which has an even larger default PHP memory limit in its php.ini). My research indicated that I needed to do this for Drush since it doesn’t access the Drupal sites in the same way loading a page does.
The other noticeable change is that I provide the path to the drush.php file rather than pointing to the Drush alias.
Apache Solr provides a Web service front end for the Apache Lucene indexing and search engine library. Both Solr and Lucene (upon which Solr depends) are Java-based, which has implications for shared Web hosting.
Drupal is an open source CMS, and I happen to use it on a shared Web hosting provider as of this writing. Drupal is gaining support for Apache Solr through a module that has had a lot of input from Acquia (the “Red Hat” of Drupal).
Dries Buytaert of Acquia has some interesting perspective on search for the Web and CMSes in some recent articles on his site. Specifically, he talks about Acquia Search, a Solr-based search service that is being offered to Drupal sites on the Acquia Network. He discusses the advantages afforded by good search capabilities for both visitors to a Drupal Web site and for site administrators.
I’ve used Acquia Search (in beta), and it has been great. It’s very fast compared to the core Drupal Search module. The ability to perform faceted searches, word stemming, spell checking, and more is all tremendous. (You can see it in action in the search field in the site sidebar, as long as my Acquia Network subscription from the beta lasts.)
But Acquia Search part of a larger service offering — the Acquia Network — which ultimately makes it too expensive for me on my personal sites. It’s priced out of reach for me — more costly for one year than two years of Web hosting, domain registrations, and separate e-mail hosting for my domains are today. I think it’s clear that Acquia is aiming at a different market, and that’s fine.
My idle thought, however, is that search by itself is a compelling feature even for small Web sites like mine. It’s as compelling as hosting files, like HTML or PHP or images, or serving databases, like MySQL and PostgreSQL.
If it’s important as Dries notes in his posts — that the search market is so large and growing, and of such universal importance — then great search is a compelling feature to have for many levels of sites. After serving the files and serving the database, it may be the next big service that a Web hosting provider could offer. And today, Web hosting offers a range of pricing (and service levels) to meet various needs.
I could see advertising that for some monthly fee, a Web host offers 55 GB of storage and 550 GB of monthly data transfer and unlimited MySQL databases — and oh, by the way, some reasonable level of indexing/search with Apache Solr and/or Sphinx or whatever. Although I hate to suggest it, search could even be an optional add-on, as many providers treat dedicated IP addresses or SSL or the like.
There may be an additional win, in that separate servers could be optimized for search to offload that processing from the Web server. It could even be something that a Web host contracts out or partners with another to provide — maybe even with a company like Acquia that’s already set up their infrastructure to scale on Amazon EC2.
Especially if other CMSes, such as WordPress, get Solr integration — as with this WordPress Solr plugin — then the case for Web hosts offering something like Solr search gets convincing.
I’ve been struggling with my Site5 Web hosting account for two years. In many respects, it has been great — good service at a price I was willing to pay. However, my biggest single aggravation has been that there is a primary domain name associated with the hosting account, and that primary domain could not be hosted in a subdirectory of the account cleanly. URL rewriting in an .htaccess file had been my workaround for a long time, but it never really did everything I wanted — and it was complicated.
The hosting account was up for renewal today. I had a decision to make: keep the account and the hassle, keep the account and find a solution to the hassle, or back up my data and move on.
I’m happy to say that I’m keeping the account and it appears that all of the frustration regarding the subdirectory has been eliminated.
The CEO of Site5 helped me out, after seeing my complaints on Twitter. He suggested that I request changing the primary domain to a dummy, nonexistent domain name. With that done, I could then create a domain pointer for my former primary domain (this one, actually), linking it to a subfolder of my account’s public_html directory.
I made the support request, which was fulfilled promptly. The change was made and it works.
Now, it looks like some issues I’d been having have cleared up. Namely, the Global Redirect module I use in Drupal correctly redirects from URLs like /node/300 to their human-friendly paths.
In my Mercurial-based workflow for updating Drupal sites, there is a sequence of commands I need whenever a new version of Drupal comes out. I have a hard time remembering the options for “tar” in this sequence — and my original source for the instructions differs from what I need to do on my Web host — so I need to help my memory. The tar command, as constructed below, places its output into the specified destination directory.
Here it is, with tar’s “--strip-path=1” and “-C” options:
$ cd path/to/repository/parent/directory
$ curl -O http://ftp.drupal.org/files/projects/drupal-5.12.tar.gz
$ tar --strip-path=1 -C drupal_source -zxv -f drupal-5.12.tar.gz