Dec
29
(2006)
I’m running a couple of servers full of Drupal sites hosted in a multisite configuration (one copy of Drupal used to host dozens of sites, each with their own sites/sitename directory. I’d been using sympal_scripts to automatically run Drupal’s cron.php script for each site in order to keep search indexes up to date and run other routine maintenance functions as expected. It’s easy enough to drop a curl http://server/site/curl.php into a crontab, but as you start adding sites to the server, it becomes unwieldy to maintain a current crontab of sites to cron.
Sympal_scripts attempts to read through the scripts directory, poking through each site and loading Drupal for each one in order to fire off the appropriate cron.php. It’s been adding records to the Drupal watchdog table, so I expected it to be working just fine. Except it hasn’t actually been running cron.php - it’s been failing silently.
Looks like there’s something funky in the way Drupal refers to the $base_url variable for the site. It’s set in each settings.php file, so it should be as simple as returning the content of a string variable. But it’s borking, and returning the name of the directory containing the site’s settings.php file.
Say I’ve got a server, myserver.com, with a bunch of sites all configured to be served as subdirectories of that server’s main website, such as myserver.com/site1 and myserver.com/site2
Each site has a respective directory within the Drupal installation’s sites directory, such as myserver.com.site1 and myserver.com.site2 (the / are converted to . for use in the directory name because / would be invalid in a directory or filename).
When Drupal is initialized by sympal_scripts/cron.php, it’s getting $base_url values of http://myserver.com.site1 and http://myserver.com.site2.
So, when it goes to fire off the cron task, it’s using urls like: http://myserver.com.site1/cron.php
It works fine on sites configured to run on their own domain, as the domain matches the site directory.
WTF? The http:// shows that it’s reading the value within each settings.php file (or does it?), but why is it retaining the .site1 rather than /site1?
Failing that, is there a better way to reliably run cron.php on a bunch of hosted sites? I’m thinking of writing a script that crawls the sites directory and pulls out the $base_url values for each site and then fires off a curl base_url on the lot of them.
It’d be really cool if Drupal’s own cron.php had a command-line version, capable of operating on any (or all) configured sites. Any ideas?
Comments
5 Responses to “Trouble with cron.php in a Drupal multisite configuration”
Leave a Reply

I wrote a simple a perl script that’s run by cron. I have to manually change the perl script when I add a new site, but if I took an hour (only because I’m so rusty with Perl these days), it could be modified to update itself automatically by searching my multisite /sites directory for the $base_url. Here it is:
#!/usr/bin/perl
use strict;
use LWP;
use HTTP::Request::Common;
my $ua = LWP::UserAgent->new;
my $request;
my $response;
my @drupal_sites = (
'http://www.domain1.org/cron.php',
'http://www.domain2.com/main/cron.php',
'http://www.domain3.com/cron.php',
);
foreach( @drupal_sites ) {
$request = GET($_);
$response = $ua->request($request);
sleep(10);
}
Using http://drupal.org/project/drush might be helpful. I haven’t gotten around to seeing how it works though.
It’s easy enough to drop a curl http://server/site/curl.php into a crontab, but as you start adding sites to the server, it becomes unwieldy to maintain a current crontab of sites to cron.
At my old Maricopa sites I ended up with a small pile of crons to run that grew incrementally. Like you, I loathed editing the ctrontab file for each addition (I am not a vi fan, and each time, I’d have to review the crontab syntax), so what I did was put in my crontab one call to a single shell script which contained all the curl or other commands needed to issue. Editing that was a matter of just keeping a local text version and ftp-ing it.
A downside might be all of your crons would be chugging at the same hour/minute. I guess you could ave a few shell scripts to run at different intervals. OTOH, it sounds like the tool you are doing should be up to the task, and allow your drupal settings control the timing.
Steve (and Alan) - I used to do something similar, having a single script added to my crontab, which in turn contained a list of URLs to curl. The trouble is, on a shared server, I’m not the only one adding sites, so it’s easy to forget to add one, or remove one, so the list gets out of sync pretty quickly. A modified version of Steve’s script that parsed out the value of $base_url for each site would be pretty close to ideal, though. Having the main script fired off as a single cron task is exactly what I want, so each site will be croned sequentially, without having to worry about offsetting by 1 minute to avoid self slashdotting my own server.
Chris - I’ll be checking out drush this afternoon. Thanks! Sounds pretty hopeful, but I’m not sure how I’d run it to cron all of the sites. Time to play around with it