I’d missed the news, but the latest version of the Akismet plugin for WordPress includes some tasty stats. As with all things statistical, there’s a few ways to read the numbers, and there are some anomalies (ferinstance, it claims I had a few days of over 1000 ham i.e., valid comments per day and that’s just plain wrong) but the spam stats feel roughly right. They’re not dramatically different from what I was seeing under Mollom, except nobody gets inflicted with Captcha using Akismet.

Akismet history graph

Akismet history graph

This post is intended to counteract the funk that I was feeling (and generated) when I made my previous post. Things aren’t quite as dire as I made them out to be. Yes, there is much room for improvement, both locally and globally, but this is statistically the best time in the history of humanity, so far. And the trends show that things are getting better, overall. See Hans Rosling’s excellent TED presentation, where he backs this up with some powerful statistical animations (and even sword swallowing).

He does end his presentation on a sombre note – all of this progress comes at the cost of increased CO2 emissions. We need to figure out ways to improve the economic foundation of all countries, without roasting the planet.

Guesstimating the size of an RSS feed audience is always a huge shot in the dark, but sometimes I get curious about how many people subscribe to this silly blog. If I was willing to surrender my feeds to Feedburner, I could get some pretty detailed stats. But, I don’t want to hand over that.

So, I thought about digging into the accesslog that’s stored in Drupal’s database. I’ve set my copy to store access logs for the past 2 weeks, and it dutifully records which pages are viewed, as well as the IP address the request came from. It’s just a subset of a typical webserver log, so there isn’t any privacy issue here (if you’re really worried about being tracked online, you’re already using an anonymizing proxy…)

A quick MySQL poke-and-test session, and I came up with a quick query that spits out a WAG about the number of feed subscribers. It’s not accurate, because people might be accessing the feed from multiple locations (recording different IP addresses), and services like Bloglines might be sending many readers in under the cover of a single IP address (at the moment, Bloglines claims 334 folks are subscribing to various feeds published by this blog). Also, it makes no distinction between bots (Google and friends) and actual human-proxying-agents. Whatever.

Here’s my brain-dead-simple query. It just looks for all requests for feed-related paths, and counts up the number of unique IP addresses.

select count(distinct hostname) from accesslog where path like '%/feed%'

YMMV. IANAP. YHBH. Wait. Not the last one.

According to my Drupal log, there have been 955 unique IP addresses requesting the RSS/atom feeds on this blog in the last 2 weeks. That may be higher or lower than the number of actual humans reading the blog. Still, ballpark order-of-magnitude WAG at roughly 1,000. That kind of boggles my mind.

Update: Oops. My rudimentary query forgot subscribers of “rss.xml” – which turned out to be more than the various /feed subscribers! Also, thanks to a tip from Bér Kessels, I added the cool XStatistics module, which takes care of the guesswork. According to it, there are 2062 subscribers to the various feeds on this blog. Wow.