Friday, March 09, 2007

How many chess blogs are there anyway?

I'd been meaning to get to this topic for quite awhile, but with all the caption contests, chess poems, USCL expansion news, Wamala updates, chess pie recipes and obscure references only marginally related to chess, who can find the time?

This story begins at the end of December on Susan Polgar's blog where she posted "The chess blogging boom." Susan wrote:
If you go to google and type "Chess Blog", about 2,130,000 items come up. If you type "Chess Blogs", about 1,760,000 items come up!
Those are of course staggering numbers, but just like the discussion of how many chess players there are in the world, rational thought and rigorous analysis can be brought to bear. So when one of Susan's anonymous commenters said, "I'm amazed that there are so many chess blogs", I thought it would be helpful to provide some clarity on the issue. I wrote:
Don't confuse the number of Google hits with the number of chess blogs. Many of the pages Google is listing probably don't even have the two words next to each other.

I would estimate that there are about 250-350 active chess blogs (where active = at least one post in the last 30 days).
Mr. Anonymous would have none of it:
"Don't confuse the number of Google hits with the number of chess blogs."

I believe Susan more than you.
It was all I could do to stop laughing. For this guy, Susan Polgar was the oracle of all truth and knowledge. It was posted on her blog so it must be true. To be fair, Mr. Anonymous would have no reason to know who I am, but had he done just a bit of research (e.g., checked my blogger profile and clicked over to this blog) he would have realized that I'm probably one of a very small handful of people on the entire planet who is in a good position to try and answer the question in the title of this post. Besides, anyone with a rudimentary understanding of search engines would realize that they are not a particularly useful tool for measuring things like the number of chess blogs. I posted a response to that effect:
"I believe Susan more than you."

Then I'm sorry to say that you would be wrong. As a test, go to say page 30 of the Google results for chess blogs and randomly click on one of the listings. Odds are it will not be a chess blog.
Thankfully, both Susan and another commenter came to my defense.

I didn't think much more about this encounter until Mark Weeks continued the discussion at Chess for All Ages:
A post titled The chess blogging boom ... caught my eye because it showed some typical confusion about counts on Google searches. A comment by DG ... corrected the misunderstanding and continued, 'I would estimate that there are about 250-350 active chess blogs...'

This number seemed high to me. I typically count about 75 blogs updated in the previous month plus another 40 active in the past six months. Since DG usually knows what he is talking about, I'll have to compare the BCC blogroll with my database of blogs.
It occurred to me that I needed to provide some explanation and the methodology behind my estimate. Here goes: While I maintain a database of all the blogs listed on BCC Weblog and continuously update their active/inactive status, I do not assume that I have a complete list of all chess blogs at any point in time. Why? Because new chess blogs are created everyday and on any given day if I want to find a few new ones I have several surefire ways of doing so. In fact, I could probably win a few bar bets with the following proposition, "In the next two hours, I bet I can find 5 chess blogs you've never seen before." Any takers?

OK, so if my database isn't complete then I need to estimate how much of the total universe of chess blogs it likely represents. I put boundaries on the problem -- it seems to me unlikely that there are 50% more chess blogs than I have in the database (i.e., I have found only about 2 of every 3), but reasonably possible that I'm short by 20% (i.e., I've found close to 4 of every five). One can certainly quibble with my estimates, but I think they are plausible.

Now at the time of Susan Polgar's original post I had ~ 220 active chess blogs listed. 220 + 20% = 264 and + 50% = 330, so that's how I came up with the 250-350 estimate I used in my comment.

Where do we stand today? I currently have 243 active chess blogs in the listings. Using the 20-50% methodology results in a current estimate of ~ 290-365.

Mark also suggested that another difference between his list and mine might be that he only tracks English language chess blogs. Certainly non-English blogs (and here I mean the language, not blogs written by people with good dental hygiene :) ) do constitute a sizable part of my database. Of the current 243 active chess blogs, 86 are not written in English while 157 are.

And lest you think we're finally done with this discussion, 10 days later Mark wrote a follow-up post on the topic which included the following:
How do I know when an inactive blog is reactivated? I don't. I trust that someone maintaining an active blog will post a welcome back message or refer to one of the newest posts in a dialog.
I have a solution for this problem. In Bloglines I keep a folder of RSS feeds for all the Inactive chess blogs. If they start up again, their posts pop up immediately. However, I wait until they've posted at least twice in 30 days before moving them back to Active status.

Anyone still awake?

No comments: