3 Million

Aaron’s cool Wordle visualisations prompted me to have a look at our ever growing log of OPAC keyword searches (see this blog post from 2006). We’ve been collecting the keyword searches for just over 2.5 years and, sometime within the last 7 days, the 3 millionth entry was logged.
Not that I ever need an excuse to play around with Perl and ImageMagick, but hitting the 3 million mark seemed like a good time to create a couple of images…
The only real difference between the two is the transparency/opacity of the words. In both, the word size reflects the number of times it has been used in a search and the words are arranged semi-randomly, with “a”s near the top and “z”s near the bottom.
If I get some spare time, it’ll be interesting to see if there are any trends in the data. For example, do events in the news have any impact on what students search for?
The data is currently doing a couple of things on our OPAC
1) Word cloud on the front page, which is mostly eye candy to fill a bit of blank space
2) Keyword combination suggestions — for example, search for “gothic” and you should see some suggestions such as “literature”, “revival” and “architecture”. These aren’t suggestions based on our holdings or from our librarians, but are the most commonly used words from multi keyword searches that included the term “gothic”.
..and, just for fun, here’s the data as a Wordle:

3 thoughts on “3 Million”

  1. I’ve been wondering for some time how you’re collecting the search data? Could you give me some technical details?

  2. I thought I’d just leave this comment in the easiest place possible: your most recent entry.
    I was interested in your blog only because of its title, I’m sorry to say, but I have couple tips for you:
    1. Proofread your About page well. It’s often the most read page on a website, and a mistake there can be disastrous to your readership.
    2. If you’re going to link to a Google search, make the Google search properly relevant. On the About page, you have a link to the search for this blog’s title, but you don’t have the terms within quotation marks, which results in several unrelated hits.
    Ermm. interesting spam blocker…

Comments are closed.