2008 — The Year of Making Your Data Work Harder

Quite a few of the conversations I’ve had this year at conferences and exhibitions have been about making data work harder (it’s also one of the themes in the JISC “Towards Implementation of Library 2.0 and the E-framework” study). We’ve had circ driven borrowing suggestions on our OPAC since 2005 (were we the first library to do this?) and, more recently, we’ve used our log of keyword searches to generate keyword combination suggestions.
However, I feel like this is really just the tip of the iceberg — I’m sure we can make our data work even harder for both us (as a library) and our users. I think the last two times I’ve spoken to Ken Chad, we’ve talked about a Utopian vision of the future where libraries share and aggregate usage data 😀
There’s been a timely discussion on the NGC4Lib mailing list about data and borrower privacy. In some ways, privacy is a red herring — data about a specific individual is really only of value to that individual, whereas aggregated data (where trends become apparent and individual whims disappear) becomes useful to everyone. As Edward Corrado points out, there are ways of ensuring patron privacy whilst still allowing data mining to occur.
Anyway, the NGC4Lib posts spurred me on into finishing off some code primarily designed for our new Student Portal — course specific new book list RSS feeds.
The way we used to do new books was torturous… I’ve thankfully blanked most of it out of my memory now, but it involved fund codes, book budgets, Word marcos, Excel and Borland Reportsmith. The way we’re trying it now is to mine our circulation data to find out what students on each course actually borrow, and use that to narrow down the Dewey ranges that will be of most interest to them.
The “big win” is that our Subject Librarians haven’t had to waste time providing me with lists of ranges for each course (and with 100 or so courses per School, that might takes weeks). I guess the $64,000 question is would they have provided me with the same Dewey ranges as the data mining did?
The code is “beta”, but looks to be generating good results — you can find all of the feeds in this directory: http://library.hud.ac.uk/data/rss/courses/
If you’d like some quick examples, then try these:

Is your data working hard enough for you and your users? If not, why not?

Congratulations “City of God” DVD!

Sitting in the Short Loan collection in the main library at the University of Huddersfield, it doesn’t really stand out as been any different to the other DVDs near it, but our copy of “City of God” is officially the most borrowed item from our entire collection (which is nearly 400,000 items) in the last 3 years.
It’s not quite as popular as it once was (the number of loans in 2007 was about half of the 2005 figure), but it’s now been borrowed 157 times since it first arrived here in 2004.
The most borrowed book was one of the copies of “Research methods for business students“, which has now been borrowed 118 times since it was first placed on our shelves.
Anyway, if you were thinking of rushing here to borrow “City of God”, sorry — it’s out on loan at the moment 🙂
(if you were wondering, then “yes, that’s a Google Chart“)

Check out these trends

…sorry, but that was the best blog title I could come up with at 10pm after a long day 😉
In a previous post, I mentioned that the circulation figures were up for the year so far (when compared to 2006). That got me wondering what the long term trend was for items checked out. Unfortunately there are some sizeable gaps in the historical data (as stored on Horizon), otherwise I’d be able to go back as far as 1996.
Anyway, here’s how the number of check outs per month pans out since 2000…
CKOs per Month (2000-2007)
…or if you prefer your lines to be smoother…
CKOs per Month (2000-2007)
The CKO data for this year is in white.
There’s a marked change after 2002 in the period around May, and (if memory serves me right) the structure of our academic year changed in September 2002. The overall figures indicate that we had a slight decline around 2003, but it’s been climbing gradually since then. So, much as I’d love to take the glory for our increased CKOs this year, it’s probably just following the recent trend.
Finally, here’s the same graph, but adjusted for an “academic year” (Sep-Aug)…
CKOs per Month (academic years 1999-2007)

If you build it, they’ll come back for more!

I’m just busy putting together slides for some of the upcoming presentations and I thought it was about time I trawled through some of the OPAC usage stats to see if our students are still using some of the OPAC tweaks we’ve made.
The good news is that they are, and then some more!
First up, here’s the overall usage for 4 of the tweaks (May 2006 to July 2007):
OPAC tweak usage
At first glance, nothing too surprising — the overall trend follows the academic year, with the lull over summer.
What did leap out was the blue line (clicks on “people who borrowed this, also borrowed…” suggestions) — since this April, the usage has been higher than the “did you mean” spelling suggestions (red line). So, either our users have suddenly become better spellers, or they’re making much higher usage of the borrowing suggestions. If I was a betting man, I’d say it was the latter.
We’ve now got enough data to compare the same 3 months in 2006 and 2007 (May to July):
OPAC tweak usage
OPAC tweak usage
That second graph is why I’m sat here with a grin like a Cheshire Cat 😀
I’ve dug out the circulation stats for the same period and that re-inforces the statement that the students are making much higher usage of borrowing suggestions in 2007 than in 2006. You can see that the number of check outs (bold pink) pretty much matches the number of clicks on the “did you mean” spelling suggestions (red line in the first graph). Check outs have also risen in 2007 when compared to the same months in 2006.
circulation stats
Interestingly, I don’t think we’ve ever had a student go up to a member of staff and say “I’ve found the suggestions really useful” or “thank you for adding spell checking”. I wonder how many complaints we’d get it we turned the features off?