book suggestions from “pewbot”

I’ve put together my first hack using pewbot — suggestions based on an individual’s loan history.
By running a user’s loan history against the “also borrowed” database, it’s possible to build a list of titles that should be of interest to that borrower.
For example, if a student had borrowed the following 4 IT books:

     

…then the top 10 suggestions are:

     

     

 

Continue reading “book suggestions from “pewbot””

Live OPAC search terms display

Another shameless hack inspired by the “Making Visible the Invisible” at SPL.
I’ve tweaked HIP to cache keyword search terms and then put together a couple of pages that display successful searches (in tasteful shades of purple and lilac) and failed searches (in gruesome greens). 
IE has a nice CSS blur, so I’ve coupled that with Ajax to provide a constantly updating web page where new terms appear at the front and then drop slowly to the back, becoming more and more blurred and darker as they recede (click to view full size versions):

Curse you Superpatron!

It’s way past my bedtime, but the Ann Arbor Superpatron has been planting ideas in my head again…

Recently Checked Out Books feed (in RSS or otherwise)

I’ve not built a feed, but I have come up with these two representations of the most recent check outs (click for larger versions):
1) The last 30 covers to walk out the door

2) Word Splat!

…that particular splat is entitled “And Treacle Challenge Yorkshire” and is now on sale for only $395,000 (serious bidders only please!)
Word Splat! is made up of words from the titles of the most recent X number of check outs (where X is a roughly a handful).
I made a typo when initially coding the Word Splat!, and ended up with a random sub selection of words at the top left.  I kinda like that, so whatever you get at the top left (if anything) is officially the title of that Splat!
update…
I’ve added three more book collages:
1) 30 Overdue Books
2) 30 Most Recent Requests
3) 30 Most Borrowed Books
The “Overdue Books” are a random selection of items that were due back on the previous day, but have yet to turn up.

A Perfect Library 2.0 Day

Just relaxing with a glass of wine after a very very Library 2.0 day 🙂
With a lot of help from Iman Moradi (blog/flickr), we ran an introduction to Library 2.0 for members of our Subject Teams and Tech Services this afternoon.  Then, after a coffee break, we watched the SirsiDynix Institute Weblogs & Libraries: Communication, Conversation, and the Blog People web seminar given by Michael Stephens.
All in all, it’s given us a lot to discuss as we look towards (hopefully) implementing a Library Services or Computing & Library Services weblog.  Fingers crossed that next week’s Library 2.0 Web Seminar will be as much fun.  I’m keen to run into Stephen Abram at the upcoming SirsiDynix SuperConference in Birmingham as I want to find out what Library 2.0 things the company has in the pipeline — the API layer in the upcoming Horizon 8 release is defintely a welcome step in the right direction.
There was a lot of interest amongst staff in the new NCSU OPAC, especially as a lot of pioneering work on faceted searching was carried at here at Huddersfield by Amanda Tinker and Steve Pollit.  I’m hoping that there might be potential for us to implement some of Amanda and Steve’s research into our OPAC.
We’ve also got a plateful of potential new features to unleash on our unsuspecting students — simple renewals via email, RSS feeds, keyword search alerts, “people who borrowed this…”, and more.  I’m hoping to see if we can’t do some cool stuff with SMS as well.
2006 is already shaping up to be a busy year for the Library Systems Team — we’ll be involved in the RFID implementation and stock conversion (we’re currently out to tender on this) and we’re also implementing Talis Reading List.  One thing I can’t stand is having nothing to do, so I’m not complaining 😀
I noticed Talis have stated that both John Blyberg and myself are developing these things purely for our own patrons/students.  Whilst that’s true to an extent (after all, I work for Huddersfield not SirsiDynix), we’re both freely sharing much our code so that other Innovative and SirsiDynix customers can play around with it if they want to.  Librarians have a long and proud tradition of sharing freely and I don’t intend to buck that trend just yet.
Speaking of which, I’ve been busy working on a Perl module to process the XML output from HIP 2.x/3.x and turn it into a simple Perl data structure.  The XML output from HIP gives you pretty much all the information you need, but the structure is a little unwieldy.  I’m hopeful the module will make it easier to quickly develop cool stuff like RSS feeds and OpenSearch interfaces from the OPAC.  Once I’ve got the module finished (and posted on this site), I’ll also use it underpin the REST interface.  In turn, that should make the REST code more manageable and I might be able to get that code to a stage where I’d be happy to make it available to the SirsiDynix community.
Unfortunately I’m currently suffering from a mild case of tendonitis in my right arm and hand, so I’m not doing as much coding as normal until it clears up.  Still, as long as I can lift a glass of wine and snuggle up to Bry on the sofa in front of the TV, I’m happy 🙂

“Did You Mean?” – part 2

I’ve been keeping an eye on the search terms and suggestions over the last few days and I noticed that we’re getting quite a few people getting failed keyword searches simply because there’s nothing that matches the term.
In particular, we’ve got a lot of students searching for diuretics.  As there’s no matches found, the spell checker jumps in and suggests things like dietetics, natriuretic or diabetics.  That got me wondering if there was a way of generating suggestions relevant to diuretics, rather than words that look or sound like it.
As a prototype, I’ve modified the Perl script to query the Answers.com web site and parse the response.  The hyperlinks text is compared with known keywords in the subject index and a tag cloud is generated (click to view larger version):

I’ve named it “Serendipity” simply because I’ve no idea what’s going to appear in there — the suggested keywords might be relevant (Hypertension and Caffeine) or they may be too broad (Medicine) to be of use.
Continue reading ““Did You Mean?” – part 2″

“Did You Mean?” for HIP 2 or 3

[update: we’re now using Aspell and the Text::Aspell module]
HIP 4 contains a spellchecking “did you mean?” facility which, although not as powerful as Googles, is certainly a step in the right direction.  One of the basic rules of designing any web based system that supports searching or browsing is to always give the user choices — even if they have gone down a virtual one way street and hit a dead end.
Unfortunately it’s going to be another few months before SirsiDynix release the UK enhanced version of HIP 4 for beta testing, so I thought I’d have a stab at adding the facility to our existing HIP 3.04 server.
Fortunately Perl provides a number of modules for this kind of thing, including String::Approx, Text::Metaphone, and Text::Soundex.
String::Approx is good at catching simple typos (e.g. Hudersfield or embarassement) whereas the latter two modules attempt to find “sounds like” matches — for example, when given batched, Text::Metaphone suggests scratched, thatched and matched.
To set something like this up, you need to have a word list.  You could download one (e.g. a list of dictionary words), but it makes more sense to generate your own — in my case I’ve parsed Horizon’s title table to create a list of keywords and frequency.  That’s given me a list of nearly 67,000 keywords that all bring up matches in either a general or title keyword search.
Once I’d got the keyword list, I ran it through Text::Metaphone and Text::Soundex to generate the relevant phonetic values — doing that in advance means that your spellchecking code can run faster as it doesn’t need to generate the values again for each incoming request.
Next up, I wrote an Apache mod_perl handle to create the suggestions from a given search term.  As String::Approx can often give the best results, the term is run against that first.  If no suggestions are found, the term is run against Text::Metaphone and then Text::Soundex in turn to find broader “sounds like” suggestions.
Assuming that one of the modules comes up with a least one suggestion, then that gets displayed in HIP:

There’s still more work to do, as the suggestions only appear for a failed single keyword.  Handling two misspelled words (or more) is technically challenging — what’s the best method of presenting all the possible options to a user?  You could just give them a list of possibilities, but I’d prefer to give them something they can click on to initiate a new search.

HIP Tips!

Adding a message to your log in page
At Huddersfield, we’ve added some text to our HIP login page:

This is a really easy hack and just involves editing a single XSL page (security.xsl).
As always, make a safe backup of the file before you do any editing!
Firstly, open up security.xsl in your favourite text editor.  If you use Microsoft Notepad and it looks a mess — e.g. you get weird squares (𘂅) appearing — then try opening the stylesheet using Wordpad instead.
Go down to the end of the file, and it should look a little like this:

</center>
</td>
</tr>
</table>
</form>
</xsl:if>
</xsl:template>
</xsl:stylesheet>

Simply insert some well formatted (i.e. XHTML) code between the </form> and the </xsl:if> – for example:

</center>
</td>
</tr>
</table>
</form>
<!-- extra info for users - added by Dave -->
<p /><div align="center" style="font-size:80%">
<b>Borrower ID</b> is the 10 digits of your Campus ID Card
<br />
<b>PIN Number</b> is the 4 digits of the day and month
of your birth (e.g. 0206 for 2nd of June)
<p />
<span style="border-bottom:dashed black 1px; color:red;">
<b>Don't forget to logout when you have finished!</b>
</span>
</div>
<!-- end of extra info -->
</xsl:if>
</xsl:template>
</xsl:stylesheet>

…you should always add comments to any code you add to a stylesheet — it will help you locate your changes when it’s 3 months down the line and you can’t remember what you did, or why you did it!

All this was done using HIP 3.04 (UK release) and it should work for other versions of HIP 3.

p.s. if you like the idea of reminding a logged in user to log out when they’ve finished, check out this tip too!

HIP Tips!

Do your 856 URLs show up in a big font size that doesn’t seem to quite fit in with the rest of the text on the full bib page?
The quickest way to fix it is to fire up the Horizon table editor, select marc_map, and then locate the marc_map that you use for your 856 URLs.
In the “HTML format (Info Portal only)” field, insert class="smallAnchor" before the href. For example, if your HTML format looks like this:

<a href="$_">{<img src="$9">|$y|$_}</a>

…then change it to:

<a class="smallAnchor" href="$_">{<img src="$9">|$y|$_}</a>

Save the change, and then restart JBoss and the 856 links should pick up the formatting of the “smallAnchor” element from your HIP cascading style sheets (CSS).
And, for the more adventurous – if you’d like to know which 856 links your users are clicking on, then you can set your marc_map up to redirect to a CGI script that logs the URL and then redirects the user’s web browser to the true 856 link.
Once you’ve got your CGI script ready (in this case, I’ve called it logit.pl), you just need to change the 856 marc_map to link to the script – e.g.

<a href="http://foo.com/cgi/logit.pl?$_">{<img src="$9">|$y|$_}</a>

Once you’ve saved that and restarted JBoss, your 856 URLs look like this in HIP:

http://foo.com/cgi/logit.pl?http://www.ebooks.com/12345

Your CGI script just needs to take the contents of the QUERY_STRING environment variable (in the above, it’s http://www.ebooks.com/12345), append it to your log, and then issue a redirect to that URL.
(disclaimer: all of the above was done with Horizon 7.32 UK and HIP 3.04 – your mileage may vary depending on which versions you’ve got!)

DUG/HUG – Friday

Friday started off with a session about the RFID implementation at Middlesex, with Alan Hopkinson, Tim Pond (D-Tech Direct), and Gregor Hotz (Bibliotheca). We’re planning to implement RFID at Huddersfield during 2006.
Next up, James Castle (Freshfields Bruckhaus Deringer) and myself did a 45 minute session entitled “HIP Ideas”. James talked about the issues involved with setting up multi-language subject indexes. You can find my presentation (“Break Your HIP!”) here. Once again, many thanks to eagle-eyed Polly who spotted what had caused the “wheels to fall off” my live floor plan demo!
Finishing off the morning, Jill Osborne (Dynix) gave a series of HIP 4 demonstrations – here are my brief notes:

  • the built in spellchecker is able to offer a range of possible correct spellings (similar to Google’s “did you mean xxx?”)
  • the optional “Thesaurus Expanded Search” includes synonyms for each search term
  • printer friendly versions of search results and full bib pages are available
  • the ADA and Kids HIP profiles won’t be available until HIP 4.1

Finally, Jill ran through some of the things that won’t be carried across from a HIP 3 to HIP 4 upgrade:

  • XSL stylesheet changes
  • some look and feel options
  • any tabs or subtabs that are links

…like many at the conference, I can’t wait to get my hands on HIP 4!
Sadly, we had to rush off after Jill’s session to get back to the airport on time.
Many thanks to everyone involved with organising the conference – especially the Dynix staff!

replacement inactive tab GIFs

The default GIF images used in HIP 3.04 for the inactive tabs only really work if you stick with the grey colour. Once you change that background colour, they no longer match:

Not only that, but the two corner GIFs are actually a different colour (more olive/brown than grey):

You can call me “sad”, but things like that niggle me! 😀
So, after a few minutes with Paint Shop Pro, I’ve come up with some replacement GIFs that use transparency to match whatever background colour you use for your inactive tabs:


If you too are niggled by your default GIFs, then you can download the replacement ones here: inactive_tab_GIFs.zip
You’ll need to overwrite the existing files, which are normally located in the “appserverjbossserverdefaultdeployhipres.warimages” directory:

…all of this assumes you’re on HIP 3.04 – so it might not work with other versions. Also, don’t forget to back up the original images (just in case!)
On a similar vein, I replaced the tab GIFs on our HIP a while ago with slightly more rounded 3D versions, although I’m not sure anyone actually noticed the change!