CODI 2005 – session links

I’ve put together a page listing each of the CODI 2005 sessions along with (hopefully!) all the PowerPoint, handout, podcast, blog, etc links.

http://www.daveyp.com/files/stuff/codi2005links.html

Please feel free to re-use the link or to circulate it.
If you have any additions or corrections, please email them to me:

d.c.pattern [at] hud.ac.uk

using “circ_tran” to show borrowing suggestions in HIP

One of the things we’re trying to do this year at Huddersfield is to make better use of our data archives:
…as each student goes through a library turnstile, data is written away…
…as each student borrows a book, more data is quietly written away…
…as each student uses an electronic resource, data is written away…
…as each student logs onto a PC, yet another piece of data is…
…okay, enough already – you get the idea!
We’re not particularly interested in what an individual student has done, but we’d like to see the broader pictures. For example, we open the Library 24/7 at certain times of the year (e.g. Easter) – we’d like to know more about the kinds of students who come in late at night and leave early the next morning:

  • are certain ethnic groups more likely to use the Library outside of the standard opening hours?
  • do we get more male or female students using the Library in the wee small hours?
  • are students coming in to use the computers, to issue/return items, or to sit quietly in a corner and study?

The answers to those kinds of questions tend to be found in several databases. The Sentry database tells us when someone entered the Library, but it doesn’t tell us if they are male or female, Asian or Caucasian – that kind is information is stored in the Student Records System. Also, the Sentry database doesn’t tell us what the student actually did – Circ transactions are in Horizon and PC usage info is stored in other databases.
So, long term we’re looking at ways of trying to combine data from all of those sources into meaningful and enlightening stats.
“What has this got to do with showing borrowing suggestions in HIP?”, I hear you ask!
Well, once I’d had a hunt around in our circ_tran table in Horizon, it seemed like a great use of all that historical Circ data would be to do an Amazon-like “patrons who borrowed this book also borrowed…”.
Before I proceed with the “how to”, I’ve got a hunch that not everyone has got a circ_tran table – it might be something that SirsiDynix needs to set for you, rather than a default table that ships with Horizon (can anyone confirm this?)
The circ_tran table contains (amongst other things) two very useful bits of information – the borrower# and the item# of the item they borrowed. You can use the item# to look up the bib# of that item (using the item table).
Once you’ve got the borrower# and bib#, you can use that to create two lists of data:

  • a list of all the bib#s that a specific borrower has ever borrowed
  • a list of all the borrower#s who have borrowed a specific bib#

To build the list of borrowing suggestions, you start with a bib# and:

  • 1) build the list of all the borrower#s who have borrowed that bib#
  • 2) for each of those borrower#s, compile all the bib#s of all the items they’ve borrowed to a single big list of bib#s
  • 3) take that big list and count how many times each bib# appears in the list
  • 4) sort your list of individual bib#s by the count of how many times they appear in the big list

…those bib#s that appear the most times in the big list are therefore the most appropriate ones to suggest.
Unfortunately those 4 steps can take some serious CPU time, so it’s not possible to do it on the fly as each of your patrons brings up a full bib page in HIP. Therefore, you need to pre-process each of your bib#s to generate a list of other suggested bib#s.
I wrote a Perl script this evening (which I’ll make available soon) that slurps up the entire circ_tran table into your PCs memory and then processes each of the bib#s to create up to 10 other suggested bib#s. Each of those suggestions is then pumped into a MySQL database where it will sit until a patron views that bib#s page in HIP.
A single line of JavaScript added to the fullnonmarcbib.xsl stylesheet then pulls in dynamic content from a Perl CGI script. That CGI script simply fetches the list of suggested bib#s from the MySQL database, quickly runs them via the title table in Horizon, and then displays a random selection of them underneath the copy/holding info:
click to view larger image
The only real drawback is that it’s not working with your circ_tran data in real time – the list of 10 possible suggestions per bib# won’t change until I run the slurping Perl script again to rebuild all of the suggestions. On our database of 2,046,180 circ_tran entries, that took about 3 hours to process. So, in theory, you could schedule it to run once a week or once a month.

Taggytastic! (part 2)

Wow! Fame and glory – hopefully the untold riches will be just around the corner! 😉
For anyone who wants to have a go with their Horizon/HIP, I’ve uploaded the script to here:
http://www.daveyp.com/files/stuff/tags/
I’ve done a little bit of tweaking, and the final keyword list now looks like this.
You’ll need to download the Perl script and the sample config.txt file.
As with many of the other scripts I’ve uploaded, you’ll need a working ODBC connection to your Horizon database – if you’re running ReportSmith or EasyAsk, then you’ll know all about that. You’ll also need to have Perl installed, along with the DBD::ODBC module from CPAN.
The config.txt file has three columns:

  • the first column defines the range for each keyword count, and this works with the $threshold variable to select the font size & colour for each keyword in the HTML output
  • the second column defines the font size – you should be able to use any valid CSS value (e.g. 50%, 10px, or x-small, etc)
  • the final column defines the font colour – in the example file I’ve gone for a blue gradient (#006 thu #77D), but if you prefer a single colour then just change all the entries to that (e.g. #00F) – again, you should be able to use any valid CSS value (#123456, red, etc)

To run the script, just put it in the same directory as the config.txt file and run it (e.g. perl getsubjects.txt). The HTML output file should get created in the same directory.
There’s a few variables that you can tweak:

  • $minimumBibs – this is used in the intial SQL query on the subject table, so a lower value means more subjects will be included for processing, but the query might take longer to run and/or hit your Horizon server harder
  • $threshold – once all the subjects have processed, any whose total number of matching bibs fall below the threshold value will be exlcuded from the output – if you’d prefer a smaller list of keywords in the HTML output, then choose a higher value and vice versa
  • $spacing – this is a string of characters to insert between each keyword
  • $hipUrl – unless you really want to link to our HIP, then you’ll need to tweak this URL

Have fun with the script!
Jenny emailed me to ask if the script could work with other systems (e.g. Innovative), so I’m going to have a go writing a smaller version of the script that will take a list of keywords and counts, such as the example below, and then create the same output:
1237 American poetry
381 Java
857 World Wide Web
…so, as long as you can query your system to get something in the above format, then it should work.
[one quick sandwich and Coke later]
…and here is a more general version of the script that should work with other systems:
http://www.daveyp.com/files/stuff/tags2/
As well as the Perl script, you’ll need to download the config.txt file. Also, you’ll need to create your own subjects.txt file – I’ve included a sample one so you can get a rough idea of the layout.
As before, you can do a bit of tweaking with the variables and the config.txt file to customise the final HTML output.
Horizon users who don’t want to faff around with getting the first script to use ODBC can generate their own subjects.txt by copying the output of running the following SQL statement in SQL Advantage (or similar):
select n_bibs,processed from subject where n_bibs > 50
…however, you won’t get the advantage of the way the first script collapses sub-subjects together.

Taggytastic!

Inspired by Jenny Levine‘s mock up of an OPAC with keyword tags, I’ve gone a step further and used our Horizon database (the “subject” table in particular) to generate a real page based on subject keywords with more than 10 bibs:
http://www.daveyp.com/files/stuff/subjects.html
I did a bit of tweaking so that sub-subjects (is that a real word?) are collapsed into the parent subject – if you hover your mouse pointer over one of the links, then you should get a better idea of what I mean. Once I’d got the totals for each parent subject, I excluded anything with less than 100 bibs.

CODI 2005 – Homeward Bound!

Sadly the two “spare days” after the end of CODI 2005 have flown by and tomorrow morning we’re setting off back to the UK. By the way, just in case anyone wants to know what “the sun going down on CODI 2005” looked like, here it is/was:
sun set on Wednesday evening
We spent most of yesterday in St Paul (the sibling to Minneapolis in the title “The Twin Cities”). As some of the Horizon mailing list regulars will know, we were keen to visit the Catbus exhibit at the Minnesota Children’s Museum – and just to prove we did, here’s Bryony in the front seat:
driving the Catbus
…and here’s Totoro himself:
Totoro
…and there’s more pictures here!
For lunch, we walked about a mile out of St Paul to Red’s Pizza Savoy. After the cosmopolitan Minneapolis, Red’s felt like a taste of true Americana.
We even made it back in time to go and give the Loring Park squirrel’s their tea:

CODI 2005 – Day Three (pm)

Planning for Hardware: It Doesn’t Have to be Hard (Tim Hyde – tim.hyde@sirsidynix.com)
Tim’s presentation covered a lot of the same ground that Jolynn’s Planning for 8.0 and 4.2 did. In fact Tim’s session was really a summary of what many of us had seen throughout the 3 days. As one of the final CODI sessions it was ideal – we didn’t want any new shocks or dropping of bombshells 🙂
Tim started off by summarising the Horizon 8, and listed the main new features as:

  • state-of-the-art uPortal
  • record ownership
  • agency modelling
  • support for native open SQL databases (Oracle, DB2, MS SQL)
  • full Unicode support
  • total Java/J2EE solution
  • e-commerce
  • UniMARC, MARC21, MARCXML…
  • LDAP
  • Kerberos encryption
  • Shibboleth
  • thin client (can run on Windows, Mac, and Linux)

Tim also shed some more light on the lack of Sybase in that list of databases: apparently Sybase isn’t 100% Unicode compliant so, until Sybase resolve that, SirsiDynix won’t certify it for use with Horizon 8.0.
For those of you who are thinking about running HIP 4.0 or the Horizon 8.0 application server under Windows 2003, you need to be aware that Microsoft currently limits the Java Virtual Machine to using a maximum of 2GB RAM. In other words, if you load your hardware up with 8 GB of RAM, then HIP/Horizon ain’t going to use it all!
The official hardware recommendations won’t be available until the end of Jan 2006. However, the unofficial word is that if your current hardware is recent, isn’t being stressed out by running Horizon 7.x, and (ideally) has some room for expansion (e.g. extra CPUs or extra memory), then that chances are that it will be suitable for running Horizon 8.0.
For small to medium sized libraries, you should be able to run the application and database servers on the same box, but large libraries should look to run them on separate servers. Every session I’ve been to where that has been stated, a hand has always gone up and someone has said “can you define what you mean by small, medium and large?”…

Yeah – a medium sized library is one that’s smaller than a large one, but bigger than a small one.
(paraphrasing Tim Hyde, SirsiDynix)

Finally, clustering options won’t be available until the release of Horizon 8.1 (Q2/Q3 2006).

CODI 2005 – Day Three (am) – pt 2

Tailored Just for U: uPortal Customised for Academics (Dennis Todd)
The HIP 4 admin tool is built on the 8.0 code base and will run on any desktop that can run Java.
Dennis had prepared a useful “HIP 4.1 Customisation Parameter List” document, but it wasn’t too obvious where this was going to be available to download from.
Dennis also introduced some of the new HIP 4 terminology:

  • targets – anything that HIP 4 can search against (e.g. Horizon database, Z39.50 targets, Digital Library, etc)
  • common codes – groups together similar result/search attributes from the targets so that a single author keyword search could match against authors, composers, editors, etc – whatever was closest to the concept of an “author” for that particular database or resource (MetaLib sites will already be familiar with this “lowest common denominator” idea)

A hint from Dennis – change the quick search (in the portal properties record) so that it searches against more than just the Horizon database.
Templates:

  • profile_user
    template user – the “look & feel” that a logged in user gets
  • profile_guest
    template guest user – the “look & feel” a non-logged in user gets

…make sure that the “system admin” box is ticked for both of the above!
When you log into HIP to make changes to those templates, firstly save the layout and then click save template user – don’t do it the other way around!!!
Remember that guest users shouldn’t have a “preferences” tab.
Anyone who is set up as a “system admin” gets a “manage channels” icon so that they can add new channels to HIP.
In HIP 4, new tabs are added in HIP – not from inside the Java HIP admin tool.
Dennis stressed the importance of getting your templates set up exactly how you want them before going live. If you decide to make changes after going live, then any patrons who had already logged in won’t get to see those new changes.
That raised the question of exactly when do you make those changes – if you’re busy upgrading from HIP 3.x to HIP 4.x, then you don’t have the luxury of saying to your patrons “Hey – the PAC’s not going to be available for a couple of weeks, because we want to play around with it first and make it all pretty for you!”.
Someone suggested that you disable logins until you’ve got the “logged in” template set up – but that would mean patrons wouldn’t be able to make requests or renew items via the PAC.
The only solution seems to be that you need to plan ahead and decide in advance what you want to appear in your HIP and how it should be laid out. Then, once you’ve finished the upgrade, cross your fingers and try and set it up as quickly as possible!

CODI 2005 – Day Three (am)

Insights into Web Reporter and NarrowCast (Eileen Kontrovitz & Brian Rawlings)

Wonderful product, but the roll-out hasn’t been the best!
(Brian Rawlings, Alpha G)

Optional components (add on services) for Web Reporter…

  • OLAP – used for data mining:
    • report objects – include items included in the SQL but not included in the report (e.g. correct sorting by “reconst” fields such as title or call/class number)
    • view filters – includes items in the SQL Query but filters the results displayed in the report
    • derived metrics – create a new metric on the fly based on existing metrics on the report
    • …who benefits? – sites with large databases will benefit the most, as well as people creating “what if?” reports
    • consider purchasing OLAP only for the administrator
  • Report Services:

    • Crystal Reports type interface that can draw data from multiple grids
    • useful for creating letter-type output (e.g. invoice notice letters)
    • …who benefits? – schools, home services, anyone wanting to create form letters
    • in the future, larger number of Report Services documents will be created in future metadata releases
    • not every user needs Report Services
  • Narrowcast:

    • pro-active, automated report delivery
    • reports can be sent to email, files, printers, or SMS devices (text messaging)
    • …who benefits? – everyone!
    • Narrowcast users are cheaper – you may have plenty already
    • savings – you can enter multiple email addresses for the same user

[Narrowcast] is the most exciting part of Web Reporter
(Brian Rawlings, Alpha G)

Brian’s general recommendations:

  • buy as few users as possible
  • buy analysts licenses rather than reporter licenses
  • compliance is based on the number of logins created
  • enable add-on services for individual users as needed

General MetaData rule: when including item attributes, look for them first in the request, circ, circ_history, burb, and burb_history folders
Narrowcast automation can…
1) save you and your staff money and time:

  • automate report delivery
  • notices (inc. pre-overdue)
  • newsletters
  • performance based alerts

2) deliver reports to a fixed group of people, or a dynamic group of people on a specified schedule
3) deliver any type of email notice:

  • dynamic subscription, dynamic content
  • hold notices, pre-overdues, overdue, billing, etc
  • html formatted email, text completely customisable
  • queries database for notice conditions, updates records after sending email
  • needs a few custom attributes if using MSTR 7.5.0

Narrowcast can keep a copy of emails sent, or it can write to the Horizon database to write a block.
Narrowcast Newsletters:

  • email newsletters and event calendars
  • keep your patrons informed of library events

Narrowcast can sent performance based alerts – e.g. alert me when Day End did not run

CODI 2005 – Day Two (pm) – pt 2

Planning for 8.0 and 4.0 4.2: Decisions You Need to Make (Jolynn Halls)
The title of this had changed subtly – with 4.0 long gone and 4.1 nearly here, plans are already afoot for HIP 4.2.
Jolynn rattled through some of the PowerPoint slides, so some of my notes aren’t complete, plus the discussion kinda jumped around a bit.
Introduction…

  • you need to look forward to 8.0/4.2 like any other upgrade and plan accordingly
  • you need to plan on getting staff involvement earlier than with other upgrades – there’s much more they need to learn
  • you need to be on the current releases (7.4/4.1) …apparently Jack has promised there will be an upgrade path from 7.3x? (Jolynn: “He’s the man!”)
  • you need to relish change 🙂
  • staff need to understand and implement the new functionality
  • take advantage of any training (web sessions available from December, although some will be chargeable)

HIP…

  • requires Java JRE 1.5 on the admin workstation – you can run different versions without any problems – use the Java App Cache (javaws.exe)
  • new indexing paradigm – all indexing done on the app server for both Horizon and HIP (instead of separate indexes for StaffPAC, etc)
  • one unified User/Patron database
  • uses filters instead of separate indexes

Hardware… (official specs released in Jan 2006)

  • Horizon 7.x architecture – two tier model
  • Horizon 8.0 architecture – three tier model (DB server, app server, clients)
    • lower client bandwidth
    • less CPU
  • app & DB can be combined onto one server (small to medium sized library)
  • for medium to large libraries, you’ll need 4 servers (app server being the beefiest)
  • for security reasons, you don’t want HIP + app + DB on single server
  • Web Reporter will be a requirement for 8.0 and would usually sit on a separate server
  • client hardware specs available by Jan 2006

Horizon 8.0

  • Database Server – DB2 V8, MS SQL Server 2000/2005, or Oracle 10g (that’s right – no Sybase!)
  • Application Server – Linux 4.0 AS/ES, Solaris 10, or Windows 2003

things that change in Horizon 8.0

  • different DB structure
  • agency vs location
  • indexing
  • record ownership
  • security (roles/staff users)
  • user interface/presentation (navigation/hot keys)
  • inheritance (sharing codes/rules)
  • library type (Horizon vs Corinthian)

Before moving to Horizon 8.0, you need to think about and understand your existing:

  • policies & procedures
  • security/roles
  • privileges/parameters
  • sorts/limits

Highlights of the 8.0 modules…

  • ACQ
    • VIP against multiple vendors
    • create and copy budgets spreadsheet
    • carry forward defaults
    • EDI from client (auto invoicing and response loading)
    • research from selections and POs
    • open with from MARC record to PO
    • approval plan loading
    • processing centers
    • quick entry of invoice lines using order ID
    • access to MARC record from PO and Selection
  • CAT
    • MARC record lists
    • items lists
    • spine label config
    • import/export profile tag action
    • import profile enhanced match points for overlay
    • import/export profile scheduling
    • MARC Editor non-MARC view & overview template
    • URL verification
    • MARC Batch Editor
    • Syntax & Validation Label expections
  • CIRC
    • patron photos
    • request groups
    • linked patrons
    • email patron from check in
    • batch requests (by title/patron)
    • calendar exceptions
    • circ rules/codes inheritance
    • custom blocks
    • ID patron access
    • display of student and/or outreach patron data dependent on Patron Type
    • notification preferences
  • Searching
    • broadcast searching
    • limiting on a browse search
    • multiple search tabs open at the same time
    • extractions
    • indexing (HIP & Horizon)
  • Serials
    • Serials CKI
    • Routing lists
    • Pattern setup
    • copy pattern and pub pattern templates
    • MARC Holdings support
    • Claims Management

What should you be doing now?

  • review existing Horizon policies & procedures
  • prepare for new UI
  • participate in training
  • upgrade to the most current versions
  • allocate time…
  • look at your current hardware

Finally, Jolynn cleared up the situation with TeleCirc…
Basically, Edify were slow in coming up with a version of their software (which underpins TeleCirc II) which would work with Windows 2003 Server. As Microsoft no longer support Windows 2000, Dynix were unhappy with Edify not coming up with a Windows 2003 version of the product. So, Dynix began evaluating solutions from Talking Tech. In the meantime, Edify finally came up with a new version (I think it’s v9.5) that does work with Windows 2003.
The outcome of all that is that there will be two solutions that work with Horizon 8.0 (one from Edify and a new one from Talking Tech). If you don’t already have TeleCirc, then you’d need to decide which solution to use and then buy the hardware and software.
If you already have TeleCirc, then you can either:

  • a) move to the Talking Tech solution – you will need to pay to get a new license and also replace the telephony hardware card in your TeleCirc server (as that hardware isn’t compatible with their software)
  • b) stay with TeleCirc – you will need to upgrade your server to Windows 2003 and also upgrade TeleCirc to the latest version, but you can still use your existing telephony card

The dropping of Sybase as a DB option surprised me, although at Huddersfield we’d been thinking about possibly moving to MS SQL or Oracle… I guess now we don’t have a choice about moving!
It’s going to be interesting to see what the recommended hardware specs are for the servers. At Huddersfield, we run Horizon on a top end Sun V240 with the Sybase database held on our SAN (storage area network) – even when running complex reports, the server barely breaks into a sweat. I’ve got my fingers crossed that the server will still be powerful enough to run both the database and the application servers.

CODI 2005 – Day Two (pm)

I Didn’t Know Web Reporter Could Do That! (Valerie M. Chase – valerie.chase@sirsidynix.com / eric.graham@sirsidynix.com)
As we haven’t had Web Reporter installed yet, kinda of everything was “I didn’t know that”!
Anyway, hopefully these notes will act as a reminder once we’re up and running…
filters…

  • once you’ve run a report, you can run the re-prompt to reselect the filters for the report without going all the way back
  • “view filter”s limit the existing results (rather than re-running the report)
  • “qualify” works well for limiting by dates, etc
  • to create new filters, you need to use the desktop client:
    • select “new” / “filter”
    • if you want a prompt, you need to hit the “prompt” button
  • prompts can be either single select (drop down list), multi-select, check boxes, or radio buttons (use the web options “modify” button to do this)
  • right-click, “search for dependants” will show you every report that uses a specific filter

consolidations…

  • allows you to do a grouping (e.g. combine all the “reference” types together)
  • remember to enable subtotals!
  • you can use consolidation to group together months (e.g. “summer”, “winter”)

metrics…

  • e.g. create a new metric to combine phone, OPAC, etc renewals to get the total renewals

threshold…

  • add a new qualification to highlight parts of the report results (e.g. show certain results in red)

advanced formatting…

  • e.g. change year format from “2005” to “05”