HIP XML Parser (v0.01) – search parser – Self Plagiarism is Style

Okay folks – here’s the companion piece of code to the bib parser I posted a few weeks ago!
http://www.daveyp.com/files/stuff/xmlparser/search.pl
As with the previous code, this is alpha at best and should be treated as such. However, if you have any suggestions then please feed them back to me.
As well as specifying your own $url, you can also tweak the $maxResults value to determine just how many results you’ll actually get back. This will override the npp value in the URL — this means you should be able to lift a keyword search URL from HIP (which might just return sets of 10 or 20 at a time) and get the script to actually bring back as many results as you want (e.g. 100 or 1,000).

A sample Data::Dumper output of the data structure returned by the function is shown here:
http://www.daveyp.com/files/stuff/xmlparser/search_dump.txt
The dump shows you…
1) A total of 9 results were found, but $maxResults was set to 5, so only 5 results were returned.
2) Each “attribute” (e.g. title, author, ISBN) should have an array of values and the ordering of the array is consistent. In other words, the second result returned by the keyword search has the following attributes:

title: The amber spyglass.
author: Pullman, Philip, 1946-
bib number: 407341
ISBN: 043999358X
publisher: Scholastic
class/call: 823.914 PUL

3) Some of the attributes (e.g. author) can return multiple values. In the above example, the first result has 2 authors and the fifth result has 2 titles.
4) Depending on how you’ve configured your HIP display, you’ll get a set of miscDetails. The miscHeaders will show you what they actually represent (these should match the labels you’ve set in the HIP admin). So, in the above example, none of the 5 results has returned a value for the “format” field.

If HIP finds just a single result for a keyword search, then it will automatically give you the full bib page (rather than a results page with just a single result). As parseSearch is expecting to parse a search page, it won’t understand a full bib page. If this happens, it will give you the following error value:
search returned a full bib page
If you’re wanting to use the data structure to create a search results page (e.g. for OpenSearch), then you’ll need to check for that error message and:
a) send the XML output ($content) through the parseBib function
b) use the data returned by parseBib to “fake” a single result.

What I hope to do during the next few weeks is to put together a CGI script that uses both of the functions (parseSearch and parseBib) to provide an OpenSearch interface. A couple of SirsiDynix sites have volunteered to let me use their HIP servers, so hopefully I’ll be able to provide them with a working OpenSearch facility!