One of the many issues we've struggled with as we've designed this system is how to deal with the fact that we can't provide the users with the thing itself. We do have a measure of how widely cataloged items are, and we promote title-clusters that are widely held above items that may have a higher search-term relevancy but are less widely held. That widely-heldness, we believe, is an indication of how important and respected the work is, in a strong analogy to Google's link measures. In fact, since it costs money to add a book to a library collection, we expect that widely-heldness does measure authority more than a link score -- how often do libraries include books just so that they can be refuted, compared to a hyperlink to something an author intends to refute?
But what if our ranking is always putting things in front of users that they can't get?
We proposed yesterday to use the actual RedLightGreen searches to identify the sample of highly ranked items that we would then test for in our pilot partner's catalog. To facilitate that test, I will log the top five results for each search: the title-cluster ID, the search-term relevancy, and the widely-heldness score.
The test program would then be a perl script with a list of title-cluster IDs as input. The program would query our database for the title of the title-cluster (and perhaps all the identifying numbers for the editions in that title-cluster), and then search the partners' catalogs for those items. Actually, it seems that that process migh be best separated into two scripts: one to derive the search terms from our data base, another to do the searches.
By having the top five results for each search recorded in the log, I will also be able to test how often a classic title is a result. Hamlet, for example, has 1455 editions with a widely-heldedness measure of 3597. In our current sample data set, if you search on Denmark, Hamlet is the second result, the first result if you want an English language item.
Posted by judielaine at July 22, 2003 10:38 AM | TrackBack