Integrated Access Council
April 18, 2005
SML Room 409
Present: Katie Bauer, Carol DeNatale, Ann Green, Katherine Haskins, Julie Linden, Fred Martz, Kim Parker, Kalee Sprague, David Stern, Rich Szary
Absent: Dale Askey, Matthew Beacom, Meg Bellinger, Dan Chudnov, Kenny Marone, Jack Meyers, Bobbie Pilette, Tom Saul, Joan Swanekamp, Frank Turner
Guest: David Gewirtz
1. Review Minutes of the March 21 meeting
Ann announced that in Meg’s absence she and Fred will keep the meeting on track. She asked if there were any comments or corrections to the March 21st minutes. These contained a summary of the MetaLib presentation by Audrey Novak and Fred’s presentation on the Interoperability Diagrams. Carol asked if the minutes can be shared and Ann answered yes, and that they are posted on the website. Since there are no corrections the minutes from March 21st are good to go.
2. Discussion of the Interoperability Diagrams –Fred
Fred gave a quick summary of the diagrams that he presented at the last meeting. The 1st diagram is our present state of affairs – the Library’s front door acts as a directory of our holdings. The 2nd diagram shows a platform-specific federated search across collections, but it is limited. This solution is desirable if it can be applied to a wider range of resources because it brings both metadata and content together in a common workspace. The 3rd diagram is focused on federated searching through MetaLib with two add on products: MetaIndex for OAI harvesting and X-Server for feeding retrieved information to an external application such as uPortal or Sakai. The 4th diagram shifts the focus away from MetaLib federated searching toward possible uses of OAI metadata harvesting for integration of in-house Yale resources residing in multiple collections.
Fred discussed the points in “Limitations of Federated Searching” http://www.infotoday.com/it/oct03/hane1.shtml. The most important point was that federated searching is limited by the capabilities of the target resource. Fred also referred to the article “Mad about Metasearching” (http://www.rlg.org/en/page.php?Page_ID=7721), where Laine Farley from CDL points out that targeted searching is best with federated search tools and it is not possible to do a comprehensive search as is done with Google. OAI harvesting is more like Google in that there is a central store of metadata used for searching. Harvesting also provides a golden opportunity for post-processing of metadata in order to deliver enhanced search results.
There was discussion about the complexity of search results from federated searching. David S. pointed out that domain specific native modes of searching may always be better than using OAI and harvested metadata; this is especially true in searching resources like chemical data. He then asked to what extent SFX could serve as an aggregator. Katie said we could use this for some searching, but not a federated search. David S. wanted to confirm that SFX is a smart agent and if you had a specific citation and wanted to get specific links you can use the citation search. Katie said yes, but they have found that many users find SFX confusing once you go past the basic search.
Fred then talked about OAI service providers. Content service providers harvest not just metadata but also the digital objects (e.g. full text and images) to which the metadata refer. In these cases, the full text could be searched as part of the service. If we implemented a digital repository that also functioned as an OAI service provider, it could serve as a combined metadata registry and a content disseminator. Rich asked what would be the advantage of combining these two applications. Rich asked if we would be complicating the situation. David Gewirtz said we probably will not have an omnibus repository, but rather there will be project-specific repositories that rely on OAI as the aggregator. You harvest metadata and then put links out to that content. Fred said that we will always have databases that need to or want to remain independent. In other cases we would want to consider a combined content service potential. The goal would be to manage multiple collections in a unified workspace using an OAI content service to draw from multiple collections. Basic core metadata is a prerequisite for this work to succeed, including extended metadata schemas like MODS or VRA.
How do we get that unified reader experience where you can pull content from multiple collections? It would be possible to combine images from multiple repositories to full advantage. Katherine asked if you could do this on the fly with your own templates. David replied that you could obtain temporary access to the content if you wanted to. Fred said the scenario Katherine mentioned could work. You would retrieve each image individually from various sources and gather them on your own desktop. The process would be less efficient than the OAI solution, however.
David Stern said he would like to be able to integrate information from his own local desktop with the results of a search; he felt it would be better to keep the harvester separate from the repository so it wouldn’t mess you up. Kim said this sounds like a Google search of my desktop and the web. David answered not quite.
Kate asked if we could use Google Scholar as our harvester. Fred said anything exposed in OAI could be Googled, keeping in mind that Google can not index licensed domains. David Gewirtz said it’s our special collections that we want to build upon, to keep our content and send metadata to Google.
Julie asked what IAC should do in response to these discussions. Fred said we are trying to determine options beyond MetaLib implementations by exploring goals and avenues. One step is to get OAI enabled metadata exposed and access collections through MetaIndex and Google. Work has begun on harvesting OAI-compliant metadata from the Finding Aids, StatCat, and the DL at Beinecke.
Rich asked if this approach is policy. Fred said it could be if IAC members agree that it should be. Kim said lets try several things before we decide on policy, considering the amount of work involved in making all collections OAI-compliant. She suggested that the pilots go ahead first, that it is worth experimenting and gauging the impact of enabling OAI and harvesting OAI metadata. Rich asked is there a practical approach to making collections OAI-enabled. We need to come up with the details. In regard to public access to OAI metadata, Kalee noted you can OAI-enable collections but you don’t have to allow harvesting of them externally. Fred said the pilots would also enable prioritization of effort; internal vs. external needs. Kim said there is not yet agreement regarding direct access to Orbis records via Open Worldcat due to ILL issues. David said this will be on the agenda of the Public Services Management Council. Julie asked how do we proceed with the decision of repositories and OAI. Fred said OAI does not yet fully support content harvesting at the moment, so this is a long term issue and repository planning can go forward. Rich said there are questions that need to be asked in regard to policies about enabling OAI harvesting, especially regarding domains, the impact of access, etc. There should be a standard protocol used each time collections are enabled. The protocol will help in prioritization and planning.
David Stern asked if IAC can get a general overview of OAI including costs, etc. Fred said when these pilot projects that are being worked on are done there would be an update on their outcome. David said we don’t need full finished data and would like more information sooner. Ann noted that the Metadata committee is working on core descriptive metadata recommendations and will need to consider OAI compliance. There are questions about how to connect the OAI pilots, the metadata work, public access concerns, and policies and protocol development. The library needs a group to develop and coordinate OAI policy.
Ann announced that Katie’s discussion about usability and user needs assessment will be moved to the May IAC meeting.