SCOPA Grant Proposal: 2000


Proposal for Digitally Available Electronic Resource Licenses

Background

Yale University Library licenses a growing number of electronic resources. Our license documents themselves are a record of our legal responsibilities. In addition they delineate exactly who may make use of our electronic resources and in what fashion. While paper copies of these licenses are sometimes kept in key strategic locations for consultation, it is logistically impossible to maintain paper copies in all library locations. A short summary of permitted uses of materials is made available on a library web-site for consultation, but again, such a summary cannot ever cover all possibly desired uses of an electronic resource or answer other questions that may arise unrelated to permitted uses.

The obvious answer is to digitize all of our licenses and make them available in a networked fashion for consultation. In addition, it would be extremely beneficial to be able to search across the licenses for specific terms, and this means applying some kind of optical character recognition conversion (OCR). There are a couple of good reasons to enable searching of digitized licenses. One is when an enterprising person comes up with a previously unconsidered way of using resources. It would be very convenient to be able to search across all our accepted licenses at once to see whether a particular use is allowed or disallowed, and to decide whether it is worth renegotiating however many licenses for this particular use and which licenses will require renegotiation.

Another reason it would be desirable to search our accepted licenses for a particular term occurs during the negotiation process for a new resource. Less experienced (or even reasonably experienced) negotiators might want to locate alternative phrasing to propose for an objectionable clause. Looking to see what was accepted in past licenses would be very instructive.

Methodology

Some preliminary work has already been accomplished to further a goal of digitizing our electronic resource licenses. Initial sorting and analysis work is currently being done on licenses on file in the AUL-Collections office. This process is identifying the appropriate licenses that could be digitized, their status as confidential or non-confidential, and entering them in a controlling database for future reference.

We propose to photocopy the licenses on file and send them to RIS for scanning into TIFF format. TIFF is an appropriate first imaging format and can be used to conduct OCR tests, or conversion to other image delivery formats. It is invaluable as a primary image format, since it retains all the original scanned information and can be a source for future image migration paths.

While the photocopying and scanning is being accomplished, we propose to test various format delivery options, including different versions of OCR'd PDF documents. The tests will include balancing readability, file size, OCR clean-up vs. acceptable searchability, and will also test whether web-available search engines can search across different file types with a minimum of intervention.

The project will then proceed to convert and mount the licenses openly (non-confidential) and securely (confidential documents) as appropriate.

A second, very important aspect to this project will be examining how to sustain this effort by continuing to mount new licenses as they are received. Issues with licenses that arrive in electronic format initially will be important to address, finding a workflow that mounts the documents with a minimal amount of effort. This SCOPA project is thus part of larger goal, and will lay the groundwork for continued work to efficiently and effectively manage our licenses to electronic resources.

Timeline

January-March 2000
Photocopy all extant accepted licenses in AUL-Collections files and send them to RIS for TIFF conversion. Identify any problematic document sizes and quality during this time as well.
April-June 2000
Experiment with different format delivery options and cross-format searchability options. Begin assessing new license arrival workflow.
July-October 2000
Convert scanned documents into chosen format(s), and enable as appropriate searching capabilities
November-December 2000
Clean up any problems and put in place ongoing routines of license mounting.

Benefit

In addition to benefits cited in the project background above, the exploration of the capabilities for online delivery of paper file documents will benefit the library as a whole, since other files of general interest undoubtedly exist elsewhere in the library.

Budget

The major expenses associated with this project would be paying RIS to scan the licenses, hiring a student assistant needed to perform the basic tasks of photocopying and format conversion from TIFF; we already possess the software necessary for the format conversion. The hourly rate for a student to do this type of work is ~$8/hr. The amount of work involved would most likely be no more than 3 hours a week of work for the project year, and we anticipate not more than 32 weeks of work.
 
Student cost per week $24.00x32 weeks
$768.00
RIS scanning (quote for 500 pages)
$200.00
Total request
$968.00

| Return to Current Grant Awards | Return to SCOPA Grants page |