The obvious answer is to digitize all of our licenses and make them available in a networked fashion for consultation. In addition, it would be extremely beneficial to be able to search across the licenses for specific terms, and this means applying some kind of optical character recognition conversion (OCR). There are a couple of good reasons to enable searching of digitized licenses. One is when an enterprising person comes up with a previously unconsidered way of using resources. It would be very convenient to be able to search across all our accepted licenses at once to see whether a particular use is allowed or disallowed, and to decide whether it is worth renegotiating however many licenses for this particular use and which licenses will require renegotiation.
Another reason it would be desirable to search our accepted licenses for a particular term occurs during the negotiation process for a new resource. Less experienced (or even reasonably experienced) negotiators might want to locate alternative phrasing to propose for an objectionable clause. Looking to see what was accepted in past licenses would be very instructive.
We propose to photocopy the licenses on file and send them to RIS for scanning into TIFF format. TIFF is an appropriate first imaging format and can be used to conduct OCR tests, or conversion to other image delivery formats. It is invaluable as a primary image format, since it retains all the original scanned information and can be a source for future image migration paths.
While the photocopying and scanning is being accomplished, we propose to test various format delivery options, including different versions of OCR'd PDF documents. The tests will include balancing readability, file size, OCR clean-up vs. acceptable searchability, and will also test whether web-available search engines can search across different file types with a minimum of intervention.
The project will then proceed to convert and mount the licenses openly (non-confidential) and securely (confidential documents) as appropriate.
A second, very important aspect to this project will be examining how to sustain this effort by continuing to mount new licenses as they are received. Issues with licenses that arrive in electronic format initially will be important to address, finding a workflow that mounts the documents with a minimal amount of effort. This SCOPA project is thus part of larger goal, and will lay the groundwork for continued work to efficiently and effectively manage our licenses to electronic resources.
| Student cost per week $24.00x32 weeks |
$768.00
|
| RIS scanning (quote for 500 pages) |
$200.00
|
| Total request |
$968.00
|
| Return to Current Grant Awards | Return to SCOPA Grants page |