Background
Beinecke Library employs several database management systems to provide bibliographic control of its collections. Foremost among these is the University Librarys Notis-based library management system, which is built upon MARC formats. Beinecke describes its printed material, its archival collections, its modern (i.e. post 1600) manuscripts, prints, photographs, and artwork in ORBIS, which it regards as the principal catalog for the library.
Beinecke does not, however, rely upon ORBIS for its accession records, nor does it regard ORBIS as a complete answer to all of the Librarys descriptive needs. Three additional systems are used to provide additional administrative and descriptive control. Inmagic microcomputer-based database management products are used to maintain accession information and to provide structured, field-based access at the item level to large collections of non-archival materials such as our papyrus and stereographic photo collections. The library employs the SGML-based EAD Document Type Definition and Open Texts full text indexing program (PAT) to permit sophisticated query and retrieval of information in its archival finding aids, virtually all of which are now encoded in SGML. Finally, Beinecke is using IBMs Digital Library (which uses either IBMs DB2 or Oracles relational database) to manage the librarys "Photonegative Digital Image Database." We expect to use IBMs product for additional digital projects, some of which may include audio and video components.
Beinecke anticipates that MARC-based cataloging will continue to be its core bibliographic control system. We are looking to more fully integrate our MARC records with the additional finding aids and databases that we have created or might create in the future. To that end, we have placed URLs for HTML and SGML versions of our archival finding aids in the 856 field of the MARC record that describes the archive. When a user telnets to ORBIS, the URL displays as text. When a user employs a web-based interface, the field functions as an active link to the finding aids. We have also demonstrated the ability to encode links within finding aids that lead to digital representations of the original material. We have not, as yet, employed the 856 field to provide a link from a bibliographic record to a digital representation of an original item in the library (such as a book or photograph), but we anticipate doing so in the future.
At the same time that we are pursuing the integration of finding aids, digital images, and MARC records, we are also exploring ways to provide expanded search and retrieval capabilities for our traditional bibliographic files. Open Texts database technology allowed us to build a full text database of our finding aids that can be queried over the web. The database does not replace ORBIS but it offers patrons a complementary tool that allows them to pursue free-text bibliographic information at a far greater depth than MARC supports.
The "Digital Photonegative Database"
Much of what we have done so far can be seen as applying new technology to well-established, conventional means of bibliographic control. Our "Digital Photonegative Database" is a more experimental approach to the challenge of cataloging, identifying, retrieving and displaying discrete items from the librarys collections. These items, whether printed text, manuscript, original art, print or photograph are frequently parts of a larger bibliographic unit such as a book, an archive, or a codex manuscript, but for many scholars they are also of interest as discrete objects. In addition, the library has several major collections of a non-archival character (such as papyrus or 19th century photographs) in which the common format of the materials are an important organizing principle but for which individual MARC records are considered impractical. Our challenge is to find a means by which we can enable staff and patrons to sift efficiently through such material to retrieve what they need for research, teaching or administration.
To explore the issues associated with such material, Beinecke has digitized 13,000 to 15,000 copy negatives and transparencies from its "photonegative" file. The reproductions of items from our collections have been made over 35 years in response to reader requests for copy prints or as part of conservation and preservation projects undertaken by library staff. The file includes material from all six curatorial units at Beinecke and represents a wide range of originals. We chose the file for our pilot project because patrons had demonstrated an interest in its contents, because it included all areas of the library, and to avoid handling rare originals in our initial scanning project.
The photonegative file has an internal filing order, but its contents have never been "cataloged." Simple information about an original item such as its call number, author and title were recorded on a sleeve that protected the negative, slide or transparency, but the information is much briefer than a traditional catalog record. We decided not to create traditional full records but to use the information that was available on the sleeves to create a basic database. For this reason, we do not see the projects data structure as a standard for future projects. We do, however, think that our experience reveals several issues that it will be important to address in designing any future digital database.
The data structure for the project includes: a record ID field, 17 fields of bibliographic description concerning the item, 5 fields that record information about the creation of the database record and the digital file, 12 fields that record the size and dimensions of the digital file and its display derivatives; 5 "spare" fields that can be renamed and employed in the current project should we discover the need to add additional structured information to the database.
IBM refers to the fields as "attributes." The table below describes them.
| Barcode | The record ID. A barcode is attached to each negative & read and into the database record and IBMs object server. |
| Call_No | The Beinecke call number of the original item. |
| Box | For archival collections; the box in which the original rests |
| Folder | For archival collections; the folder in which the original rests |
| Page_No | Used for codices; the original page |
| Image_No | Used when multiple images appear on a page |
| File_Heading | The title of the folder in the photonegative file in which the negative or transparency rests. |
| Item_Author | The creator of the item digitized (when known) |
| Item_Title | The title of the item digitized (derived or supplied) |
| Item_Date | The date of creation of the item digitized (when known) |
| Item_Place | The place of creation of item digitized (when known) |
| Source_Author | Creator of the source from in which item appears |
| Source_Title | Title of source in which item appears |
| Source_Date | Date of creation of source in which item appears |
| Source_Place | Place of creation of source in which item appears |
| Repro_Type | The kind of photo-duplicate which was scanned |
| Credit_Line | The curatorial collection in BRBL in which item rests |
| Caption | A brief description of image for display with thumbnail |
| Input_Date | Date on which record was created |
| Initials | Initials of person who created the record |
| Scan_Date | Date on which item was digitized |
| Load_Date | Date on which |
| CD_Label | Label of CD with the archival TIFF digital master for item |
| Full_Image_Size Full_Image_Width Full_Image_Height Zoom_Image_Size Zoom_Image_Width Zoom_Image_Height Screen_Image_Size Screen_Image_Width Screen_Image_Height Thumbnail_Image_Size Thumbnail_Image_Width Thumbnail_Image_Height |
The following fields contain the file size and pixel arrays for the archival image and the 3 derivative files that are used for screen display. The Maximum sizes for those derivatives are 200 x 200, 700 x 700 and 1400 x 1400. The scale ratio of the original is preserved in each case so most images are either. |
5 Spare Fields
It is important to distinguish the fields chosen for Beineckes photonegative project from the capabilities of IBMs Digital Library. Our fields are not inherent to Digital Library, which can accommodate a wide variety of data structures. Indeed, one feature of Digital Library is the ability to interrogate multiple databases with distinct data structures at the same time.
The chief lesson about meta-data that we have learned thus far from the photonegative file project is the need to be able to identify in our database not only information about a specific item but also about its original context. Thus we have created distinct "Item" and "Source" fields. This seems to me to be a direct result of our desire to allow people to conduct a wide variety of searches against our file. In some cases they will be seeking to retrieve groups of contextually related material, but in other cases they will be seeking items without regard to their context. Present MARC and AACRII cataloging conventions do not easily accommodate this kind of strategy nor are traditional library management systems like NOTIS particularly well equipped to do so. If we are to achieve the fullest benefit from digital image databases, it seems to me that such systems must offer the flexibility and power to support such distinct uses.
Our data structure also points out the importance of administrative control fields within the record. Our structure includes only a few such fields; I can imagine a number of other fields (some of which would resemble the fixed fields in the MARC format) that would be valuable or essential.
Submitted by George Miles
October 28, 1998
Return to Metadata Task Group Home Page