Minutes from CDC Digital Collection TF

April 8.2004

 

Derek Merleaux was our guest and discussed the Shoah Foundation Archive Project.  (http://www.library.yale.edu/mssa/vha/)

 

This project is part of a Mellon Grant the Shoah Foundation received to investigate how to make their digital archive available over the Internet to educational institutions.  Yale, Rice, and USC all are participating. 

 

The Shoah Foundation testimonies are very large video files.  To facilitate distribution, Yale and the Shoah Foundation are using the “Internet2”, a network that is as large as the Internet but is used mainly by educational institutions and thus has less traffic.  Yale has a large cache server that can keep about 300 testimonies on site.  Derek can request digitization of a testimony if a patron needs it and it can be delivered in about 2 weeks.  It takes several hours to download each testimony from California, but testimonies can be “locked” onto the cache for viewing by a class or by request of a researcher.  The testimonies can be viewed from anywhere on campus. 

 

The Shoah Foundation is a very large archive with different cataloging and indexing policies from the Fortunoff Archive.  It contains 52,000 testimonies collected over the last 10 years.  The Shoah Foundation has now stopped collecting materials and is working on providing access to them.  Previously, they applied keywords and descriptors to segments of each testimony and a summary of the whole testimony.  However, they recently stopped doing this and are streamlining the indexing process.  This means testimonies are cataloged differently.  The videos are all digitized in MPEG2 format, a compromise between compression and viewability.  7,000 videos are indexed and digitized.  26,000 are indexed. 

 

This project was a challenge for the Shoah foundation.  It’s the first time researchers have used the material outside of their office.  Researchers can search all 26,000 indexed testimonies and get them on demand.  The interface the Shoah Foundation developed has proved to be not very intuitive and users require a lot of support.  Yale is able to support this project because they have Derek and the user population is small.  Yale does get many outside researchers from New York and the East Coast who would like to use the material. 

 

Derek is the contact person for this project.  He is currently working on making access to the material at Yale smoother and easier, using a simpler interface for searching the collections.  He is also working on what will be supported after June 30, when the grant funding ends.

 

In addition to providing access, the grant is also investigating how the Shoah material can be searched with the Fortunoff Archive material.  The metadata for the Fortunoff Archive testimonies is much richer.  However, MSSA hopes to enable searching across both collections view a XML backend. 

 

The Fortunoff material is currently not digitized.  However, a preservation grant has been awarded to the Archive to migrate old videotapes onto Beta.  In the process, a digitized copy can be made as a by-product.  This is done through the use of a robot that migrates the tapes and checks them digitally as well as creating metadata about the digitized copy.  However, the digital files are very large and the storage space required will be quite expensive.  A small portion of the Fortunoff Archive material will then be digitized.  However, this material is much more restricted than the Shoah material, and will be available according to current policies (in the MSSA reading room only, no copies allowed). 

 

This process will be very interesting for the rest of the library as there are many videos that could use digitization and/or better and easier access in the future.

 

 

For the rest of the meeting we discussed the documents we are preparing and the recommendations we will make to CDC.  Jen will forward these via email.  We will try to meet with Abe Parrish at the Map library, but Jen will follow up with him if a meeting is impossible.  We hope to have this finished by the end of April so we can meet with CDC.

 

We decided its not really our groups job to set “digitization priorities”.  We currently feel that this is best done by selectors.  The new potential “Developer’s Forum” that may report to the IAC could serve as a useful place for selectors who are developing projects to get assistance, support, and information. 

 

So our recommendations will include:

 

-Recommend that those considering new digital collections projects consult the Best Practices for Selection for digitization and the information about current projects before beginning, to get a sense of the requirements they need to consider.

-Recommend that another group, possibly the Developer’s Forum, would be an appropriate place for selectors to get help and assistance. 

-Recommend best practices/standards be developed for digitization.  This should be done with input from those who have experience and the Systems office.  Perhaps this can be done in conjunction with IAC.

-Recommend that there be more activity around funding, such as improved communication with the development office (library and central) about digital projects, increased awareness of grants and their deadlines, and potentially a grant mentoring program, where successful grant recipients can help newer grant-writers

-Recommend that this group be reformed if necessary in a few years to see if more centralized practices concerning selection for digitization are necessary.

 

In addition to these recommendations, we will have the best practices document and the spreadsheet that summarizes different aspects of digital collections.  

 

Jen:  will work on best practices document and send draft via email

David:  will work on a paragraph for best practices document about preservation

Jae:  will work on a paragraph about the “unlocking collections” part

Katie:  volunteered to fill out the spreadsheet.

Emily:  also will help fill out the spreadsheet.