*** If a file passes all parsers, but refuses to index in OpenText,
the cause could be a tag that is not
properly closed - either in the form of
</title
in which the ending bracket is not present,
or in the form of :
<archdesc
level="collection">
in which the line wrap breaks the tag into two sections, which can
be understood by parsers, but which confuses OpenText.
1. HAVE YOUR TEST FILES, STYLE SHEET, NAVIGATOR, AND ANY AUXILIARY
FILES (INCLUDING IMAGES AND HELPER APPLICATION EXECUTABLES) IN A COMMON
DIRECTORY.
2. MODIFY THE "CATALOG" FILES IN \SOFTQUAD\PANOPRO2\CATALOG TO INCLUDE
NOTATION FOR THE EAD.DTD. THE FOLLOWING LINES SHOULD BE INSERTED AT
THE END:
PUBLIC "-//UC BERKELEY//DTD FINDING AIDS//EN" "findaid.dtd"
PUBLIC "-//Society of American Archivists//DTD ead.dtd (Encoded Archival
Description (EAD))//EN" "ead.dtd"
PUBLIC "-//Society of American Archivists//DTD ead.dtd (Encoded Archival
Description (EAD) VERSION 1.0)//EN" "ead.dtd"
PUBLIC "-//Society of American Archivists//DTD eadnotat.ent (EAD Notation
Declarations)//EN" "eadnotat.ent"
PUBLIC "-//Society of American Archivists//DTD eadbase.ent (EAD Basic
Declarations)//EN" "eadbase.ent"
PUBLIC "-//Society of American Archivists//DTD eadchars.ent (EAD Special
Characters)//EN" "eadchars.ent"
PUBLIC "-//Society of American Archivists//DTD eadtable.ent (EAD Table
Elements)//EN" "eadtable.ent"
3. COPY ALL COMPONENTS OF THE EAD.DTD TO THE DIRECTORY \SOFTQUAD\PANOPRO2\CATALOG
- THESE INCLUDE:
EADSGML.DEC
EAD.DTD
EADGRP.DTD
EADBASE.ENT
EADCHARS.ENT
EADNOTAT.ENT
EADTABLE.ENT
Because of problems in SGML browser, there are limits to the
size of tables.
Currently, it seems to be around 1500 lines (including blank lines).
Early in the EAD project, we discovered that many of our more complex
finding aids were
"bunching up" at certain points. This was becaus eht tables, as declared
at the <C01> level,
were much too long to be parsed. Therefore, we had to start the practice
of dividing long
Series artificially by creating parallel groupings of <C01>s.
The following is an example of how to do this:
In the instance OSBMSS.SGM, there is a single series, "Documents" which
runs over
1500 lines. Because the "bunching up" appears near the end of the series,
it was safe to asume that the series could be neatly divided in half
- A-M & N-Z.
So - the <C01> markup from the beginning of the series was copied
to the mid-point of the series - before the first entry starting with
"N".
Here is the beginning of the series (with the important coding in red):
<dsc type="combined">
<c01><did><unittitle>Documents</unittitle>
</did>
<thead><row><entry>Box</entry><entry>Folder</entry><entry></entry><entry>Date</entry></row></thead>
<c02><drow valign="top"><dentry><unitloc label="box">fpe</unitloc></dentry>
<dentry><unitloc>12</unitloc></dentry>
<dentry><unittitle>
Adams, Joseph Quincy, 1881-1946
ALS to Walter Wilson Greg, Cornell U,
Ithaca, NY
</unittitle></dentry>
<dentry><unitdate>1929 Aug 10</unitdate></dentry></drow>...
Here is the target half-way point in the series:
<c02><drow valign="top"><dentry><unitloc label="box">pe</unitloc></dentry>
<dentry><unitloc>191</unitloc></dentry>
<dentry><unittitle>
Needham, Francis
ALS to James M. Osborn, Worksop,
Nottinghamshire
</unittitle></dentry>
<dentry><unitdate>1937 Jun 6</unitdate></dentry></drow>
And here is the same section with the new coding copied (in red):
</c01>
<c01><did><unittitle>Documents</unittitle>
</did>
<c02><drow valign="top"><dentry><unitloc label="box">pe</unitloc></dentry>
<dentry><unitloc>191</unitloc></dentry>
<dentry><unittitle>
Needham, Francis
ALS to James M. Osborn, Worksop,
Nottinghamshire
</unittitle></dentry>
<dentry><unitdate>1937 Jun 6</unitdate></dentry></drow>
NOTE that the extra tag </C01> had
to be inserted to close the previous <C01>
The final touch is to amend the unittitles so that they make sense
in the newly divided instance.
The first section should be changed to read: Documents
A-M; the second: Documents N-Z.
This general technique can be applied to any long series that does
not display cleanly, due to length. For very long files that need to
be cut into separate instances (e.g. Gertrude Stein Papersd and Langston
Hughes Papers), it is best to examine them to get an idea of how they
were prepared. Basically, it's a matter of duplicating the entire instance
and moving one or more series out of the old finding aid and into the
new one, by themselves.