DESIGN REQUIREMENTS FOR ARCHIVAL AUTHORITY SYSTEMS
 
Paper presented at the annual meeting of the Society of American Archivists
October 2, 1988
Atlanta, Georgia

 

Richard V. Szary
Yale University
 
The question of authority control in archival information systems has become something of a red herring in discussions of how to improve the quality and effectiveness of those systems. There is an assumption that if archivists would simply curb their predilection for idiosyncratic descriptive practices by adopting standard data structures, standard descriptive practices, and one or more standardized vocabularies, our catalogs would become models of consistency that will allow users effective access to our holdings. This line of reasoning, however, ignores two very important facts concerning systems development and current archival practice: first, a system will only be effective if it has a consistent underlying rationale that can be comprehended by its users; and second, such a rationale does not now exist for archival descriptive systems.

One might argue that this rationale is implicitly recognized and implemented in archival descriptive practices, and that the call for its explicit articulation is an intellectual exercise with little practical application. Countering that position is a growing realization that the systems archival repositories currently are designing and using have major shortcomings and limitations, and that these are systemic to archival and other bibliographic practices rather than simply technical limitations of the systems within which they are implemented. In support of that assertion, one can point to the absence of user considerations in the design and evaluation of systems, ranging from user needs and expectations for record displays to an accommodation of typical research methodologies in searching and browsing capabilities. Another is the absence of defined measures by which the effectiveness of a system can be evaluated and tracked. The difficulty of explaining the scope and structure of systems now in use suggests a fundamental absence of any organizing principle for comprehending these systems other than a coincidental assembly of separate descriptive records in a shared location. Given this situation, it seems premature to speculate on the types of authority control, whether conventional vocabulary and syndetic control or an enhanced provenance-based approach, that this amorphous system requires. Like any other system capability, authority control features should flow naturally from an overall system purpose and approach. Rather than abandoning the discussion at this point, however, waiting for the development of a consensus on the role and purpose of the catalog, I will make some assumptions, first about the description and retrieval function, and then about the role of the catalog in supporting that function. Only then can we proceed, within that defined context, to look at how authority control might operate.

 

Assumptions about the description and retrieval function

Since an archival descriptive system exists to support the description and retrieval function, the scope and purpose of that function must be well-articulated if the system is to be effective. It is not enough to say that the scope is any and all information about the records and all users who might be interested in them, and that the purpose is to record and retrieve all of that information in order to answer any query that any user might pose. One must understand the types of information that may be used to describe the characteristics of archival materials and the types of approaches that users bring to their use of them, and how the two interact. Only at that point can one begin to describe a system to support those functions, defining its parameters and evaluating its effectiveness.

While such an understanding and consensus requires considerably more research and discussion than has taken place thus far, the following are the assumptions upon which the present discussion of authority control is based:

Archival description and retrieval are provenance-based, as well as bibliographic-based, techniques. Archivists describe and retrieve record descriptions indirectly through knowledge of the characteristics and activities of record creators and other circumstances of their creation and use, as well as directly through knowledge of the characteristics of the material.
Effective description and retrieval require:
      - a structured approach to the recording of bibliographic information, the subsequent searching of that information, and retrieval of bibliographic descriptions of which it is a part;
     
      - a structured approach to the recording of provenance information, the subsequent searching and retrieval of that information, and retrieval of provenance descriptions of which it is a part; and
     
      - the linking of bibliographic and provenance records to record relationships between archival materials and the circumstances of their creation and use, and the subsequent navigation between records based on those links.
     
    The recording of bibliographic and provenance information in a data structure is independent of any particular presentation format.

    The arrangement of both bibliographic and provenance record descriptions is arbitrary except within a defined context. Arrangements within any given context may or may not satisfy research informed by a different viewpoint.

 

Purposes of archival description and retrieval systems

 The development of archival description and retrieval systems has only recently become a shared enterprise, and there exist no statements of purpose or definition of content that have been defined and accepted to the same extent as in the library community. While a number of initiatives have begun codifying existing practice, they rarely identify and address underlying purposes. As a consequence, there are no accepted criteria for guiding the development of and evaluating archival description and retrieval systems.

 The general assumptions about the description and retrieval function that I have just outlined lead to the following assumed purposes for a system to support it:

 A description and retrieval system for archival materials should be an efficient instrument for:
        presenting information on the characteristics, contexts, contents, and conditions of archival materials so as to support an evaluation of the relevance of the described materials to a defined research topic;
       
        presenting information on the circumstances surrounding the creation and use of archival materials to enable researchers to interpret the materials more fully and accurately;
       
        providing the means to collocate and retrieve descriptions of archival materials and of the circumstances of their creation and use by the:
- personal or corporate creator of the materials;

- characteristics of the materials' creators and of other circumstances of their creation and use;

- purpose the materials served and the activity that generated them;

- objects documented in the materials;

- other circumstances of their creation and use; and

- descriptive characteristics of the materials; and
 

selecting and presenting descriptions of archival materials and the circumstances of their creation and use in defined products that exhibit a particular arrangement and formatting of records, appropriate to the product's purpose.  
Definition of authority control

 Assuming a system that fulfills these purposes, we can now begin to investigate the role of authority control within it. In an earlier paper, I suggested the following definition of the generalized, access-related function of authority control:

 The function of authority control is to enhance the accessibility of access points used in the retrieval system to the system's users. To that earlier definition, which focused specifically on access, I would add:  The authority record is the repository for all information used to identify a particular entity and distinguish it from other entities of the same type.  
Using this expanded definition, it is clear that all records in an archival descriptive system, including bibliographic records, are authority records. Bibliographic records are simply authority records for archival materials. This concept is much clearer in systems for published materials, where, as David Bearman has pointed out, bibliographic records function in a more traditional authority role, being shared and maintained through central agencies, with copy-specific information being relegated more and more to holdings records.

 A more comprehensive way of looking at archival bibliographic records then, would define them as authority records containing the "objective" or descriptive cataloging information about the materials. "Subjective" or interpretive information is recorded by linking bibliographic authority records to those for non-bibliographic entities, such as persons, organizations, events, and the like.

 Following this reasoning, the archival catalog can be defined as an integrated authority system in which each type of entity has a particular descriptive format specific to its characteristics and needs. By themselves, these formats provide only the descriptive cataloging capabilities of the system, and in fact, give us nothing more than the set of discrete records coincidentally located in the same system that I spoke of before. It is the ability to link these discrete descriptions into a network of relationships that provides organization to the system, and allows the records to be comprehended as part of a coherent descriptive system.

 Authority control, then, is not simply a discrete capability that can be appended to an otherwise functional catalog, but is the underlying basis for the system. When viewed in this expanded light, an integrated authority system begins to fulfill the purposes of the archival catalog that I outlined earlier, as well as the more traditional authority control functions.

 
 Integrated authority control and the archival catalog

 When I spoke about the purposes of an archival description and retrieval system earlier, I identified four major purposes: presentation of descriptions of archival materials, presentation of descriptions of circumstances of creation, collocation and retrieval of descriptions, and selection and presentation of information in products. All of these assumed purposes are supported by the type of integrated authority control suggested here.

 The structuring of the system into separate record types for different types of entities allows a coherent presentation of bibliographic and provenance information in ways specific to their characteristics and use and facilitates easy movement between them. A user can review a bibliographic description, including an indication of its relationship to other, provenance-based, entities, and request further clarifying information on those entities as needed. In many cases, this information is essential to an accurate interpretation of the bibliographic record, but does not properly form part of the description of the records themselves.

 The collocation and retrieval purposes assumed for the catalog would be supported by a more precise and specific indexing capability that could distinguish between characteristics of records and those of the circumstances surrounding the records. This would decrease the ambiguity that has resulted from attempting to shoehorn non-bibliographic access points into bibliographic records. The system would also support travel through a network of relationships amongst non-bibliographic entities, allowing the user to obtain a sense of the environment in which the records were created and identifying access points that might be useful for bibliographic searches.

 Finally, the system structure would make it easier to define and extract the types of information needed to construct useful bibliographic products from the database. By segregating that information into separate records, the system allows the flexibility of creating different types of products for different purposes. Where a repository guide might need shared administrative history information to introduce a set of record descriptions, a subject guide might benefit from a topical discussion of the history and players involved.

 What has happened to traditional authority functions in this system? Traditional authority functions can be categorized in four areas: preferred form of heading, syntactic structure, description, and source information.

 The concept of a preferred form of heading is less imperative as technical limitations of bibliographic systems are overcome. The overhead of creating and filing catalog cards under each heading variation need not exist in an automated system. While one must impose consistency within a particular database product, particularly if it is a static, non-automated product, the database itself can be much more flexible. Unless the user's choice of a heading presents the system with an unresolvable ambiguity, the retrieval capabilities should be able to ignore the variation, or, as one librarian has put it "open up the normalized and closed bibliographic universe." Thus the authority record should contain as many variations of the heading, of equal legitimacy, as needed, with preference becoming a factor only when necessary for particular products.

 The syntactic structure supported by conventional authority control mechanisms is a subset of the network of relationships that the system could support. Where syntactic relationships are usually confined to simple relationships among entities of the same type (e.g., organizations), the suggested system offers a much broader range of possibilities for relationships amongst entities.

 Descriptive text has fallen out of favor in conventional authority work, primarily because of the economic factors involved in creating and maintaining accurate descriptions. Unlike library systems, however, these descriptions are a much more integral part of archival description and retrieval, and much of the work represented in them cannot be reduced to a set of headings. While the acceptance of such descriptions is more of a cataloging standards question than a systems one, the underlying rationale of the system does recognize the importance of this type of information distinct from its use to guide the use of headings.

 Finally, authority records have provisions for recording information as to the origin and circumstances under which the information was assembled. This has provided an audit trail through which questions that arise about the use of a particular heading can be resolved. Source information, however, has implications that can be extended from its role as a research record.

 An unspoken assumption behind many of the system capabilities described is that there is no one true way of interpreting historical documentation. By explicitly separating the "objective" descriptive cataloging information from the more "subjective" interpretive relations between records, the system offers the opportunity to record diverse interpretations of the same set of records. The network of relationships that exists between the records in a database organizes them into a particular whole that reflects the viewpoint of the creator of the network. The possibility exists of supporting multiple viewpoints of the same database, and of users imposing additional ones as research dictates, through the use of an integrated authority system, particularly one that has explicit source identification for its information.

 
Design requirements

 The preceding discussion has focussed primarily on the concepts underlying authority control in the archival setting. Implementation of such a system, however, requires a more detailed and specific definition of technical design requirements. Without suggesting particular technical implementations, we can still define some basic requirements:

 First, the system must support multiple record types, each structured to record the information characteristic of a particular type of entity, and restricted to that information. Administrative histories or biographical notes would appear in records describing organizations or persons, not in bibliographic records. The segregation of information by the type of entity it is used to describe is essential.

 Second, no type of record must have a privileged status in the system. In conventional bibliographic systems, authority files are auxiliary to the bibliographic records that are the central focus of the system. With an integrated authority system, no assumptions are made about which information is likely to be most useful to the user, such that the extraction of related "peripheral" information is necessary.

 Third, the system must support links between any records regardless of type. Links between records embody the network of relationships that organize the records into a coherent whole. The universe of relationship types and occurrences is open-ended and cannot be restricted to a pre-defined set.

 Fourth, the user must be able to navigate the system links to retrieve all related records of whatever type. It should be obvious to the user reviewing a particular description that additional descriptions for related records or entities exist and might be relevant to his or her interpretation of the description. They should then be able to retrieve that information easily, leaving a research trail that can be retraced.

 Finally, the system must support an integrated access mechanism that offers the user a large degree of flexibility in constructing and following search strategies. The system should interpret a search request as broadly as possible, unless constrained by the user, and present and identify the various options that result from the user's search request.

 
The major conclusion to be drawn from this discussion is that authority control is not an isolated and discrete capability that is separate from the overall rationale and functioning of the catalog. The short-term pursuit of conventional authority control features that promote consistency in vocabulary usage should continue, but we should be realistic about the limited gains in access to be accomplished by these means. Unless the design and construction of our description and retrieval systems become driven by a firm understanding of how they support the interaction between archival information, in all its manifestations, and user approaches and needs, we will be unable to exploit more fully and effectively the wealth of cultural and institutional memory that it is our responsibility to preserve and disseminate.