The Dublin Core (DC) is a metadata set designed to promote discovery of electronic resources. Metadata is, simply put, data about data. The DC element set provides a simple, flexible means of describing documents, images, sound files, and other networked information objects.
Originally, the DC was created to enhance searching of document-like objects on the web. The first Dublin Core workshop, held in Dublin, Ohio, identified 12 descriptive elements common to most web documents: Title, Author or Creator, Subject and Keywords, Description, Publisher, Other Contributor, Date, Resource Type, Format, Resource Identifier, Source, and Language. The DC elements were designed to describe works generated by a wide variety of intellectual disciplines and in a number of formats; in addition to text, the element set applied to graphics, sound, and video files. Three more elements, Relation, Coverage, and Rights Management, were added at later Dublin Core Workshops to enhance description of images. (Each of the 15 elements is described in greater detail in Appendix A).
Using Dublin Core for Primary Description
When looked at as a primary means of description, the Dublin Core possesses several weaknesses. First, the level of description provided by the Dublin Core is not exhaustive. The 15 DC elements are general in nature. This generality gives the Dublin Core its flexibility to describe different resources. However, this same attribute limits the DC's ability to describe a work at more than a basic level of detail. The "Author/Creator" element, for example, does not distinguish between corporate authors and personal authors. The corporate nature of an author can only be indicated using explanatory text within the element value itself. When other details about the author are added, such as e-mail address or affiliation, the simple element field can grow long and unwieldy.
Second, the Dublin Core does not prescribe a syntax for element values. In contrast to highly regulated metadata formats like USMARC, the Dublin Core element values use natural language. The advantage of using natural language is that authors/creators can describe their works using language and formatting appropriate to their respective disciplines. Natural language values, however, can pose serious problems for search engines [Lynch]. For example, a search engine would have to execute a number of complex comparisons to discover a specific date when DC dates are stored in any number of different formats (i.e. "10/25/98" vs. "10-25-98" vs. "October 25, 1998," etc.). Searching elements like the "Author/Creator" field described above poses similar problems. Search engines have to parse through long text strings looking for pertinent information like author name.
Third, description of collections and surrogates is awkward under the Dublin Core. Data describing a Collection is linked to members of the collection through the "Relation" element. A search engine attempting to group like things together will have difficulty identifying both the collection and all of its members. Describing the relationship between surrogates is also difficult. For example, two versions of a famous photo exist: the photo itself, and a digital image made from the photo. Both the image and the photo have metadata associated with them. Under the Dublin Core, the relationship between image and photo is described using free-text in the "Relation" element. A patron searching specifically for the original photo may have difficulty distinguishing between references to the photo itself and references to the image.
Qualifiers
In order to address some of the weaknesses in the Dublin Core, a series of qualifiers have been proposed to refine the core element set. The proposed qualifiers fall into two groups: "schemes" and "types". Schemes describe the syntax used by element values. The scheme "LCSH," for example, indicates that the values contained in a Dublin Core "Subjects" element are Library of Congress Subject Headings. Types refine the core element itself. The type "CorporateName," for instance, defines a Dublin Core "Creator" as a corporate author. The Dublin Core Workshops have set two limitations on qualifiers. First, a qualifier can only refine an element, not re-define its semantics. Second, the content must still be understood if the element is used without qualifiers. For a full description of the proposed Dublin Core Qualifiers, see Appendix B.
A Networking Approach: The Warwick Framework
While qualifiers enhance the quality of description in the Dublin Core, they add complexity to the schema without necessarily addressing the needs of specialized communities. Many of these communities have developed their own metadata sets like MESL, the VRA Core, and others. Specialized metadata sets resolve primary description issues, but they make data exchange difficult. The unqualified Dublin Core element set, in this context, offers a means of communicating and exchanging data from more specialized metadata schemes. The Warwick Framework provides a model for using the Dublin Core in this role.
Proposed at the second Dublin Core workshop in Warwick, England, the Warwick Framework was designed as an architecture for the exchange of metadata. The framework provides a means both for communicating among different metadata schemas, and for defining hierarchical relationships of information objects. The Warwick Framework is composed of a series of "Packages" and "Containers." A container references a specific information object. The packages are the different entities that describe the object. Within one container, for example, there might be a MARC package, a Dublin Core package, and a simple package containing a URL for a related object. A search engine approaching the container can read through the packages and pick the metadata description most compatible with its operation. The Dublin Core fits well within this framework as a lowest common denominator among metadata sets. The Warwick Framework allows databases to use a local metadata element set for primary description, while utilizing the Dublin Core as a means of communicating with databases using other metadata schemes. A site might include two metadata packages in the "container" pertaining to an image: a VRA-core based description, and a Dublin Core description. Search engines capable of reading VRA-core elements would select the VRA description of the image. Other search engines would scan the Dublin Core description for information.
Within the Warwick Framework, packages can themselves be containers, allowing for an infinite hierarchy of objects and metadata sets. The ability to define hierarchies under the Warwick Framework resolves many of the problems describing related objects with the Dublin Core. For example, a hierarchy of packages and containers could be used to describe a painting within a collection, within a museum. The painting, the collection, and the museum are represented with a cascading series of containers linked to each other through URL's or URN's. A similar hierarchy of containers and packages can clearly define the relationship among surrogates. In the example given previously, the photo and its digital reproduction would each be represented with a container, multiple metadata packages, and package containing a link to the related resource.
Conclusion
The low level of detail encompassed by the Core, and the lack of a defined syntax for element values, make the DC a dubious choice for primary description of material. However, the flexibility and universality of the core make it a good medium for promoting exchange of data among databases using a variety of specialized metadata sets. The DC could provide a means for unifying Yales diverse digital resources.
Appendix A: Implementing the Dublin Core HTML
The simplest method of implementing Dublin Core description on the Web is to use HTML. The META tag in both HTML 2.0 and HTML 4.0 supports the Dublin Core element set. HTML 2.0 uses a simple tag syntax indicating metadata scheme and element name, followed by content of the field. For example, the Dublin Core "Creator" of this summary is tagged as:
<META name="DC.creator" content="Kalee Sprague">
Most browsers and many search engines such as Alta Vista support use of the META tag. Under HTML 4.0, the qualifiers "SCHEME" and "LANG" are available to further enhance description in the META tag. For example, the following tag indicates that the DC "Creator" element above is described in English:
<META name="DC.creator" lang="en" content="Kalee Sprague">
RDF
The Resource Description Framework (RDF), developed by the World Wide Web Consortium (W3C), is an experimental method for supporting metadata description of networked resources. RDF works within the XML (Extensible Markup Language) Namespace element. Within its Namespace, RDF references both its own and other metadata schemas by their Uniform Resource Identifier (URI). The URI, in theory, marks the reference location of the metadata standard being used. RDF incorporates the Warwick Framework, allowing the use of multiple metadata schemes to describe an object. The different schemes can be used in parallel or within a single element hierarchy.
The following example uses both DC and MESL elements:
<Description about =
"http://www.library.yale.edu/databaseadmin/dublincore.html"
<DC:Title Dublin Core, a Summary</DC:Title
<DC:Subject Metadata, RDF, Dublin Core </DC:Subject
<MESL: Concepts/Function Information Management - Internet
</MESL: Concepts/Function
</Description
</RDR>
Z39.50
Z39.50 offers exciting possibilities for the exchange of Dublin Core metadata. Z39.50 clients and hosts are already widely used for searching across different databases and metadata schemas. The Z39.50 organization plans to incorporate the basic 15 elements of the Dublin Core into the Bib-1 attribute set. The Bib-1 attribute set is the basic bibliographic attribute set used by the Z39.50 Version 2.0 standard. DC qualifiers may be incorporated in a separate attribute set in Z39.50 Version 3.0; plans for this standard are still under way.
Appendix B: Dublin Core Element Set
Reference definition available at
URL:http://purl.org/metadata/dublin_core
1997-11-02
Each element is optional and repeatable; the elements can appear in any order.
| Field | Label | Description |
| Title | Title | The name given to the resource, usually by the Creator or Publisher |
| Author or Creator | Creator | The person or organization primarily responsible for creating the intellectual content of the resource. |
| Subject and Keywords | Subject | The topic of the resource. Typically, subject will be expressed as keywords or phrases that describe the subject or content of the resource. |
| Description | Description | A textual description of the content of the resource, including abstracts in the case of document-like objects or content descriptions in the case of visual resources. |
| Publisher | Publisher | The entity responsible for making the resource available in its present form, such as a publishing house, a university department, or a corporate entity. |
| Other Contributor | Contributor | A person or organization not specified in a Creator element who has made significant |
| Date | Date | A date associated with the creation or availability of the resource. Such a date is not to be confused with one belonging in the Coverage element, which would be associated with the resource only insofar as the intellectual content is somehow about that date. |
| Resource Type | Type | The category of the resource, such as home page, novel, poem, working paper, technical report, essay, dictionary. For the sake of interoperability, Type should be selected from an enumerated list that is currently under development in the workshop series. |
| Format | Format | The data format of the resource, used to identify the software and possibly hardware that might be needed to display or operate the resource. For the sake of interoperability, Format should be selected from an enumerated list that is currently under development in the workshop series. |
| Resource Identifier | Identifier | A string or number used to uniquely identify the resource. Examples for networked resources include URLs and URNs (when implemented). |
| Source | Source | Information about a second resource from which the present resource is derived. While it is generally recommended that elements contain information about the present resource only, this element may contain a date, creator, format, identifier, or other metadata for the second resource when it is considered important for discovery of the present resource; recommended best practice is to use the Relation element instead. |
| Language | Language | The language of the intellectual content of
the resource. Where practical, the content of this field should coincide
with RFC 1766 [Tags for the Identification of Languages,
http://ds.internic.net/rfc/rfc1766.txt ]; examples
include en, de, es, fi, fr, ja, th, and zh. |
| Coverage | Coverage | The spatial or temporal characteristics of the intellectual content of the resource. Spatial coverage refers to a physical region (e.g., celestial sector); use coordinates (e.g., longitude and latitude) or place names that are from a controlled list or are fully spelled out. Temporal coverage refers to what the resource is about rather than when it was created or made available (the latter belonging in the Date element). |
| Rights Management | Rights | A rights management statement, an identifier that links to a rights management statement, or an identifier that links to a service providing information about rights management for the resource. |
Appendix C: Dublin Core Qualifiers (Types)
Original documentation available at
http://www.loc.gov/marc/dcqualif.html
1997-10-15
| Element | Subelement | Description |
| Title | Alternative | Used for any titles other than the main title; including subtitle, translated title, series title, vernacular name, etc. |
| Main | Used where two or more titles are being recorded for the same resource in order to distinguish the main title from alternative titles. | |
| Creator | PersonalName
|
The name of an individual associated with the creation of the resource. |
CorporateName
|
The name of an institution or corporation associated with the creation of the resource. | |
| Publisher | PersonalName
|
The name of an individual associated with the publication of the resource. |
CorporateName
|
The name of an institution or corporation associated with the publication of the resource. | |
| Contributor | PersonalName
|
The name of an individual associated with the resource. |
CorporateName
|
The name of an institution or corporation associated with the resource. |
| Date | Created | Date of creation of the resource |
| Issued | Date of formal issuance (e.g., publication) of the resource. | |
| Accepted | Date of acceptance (e.g., for a dissertation or treaty) of the resource. | |
| Available | Date (often a range) that the resource will become or did become available. | |
| Acquired | Date of acquisition or accession. | |
| DataGathered | Date of sampling of the information in the resource. | |
| Valid | Date (often a range) of validity of the resource. | |
| Relation | Type | No definition given |
| Indicator | No definition given | |
| Coverage | PeriodName | The resource being described is from or related to a named historical period, referred to by this use of the element. |
| PlaceName | The resource being described is associated with a named place, identified by this use of the element. | |
| X | The resource being described is associated with a spatial location which may be defined by the use of x, y, (and, possibly, z) co-ordinates. | |
| Y | See above | |
| Z | See above | |
| T | The resource being described is from or associated with an instance in time that may be given numerically. | |
| Polygon | The resource being described may be located with respect to a shape, or polygon, defined in space as a series of x, y co-ordinate values. |
| Line | The resource being described may be located with respect to a line defined in space by a series of x, y co-ordinate values. | |
| 3d | The resource being described may be located with respect to a volume, or hull, defined in three dimensional space as a series of x, y, z co-ordinate values. |
Guenther, Rebecca. "Dublin Core Qualifiers/Substructure"
October 15, 1997
http://www.loc.gov/marc/dcqualif.html
Iannella, Renato. "An Idiot's Guide to the Resource Description Framework."
1998-09-03
http://www.dstc.edu.au/RDU/reports/RDF-Idiot/
(5 Oct. 1998).
Lagoze, C. "The Warwick Framework: A Container Architecture for Diverse Sets of Metadata."
D-Lib Magazine.July/August 1996.
http://www.dlib.org/dlib/july96/lagoze/07lagoze.html
(5 October 1998).
Lynch, Clifford. "The Dublin Core Descriptive Metadata Program: Startegic Implications for Libraries and Networked Information Access." ARL. February 1998, pp.5-10.
LeVan, Ralph. "Dublin Core and Z39.50."
Draft Version 1.2. 1998-02-02
http://www.oclc.org/~levan/docs/dublincoreandz3950.html
(3 Nov. 1998).
Weibel, Stuart. "A Proposed Convention for Embedding Metadata in HTML. "
1996-06-02
http://purl.oclc.org/docs/metadata/dublin_core/approach.html
(29 Sept. 1998).