Non-Roman Searches in Orbis
During the 2005/2006 winter recess, bibliographic data within Orbis were converted
to the Unicode
character encoding standard. For this reason, it is now possible to search Orbis
in a variety of non-Roman alphabets and scripts. Moreover, rather than being
grouped at the bottom of records as was formerly the case, most original script
data will now be displayed in interlinear fashion beside their corresponding
transliterations. At this time Orbis includes original script data for the JACKPHY
languages only (i.e., Japanese, Arabic, Chinese, Korean, Persian, Hebrew, and
Yiddish). Records containing non-Roman script can now be printed, saved and
imported to bibliographic management tools such as Endnote 9 and RefWorks (though
not yet e-mailed) with the formatting of the script preserved. The browser (pointed
to Orbis) will automatically display catalog data as Unicode text on condition
that the searcher's computer has an appropriate font installed. The most widely
available appropriate font for Windows operating systems is Arial Unicode MS.
This font is available on all the Library's public workstations. On Macintosh
computers, the operating system should be OS 10.3.x or higher and the appropriate
font is Lucida Grande.
Unicode implementation also means that users can now copy non-Western characters
from any Unicode-compliant Web page or application and paste them directly into
the Orbis search box. The reverse is also true. non-Roman words and phrases
can be copied from Orbis and pasted directly into Google, Hotmail, MS Word,
or any other Unicode-compliant application. When copying and pasting from Orbis
into MS Word, however, remember to set the receiving document's font to Arial
Unicode MS to ensure that scripts display correctly.
Search capabilities for non-Western languages differ according to script:
Please keep in mind that cross-references (and other thesaural relationships)
are currently provided only for Roman-script fields. This means that results
from non-Roman script searches may vastly undercount the Library's actual holdings.
Also, please be aware that not every record for non-Western materials will include
the script in which the materials were actually written. In a minority of cases,
technology at the time of record creation did not permit non-Roman characters
to be entered into the database, and therefore searching by original script will
not retrieve those records.
Arabic and Hebrew-script searching appear to be working well with one exception.
Due to database conversion issues, titles with leading articles (e.g., those
beginning with the Hebrew "ha-", Arabic "al", Yiddish "der",
"di", "dos", "eyne", etc.) may not be retrievable
through the title search index (i.e., even with leading articles removed). In
such cases, it is advisable to use keyword searching instead. This problem is
under review, and will be fixed as quickly as possible. The other indexes appear
to be working correctly.
For Chinese, Japanese, and Korean (CJK) scripts, Romanized queries will continue
to be the most effective way to obtain reliable search results. While the ability
to enter original CJK characters is a boon to those unfamiliar with CJK Romanization
rules, in many cases the system will fail to find and retrieve desired items
due to variant forms of characters used, variations in spacing between characters,
and other orthographic and formatting inconsistencies (i.e., ones that are inherent
to all CJK language processing). The East Asia Library staff will maintain a
detailed list of known issues and workarounds on its web site http://www.library.yale.edu/eastasian/scripts.html
Summary of Unicode-Compliant Orbis Features
| Browser Requirements |
Navigator version
6 or I.E. 4, or later |
| Font needed |
Windows: Arial
Unicode MS or other unicode-compliant fontMacintosh: Lucida Grande.
For PCs: Internet Explorer 5.0 or higher, Firefox 1.0.7 or higher, Netscape
6.0 or higher. For Macs: Firefox 5 or higher, Safari 1.3 or higher, and
Netscape 7.0 or higher. |
| Printing |
Yes |
| Saving |
Yes |
| Saving and importing
to bibliographic management tools |
Yes |
| Non-Roman script
preserved in emailed records |
No |
Send comments to libweb@www.library.yale.edu