Project title: Database of Chinese Bronze Inscriptions

Abstract

The goal of this project is to develop a relational database of Early Chinese paleographic materials. It will serve not only as a collection of source material and information, but also as a tool for the graphical and lexical analysis of paleographic material. The content of the database is to be developed by faculty and research assistants in the University of Chicago’s Department of East Asian Languages and Civilizations, with technical assistance from the Digital Library Development Center. We hope that after an initial development period, we may make the database available to scholars via the internet.

Objects

The Database of Chinese Bronze Inscriptions will include the following objects:

  1. Inscriptional Texts: The basic source material for the database are bronze inscriptions, which are short texts (usually less than 100 graphs) that were originally cast into a bronze vessel.  Attributes for these objects would include basic information about the text, such as the name of the inscription, periodization, references, etc.
  2. Vessels: This is the media onto which bronze inscriptions were cast and the source from which rubbings (see next item) are taken. Attributes for these objects would include references to scans of photographs of the vessel, provenance information, notes on the size and appearance of the vessel, references, etc. Since a single inscriptional text could be cast onto multiple vessels, one inscription could be related to multiple vessels. 
  3. Rubbings: Inscriptional texts are reproduced by making a rubbing of a vessel, which produces a black-and-white “negative” of the inscription. Scanned images of such rubbings will be used to reproduce inscriptional texts in the database. Attributes for these objects would include a reference to a scan of the rubbing, location of the rubbing on the vessel, source of the rubbing, etc. Since multiple versions of a single inscriptional text could be cast into a single or multiple vessels, one vessel or one inscription could be related to multiple rubbings.
  4. Graph Exemplars: Each instance of a graph appearing in the inscriptional corpus will be included. Attributes would include a reference to a scanned graph (cut from the scan of the original rubbing) together with notation of its position in the inscription. Each graph exemplar will be linked to a particular position in the transcription of the inscriptional text (see next object).
  5. Transcriptions: Each inscriptional text comprises a collection of graphs, each of which indicates a particular lexeme. Transcription objects will serve to relate each position in the inscriptional text to both a graph and a lexeme. The attributes for these objects will include the sequential position of each graph, notes on the transcription, and links to a particular graph and a particular lexeme (see next two objects).
  6. Graphs: Each unique graphic form appearing in the corpus of bronze inscriptions will be a distinct Graph object. Attributes will include a transcription into modern script, an analysis of the graphic form, and references to citations in paleographic dictionaries, etc.
  7. Lexemes: These objects consist of modern graphic forms together with dictionary-type entries. Attributes will include a modern graph, definition and usage notes, pronunciation data, graphic analysis, as well as references to citations in major dictionaries, etc.
  8. Translations: These objects will consist of English phrases that correspond to a set of graphs in the transcription of a particular inscriptional text. Attributes will include English text as well as notes on the translation.

A note about copyrighted material: During an initial development period, the database will use only material that is not under international copyright protection. This includes all material published in the People’s Republic of China before that country became a signatory to international copyright protection treaties and conventions.

Relationships

The relationships between the above objects may be schematically represented as follows:



entity relationship diagram

Functions

The Database of Chinese Bronze Inscriptions is designed to have the following functions:

  1. Repository of primary sources: The database will collect a variety of source materials: rubbings, photographs, transcriptions, and translations, to be culled from a variety of sources.  By doing so, the database will significantly ease access to such materials by providing a single point of entry and a convenient search mechanism.
  2. Collection of data: Aside from collecting primary source material, the database will also collect supplementary information about these objects (e.g. physical dimensions of vessel, provenance, number of graphs), culled from a variety of disparate sources. The database will also allow full annotation of primary objects so that research notes can be appended to objects at any time.
  3. Bibliography: References to inscriptions, excavations, and graphs will be attached to each object, giving the database a bibliographic function.
  4. Graphical dictionary: All graphs that appear in the corpus of inscriptions will have an entry that provides an analysis of the graphic form (in terms of radical composition), variants, and a strict transcription into modern script, as well as references to citations in important paleographic dictionaries.
  5. Lexical dictionary: All graphs that appear in the corpus of inscriptions will have an entry that provides pronunciation information, definition and usage notes, and references to citations in important dictionaries. This dictionary will be organized under headwords that correspond to standard graphs in the modern script. The link between graphic form and lexical entry is interpretative and may vary from inscription to inscription or even within an inscription. That is, the same graph will not indicate the same lexeme in all instances. We anticipate that a good deal of lexical content will be developed in collaboration with an existing electronic dictionary project, Synonyma Serica Comparata, under the direction of Christoph Harbsmeier, University of Oslo.
  6. Concordance: The database will serve as a concordance that will locate graphs or lexemes and display them in context, both in transcription and translation.

Main Output Screens and Navigation

We envision the following output screens for the Database of Chinese Bronze Inscriptions:

  1. Data Screen: This series of screens would display all information associated with a particular inscription: names of the vessels on which it appears, publication data, archaeological provenance, present whereabouts, physical description of the vessel, photos of the vessel, etc. This screen should link to the both the Rubbing Screen and the Transcription/Translation Screen (see below).
  2. Rubbing Screen: This screen would display a scanned image of a rubbing of the inscription. Overlying the rubbing scan will be an image map embedded with links or scripts so that clicking on a certain character in the image would cause information about the transcription of that character to appear, either in a separate window or perhaps within the current window. From there it should be possible to query information on the graph or lexeme in the “Graphical Dictionary Screen,” “Lexical Dictionary Screen”, or “Concordance Screen.” Additionally, the user should be able to access a transcription and translation of the inscription by calling up a “Transcription/Translation Screen.” Ideally the user could jump from a particular graph in the rubbing directly to the line in the transcription that contains that graph.
  3. Graphical Dictionary Screen: This screen would display information about the graphic forms appearing in the database: a standard form of the character, strict transcription into modern script, common variants, references, the list of radicals used to classify and sort the graph, etc. There would be a link to the “Concordance Screen” (see below), which would allow the user to call up the concordance entry for the graph in question. The screen would also display a query of lexemes that are associated with this graph (which should link to the “Lexical Dictionary Screen” [see below]).
  4. Lexical Dictionary Screen: This screen would display the information about lexemes that appear within the corpus of inscriptions: an image of the graph, Unicode number, pronunciation information, definition and usage notes, etc. There would be a link to the “Concordance Screen” (see below), which would allow the user to call up the concordance entry for the lexeme in question. This screen would also display a query of graphs used to indicate this lexeme (which should link to the “Graphical Dictionary Screen”).
  5. Concordance Screen: For any graph or lexeme, this screen would list the inscriptions in which it appears. From this list, the user should be able to link to the “Data Screen” for the selected transcription, the “Rubbing Screen” for a selected rubbing, or the “Transcription/Translation Screen,” ideally directly to the line that contains the graph or lexeme.
  6. Transcription/Translation Screen: This screen would present a formatted transcription and translation of the inscription.  Each inscriptional text will be parsed into phrases. This screen will display Chinese text, English translation, and translation notes for each phrase.  Since many of the characters in the transcription will be nonstandard, Chinese text attribute will contain a mixture of Unicode characters and images for those graphs that are outside Unicode. From this screen it should be possible to call up the “Graphical Dictionary Screen,” “Lexical Dictionary Screen” and “Concordance Screen” for any character appearing in the transcription.

Technical Issues

The Database of Chinese Bronze Inscriptions presents a number of technical challenges, including the following two points:

  1. Creation of nonstandard Chinese graphs: Inscriptional texts contain many specialized graphs that are not part of the modern script. As a tool for paleographic analysis, it is important that the database be able to represent such characters as accurately as possible. For those characters that are part of the Unicode standard, we plan to make use of standard Unicode fonts. Outside of Unicode, there are about 35,000 additional graphs that are included in the Mojikyo font set, which are available as GIF images. All other graphs will have to be created as some type of graphic, but at this point we are not clear what is the best method or format (TIFF, GIF, etc.) to use. This decision may be influenced by our ability to collaborate with other paleographic projects that create images of Chinese graphs.
  2. Mixed text and image display: In addition to creating TIFF or GIF images for nonstandard Chinese graphs, we need to develop a way to both store and display attributes that are composed of both text and images. For instance, we may imagine an attribute containing the name of an inscription in Chinese. For some inscriptions, this field will be composed of purely Unicode text. However, for other inscriptions the name will include nonstandard graphs for which we will create images.  Therefore, when this attribute is stored, it will have to include both Unicode Chinese text as well as a reference to an image. When the field is displayed, it will be necessary to recognize the image references and display them as images, while at the same time preserving the Unicode Chinese text surrounding it. For example, if the Chinese name attribute were stored as a text field, the data might read “<000236>”, but would be displayed as “non-standard graph”.

Development of the Database

Content for the Database of Chinese Bronze Inscriptions will be developed by Faculty and graduate students at the Department of East Asian Languages and Civilizations at The University of Chicago. We hope also to collaborate with other projects that are computerizing paleographic material. For technical guidance and implementation of the database on the internet, we hope to rely on the Digital Library Development Center.