Electronic Textual Editing: Epigraphy [Anne Mahoney, Perseus Project & Stoa Consortium Tufts University ]


Epigraphy in print

Epigraphy is the study of texts that are inscribed onto durable materials, typically (though not always) stone. These texts include honorary and memorial inscriptions (on statue bases or gravestones), laws and decrees, and even graffiti. Although stone does not rot or burn, it can be broken or worn, and inscribed stones were often re-used as building materials. As a result, many inscriptions are fragmentary texts.

Here we focus on Greek and Latin epigraphy. In the city-states of classical Greece and the Hellenistic world, from the sixth through at least the second century BC, laws and decrees were regularly inscribed on stone stelæ and displayed publicly. In the Roman world, starting in about the third century BC, although decree stelæ were less common, funerary and memorial inscriptions were quite common. These inscriptions of all types are an essential primary source for knowledge about the ancient world. Not only do they tell us about laws, alliances, and famous people, but they also give insight into daily life. The grave inscriptions, for example, can be mined for information about life expectancies, family sizes, and occupations. In addition, inscriptions preserve early forms of the Greek and Latin languages. Even the spelling errors give valuable information about pronunciation.

Epigraphy has been studied systematically since the Renaissance. Inscriptions from various parts of the classical world have been collected into corpora and published in large print volumes, starting in the 19th century. The most important corpora are the Corpus Inscriptionum Latinarum: and the Inscriptiones Græcæ, both begun in the 19th century and still being updated today. Large university libraries will have copies of these huge, cumbersome print volumes; smaller libraries often do not. Several selections and anthologies of important inscriptions have been published, some with notes aimed at relative beginners. 1 In all of these print volumes, the texts are presented in modern dress: type faces use modern letterforms and mixed case; punctuation is added, usually according to the conventions of the editor's native language; and errors may be corrected or broken words completed, with clear indications that these are editorial changes from what appears on the stone. Photographs are included where possible, but the cost of reproducing pictures, especially when these print corpora were getting started, means that most printed inscriptions are not accompanied by photographs of the stone. As a result, it is important that the printed version of the text accurately represent what is physically present and what is an editorial addition.

Epigraphers have a standard convention for marking up texts in print, called the Leiden convention. 2 While it is common to include a diplomatic text as well (that is, an exact transcription), nearly all epigraphic publications include an edited text, nowadays generally using the Leiden system. In this system, brackets of various shapes, underdots, and occasional other marks indicate letters added or corrected by the editor, letters on the stone that the editor judges to be wrong (for example, misspelled words), expansions of abbreviations, blank spaces on the stone, erasures, ancient corrections, and so on. Epigraphers are thus trained to read and write semantically encoded texts, and therefore often find XML markup a natural extension of what they are already doing.

A brief description of the Leiden system may be useful. In this system, angle brackets enclose letters or words that are not present in the inscription but which, in the editor's judgement, should be. Square brackets enclose letters or words that are in the inscription but are wrong. In both of these cases, the error could be a spelling or transcription mistake on the part of the stone-cutter, or the editor could be normalizing the language or dialect. Commentary printed along with the text will generally explain which is the case.

Letters are printed with underdots when they are not complete or not clear. This can happen when the stone has been broken or worn. For example, suppose a particular Greek capital letter might be alpha, lambda, or delta. In this case the editor will print the one which is most likely, with a dot under it. Sometimes context makes clear that only one of these letters is really possible. In our example, if the previous letter is nu and the following one is xi, the letter in between must be a vowel, so the disputed triangular form must be an alpha. In such cases, some editors will print the alpha with an underdot, since it would not be unambiguously legible on its own, but others will print an unmarked alpha, since in the particular context there is no ambiguity.

Some ancient inscriptions include deliberate erasures, for example to remove the name of a ‘bad’ Roman emperor after his death (as part of the procedure called damnatio memoriæ). The erasure is marked with double square brackets when the letters can still be read, or when it is at least clear how many letters were once there, with single square brackets otherwise.

Spacing, punctuation, and capitalization are all added or adjusted silently, as are accents and breathings for Greek. Readers are generally simply expected to know that ancient writing conventions are different from modern. The editor's commentary, however, will usually indicate the appearance of the original text. In particular, many classical Greek stones are written in stoichedon style (with letters arranged in a grid, as if on graph paper), and this fact is important because, if any line is complete, we can know how many letters are broken away from any incomplete lines.

Unfortunately, the Leiden convention is neither universal nor flawless. Since it was only devised in 1931, many published texts pre-date it and use similar, but different conventions. Square brackets may denote additions rather than subtractions, round parentheses may denote additions rather than expansion of abbreviations, and so on. In addition, any given book may or may not include a key to its particular markup system, so readers must become aware of different publishers' preferred styles. Moreover, because a fully-marked text can be cumbersome to read, some publications omit the more complicated markings. This is especially likely in texts intended for beginners and students.

A more important problem, however, is that epigraphical markup, like most markup, encodes an editor's judgement about the text, which may be more or less certain. An editor's supplement could be nearly certain, as in the case of the alpha-lambda-delta example above, or it could be pure conjecture, or anything in between. The Leiden system does not include a mechanism for indicating the editor's confidence in the proposed text. Normally this is discussed in the commentary, of course, but certainty is not encoded in the text itself. Moreover, editors differ in their application of markup. Some will mark any supplement or emendation, even if it is beyond any reasonable doubt; others will leave the obvious emendations unmarked—and the determination of what is ‘obvious’ varies widely.

The first part of this problem, that the markup scheme in general use does not provide a way to encode certainty, can be addressed with the use of the TEI, as we will see below. The second part, that editors will disagree about what is obvious and what is arguable, cannot be solved by any particular markup scheme. We hope, though, that a scheme that provides a structured way to encode the discussion of certainty will ultimately help readers understand what editors know about a given text.

Although we focus on epigraphy here, it should be note that the editorial and markup conventions for papyrology are similar. Textual criticism also uses similar signs in the text, and the commentary in an epigraphic or papyrological publication may include an apparatus criticus. As a result, all three fields face similar issues and have similar needs. A convention for using the TEI in epigraphy will be largely applicable to papyrology. The use of the TEI is well understood in more conventional textual criticism, which in classics involves establishing a text from several manuscripts, usually dating from the 10th - 15th centuries AD for Greek texts, sometimes also rather earlier for Latin texts. The TEI tags that represent choices among variant readings are less useful in editing inscriptions and papyri, where there is generally only one copy of the text and we do not have the luxury of variants to choose from, but they can be used in collating prior editors' treatments of an inscription.

Digitization projects

Over the last few years, several of the major epigraphic corpora have begun digitization projects. The epigraphic community also hopes to create a unified database of information about all known Greek and Latin inscriptions. A digitized corpus of inscriptions can include several different representations of the inscriptions:
  • photographs of inscriptions;
  • photographs of ‘squeezes’ of inscriptions, which are casts of the stone made in a flexible material like paper or latex;
  • diplomatic transcriptions;
  • edited texts;
  • translations;
  • commentaries.
Many projects also find it convenient to store meta-data about the inscriptions in a database, to facilitate searching. The most useful meta-data fields include the date of the inscription, its language, the types of letter forms in use in it, where it was found, what material it is on, and its size.

Some digitization projects complement existing print projects. Here the goal is to provide more complicated searching than is possible with print indices, and to help produce the printed texts. The digital corpora may be made available on CD with appropriate programs to search and display the texts and images. Some projects plan to distribute their texts on CD instead of in print. Still other projects hope to make their inscriptions available over the web. Whatever the proposed dissemination mechanism, however, all these digitization projects face similar problems.

Digital epigraphy projects recently came together at the Second International Workshop on Digital Epigraphy, held at King's College, London, in July 2002. This workshop, hosted by the EpiDoc Aphrodisias Pilot Project (EPAPP), was part of a continuing discussion among epigraphers about standards and practices for semantic markup, in support of both electronic and print publication. EPAPP is a collaboration between the Aphrodisias team and the EpiDoc Group, based at the Ancient World Mapping Center, University of North Carolina. EpiDoc itself is a set of guidelines and a TEI DTD intended for use by epigraphers. It is likely to become a standard for those epigraphic projects that choose to use XML. EpiDoc was presented to the wider epigraphical community in an informal session at the XII Congressus Internationalis Epigraphiæ Græcæ et Latinæ: , the annual meeting of the Association Internationale d'Épigraphie Grecque et Latine (AIEGL), Barcelona, September 2002.

Over a dozen different epigraphic digitization projects were represented at the July 2002 workshop, covering all of the classical world and including researchers from half a dozen different countries. Space does not permit detailed discussion of all of them here; the workshop program, including links to the various projects' home pages, is available on line at http://www.kcl.ac.uk/humanities/cch/epapp/epigraphy.htm .

Of the projects represented at this workshop, very few are using XML yet. The Aphrodisias project, which hosted the workshop, is digitizing inscriptions from the Greek city of Aphrodisias in Asia Minor. Although Aphrodisias was first settled as early as the third millennium BC and the city continued to exist well into the thirteenth century AD, the project focuses on the inscriptions of late antiquity (see Roueché). This project is the first major test case for the EpiDoc guidelines and DTD. The inscriptions are transcribed using this DTD, then transformed to Leiden markup in HTML for web presentation. A database links the texts, information about the stones that contain them, photographs, and a site plan. The project began as an on-line reprint of Aphrodisias in Late Antiquity (Roueché), the book in which many of these inscriptions were first published, but has expanded to include more texts and more photographs than could have been published in print. In addition, the archive includes pages from the notebooks of John Gandy Deering, who transcribed some of the Aphrodisias inscriptions in the winter of 1812-1813. While such notebooks are rarely published in printed epigraphic editions, they can easily be added to an electronic edition, including both page images and transcriptions.

At Oxford, the Center for the Study of Ancient Documents is publishing the inscriptions from Roman Britain, which are generally on small tablets of wood or lead. Because the history of the Roman presence in Britain is part of the school curriculum there, the on-line publication of these texts must be accessible to children as young as eight or nine years old as well as to scholars. Publication therefore includes print volumes, an on-line edition, and an on-line exhibition for non-specialists. The texts are marked up in EpiDoc, and archaeological data and other meta-data are stored in a relational database. The texts are converted to Leiden-like markup for HTML display, with some adjustment for the limited capacities of HTML: since standard HTML does not include underdots, for example, these letters will instead be shown in a lighter color. This corpus includes the well-known Vindolanda tablets, from the military base at Vindolanda near Hadrian's Wall in northern England. The tablets, small pieces of wood with text in ink, date from around the end of the first century AD; they include letters, military records, financial accounts, and other types of documents. Archaeological work continues at Vindolanda, so the corpus of texts continues to grow. As a result, the CSAD team expects to update the on-line edition of these inscriptions with new texts and the corresponding photographs, transcriptions, and translations, as well as with corrections or changes to earlier readings based on new materials. The print edition, naturally, will not change, and the editors must decide whether the on-line edition is the printed text with a separate list of corrections, or the most current text with a separate page of change history.

The oldest and largest epigraphic corpora, on the other hand, have so much material that conversion to XML would be prohibitively difficult. Both Inscriptiones Græcæ (IG): and the Corpus Inscriptionum Latinarum (CIL): began collecting inscriptions in the nineteenth century—before the codification of the Leiden markup system, let alone XML. IG: includes some 50,000 inscriptions, published in 49 large print volumes. The 70 volumes of CIL: include about 180,000 inscriptions. Each collection also includes photographs, squeezes, and notebooks. CIL: uses an XML-like markup internally, based on the needs of the publisher who handles the print edition, while IG: gives its printer word-processor files. CIL has a large relational database storing bibliography and meta-data, not the actual texts. The texts themselves are transcribed and marked up in various systems, some influenced by customs from papyrology, some quite old. Converting these texts to XML would probably require a semi-automatic process, starting with a more or less accurate interpretation of the existing markup and ending with careful, manual proofreading. Given the number of texts involved and the variety of markup conventions applied over the better part of two centuries, this is a daunting prospect. Although both projects would be happy to make at least some of their texts and photographs available on line, the volume of material and the responsibility for continuing the print series make this impossible at present.

Some general considerations emerge from comparison of the various projects. Whether they are using databases, XML, or hand-written notes, all epigraphic projects have structured data, and whether they are marked up in Leiden, TEI, or an older system, all epigraphic texts are structured as well. When a new project begins, or an existing one considers digitization, it faces several basic questions, some editorial and some technological.

The first question an epigraphic project must answer is whether the text or the physical object on which it is written should be considered the primary focus. That is, does the project study texts which happen to be written on stones, or stones which happen to contain texts? Either approach is possible. Often projects begin from the stones because one stone may contain several different inscriptions. If the text is primary, then information about the stone itself will be repeated (or referenced repeatedly) with each of the texts it contains. It is rare, on the other hand, for a given text to appear on more than one inscription, so it is rarely necessary to repeat information about a text in the records for different stones. A project whose main concern is with language, however, might prefer to treat the texts as primary and the stones as secondary.

The next major question is the scope of the collection. A project might work on inscriptions from a particular place or time, like the city of Aphrodisias or Roman Britain, or might try to catalog all known inscriptions in a given language, as the major corpora CIL: and IG do. The number of inscriptions the project expects to be working with will affect choices about how to manage them. If there will only be a few dozen texts, then markup, meta-data creation, and other operations can be done by hand. If the project must catalog, index, and display hundreds or thousands of texts, however, automated tools will be very useful. The project probably needs a corpus editor, 3 who will determine what can be automated and what must be done by hand.

After these fundamental editorial questions come the technological questions, of which the most important may be how to store the meta-data about the inscriptions, the objects that contain them, and the project's photographs, transcriptions, and editions. Although basic information about a TEI text goes into its TEI header—who transcribed it, whether it has been published before, and so on—it may also be convenient to store this information in the same place and format as the meta-data about the rest of the project's collections. These issues are not specific to epigraphical projects, of course, but common to any project that deals with texts, photographs, and physical objects. Many epigraphic projects, like many more general digital libraries, use relational databases for their meta-data. Some store the actual texts in the same database records, marked up in Leiden style or in XML. A sufficiently robust database system can even store photographs.

The technical question most relevant to the present chapter is how to encode and store the text. 4 There is a growing consensus that XML is the best way to encode the text, and the EpiDoc guidelines for using the TEI are emerging as a standard. XML is not the only choice, however. Projects may also use the typographical marks of the Leiden system, which has the advantage of being entirely familiar to the epigraphers who create and maintain the corpus. Unfortunately, the special brackets, underdots, and other typographical devices may not be supported by the character set of the computer system to be used. The Packard Humanities Institute inscription database, compiled and maintained by the Cornell Greek Epigraphy Project, uses TLG Beta-code to get around this problem. 5 In this system, every non-alphabetic character is represented by a series of characters, a marker to indicate what type of character is being represented (punctuation, bracket, metrical symbol, or the like) and a string of digits to specify the character. Thus ‘[4’ indicates a left double square bracket, ‘]4’ its matching right bracket, or ‘#322’ indicates the chi-rho symbol.

If a project decides to use XML, it must then determine what DTD (or schema) to use. As in every other humanities discipline, the basic question is whether to use a general DTD, like the TEI, or to write a project-specific one. Exactly the same issues arise in the design of the database tables or other organizational schema for meta-data. Some projects want databases or DTDs that are extremely specific to the types of inscriptions they are dealing with. For example, the projects that work largely or exclusively with funerary inscriptions want a standard way to record the age and sex of the person being memorialized, while projects that work with legal texts do not need this. Other projects prefer not to write and maintain their own DTD. The EpiDoc TEI guidelines are a good compromise here: the EpiDoc DTD is the TEI, with a few epigraphically oriented modifications made using the standard TEI mechanisms. There are also projects that use their own versions of the TEI, for example the project working on the Protestant Cemetery in Rome (Rahtz).

A key incentive for using XML is the ability to exchange data with other projects. Epigraphic corpora may overlap, as the time periods or geographical areas they focus on may intersect. It is therefore convenient to be able to divide the labor of photographing, cataloging, and editing the inscriptions, and that means the resulting data must be in compatible forms. Using the same DTD in the same way makes this relatively easy. While projects that store their texts as word-processor files with Leiden markup can also share data, they must agree explicitly on the details of text layout, file formats, and character encodings.

Text management must also take into account the writing systems used in the corpus. If a project is only dealing with inscriptions in Latin written in the Roman alphabet, 6 then the writing system of the inscriptions is essentially the same as that of the Western European modern languages used for meta-data, translations, and commentaries. Most classical epigraphy projects, however, have to deal with Greek, and projects dealing more generally with the ancient Mediterranean may have texts in Etruscan or Umbrian, which use similar alphabets, Aramaic or Hebrew, which use very different alphabets and are of course written right to left, or even languages written in cuneiform or hieroglyphic scripts.

The first approach to the writing system problem is often to use different fonts, as one might with a word processor. This approach is appealing since if the project ever wants to print its texts, it will sooner or later need fonts for the different scripts anyway. It is also analogous to the way texts are presented in print: we recognize that an inscription is in Greek because we see it printed in the Greek alphabet. There are long-standing conventions for the use of boldface, spaced type, and other typographic devices to represent the quasi-Roman alphabets used by the other ancient languages of Italy, like Oscan or Umbrian. Yet the font-based approach assumes that all the software that will manipulate a given text can recognize font-change markers. Some database packages do not allow change of font within a single text field, for example, and some export or interchange formats strip font information.

Unicode is a better approach when the scripts of interest are all supported, which will be the case for any script still in use by a living language (for example, Greek or Hebrew). Hieroglyphic and cuneiform characters are not currently part of the Unicode standard, however, and even in supported scripts some particular old characters may not be available. In Greek inscriptions, for example, numerals are often symbols composed from the first letter of the word for the number; ‘fifty’ would be represented as π for pente (= 5) combined with δ for deka (= 10). These acrophonic numerals are generally used in print publications of inscriptions (see for example Meiggs and Lewis, no. 72), but they are not yet part of the Unicode standard.

With XML, it is possible to define either elements or entities for unsupported characters. If the DTD contains an element called, say, <char> , and if the project has a controlled vocabulary for its attributes, the acrophonic numeral for 50 might be expressed as <char type="acrophonic 50" font="numfont" pos="123"/>, where ‘numfont’ names a (hypothetical) font in which this character is available and ‘pos’ is the character position of that character in that font. Alternatively, the project might define an entity like acro50 to represent this character. Either way, the XML text notes that here is the acrophonic numeral for 50, and the later rendering of the text for display or printing can substitute the appropriate character in a known font, a picture of the character, or even a numeral from a different system (the Greek alphabetic system, Arabic digits), depending on the facilities available in the target medium and on the audience for this version of the text. Approaches like these, however, assume that tools are available for these conversions; some application, transformation, or stylesheet must be told how to interpret the given element or entity.

Because so many epigraphy projects deal with large numbers of small texts, whereas literary projects in the classics more often have relatively few larger texts (for example, a few dozen dramas or a couple of epics), epigraphers have been quick to recognize the benefits of digitization for searching and for global manipulation of a corpus. Although many digital epigraphy projects pre-date XML, they are beginning to adopt it, and EpiDoc is emerging as a method.

EpiDoc: a TEI DTD for epigraphy

The EpiDoc initiative, under the leadership of Tom Elliott of the Ancient World Mapping Center, University of North Carolina, is working out ways to encode epigraphic data with the TEI. EpiDoc's basic assumption is that ‘Ancient epigraphic texts ought to be widely available in digital form for sharing and use in a variety of environments for a variety of scholarly and educational purposes. Individuals, organizations and projects require digital epigraphic texts for personal or internal use as well; if standard tools and formats were available, such needs would be more easily met’ (EpiDoc Collaborative). The obvious standard for sharing and presenting texts is XML. Rather than writing a DTD for epigraphy from scratch, moreover, the EpiDoc group uses the TEI ‘because TEI has already addressed many of the taxonomic and semantic challenges faced by epigraphers, because the TEI-using community can provide a wide range of best-practice examples and guiding expertise, and because existing tooling built around TEI could easily lead to early, effective presentation and use of TEI-encoded epigraphic texts.’

The EpiDoc approach has already been adopted by several epigraphic projects, and others are considering it. As noted above, Aphrodisias and the Roman Britain corpus use EpiDoc for their texts. The Dêmos project, directed by Christopher Blackwell of Furman University, is a library of materials about Athenian democracy which will include Greek inscriptions, marked up with EpiDoc, among the primary sources. Epigrapher Michael Arnush of Skidmore College is writing translations and commentaries for these inscriptions. The corpus of Macedonian and Thracian inscriptions being compiled at KERA, the Research Center for Greek and Roman Antiquity at Athens, is beginning to use the TEI and may choose to use EpiDoc.

The main product of the EpiDoc Collaborative is a set of guidelines detailing how to use the TEI for epigraphy in a standard way. There is also an EpiDoc DTD, which is an extension of the TEI in the standard way, restricting the allowable values for certain attributes, suppressing unused elements, and adding a very small number of additional elements. The guidelines suggest what features to mark, which of a set of complementary tags to use for them (for example <abbr> and <expan> ), and what to call the structural parts of an epigraphic publication. Projects that follow these guidelines exactly will be able to share not only their texts but also their tools. Applications written to process EpiDoc texts—transformations, stylesheets, specialized search engines, index generators, and so on—will only need to handle the cases provided for in the guidelines: they must process <abbr> , for example, but need not deal with <expan> .

The current version of the guidelines document is not complete; several sections remain to be written, and some are being revised based on experience. The basic philosophy of the guidelines, however, is clear. The simplest rule is that whatever is actually on the stone is in the content of the elements, while editorial changes and additions are in attributes. Thus EpiDoc prefers <abbr> , with the expansion in an attribute, to <expan> , with the expansion in the content and the actual abbreviation in an attribute. The next rule is that EpiDoc follows the intended semantics of the TEI Guidelines: ‘we are not re-writing TEI,’ as the EpiDoc guidelines state (sec. 6). Finally, everything that can be expressed in the Leiden system, or other similar schemes, must be expressible in EpiDoc. Moreover, there must be a one-to-one match between markup elements in the Leiden system (symbols and character formatting) and those in EpiDoc, so that the two markup schemes will be mechanically interconvertible.

An EpiDoc text is structured as a series of un-numbered <div> s, distinguished by their type attributes. Typical divisions might include the text itself, a translation, a description of the stone or other object where the text is, a commentary, and a bibliography. The EpiDoc DTD, unlike unmodified TEI, introduces a finite set of possible values for the type of a <div> , so that all users of a text can distinguish, say, the commentary from the description or the archaeological history. To ensure that this structure is used, the EpiDoc DTD does not include the numbered <divN> elements at all.

The EpiDoc group is also working on tools, for example XSL stylesheets, to facilitate working with EpiDoc texts; these tools can be found at the EpiDoc home page at the Ancient World Mapping Center. One tool that will be particularly important to wide acceptance of EpiDoc is a transformer that can convert between Leiden format and EpiDoc XML in either direction; this is currently under development. This tool will help projects convert their existing texts to EpiDoc format, and it will also promote the use of EpiDoc as an exchange mechanism: two projects that do not want to convert their own holdings to XML can nonetheless use XML to give texts to each other. An additional desideratum is an editor with specific support for EpiDoc, as opposed to a general XML editor that can read the DTD, by analogy with the HTML editors that have the HTML DTD built in and do not claim to provide general support for other DTDs. Such an editor could be tailored to the needs of epigraphers rather than general users, and should help overcome the perception among some epigraphers that XML is ‘too technical’ or ‘too difficult.’

Although the guidelines and DTD are primarily the work of Tom Elliott and his colleagues at UNC, the wider community has been involved from the beginning. Even before the first version of the DTD was prepared, EpiDoc existed in the form of a mailing list, bringing together epigraphers, historians, and humanities computing specialists to discuss how EpiDoc might work. Discussions on this list have ranged from basic philosophical questions to highly technical implementation details.

All the disputes about what you mark—like the underdot in the example above with nu, triangular broken letter, xi—don't go away as a result of encoding the texts in XML instead of in typographic form. One advantage of structured markup, however, is that editors can, if they choose, encode more information about how certain a particular feature is. The date of an inscription, for example, can be encoded as a range of possible dates. EpiDoc includes the TEI <certainty> element and the cert attribute to encourage editors to say whether or not they are completely confident of a given reading. After some discussion, the EpiDoc community decided that certainty should be expressed as a yes-or-no value: either the editor is certain of the reading, or not. The idea of saying ‘I am 95% certain of this letter, 83% certain of this letter, and only 37% certain of this letter’ seemed too complicated, and it was decided that editors should be encouraged to put these details into the commentary—as they have always done. The advance of EpiDoc over the Leiden system here is simply that the editor can note certainty in a standard way in the markup, not merely in the commentary.

Other philosophical debates include how much can be assumed from applications that will work with EpiDoc texts, how best to handle characters that are not part of Unicode and will not be added, and how to handle the necessarily imprecise dates given for ancient texts. The archives of the mailing list trace the progress of the guidelines, and the guidelines themselves embody the collective wisdom of a group of practicing epigraphers and XML specialists.


The epigraphic community has a long-established practice of using semantic markup. The markup systems in use have evolved over the past four hundred years, but until relatively recently have always involved special typographical symbols in the text—brackets, underdots, and so on. Some epigraphers see XML as a natural transformation of what they have always done, with all the additional benefits that come from standardization within the community.

The EpiDoc guidelines are emerging as one standard for digital epigraphy with the TEI. EpiDoc is not the only possible way to use the TEI for epigraphic texts, of course, but the tools, documentation, and examples that are growing up around it will make it a good place for new digitization projects to start.

Student anthologies of Greek inscriptions include for example Meiggs and Lewis; Schwenk; and Tod. ILS (Dessau) is still used for Latin.
This convention was first adopted at the 18th International Congress of Orientalists, Leiden, September 1932, by papyrologists, and taken up by epigraphers shortly thereafter. Dow gives historical background and bibliography. Woodhead explains the system from the point of view of a student learning to use inscriptions. Panciera summarizes more recent debates.
In the sense of (Crane and Ryberg-Cox; Rydberg-Cox, Mahoney, and Crane)
Of course projects must also decide what format to use for digital photographs, but this is outside the scope of this discussion.
Beta-code is well known as an encoding for the Greek alphabet, but it also includes representations for all the other characters that appear in the texts of the Thesaurus Linguæ Græcæ.
As opposed to the Greek or Etruscan alphabets, which were occasionally used for Latin by non-native speakers

Last recorded change to this page: 2007-10-31  •  For corrections or updates, contact webmaster AT tei-c DOT org