<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright TEI Consortium. 
Dual-licensed under CC-by and BSD2 licences 
See the file COPYING.txt for details.
$Date$
$Id$
-->


<?xml-model href="http://tei.oucs.ox.ac.uk/jenkins/job/TEIP5/lastSuccessfulBuild/artifact/P5/release/xml/tei/odd/p5.nvdl" type="application/xml" schematypens="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0"?>

<div xmlns="http://www.tei-c.org/ns/1.0" type="div1" xml:id="CO" n="6">
<head>Elements Available in All TEI Documents</head>
<p>This chapter describes elements which may appear in any kind of text
and the tags used to mark them in all TEI documents. Most of these
elements are freely floating phrases, which can appear at any point
within the textual structure, although they must generally be contained
by a higher-level element of some kind (such as a paragraph). A few of
the elements described in this chapter (for example, bibliographic
citations and lists) have a comparatively well-defined internal
structure, but most of them have no consistent inner structure of their
own. In the general case, they contain only a few words, and are often
identifiable in a conventionally printed text by the use of typographic
conventions such as shifts of font, use of quotation or other
punctuation marks, or other changes in layout.</p>
<p>This chapter begins by describing the <gi>p</gi> tag used to mark
paragraphs, the prototypical formal unit for running text
in many TEI modules. This is followed, in
section <ptr target="#COPU"/>, by a discussion of some specific problems
associated with the interpretation of conventional punctuation, and the
methods proposed by the Guidelines for resolving ambiguities
therein.</p>

<p>The next section (section <ptr target="#COHQ"/>) describes a number
of phrase-level elements commonly marked by typographic features (and
thus well-represented in conventional markup languages). These include
features commonly marked by font shifts (section <ptr target="#COHQH"/>) and features commonly marked by quotation marks
(section <ptr target="#COHQQ"/>) as well as such features as terms,
cited words, and glosses (section <ptr target="#COHQU"/>).</p>

<p>Section <ptr target="#COED"/> introduces some phrase-level elements
which may be used to record simple editorial interventions, such as
emendation or correction of the encoded text. The elements described
here constitute a simple subset of the full mechanisms for encoding
such information (described in full in chapter <ptr target="#PH"/>),
which should be adequate to most commonly encountered situations.</p>

<p>The next section (section <ptr target="#CONA"/>) describes several
phrase-level and inter-level elements which, although often of
interest for analysis or processing, are rarely explicitly identified
in conventional printing. These include names (section <ptr target="#CONARS"/>), numbers and measures (section <ptr target="#CONANU"/>), dates and times (section <ptr target="#CONADA"/>), abbreviations (section <ptr target="#CONAAB"/>), and addresses (section <ptr target="#CONAAD"/>).</p>

<p>In the same way, the following section (section <ptr target="#COXR"/>) presents only a subset of the facilities available
for the encoding of cross-references or text-linkage. The full story
may be found in chapter <ptr target="#SA"/>; the tags presented here
are intended to be usable for a wide variety of simple
applications.</p>

<p>Sections <ptr target="#COLI"/>, and <ptr target="#CONO"/>, describe
two kinds of quasi-structural elements: lists and notes. These may
appear either within chunk-level elements such as paragraphs, or
between them. Several kinds of lists are catered for, of an arbitrary
complexity. The section on notes discusses both notes found in the
source and simple mechanisms for adding annotations of an interpretive
nature during the encoding; again, only a subset of the facilities
described in full elsewhere (specifically, in chapter <ptr target="#AI"/>) is discussed.</p>
<p>Section <ptr target="#COGR"/> introduces some simple ways of
representing graphic or other non-textual content found in a text. A
fuller discussion of the multimedia facilities supported by these
Guidelines may be found in chapters <ptr target="#FT"/> and <ptr target="#SA"/>. </p>
<p>Next, section <ptr target="#CORS"/>, describes methods of
encoding within a text the conventional system or systems used when
making references to the text. Some reference systems have attained
canonical authority and must be recorded to make the text useable in
normal work; in other cases, a convenient reference system must be
created by the creator or analyst of an electronic text.</p>
<p>Like lists and notes, the bibliographic citations discussed in
section <ptr target="#COBI"/>, may be regarded as structural elements in
their own right. A range of possibilities is presented for the encoding
of bibliographic citations or references, which may be treated as
simple phrases within a running text, or as highly-structured
components suitable for inclusion in a bibliographic database.</p>
<p>Additional elements for the encoding of passages of verse or drama
(whether prose or verse) are discussed in section <ptr target="#CODV"/>.</p>
<p>The chapter concludes with a technical overview of the structure and
organization of the module described here. This should be read in
conjunction with chapter <ptr target="#ST"/>, describing the structure of
the TEI document type definition.</p> 
<div type="div2" xml:id="COPA"><head>Paragraphs</head>
<p>The paragraph is the fundamental organizational unit for all prose
texts, being the smallest regular unit into which prose can be
divided. Prose can appear in all TEI texts, even those that are
primarily of another genre (e.g., verse); thus the paragraph is
described here, as an element which can appear in any kind of
text.</p>
<p>Paragraphs can contain any of the other elements described within
this chapter, as well as some other elements which are specific to
individual text types. We distinguish <term>phrase-level</term>
elements, which must be entirely contained within a paragraph and
cannot appear except within one, from <term>chunks</term>, which can
appear between, but not within, paragraphs, and from
<term>inter-level</term> elements, which can appear either within a
single paragraph or between paragraphs. The class of phrases includes
emphasized or quoted phrases, names, dates, etc. The class of
inter-level elements includes bibliographic citations, notes, lists,
etc. The class of chunks includes the paragraph itself, and other
elements which have similar structural properties, notably the
<gi>ab</gi> (anonymous block) element described in <ptr target="#SASE"/>) which may be used as an alternative to the paragraph
in some kinds of texts.</p>
<p>Because paragraphs may appear in different base or additional tag
sets, their possible contents may differ in different kinds of
documents. In particular, additional elements not listed in this
chapter may appear in paragraphs in certain kinds of text. However, the
elements described in this chapter are always by default available in
all kinds of text.</p>
<p>The paragraph is marked using the <gi>p</gi> element:
 <specList><specDesc key="p"/></specList></p>
<p>If a consistent internal subdivision of paragraphs is desired, the
<gi>s</gi> or <gi>seg</gi> (<soCalled>segment</soCalled>) elements may
be used, as discussed in chapters <ptr target="#SA"/> and <ptr target="#AI"/>
respectively. More usually, however, paragraphs have no firm internal
structure, but contain prose encoded as a mix of characters, entity
references, phrases marked as described in the rest of this chapter, and
embedded elements like lists, figures, or tables.</p>
<p>Since paragraphs are usually explicitly marked in Western texts,
typically by indentation, the application of the <gi>p</gi> tag
usually presents few problems.</p>
<p>In some cases, the body of a text may comprise but a single
paragraph:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COPA-eg-1"><body>
   <p>I fully appreciate Gen. Pope's splendid achievements with their
invaluable results; but you must know that Major Generalships in the
Regular Army, are not as plenty as blackberries.</p>
</body></egXML>
 </p>
<p>This news story shows typically short journalistic paragraphs:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><head>SARAJEVO, Bosnia and Herzegovina, April 19</head><p>Serbs seized more territory in this struggling new country today as
  the United States Air Force ended a two-day airlift of humanitarian
  aid into the capital, Sarajevo.</p>
<p>International relief workers called on European Community nations
  to step up their humanitarian aid to the former Yugoslav republic,
  in conjunction with new American aid flights if necessary.</p>
<p>A special envoy from the European Community, Colin Doyle, harshly
  condemned the decision by Serbs to shell Sarajevo on Saturday night
  during a visit to the Bosnian capital by a senior American official,
  Deputy Assistant Secretary of State Ralph R. Johnson.</p>
<p>...</p></egXML>
 </p>
<p>The following extract from a Russian fairy tale demonstrates
how other phrase level elements (in this case <gi>q</gi> elements
representing direct speech; see section <ptr target="#COHQQ"/>)
may be nested within, but not across, paragraphs:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COPA-eg-02"><p>A fly built a castle, a tall and mighty castle.
There came to the castle the Crawling Louse. <q>Who,
who's in the castle?  Who, who's in your house?</q>
said the Crawling Louse. <q>I, I, the Languishing Fly.
And who art thou?</q><q>I'm the Crawling Louse.</q>
</p>
<p>Then came to the castle the Leaping Flea. <q>Who,
who's in the castle?</q> said the Leaping Flea. <q>I,
I, the Languishing Fly, and I, the Crawling Louse. And
who art thou?</q><q>I'm the Leaping Flea.</q>
</p>
<p>Then came to the castle the Mischievous Mosquito.
<q>Who, who's in the castle?</q> said the Mischievous
Mosquito. <q>I, I, the Languishing Fly, and I, the
Crawling Louse, and I, the Leaping Flea. And who art
thou?</q><q>I'm the Mischievous Mosquito.</q>
</p></egXML>
 </p>

<specGrp xml:id="DCOPA" n="Paragraph">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/p.xml"/>
</specGrp>

</div>
<div type="div2" xml:id="COPU"><head>Treatment of Punctuation</head>

<p>Punctuation marks cause two distinct classes of problem for text
markup: the marks  may not
be available in the character set used,  and they
may be significantly ambiguous. To some extent, the availability of
the Unicode character set addresses the first of these problems, since
it provides specific code points for most punctuation marks, and also
the second to the extent that it distinguishes glyphs (such as stop,
comma, and hyphen) which are used with different functions. 
<!-- Thus, for
example, different Unicode code points are available for the hyphen
used as a minus sign (<val>x2212</val>), as a word breaking hyphen
(<val>x2010</val>), and as a soft or
<soCalled>non-breaking</soCalled> hyphen (<val>x00AD</val>); such
distinctions are not however made in all possible cases, particularly
where older writing systems are concerned. --> Where punctuation
itself is the subject of study, the element <gi>pc</gi> (punctuation
character) may be used to mark it explicitly, as further discussed in
<ptr target="#AIPC"/>. Where the character used for a punctuation mark
is not available in Unicode, the <gi>g</gi> element and other facilities described in
chapter <ptr target="#WD"/> may also be used to mark its presence.</p>

<div xml:id="COPU-1"><head>Functions of Punctuation</head>

<p>Punctuation is itself a form of markup, historically introduced to
provide the reader with an indication about how the text should be
read. As such, it is unsurprising that encoders will often wish to
encode directly the purpose for which punctuation was provided, as
well as, or even instead of, the punctuation itself. We discuss some
typical cases below.</p>
<p>The <term rend="noindex">Full stop (period)</term>
may mark (orthographic) sentence boundaries, abbreviations, decimal
points, or serve as a visual aid in printing numbers. These usages can
be distinguished by tagging S-units, abbreviations, and numbers, as
described in sections <ptr target="#SASE"/>, <ptr target="#CONAAB"/>,
and <ptr target="#CONANU"/> respectively. However, there are independent reasons
for tagging these, whether or not they are marked by full
stops, and the polysemy of the full stop itself is perhaps no different from
that of any other character in the writing system. </p>
<p>The <term rend="noindex">Question mark</term>
and <term rend="noindex">exclamation mark</term>
usually mark the end of orthographic sentences, but may also be used
as a mid-sentence comment by the author (<mentioned>!</mentioned> to express
surprise or some other strong feeling, <mentioned>?</mentioned> to query a word
or expression or mark a sentence as dubious in linguistic discussion).
Such usages may be distinguished by marking S-units, in which case the
mid-sentence uses of these punctuation marks may be left unmarked, or
tagged using the <gi>pc</gi> element discussed in <ptr target="#AILC"/>.
 </p>
 <p><term rend="noindex">Dashes</term> are used for a variety of
 purposes: as a mark of omission, insertion, or interruption;
 to show where a new speaker takes over (in dialogue); or to introduce
 a list item. In the latter two cases particularly, it is clearly
desirable to mark the function as well as its rendition using the
 elements <gi>q</gi> or <gi>item</gi>, on which see section <ptr
 target="#COHQQ"/>, and section <ptr target="#COLI"/>,
 respectively.</p>

<p><term rend="noindex">Quotation marks</term> may be removed from
text contained by <gi>q</gi> or <gi>quote</gi> elements on editorial
grounds, or they may be marked in a variety of ways; see
the discussion of quotation and related features in section <ptr
target="#COHQQ"/>.</p>

<p><term rend="noindex">Apostrophes</term> must be distinguished from
single quote marks. As with hyphens, this disambiguation is best
performed by selecting the appropriate Unicode character, though  it may
also be represented by using appropriate XML markup for quotations as
suggested above. However, apostrophes have a variety of uses. In
English they mark contractions, genitive forms, and (occasionally)
plural forms. Full disambiguation of these uses belongs to the level of
linguistic analysis and interpretation.</p>
<p><term rend="noindex">Parentheses</term>
and other marks of suspension such as dashes or ellipses are often
used to signal information about the syntactic structure of a text
fragment. Full disambiguation of their uses also belongs to the level
of linguistic analysis and interpretation, and will therefore need to
use the mechanisms discussed in chapter <ptr target="#AI"/>.
</p>
<p>Where punctuation marks are disambiguated by tagging their assumed
function in the text (for example, quotation), it may be debated
whether they should be excluded or left as part of the text. In the
case of quotation marks, it may be more convenient to distinguish
opening from closing marks simply by using the appropriate Unicode
character than to use the <gi>q</gi> element, with or without an
indication of rendition. </p>

<p>Where segmentation of a text is performed automatically, the
accuracy of the result may be considerably enhanced by a first pass in
which the function of different punctuation characters is explicitly
marked.  This need not be done for all cases, but only where the
structural function of the punctuation markup (for example as a word
or phrase delimiter) is ambiguous. Thus, dots indicating abbreviation
might be distinguished from dots indicating sentence end, and
exclamation or question marks internal to a sentence distinguished
from those which terminate one. Furthermore, when encoding historical
materials, it may be considered essential to retain the original
punctuation, whether by using an appropriate character code, if this
is available (or using the <gi>g</gi> element where it is not) or by
an explicit encoding using <gi>pc</gi>. The particular method adopted
will vary depending upon the feature concerned and upon the purpose of
the project.
</p> 

</div>

<div xml:id="COPU-2"><head>Hyphenation</head>

<p>Hyphenation as a phenomenon is generally of most concern when
producing formatted text for display in print or on screen: different
languages and systems have developed quite sophisticated sets of rules
about where hyphens may be introduced and for what reason. These
generally do not concern the text encoder, since they belong to the
domain of formatting and will generally be handled by the rendition
software in use. In this section, we discuss issues arising from the
appearance of hyphens in pre-existing formatted texts which are being
re-encoded for analysis or other processing. Unicode distinguishes
four  characters  visually similar to the hyphen, including the
undifferentiated hyphen-minus (U+002D) which is retained for compatibility
reasons. The hard hyphen (U+2010) is distinguished from the minus sign
(U+2212) which is for use in mathematical expressions, and
also from the soft hyphen (U+00AD) which may appear in <soCalled>born
digital</soCalled> documents to indicate places where it is acceptable
to insert a hyphen when the document is formatted. </p>

<p> Historically, the hard hyphen has been used in printed or
manuscript documents for two distinct purposes. In many languages, it
is used between words to show that they function as a single syntactic
or lexical unit. For example, in French, <mentioned>est-ce
que</mentioned>; in English <mentioned>body-snatcher</mentioned>,
<mentioned>tea-party</mentioned> etc. It may also have an important
role in disambiguation (for example, by distinguishing say a
<mentioned>man-eating fish</mentioned> from a <mentioned>man eating
fish</mentioned>). Such usages, although possibly problematic when a
linguistic analysis is undertaken, are not generally of concern to
text encoders: the hyphen character is usually retained in the text,
because it may be regarded as part of the way a compound or other
lexical item is spelled. Deciding whether a compound is to be
decomposed into its constituent parts, and if so how, is a different
question, involving consideration of many other phenomena in addition
to the simple presence of a hyphen. </p>

<p> When it appears at the end of a printed or written line however,
the hard hyphen generally indicates that—contrary to what might be
expected—a word is not yet complete, but continues on the next line
(or over the next page or column or other boundary). The hyphen
character is not, in this case, part of the word, but just a signal
that the word continues over the break. Unfortunately, few languages
distinguish these two cases visually, which necessarily poses a
problem for text encoders. Suppose, for example, that we wish to
investigate a diachronic English corpus for occurrences of "tea-pot"
and "teapot", to find evidence for the point at which this compound
becomes lexicalized. Any case where the word is hyphenated across a
linebreak, like this: <eg xml:space="preserve"><![CDATA[tea-
pot]]></eg> is entirely ambiguous: there is simply no way of deciding
which of the two spellings was intended.
</p>

<p>As elsewhere, therefore, the encoder has a range of choices:
<list>
<item>They
may decide simply to remove any end-of-line hyphenation from the
encoded text, on the grounds that its presence is purely a secondary
matter of formatting. This will obviously apply also if line endings
are themselves regarded as unimportant.</item>
<item>Alternatively, they may decide to record the presence of the
hyphen, perhaps on the grounds that it provides useful morphological
information; perhaps in order to retain information about the visual
appearance of the original source. In either case, they need to decide
whether to record it explicitly, by including an appropriate
punctuation character in the text data, or implicitly by supplying an
appropriate symbolic value for one or more of the attributes on the
<gi>lb</gi> or other milestone element used to record the fact of the line
 division. If the hyphen is included in the character data of the TEI document, it might be marked up using the <gi>pc</gi>
  (punctuation character) tag, which allows the encoder to express
  information about its function as a separator, through the <att>force</att> attribute
 (see <ptr target="#AIPC"/>).</item>
</list>
A similar range of possibilities applies equally to the representation of
other common punctuation marks, notably quotation marks, as discussed
in <ptr target="#COHQQ"/>.</p>

<p> The <soCalled>text data</soCalled> of which XML documents are
composed is decomposable into smaller units, here called
<term>orthographic tokens</term>, even if those units are not
explicitly indicated by the XML markup. The ambiguity of the
end-of-line hyphen also causes problems in the way a processor
identifies such tokens in the absence of explicit markup. If token
boundaries are not explicitly marked (for example using the
<gi>seg</gi> or <gi>w</gi> elements), for most languages a processor
will rely on character class information to determine where they are
to be found: some punctuation characters are considered to be
word-breaking, while others are not. In XML, the newline character in
text data is a kind of whitespace, and is therefore word
breaking.  However, it is generally unsafe to assume that whitespace
adjacent to markup tags will always be preserved, and it is decidedly
unsafe to assume that markup tags themselves are equivalent to
whitespace. </p>

<p> The <gi>lb</gi>, <gi>pb</gi>, and <gi>cb</gi> elements are notable
exceptions to this general rule, since their function is precisely to
represent (or replace) line, page, or column breaks, which, as noted
above, are generally considered to be equivalent to whitespace. These
elements provide a more reliable way of preserving the lineation,
pagination, etc of a source document, since the encoder should not
assume that (untagged) line breaks etc. in an XML source file will
necessarily be preserved. </p>

<p>To control the intended tokenization, the encoder may use the
<att>break</att> attribute on such elements to indicate whether or not
the element is to be regarded as equivalent to whitespace. This
attribute can take the values <val>yes</val> or <val>no</val> to
indicate whether or not the element corresponds with a token
boundary. The value <val>maybe</val> is also available, for cases
where the encoder does not wish (or is unable) to determine whether
the orthographic token concerned is broken by the line ending.
</p>

<p>As a final complication, it should be noted that in some languages,
particularly German and Dutch, the spelling of a word may be altered
in the presence of end of line hyphenation. For example, in Dutch, the
word <mentioned>opaatje</mentioned> (<gloss>granddad</gloss>),
occurring at the end of a line may be hyphenated as
<mentioned>opa-tje</mentioned>, with a single letter a. An encoder
wishing to preserve the original form of this orthographic token in a
printed text while at the same time facilitating its recognition as
the word <mentioned>opaatje</mentioned> will therefore need to rely on
a more sophisticated process than simply removing the hyphen. This is
however essentially the same as any other form of normalization
accompanying the recognition of variations in spelling or morphology:
as such it may be encoded using the <gi>choice</gi> element discussed
in <ptr target="#COED"/>, or the more sophisticated mechanisms for
linguistic analysis discussed in chapter <ptr target="#AI"/>.
</p>
</div>
</div>


<div type="div2" xml:id="COHQ"><head>Highlighting and Quotation</head>
<p>This section deals with a variety of textual features, all of
which have in common that they are frequently realized in conventional
printing practice by the use of such features as underlining, italic
fonts, or quotation marks, collectively referred to here as
<term>highlighting</term>. After an initial discussion of this
phenomenon and alternate approaches to encoding it, this section
describes ways of encoding the following textual features, all
of which are conventionally rendered using some kind of highlighting:
<list rend="bulleted">
<item>emphasis, foreign words and other linguistically distinct uses
of highlighting</item>
<item>representation of speech and thought, quotation, etc.</item>
<item>technical terms, glosses, etc.</item></list>
 </p>
<div type="div3" xml:id="COHQW"><head>What Is Highlighting?</head>
<p>By <mentioned>highlighting</mentioned> we mean the use of any
combination of<index><term>highlighting</term></index> typographic
features (font, size, hue, etc.) in a printed or written text in order
to distinguish some passage of a text from its surroundings.<note place="bottom">Although the way in which a spoken text is performed,
(for example, the voice quality, loudness, etc.)  might be regarded as
analogous to <soCalled>highlighting</soCalled> in this sense, these
Guidelines recommend distinct elements for the encoding of such
<soCalled>highlighting</soCalled> in spoken texts. See further section
<ptr target="#TSSASH"/>.</note> The purpose of highlighting is
generally to draw the reader's attention to some feature or
characteristic of the passage highlighted; this section describes the
elements recommended by these Guidelines for the encoding of such
textual features.
 </p>
<p>In conventionally printed modern texts, highlighting is often
employed to identify words or phrases which are regarded as being one or
more of the following:
<list rend="bulleted">
<item>distinct in some way—as foreign, dialectal,
archaic, technical, etc.</item>
<item>emphatic, and which would for example be stressed when spoken</item>
<item>not part of the body of the text, for example cross-references,
titles, headings, labels, etc.</item>
<item>identified with a distinct narrative stream, for example an
internal monologue or commentary.</item>
<item>attributed by the narrator to some other agency, either within the
text or outside it:  for example, direct speech or quotation.</item>
<item>set apart from the text in some other way:  for example,
proverbial phrases, words mentioned but not used, names of persons and
places in older texts, editorial corrections or additions, etc.</item></list>
 </p>
<p>The textual functions indicated by highlighting may not be rendered
consistently in different parts of a text or in different texts. (For
example, a foreign word may appear in italics if the surrounding text is
in roman, but in roman if the surrounding text is in italics.)  For this
reason, these Guidelines distinguish between the encoding of rendering
itself and the encoding of the underlying feature expressed by it.
 </p>
<p>Highlighting as such may be encoded by using one of the global
attributes <att>rend</att>, <att>rendition</att>, or <att>style</att>
(see further <ptr target="#STGA"/>).  This allows the encoder both to
specify the function of a highlighted phrase or word, by selecting the
appropriate element described here or elsewhere in the Guidelines, and
to further describe the way in which it is highlighted, by means of an
attribute. If the encoder wishes to offer no interpretation of the
feature underlying the use of highlighting in the source text, then
the <gi>hi</gi> element may be used, which indicates only that the
text so tagged was highlighted in some way.
<specList>
<specDesc key="hi"/>
</specList>
The <gi>hi</gi> element is provided by the <ident type="class">model.hiLike</ident> class. </p>

<p>The possible values carried by the <att>rend</att> attribute are
not formally defined in this version of the Guidelines. It may be used
to document any peculiarity of the way a given segment of text was
rendered in the original source text, and may thus express a very
large range of typographic or other features, by no means restricted
to typeface, type size, etc. The <att>style</att> attribute, by
contrast, defines the way the source text was rendered using a
formally defined style language, such as the W3C standard Cascading
Stylesheet Language (<ptr target="#CSS1"/>). The complementary
<att>rendition</att> attribute is used to point to one or more
fragments expressed using such a language which have been predefined
in the TEI header using the <gi>rendition</gi> element discussed in
section <ptr target="#HD57"/>.
 </p>

<p>Where it is both appropriate and feasible, these Guidelines recommend
that the textual feature marked by the highlighting should be encoded,
rather than just the simple fact of the highlighting. This is for the
following reasons:
<list rend="bulleted">
<item>the same kind of highlighting may be used for different purposes
in different contexts</item>
<item>the same textual function may be highlighted in different ways in
different contexts</item>
<item>for analytic purposes, it is in general more useful to know the
intended function of a highlighted phrase than simply that it is
distinct.</item></list>
 </p>
<p>In many, if not most, cases the underlying function of a
highlighted phrase will be obvious and non-controversial, since the
distinctions indicated by a change of highlighting correspond with
distinctions discussed elsewhere in these Guidelines. The elements
available to record such distinctions are, for the most part, members
of the <ident type="class">model.emphLike</ident> class. This and the
<ident type="class">model.hiLike</ident> class mentioned above
constitute the <ident type="class">model.highlighted</ident> class,
which is a phrase level class. Members of this class may appear
anywhere within paragraph level elements.</p>
<p>The distinction between the two classes is simple, and typified by
the two elements <gi>hi</gi> and <gi>emph</gi>: the former marks
simply that a passage is typographically distinct in some way, while
the latter asserts that a passage is linguistically emphasized for
some purpose. These two properties, though often combined, are not
identical. It should however be recognized, however, that cases do
exist in which it is not economically feasible to mark the underlying
function (e.g. in the preparation of large text corpora), as well as
cases in which it is not intellectually appropriate (as in the
transcription of some older materials, or in the preparation of
material for the study of typographic practice). In such cases, the
<gi>hi</gi> element or some other element from the <ident type="class">model.hiLike</ident> class should be used.
 </p>
<p>Elements which are sometimes realized by typographic distinction but
which are not discussed in this section include <gi>title</gi>
(discussed in section <ptr target="#COBI"/>) and <gi>name</gi> (discussed
in section <ptr target="#CONARS"/>).
 </p></div>
<div type="div3" xml:id="COHQH"><head>Emphasis, Foreign Words, and Unusual Language</head>
<p>This subsection discusses the following elements:
<specList><specDesc key="foreign"/><specDesc key="emph"/><specDesc key="distinct"/></specList>
These elements are all members of the <ident type="class">model.emphLike</ident> class. </p>
<div type="div4" xml:id="COHQHF"><head>Foreign Words or Expressions</head>
<p>Words or phrases which are not in the main language of the text
should be tagged as such, at least where the fact is indicated in the
text. Where the word or phrase concerned is already distinguished from
the rest of the text by virtue of its function (for example, because
it is a name, a technical term, a quotation, a mentioned word, etc.)
then the global <att>xml:lang</att> attribute should be used to
specify additionally that its language distinguishes it from the
surrounding text. Any element in the TEI scheme may take a
<att>xml:lang</att> attribute, which specifies both the writing system
and the language used by its content (see sections <ptr
target="#STGAla"/> and <ptr target="#CHSH"/> for discussion of this
attribute and its values respectively). Where there is no other
applicable element, the element <gi>foreign</gi> may be used to
provide a peg onto which the <att>xml:lang</att> may be attached.
<egXML xmlns="http://www.tei-c.org/ns/Examples"
 source="#COHQHF-eg-5"><q>Aren't you confusing <foreign
xml:lang="la">post hoc</foreign> with <foreign xml:lang="la">propter
hoc</foreign>?</q> said the Bee Master. <q>Wax-moth only succeed when
weak bees let them in.</q></egXML>
</p>

<p>The <gi>foreign</gi> element should not be used to represent foreign words
which are mentioned or glossed within the text: for these use the
appropriate element from section <ptr target="#COHQU"/> below. Compare the
following example sentences:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#NONE">John eats a <foreign xml:lang="fr">croissant</foreign> every morning.</egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#NONE"><mentioned xml:lang="fr">Croissant</mentioned> is difficult to
pronounce with your mouth full.</egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQHF-eg-8">A <term xml:lang="fr">croissant</term> is a crescent-shaped
piece of light, buttery, pastry that is usually eaten for
breakfast, especially in France.</egXML>
 </p>
<p>Elements which do not explicitly state the language of their
content by means of an <att>xml:lang</att> attribute are understood to
inherit a value for it from their parent element. In the general case,
therefore, it is recommended practice to supply a default value for
<att>xml:lang</att> on the root <gi>TEI</gi> or <gi>text</gi> element,
as further discussed in section <ptr target="#STGAla"/></p>

<specGrp xml:id="DCOHQ" n="Highlighted phrases">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/foreign.xml"/>
<specGrpRef target="#DCOHQ1"/><specGrpRef target="#DCOHQ3"/><specGrpRef target="#DCOHQQ"/><specGrpRef target="#DCOHQU"/></specGrp>
 </div>
<div type="div4" xml:id="COHQHE"><head>Emphatic Words and Phrases</head>
<p>The <gi>emph</gi> element is provided to mark words or phrases
which are <emph>linguistically</emph> emphatic or stressed. Text which
is only typographically <soCalled>emphasized</soCalled> falls into the
class of highlighted text, and may be tagged with the <gi>hi</gi>
element. In printed works, emphasis is generally indicated by devices
such as the use of an italic font, a large typeface, or extra wide
letter spacing; in manuscripts and typescripts, it is usually
indicated by the use of underlining.  As the following examples
demonstrate, an encoder may choose whether or not to make explicit the
particular type of rendition associated with the emphasis. If a source
text consistently renders a particular feature (e.g. emphasis or words
in foreign languages) in a particular way, the rendering associated
with that feature may be described in the TEI header using the
<gi>rendition</gi> element.  The <att>rend</att>,
<att>rendition</att>, or <att>style</att> attributes may then be used
to describe examples which deviate from the norm. For example,
assuming that the TEI header has defined a default rendering for the
<gi>emph</gi> element, the following encoding would use it: <egXML
xmlns="http://www.tei-c.org/ns/Examples"
 source="#COHQHE-eg-10"><q>Sex, sir, is <emph>purely</emph> a question
of appetite!</q> Tarr exclaimed.</egXML> If on the other hand no such
default has been defined for the element, the encoder may specify it
informally using the <att>rend</att> attribute: <egXML
xmlns="http://www.tei-c.org/ns/Examples"
 source="#COHQHE-eg-11"><q>What it all comes to is this,</q> he said.
<q><emph rend="italic">What does Christopher Robin do in the morning
nowadays?</emph></q></egXML> If the encoder wishes to express
information about the rendition used in the source using a formal
language such as CSS, then the <att>style</att> attribute can be used
in a similar way: <egXML xmlns="http://www.tei-c.org/ns/Examples"
 source="#COHQHE-eg-11"><q>What it all comes to is this,</q> he said.
<q><emph style="font-style: italic">What does Christopher Robin do in
the morning nowadays?</emph></q></egXML>
</p>
<p>In cases where the rendition of a source needs to be indicated 
several times in a document, it may be more convenient to provide a
default value using the <gi>rendition</gi> element  in the header. If
a small number of distinct values are required, it may also be
convenient to define them all by means of a series of <gi>rendition</gi> elements
which can then be referenced from the elements in question by means of
the global <att>rendition</att> attribute:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQHE-eg-12"><l>Here Thou, great <name rendition="#italic">Anna</name>!
   whom three Realms obey,</l>
<l>Doth sometimes Counsel take —
   and sometimes <emph rendition="#italic">Tea</emph>.</l>
<!-- in the header ... -->
<rendition xml:id="italic" scheme="css">font-style: italic</rendition>
</egXML>
Further information on the use of the <gi>rendition</gi> element is
provided at <ptr target="#HD57"/>. </p>


<p>The <gi>hi</gi> element is used to mark words or phrases which are
highlighted in some way, but for which identification of the intended
distinction is difficult, controversial, or impossible. It enables an
encoder simply to record the fact of highlighting, possibly describing
it by the use of a <att>rend</att>, <att>style</att>, or
<att>rendition</att> attribute, as discussed above, without however
taking a position as to the function of the highlighting. This may
also be useful if the text is to be processed in two stages:
representing simply typographic distinctions during a first pass, and
then replacing the <gi>hi</gi> elements with more specific elements in
a second pass.
 </p>
<p>Some simple examples:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQHE-eg-13"><hi rend="gothic">And this Indenture further witnesseth</hi>
that the said <hi rend="italic">Walter Shandy</hi>, merchant,
in consideration of the said intended marriage ...</egXML>

In this example, the first highlighted phrase uses black letter or
gothic print to mimic the appearance of a legal document, and italic
to mark <mentioned>Walter Shandy</mentioned> as a name. In a second
pass, the elements <gi>head</gi> or <gi>label</gi> might be
appropriate for the first use, and the element <gi>name</gi> for the
second.  <egXML xmlns="http://www.tei-c.org/ns/Examples"
 source="#COHQHE-eg-14">The heaviest rain, and snow, and hail, and
sleet, could boast of the advantage over him in only one respect. They
often <hi rend="quoted">came down</hi> handsomely, and Scrooge never
did.</egXML> In this example, the phrase <mentioned>came
down</mentioned> uses inverted commas to indicate a play on
words.<note place="bottom">The Oxford English Dictionary documents the
phrase <mentioned>to come down</mentioned> in the sense <q>to bring or
put down; <hi rend="italic">esp.</hi> to lay down money; to make a
disbursement</q> as being in use, mostly in colloquial or humorous
contexts, from at least 1700 to the latter half of the 19th century.
	</note>
In a second pass, the element <gi>soCalled</gi> might be preferred as
a means of indicating that the narrator is distancing himself from
this usage.
 </p>
<specGrp xml:id="DCOHQ1">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/emph.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/hi.xml"/>
</specGrp>
</div>
<div type="div4" xml:id="COHQHD"><head>Other Linguistically Distinct Material</head>
<p>For some kinds of analysis, it may be desirable to encode the
linguistic distinctiveness of words and phrases with more delicacy than
is allowed by the <gi>foreign</gi> element. The <gi>distinct</gi>
element is provided for this purpose. Its attributes allow for
additional information characterizing the nature of the linguistic
distinction to be made in two distinct ways:  the <att>type</att>
attribute simply assigns a user-defined code of some kind to the word or
phrase which assigns it to some register, sub-language, etc. No
recommendations as to the set of values for this attribute are provided
at this time, as little consensus exists in the field.
 </p>
<p>Alternatively, the remaining three attributes may be used in
combination to place a word or phrase on a three-dimensional scale
sometimes used in descriptive linguistics, as for example in
<ref target="#CO-BIBL-1">Mattheier et al, 1988</ref>. 
The <att>time</att> attribute places a word or phrase
<term rend="noindex">diachronically</term>,<index><term>diachronic information</term></index>
for example as archaic, old-fashioned, contemporary, futuristic, etc.;
the <att>space</att> attribute places a word or phrase
<term rend="noindex">diatopically</term>,<index><term>diatopic information</term></index>
that is, with respect to a geographical classification, for example as
national, regional, international, etc.; the <att>social</att> attribute
places a word or phrase <term rend="noindex">diastratically</term>,<index><term>diastratic information</term></index>
that is, with respect to a social classification, for example as
technical, polite, impolite, restricted, etc. Again, no recommendations
are made for the values of these attributes at this time; the encoder
should provide a description of the scheme used in the appropriate
section of the header (see section <ptr target="#HD5"/>).
 </p>
<p>Examples:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQHD-eg-18">Next morning a boy in that dormitory confided to his
bosom friend, a <distinct type="psSlang">fag</distinct> of
Macrea's, that there was trouble in their midst which
King <distinct type="archaic">would fain</distinct> keep
secret.</egXML>

<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQHD-eg-18">Next morning a boy in that dormitory confided to his
bosom friend, a
<distinct time="1900" space="GB" social="publicschool">fag</distinct>
of Macrea's, that there was trouble in their midst which
King <distinct time="archaic">would fain</distinct> keep
secret.</egXML>
Where more complex (or more rigorous) interpretive analyses of the
associations of a word are required, the more detailed and general
mechanisms described in chapter <ptr target="#FS"/> should be preferred to
these simple characterizations. It may also be preferable to record the
kinds of analysis suggested here by means of the simple annotation
element <gi>note</gi> described in section <ptr target="#CONO"/>, or the
<gi>span</gi> element described in section <ptr target="#AISP"/>.
 </p>
<specGrp xml:id="DCOHQ3">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/distinct.xml"/>
</specGrp>
</div></div>
 <div type="div3" xml:id="COHQQ"><head>Quotation</head>

 <p>One form of presentational variation found particularly frequently in
 written and printed texts is the use of quotation marks. As with the
 typographic variations discussed in the preceding section, it is
 generally helpful to separate the encoding of the underlying textual
 feature (for example, a quotation or a piece of direct speech) from the
 encoding of its rendering (for example, the use of a particular style of
 quotation marks).</p>

 <p>This section discusses the following elements, all of which are often
 rendered by the use of quotation marks:
 <specList>
   <specDesc key="q"/>
   <specDesc key="said" atts="direct aloud"/>
   <specDesc key="quote"/>
   <specDesc key="att.source" atts="source"/>
   <specDesc key="cit"/>
   <specDesc key="mentioned"/>
   <specDesc key="soCalled"/>
 </specList>
The elements <gi>mentioned</gi> and <gi>soCalled</gi> are members of
the class <ident type="class">model.emphLike</ident>; the <gi>q</gi>
and <gi>said</gi> are members of the class <ident
type="class">model.qLike</ident> in their own right, while
<gi>cit</gi> and <gi>quote</gi> are members of <ident
type="class">model.quoteLike</ident>, a subclass of <ident
type="class">model.qLike</ident>. This class is a subclass of <ident
type="class">model.inter</ident>; hence all of these elements are
permitted both within and between paragraph-level elements.</p>

 <p>The most common and important use of quotation marks is, of
 course, to mark <term>quotation</term>, by which we mean simply any
 part of the text which the author or narrator wishes to attribute to
 some agency other than the narrative voice. The <gi>q</gi> element
 may be used if no further distinction beyond this is judged
 necessary. If it is felt necessary to distinguish such passages
 further, for example to indicate whether they are regarded as speech,
 writing, or thought, either the <att>type</att> attribute or one of
 the more specialized elements discussed in this section may be
 used. For example, the element <gi>quote</gi> may be used for written
 passages cited from other works, or the element <gi>said</gi> for
 words or phrases represented as being spoken or thought by people or
 characters within the current work.  The <gi>soCalled</gi> element is
 used for cases where the author or narrator distances him or herself
 from the words in question without however attributing them to any
 other voice in particular. The <gi>mentioned</gi> element is
 appropriate for a case where a word or phrase is being discussed in
 the body of a text rather than forming part of the text directly.
 </p>
 <p>As noted above, if the distinction among these various reasons why
 a passage is offset from surrounding text cannot be made reliably, or
 is not of interest, then any representation of speech, thought, or
 writing may simply be marked using the <gi>q</gi> element. </p>
 <p>Quotation may be indicated in a printed source by changes in type
 face, by special punctuation marks (single or double or angled
 quotes, dashes, etc.) and by layout (indented paragraphs, etc.), or
 it may not be explicitly represented at all. If these characteristics
 are of interest, one or other of the global <att>rend</att> or
 <att>rendition</att> attributes discussed in section <ptr
 target="#STGA"/> may be used to record them. </p>
 
<p>Quotation marks themselves may, like other punctuation marks, be
 felt for some purposes to be worth retaining within a text, quite
 independently of their description by the <att>rend</att> attribute.
 This should generally be done using the appropriate Unicode
 character, or, if this is not possible, a numeric character reference
 (see <ptr target="#SG-er"/>). If the encoder decides both to retain
 the quotation marks and to represent their function by means of an
 explicit tag such as <gi>quote</gi>, the quotation marks should be
 included within the element, rather than outside it, as in the first
 example below:

 <egXML xmlns="http://www.tei-c.org/ns/Examples"
 source="#COHQQ-eg-23" xml:lang="fr">Adolphe se tourna vers lui :
 <said>— Alors, Albert, quoi de neuf?</said>
 <said>— Pas grand-chose.</said>
 <said>— Il fait beau,</said> dit Robert.</egXML>

Alternatively, since this use of the leading mdash is very common
typographic practice, it may be considered unnecessary to retain it in
the encoding. Its presence in the source might instead be signalled
using one of the attributes <att>rend</att>, <att>style</att>, or
<att>rendition</att>. This kind of rendering might be
predefined using a <gi>rendition</gi> element, which can then be
referenced using the <att>rendition</att> attribute as follows:

 <egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQQ-eg-23" xml:lang="fr">Adolphe se tourna vers lui :
 <said rendition="#dashBefore">Alors, 
 Albert, quoi de neuf ?</said>
 <said rendition="#dashBefore">Pas grand-chose.</said>
 <said rendition="#dashBefore">Il fait beau,</said>
 dit Robert.
<!-- ...  within the header  -->
<rendition xml:id="dashBefore" scope="before">content: '— '</rendition>
<!-- ... -->
<quotation marks="none"/></egXML>
 </p>

<p>Whatever policy is  adopted, the encoder should document the
decision in some way, for example by using the <gi>quotation</gi>
element provided in the TEI header (see <ptr target="#HD53"/>) to
indicate that quotation marks have not been retained in the encoding;
their presence in the source is implied by the <att>rendition</att>
attribute values supplied.
</p>

 <p>Whether or not the quotation marks are suppressed, their presence
 and nature may be described using some appropriate set of conventions
 in the <att>rend</att> attribute. These conventions may be entirely
 idiosyncratic, and hence not necessarily useful for interchange, as
 in the following example:

 <egXML xmlns="http://www.tei-c.org/ns/Examples"
 source="#DSHD-eg-30"><said rend="pre(‘) post(’)">Who-e debel
 you?</said> — he at last said — 
 <said rend="pre(‘) post(’)">you no speak-e,
 damme, I kill-e.</said>  And so saying,
 the lighted tomahawk began flourishing
 about me in the dark.</egXML>
</p>

<p>Such passages might more effectively be encoded without loss of
rendering information by using the <att>rendition</att> attribute and
its associated <gi>rendition</gi> element as described in section <ptr
target="#HD57-1"/>. If the rendition of passages tagged as
<gi>said</gi> is uniform throughout a text, then the <att>render</att>
attribute of the <gi>tagUsage</gi> element in the header may be used
to specify a default rendering, in which case the same section might
simply be tagged: 

 <egXML xmlns="http://www.tei-c.org/ns/Examples"
 source="#DSHD-eg-30">
<said>Who-e debel
 you?</said> — he at last said — 
 <said>you no speak-e,
 damme, I kill-e.</said>  And so saying,
 the lighted tomahawk began flourishing
 about me in the dark.
<!-- in the header -->
<tagsDecl>
  <rendition xml:id="prequote" scheme="css" scope="before">content:"‘";</rendition>
  <rendition xml:id="postquote" scheme="css" scope="after">content:"’";</rendition>
  <namespace name="http://www.tei-c.org/ns/1.0">    
    <tagUsage gi="said" render="#prequote #postquote"/>
  </namespace>
</tagsDecl>
</egXML>
</p>



<p>As  members of the <ident type="class">att.ascribed</ident> class,
 elements <gi>said</gi> and <gi>q</gi>  share the following attribute:
<specList><specDesc key="att.ascribed" atts="who"/></specList>
This may be used to make explicit who is speaking:
 <egXML xmlns="http://www.tei-c.org/ns/Examples"
 source="#COHQQ-eg-23" xml:lang="fr">Adolphe se tourna vers lui :
 <said who="#Adolphe">— Alors, Albert,
 quoi de neuf?</said>
 <said who="#Albert">— Pas grand-chose.</said>
 <said who="#Robert">— Il fait beau,</said>
 dit Robert.
<!-- ... elsewhere in the document -->
 <list type="speakers">
   <item xml:id="Adolphe"/>
   <item xml:id="Albert"/>
   <item xml:id="Robert"/>
 </list></egXML>

 The <att>who</att> attribute may be supplied whether
 or not an indication of the speaker is given explicitly in the
 text. It may take the form (as above) of  a
 normalized form of the speaker's name, but its role is to act as a
 pointer to a location elsewhere in the text, or another document, where data about each
 speaker may be supplied. While this attribute could point to any source of information about the speaker available by a URI, the most appropriate place to place such
 information is within the <term>participant description</term>
 component of the TEI header, as further discussed in <ptr target="#CCAHPA"/> but for simple cases like the above, a simple list
 of speakers located in the front or back matter of the text may
 suffice.</p>
 <p>It may also be useful to distinguish
 representations of speech from representations of thought, in modern
 printed texts often indicated by a change of typeface. The
 <att>aloud</att> attribute is provided for this purpose, as in this
 example:
 <egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQQ-eg-25"><said aloud="true">Oh yes,</said> said Henry, <said aloud="false">I mean
 Gordon Macrae, for example…</said> <said aloud="false">Jungian
 Analyst with Winebox! That's what you called him, you callous bastard,
 didn't you? Eh? Eh?</said></egXML>
 </p>
 <p>Quoted matter may be embedded within quoted matter, as when one
 speaker reports the speech of another:
 <egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQQ-eg-26"><said who="#Wilson">Spaulding, he came down into the office just this day
 eight weeks with this very paper in his hand, and he says:—
 <said who="#WilsonSpaulding">I wish to the Lord, Mr. Wilson, that I was a
 red-headed man.</said></said>
 <!-- ... -->
 <list type="speakers">
   <item xml:id="Wilson">Wilson</item>
   <item xml:id="WilsonSpaulding">Spaulding reported by Wilson</item>
   <!-- ...-->
 </list></egXML>
 </p>
 <p>Direct speech nested in this way is treated in the same way as
 elsewhere: a change of rendition may occur, but the same
 element should be used. An encoder may however choose to distinguish
 between direct speech which contains quotations from extra-textual
 matter and direct speech itself, as in the following example:
 <egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQQ-eg-27"><p><said>The Lord! The Lord! It is Sakya Muni himself,</said> the lama half
 sobbed; and under his breath began the wonderful Buddhist
 invocation:-<said>
 <quote>
   <l>To Him the Way — the Law — Apart —</l>
   <l>Whom Maya held beneath her heart</l>
   <l>Ananda's Lord — the Bodhisat</l>
 </quote>
 And He is here! The Most Excellent Law is here also. My
 pilgrimage is well begun. And what work! What work!</said>
 </p></egXML> 
 </p>
 <p>Quotations from other works are often accompanied by a reference to
 their source. The <gi>cit</gi> element may be used to group together
 the quotation and its associated bibliographic reference, which should
 be encoded using the elements for bibliographic references discussed in
 section <ptr target="#COBI"/>, as in the following example.
 <egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQQ-eg-28"><div xml:id="mm01" type="chapter">
   <head>Chapter 1</head>
   <epigraph><cit>
     <quote>
       <l>Since I can do no good because a woman</l>
       <l>Reach constantly at something that is near it.</l>
     </quote>
     <bibl>
       <title>The Maid's Tragedy</title>
       <author>Beaumont and Fletcher</author>
     </bibl>
   </cit></epigraph>
   <p>Miss Brooke had that kind of beauty which seems to be thrown into
   relief by poor dress...</p>
 </div></egXML>
 Like other bibliographic references, the citation associated with a
 quotation may be represented simply by a cross-reference, as in this example:
 <egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQQ-eg-29">Lexicography has shown little sign of being affected by the
 work of followers of J.R. Firth, probably best summarized
 in his slogan, <cit>
 <quote>You shall know a word by the company it keeps.</quote>
 <ref>(Firth, 1957)</ref>
 </cit></egXML>
 
 It is also common for quotations to be separated from their bibliographic reference
 by intervening text, which makes the use of <gi>cit</gi> impractical. In such circumstances, 
 the quotation can be linked to a bibliographical reference using <att>source</att>:
 
   <egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQQ-eg-31"><bibl xml:id="tlk_36">Tolkien (1936)</bibl> tells us that 
<quote source="#tlk_36"><title>Beowulf</title> is in fact so interesting as 
poetry, in places poetry so powerful, that this quite 
overshadows the historical content</quote>.</egXML>
   
   <att>source</att> could also be used to point to a complete external bibliographic reference
   in a <gi>listBibl</gi> elsewhere in the document, or external to it.
</p>
   
   
 <p>Unlike most of the other elements discussed in this chapter, direct
 speech and quotations may frequently contain other high-level elements
 such as paragraphs or verse lines, as well as being themselves contained
 by such elements. Three possible solutions exist for this well-known
 structural problem:
 <list rend="bulleted">
   <item>the quotation is broken into segments, each of which is
   entirely contained within a paragraph</item>
   <item>the quotation is marked up using stand-off markup</item>
   <item>the quotation boundaries are represented by empty
   segment boundary delimiter elements</item>
 </list>
 For further discussion and several examples, see chapter <ptr target="#NH"/>.</p>
 <p>Finally, in this section, the element <gi>soCalled</gi> is provided
 for all cases in which quotation marks are used to distance the quoted
 text from the narrator or speaker. Common examples include the
 <soCalled>scare</soCalled> quotes often found in newspaper headlines and
 advertising copy, where the effect is to cast doubts on the veracity of
 an assertion:
 <egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQQ-eg-30"><head>PM dodges <soCalled>election threat</soCalled> in interview</head></egXML>
</p>
 <p>The same element should be used to mark a variety of special ironic
 usages. Some further examples follow:
 <egXML xmlns="http://www.tei-c.org/ns/Examples" source="#NONE">He hated <soCalled>good</soCalled> books.</egXML>
 <egXML xmlns="http://www.tei-c.org/ns/Examples" source="#NONE"><soCalled>Croissants</soCalled> indeed! toast not good enough for you?</egXML>
 <!-- some "real" examples would be better -->
 <egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQQ-eg-33">Although Chomsky's decision that all NL
 sentences are finite objects was never justified by arguments from
 the attested properties of NLs, it did have a certain
 <soCalled>social</soCalled> justification. It was commonly assumed in
 works on logic until fairly recently that the notion
 <mentioned>language</mentioned> is necessarily restricted to finite
 strings.</egXML>
 </p>
 <specGrp xml:id="DCOHQQ" n="Quotation">

<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/said.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/quote.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/q.xml"/>
 <include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/att.source.xml"/>   
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/cit.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/mentioned.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/soCalled.xml"/> </specGrp>
</div>
<div type="div3" xml:id="COHQU"><head>Terms, Glosses, Equivalents, and Descriptions</head>
<p>This section describes a set of textual elements which are
used to provide a gloss, alternate identification, or description of
something.</p>
<p>Technical terms are often italicized or emboldened upon first mention
in printed texts; an explanation or gloss is sometimes given in
quotation marks.  Linguistic analyses conventionally cite words in
languages under discussion in italics, providing a gloss immediately
following marked with single quotation marks.  Other texts in which
individual words or phrases are <term rend="noindex">mentioned</term> (for<index><term>mention</term><index><term>vs. use</term></index></index><index><term>use</term><index><term>vs. mention</term></index></index>
example, as examples) rather than <term rend="noindex">used</term> may
mark them either with italics or with quotation marks, and will gloss
them less regularly.<specList>
        <specDesc key="term"/>
	<specDesc key="gloss"/>
</specList>
These elements are also members of
the class <ident type="class">model.emphLike</ident>.
</p>
<p>A <gi>term</gi> may appear with or without a gloss, as may a
<gi>mentioned</gi> element.  Where the <gi>gloss</gi> is present, it may
be linked to the term it is glossing by means of its <att>target</att>
attribute. To establish such a link, the encoder should give an
<att>xml:id</att> value to the <gi>term</gi> or <gi>mentioned</gi> element
and provide that id as the value of the <att>target</att> attribute on
the <gi>gloss</gi> element.  The following examples demonstrate this
facility:   </p>
<p>Examples:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQU-eg-42">We may define <term xml:id="TDPv" rend="sc">discoursal point of view</term>
as <gloss target="#TDPv">the relationship, expressed through discourse
structure, between the implied author or some other addresser,
and the fiction.</gloss></egXML>

<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQU-eg-43"><gloss rend="unmarked" target="#PRSR">A computational device that infers
structure from grammatical strings of words</gloss> is known as a
<term xml:id="PRSR">parser</term>, and much of the history of NLP over the
last 20 years has been occupied with the design of parsers.</egXML>
</p>

<p>Note that the element <gi>term</gi> is intended for use with words
or phrases identified as terminological in nature; where words or
phrases are simply being cited, discussed, or glossed in a text, it
will often be more appropriate to use the <gi>mentioned</gi> element,
as in the following example:

<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQU-eg-44">There is thus a striking accentual difference between a verbal
form like <mentioned xml:id="cw234" xml:lang="grc">eluthemen</mentioned>
<gloss target="#cw234">we were released,</gloss> accented on the
second syllable of the word, and its participial derivative
<mentioned xml:id="cw235" xml:lang="grc">lutheis</mentioned> <gloss target="#cw235">released,</gloss> accented on the last.</egXML>
 </p>

<p>For technical terminology in particular, and generally in
terminological studies, it may be useful to associate an instance of a
term within a text with a canonical definition for it, which is stored
either elsewhere in the same text (for example in a glossary of terms)
or externally, for example in a database, authority file, or published
standard. The attributes <att>key</att> and <att>ref</att> discussed
in section <ptr target="#CONARS"/> below are available on the
<gi>term</gi> element for this purpose.
</p>
<p>Another group of elements is used to supply different kinds of names
for objects described by the TEI. Examples of this are documentation
of elements, attributes, classes (and also attribute values where
appropriate), and description of glyphs.
<specList>
<specDesc key="altIdent"/>
<specDesc key="desc"/>
<specDesc key="equiv" atts="uri filter name"/>
</specList>
Along with the <gi>gloss</gi> element mentioned above, these elements
constitute the <ident type="class">model.glossLike</ident> class. They
are described in more detail in <ptr target="#TDcrystalsCEdc"/>.</p>

<specGrp xml:id="DCOHQU">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/desc.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/gloss.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/term.xml"/></specGrp>
</div>
<div type="div3" xml:id="COHQHEG"><head>Some Further Examples</head>
<p>As a simple example of the elements discussed here, consider the
following sentence:
<q rend="display">On the one hand the <title>Nibelungenlied</title> 
is associated with the new rise of romance of twelfth-century
France, the <foreign>romans d'antiquité</foreign>,
the romances of Chrétien de Troyes, and the German
adaptations of these works by Heinrich van Veldeke,
Hartmann von Aue, and Wolfram von Eschenbach.</q>
A first approximation to the encoding of this sentence might be simply
to record the fact that the phrases printed above in italics are
highlighted, as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CONADA-eg-144">On the one hand the <hi rend="italic">Nibelungenlied</hi> is
associated with the new rise of romance of twelfth-century France,
the <hi xml:lang="fr" rend="italic">romans d'antiquité</hi>,
the romances of Chrétien de Troyes, ...</egXML>
This encoding would, however, lose the important distinction between
an italicized title and an italicized foreign phrase.  Many other
phrases might also be italicized in the text, and a retrieval
program seeking to identify foreign terms (for example) would not
be able to produce reliable results by simply looking for italicized
words.  Where economic and intellectual constraints permit, therefore,
it would be preferable to encode both the function of the
highlighted phrases and their appearance, as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CONADA-eg-144">On the one hand the <title rend="italic">Nibelungenlied</title>
is associated with the new rise of romance of twelfth-century France,
the <foreign rend="italic">romans d'antiquité</foreign>, the
romances of Chrétien de Troyes, ...</egXML></p>
<p>In this example, the decision as to which textual features
are distinguished by the highlighting is relatively
uncontroversial.  As a less straightforward example, consider the
use of italic font in the following passage:
<q rend="display">A pretty common case, I believe; in all
<hi rend="it">vehement</hi> debatings.  She says I am
<hi rend="it">too witty</hi>; Anglicé,
<hi rend="it">too pert</hi>; I, that she is
<hi rend="it">too wise</hi>; that is to say, being
likewise put into English, <hi rend="it">not so young as
she has been</hi>:  in short, she is grown so much into
a <hi rend="it">mother</hi>, that she had forgotten
she ever was a <hi rend="it">daughter</hi>. ...</q>
 </p>
<p>Clearly, the word <mentioned>vehement</mentioned> is not italicized for the
same reason as the phrase <mentioned>not so young as she has been</mentioned>;
the former is emphasized, while the latter is proverbial.  It also
provides an ironic gloss for the words <mentioned>too wise</mentioned>, in the
same way as <mentioned>too pert</mentioned> glosses <mentioned>too witty</mentioned>.
The glossed phrases are not, however, technical terms or cited words, but
quoted phrases, as if the writer were putting words into her own and her
mother's mouths.  Finally, the words <mentioned>mother</mentioned> and
<mentioned>daughter</mentioned> are apparently italicized simply to oppose them
in the sentence; certainly they do not fit into any of the categories so
far proposed as reasons for italicizing.  Note also that the word
<mentioned>Anglicé</mentioned> is not italicized although it is not
generally considered an English word.
 </p>
<p>The following sample encoding for the above passage attempts to take
into account all the above points:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQHEG-eg-02">A pretty common case, I believe; in all <emph>vehement</emph>
debatings. She says I am <q rend="italic">too witty</q>;
<foreign xml:lang="la" rend="roman">Anglicé</foreign>,
<gloss rend="italic">too pert</gloss>; I, that she is
<q rend="italic"> too wise</q>; that is to say, being likewise
put into English, <gloss rend="italic">not so young as she has
been</gloss>: in short, she is grown so much into a
<hi rend="italic">mother</hi>, that she had forgotten she ever
was a <hi rend="italic">daughter</hi>.</egXML>
<!-- WWP proofer points out inconsistency between tagging     -->
	<!-- and commentary.  "too witty" and "too wise" retagged     -->
	<!-- from TERM (which the prose says they are *not*) to Q     -->
	<!-- (which the prose says they *are*).  WWP proofer also     -->
	<!-- suggests EMPH for 'mother' and 'daughter', but our       -->
	<!-- imaginary interpretation says we don't understand the    -->
	<!-- italics there, so we leave it undisambiguated.  (msm)    -->
</p></div></div>
<div type="div2" xml:id="COED"><head>Simple Editorial Changes</head>
<p>As in editing a printed text, so in encoding a text in electronic
form, it may be necessary to accommodate editorial comment on the text
and to render account of any changes made to the text in preparing it.
The tags described in this section may be used to record such editorial
interventions, whether made by the encoder, by the editor of a printed
edition used as a copy text, by earlier editors, or by the copyists of
manuscripts.</p>

<p>The tags described here handle most common types of editorial
intervention and stereotyped comment; where less structured commentary
of other types is to be included, it should be marked using the
<gi>note</gi> element described in section <ptr target="#CONO"/>.
Systematic interpretive annotation is also possible using the various
methods described in chapter <ptr target="#SA"/>. The examples given
here illustrate only simple cases of editorial intervention; in
particular, they permit economical encoding of a simple set of
alternative readings of a short span of text. To encode multiple views
of large or heterogeneous spans of text, the mechanisms described in
chapter <ptr target="#SA"/> should be used. To encode multiple
witnesses of a particular text, a similar mechanism designed
specifically for critical editions is described in chapter <ptr target="#TC"/>.</p>

  <p>For most of the elements discussed here, some encoders 
    may wish to indicate both a <term>responsibility</term>, that is, a 
    code indicating the person or agency responsible for 
    making the editorial intervention in question, and also 
    an indication of the degree of <term>certainty</term> which the encoder 
    wishes to associate with the intervention. These 
    requirements are served by the 
    <ident type="class">att.global.responsibility</ident>
    class, along with <ident type="class">att.source</ident> and 
    <ident type="class">att.dimensions</ident>. Any of 
    the elements discussed here thus may potentially carry 
    any of the following optional attributes:
<specList>
<specDesc key="att.global.responsibility" atts="cert resp"/>
<specDesc key="att.source" atts="source"/>
<specDesc key="att.editLike" atts="evidence"/>
<specDesc key="att.dimensions" atts="unit quantity extent precision scope"/></specList>
</p>

<p>Many of the elements discussed here can be used in two ways. Their
primary purpose is to indicate that the text encoded as the element's
content represents an editorial intervention (or non-intervention) of
a specific kind, indicated by the element itself. However, pairs or
other meaningful groupings of such elements can also be supplied,
wrapped within a special purpose <gi>choice</gi> element:
  <specList>
  <specDesc key="choice"/>
</specList>
This element enables the encoder to represent for example a text in
its <soCalled>original</soCalled> uncorrected and unaltered form,
alongside the same text in one or more <soCalled>edited</soCalled>
forms. This usage permits software to switch automatically between one
<soCalled>view</soCalled> of a text and another, so that (for example)
a stylesheet may be set to display either the text in its original
form or after the application of editorial interventions of particular
kinds.</p>
<p>Elements which can be combined in this way constitute the
<ident type="class">model.choicePart</ident> class. The default
members of this class are  <gi>sic</gi>,
<gi>corr</gi>, <gi>reg</gi>, <gi>orig</gi>, <gi>unclear</gi>,
 <gi>abbr</gi>, <gi>expan</gi>, <gi>ex</gi>, <gi>am</gi> and <gi>seg</gi>; 
some of their functions and usage are described further below.</p>
<p>Three categories of editorial intervention are discussed in this
section:
<list rend="bulleted">
<item>indication or correction of apparent errors </item>
<item>indication or regularization of variant, irregular,
non-standard, or eccentric forms</item>
<item>editorial additions, suppressions, and
omissions</item></list></p>
<p>A more extended treatment of the use of these tags in
transcriptional and editorial work is given in chapter <ptr target="#PH"/>.</p>
<div type="div3" xml:id="COEDCOR"><head>Apparent Errors</head>
<p>When the copy text is manifestly faulty, an encoder or transcriber
may elect simply to correct it without comment, although for scholarly
purposes it will often be more generally useful to record both the
correction and the original state of the text. The elements described
here enable all three approaches, and allows the last to be done in
such a way as make it easy for software to present
either the original or the correction. 
<specList>
  <specDesc key="sic"/>
  <specDesc key="corr"/>
</specList>
 </p><p>The following examples show alternative treatment of the same
material. The copy text reads:
<q rend="display">Another property of computer-assisted historical
research is that data modelling must permit any one textual feature or
part of a textual feature to be a part of more than one information
model and to allow the researcher to draw on several such models
simultaneously, for example, to select from a machine-readable text
those marginal comments which indicate that the date's mentioned in the
main body of the text are incorrect.</q></p>
<p>An encoder may choose to correct the typographic error, either
silently or with an indication that a correction has been made, as
follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#NONE">… marginal comments which indicate that the <corr>dates</corr>
mentioned in the main body of the text are incorrect.</egXML></p><p>Alternatively, the encoder may simply record the typographic
error without correcting it, either without comment or with a
<gi>sic</gi> element to indicate the error is not a transcription error
in the encoding:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#NONE">… marginal comments which indicate that the <sic>date's</sic>
mentioned in the main body of the text are incorrect.</egXML></p><p>If the encoder elects both to record the original source text
and to provide a correction for the sake of word-search
and other programs, both <gi>sic</gi> and <gi>corr</gi> are used,
wrapped in a <gi>choice</gi>:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#NONE">… marginal comments which indicate that the
  <choice>
    <corr>dates</corr>
    <sic>date's</sic>
  </choice> mentioned in the main body of the text are
  incorrect.</egXML>The <gi>sic</gi> and <gi>corr</gi> elements can
  appear in either order.</p>
<p>If  it is desired to indicate the person or edition responsible for
the emendation, this might be done as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#NONE">… marginal comments which indicate that the
  <choice>
    <corr resp="#msm">dates</corr>
    <sic>date's</sic>
  </choice> mentioned in the main body of the text are
  incorrect.
<!-- within the header for this document ... -->
<respStmt xml:id="msm">
<resp>editor</resp>
<name>C.M. Sperberg-McQueen</name>
</respStmt>
</egXML>Here the <att>resp</att> attribute
  has been used to indicate responsibility for the 
  correction. Its value (<val>#msm</val>) is an 
  example of the <term>pointer</term> values discussed
  in section <ptr target="#COXR"/>; in this case, 
  it points to a <gi>respStmt</gi> element within the TEI 
  header, but any element might be indicated in this way, 
  including for example a <gi>name</gi> element, or (if the 
  module described in <ptr target="#ND"/> has been included) a 
  <gi>person</gi> element. 
  The <att>resp</att> attribute is
  available for all elements which are members of the 
  <ident type="class">att.global.responsibility</ident> class. The same
  class makes available a <att>cert</att> attribute, which may be used
  to indicate the degree of editorial
confidence in a particular correction, as in the following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COEDCOR-eg-64">An <choice><corr cert="high">Autumn</corr><sic>Antony</sic></choice> it was,
That grew the more by reaping</egXML>
See further the discussion in section <ptr target="#PHCC"/>.</p>
<p>Where, as here,  the correction takes the form of adding text not otherwise
present in the text being encoded, the encoder
should use the <gi>corr</gi> element. Where the correction is present
in the text being encoded, and consists of some combination of visible
additions and deletions, the elements <gi>add</gi> or <gi>del</gi>
should be used: see further section <ptr target="#COEDADD"/>
below. Where the correction takes the form of addition of material not
present in the original because of physical damage or illegibility,
the <gi>supplied</gi> element may be used. Where the
<soCalled>correction</soCalled> is simply a matter of 
expanding an abbreviation the <gi>ex</gi> element may be used. These
and other elements to support the detailed encoding of  authorial or scribal
interventions of this kind are all provided by the module described in
chapter <ptr target="#PH"/>. 
</p>
<specGrp xml:id="DCOEDC" n="Editorial tags for correction">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/sic.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/corr.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/choice.xml"/>
</specGrp>
</div><div type="div3" xml:id="COEDREG"><head>Regularization and
Normalization</head> <p>When the source text makes extensive use of
variant forms or non-standard spellings, it may be desirable for a
number of reasons to <term>regularize</term> it: that is, to provide
<soCalled>standard</soCalled> or <soCalled>regularized</soCalled>
forms equivalent to the non-standard forms.<note place="bottom">In some
contexts, the term <mentioned>regularization</mentioned> has a
narrower and more specific significance than that proposed here: the
<gi>reg</gi> element may be used for any kind of regularization,
including normalization, standardization, and
modernization.</note></p><p>As with other such changes to the copy
text, the changes may be made silently (in which case the TEI header
should specify the types of silent changes made) or may be explicitly
marked using the following elements:
<specList>
  <specDesc key="reg"/>
  <specDesc key="orig"/>
  <specDesc key="choice"/>
</specList></p><p>Typical applications for these elements include the production of
editions intended for student or lay readers, linguistic research in
which spelling or usage variation is not the main question at issue,
production of spelling dictionaries, etc.</p><p>Consider this 16th-century text:
<q rend="display">how godly a dede it is to overthrowe so wicked a race
the world may judge: for my part I thinke there canot
be a greater sacryfice to God.</q></p><p>An encoder may choose to preserve the original spelling of this
text, but simply flag it as nonstandard by using the <gi>orig</gi>
element with no attributes specified, as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COEDREG-eg-74"><p>...how godly a <orig>dede</orig> it is to
<orig>overthrowe</orig> so wicked a race the
world may judge: for my part I <orig>thinke</orig>
there <orig>canot</orig> be a greater
<orig>sacryfice</orig> to God</p></egXML></p><p>Alternatively, the encoder may simply indicate that certain words
have been modernized by using the <gi>reg</gi> element with no
attributes specified, as follows:<egXML
xmlns="http://www.tei-c.org/ns/Examples" source="#COEDREG-eg-74"><p>...how godly a
<reg>deed</reg> it is to <reg>overthrow</reg> so wicked a race the
world may judge: for my part I <reg>think</reg>
there <reg>cannot</reg> be a greater
<reg>sacrifice</reg> to God.</p></egXML></p><p>Alternatively, the encoder may elect to record both old and new
spellings, so that (for example) the same electronic text may serve as
the basis of an old- or new-spelling edition:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COEDREG-eg-74"><p>...how godly a <choice><orig>dede</orig><reg>deed</reg></choice> it is to
<choice><orig>overthrowe</orig><reg>overthrow</reg></choice> so wicked a race the
world may judge: for my part I <choice><orig>thinke</orig><reg>think</reg></choice>
there <choice><orig>canot</orig><reg>cannot</reg></choice> be a greater
<choice><orig>sacryfice</orig><reg>sacrifice</reg></choice> to God.</p></egXML>
 </p><p>As elsewhere, the <att>resp</att> attribute may be used to specify the agency
responsible for the regularization. 
</p>
<specGrp xml:id="DCOEDR" n="Editorial tags for regularization">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/reg.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/orig.xml"/>
</specGrp>
</div>
<div type="div3" xml:id="COEDADD"><head>Additions, Deletions, and Omissions</head>
<p>The following elements are used to indicate when words or phrases
have been omitted from, added to, or marked for deletion from, a text.
Like the other editorial elements, they allow for a wide range of
editorial practices: 
<specList>
  <specDesc key="gap" atts="reason"/>
  <specDesc key="unclear" atts="reason"/>
  <specDesc key="add"/>
  <specDesc key="del"/>
</specList></p>
<p>Encoders may choose to omit parts of the copy text for reasons
ranging from illegibility of the source or impossibility of transcribing
it, to editorial policy, e.g. a systematic exclusion of poetry or prose
from an encoding. The full details of the policy decisions concerned
should be documented in the TEI header (see section <ptr target="#HD5"/>).
Each place in the text at which omission has taken place should be
marked with a <gi>gap</gi> element, with optionally further information
about the reason for the omission, its extent, and the person or agency
responsible for it, as in the following examples:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><gap
reason="illegible" unit="word" quantity="2"/></egXML><egXML
xmlns="http://www.tei-c.org/ns/Examples"><gap reason="overwriting
illegible" extent="several characters"/></egXML>
Note that the extent of the gap may be marked precisely using
attributes <att>unit</att> and <att>quantity</att>, or more
descriptively using the <att>extent</att> attribute. Other, more
detailed, options are also available for representing dimensions of
any kind; see further <ptr target="#msdim"/>. </p>
<p>The <gi>desc</gi> element may be used to supply a description of
the material omitted, where that is considered useful:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><gap reason="sampling" extent="120" unit="lines"><desc>irrelevant commentary</desc></gap></egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COEDADD-eg-83">… Their arrangement with respect to Jupiter and to each other was as follows:
<gap reason="sampling" extent="2" unit="cm"><desc>astrological figure</desc></gap>
That is, there were two stars on the easterly side and one to the
west; …</egXML>
	<!-- Figure looks like this:                                  -->
	<!--                                                          -->
	<!-- East         *        *  O      *                   West -->
	<!--                                                          -->
</p> 
<p>The <gi>add</gi> and <gi>del</gi> elements may be used to record
where words or phrases have been added or deleted in the copy text.
They are not appropriate where longer passages have been added or
deleted, which span several elements; for these, the elements
<gi>addSpan</gi> and <gi>delSpan</gi> described in
chapter <ptr target="#PHAD"/> must be used.</p>
<p>Additions to a text may be recorded for a number of reasons.
Sometimes they are marked in a distinctive way in the source text, for
example by brackets or insertion above the line (<term rend="noindex">supralinear</term> insertion),<index><term>additions</term><index><term>supralinear</term></index></index><index><term>insertions</term><index><term>supralinear</term></index></index><index><term>supralinear insertions</term></index> as in
the following example, taken from a 19th century manuscript:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COEDADD-eg-84">The story I am going to relate is true as to its main facts,
and as to the consequences <add place="above">of
these facts</add> from which this tale takes its title.</egXML>
</p><p>The <gi>add</gi> element should not be used to mark editorial
changes, such as supplying a word omitted by mistake from the source
text or a passage present in another version. In these cases, either
the <gi>corr</gi> or <gi>supplied</gi> tags should be used, as
discussed above in section <ptr target="#COEDCOR"/>, and in section
<ptr target="#PHCC"/>, respectively.</p><p>The <gi>unclear</gi> element is used to mark passages in the
original which cannot be read with confidence, or about which the
transcriber is uncertain for other reasons, as for example when
transcribing a partially inaudible or illegible source. Its
<att>reason</att> and <att>resp</att> attributes are used, as with the
<gi>gap</gi> element, to indicate the cause of uncertainty and the
person responsible for the conjectured reading.</p><p>For example:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COEDADD-eg-85"><l>And where the sandy mountain Fenwick scald</l>
<l><unclear reason="ink blot">The</unclear> sea between
yet hence his pray'r prevail'd</l></egXML>
or from a spoken text:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>... and then <unclear reason="passingTruck">marbled queen</unclear>...</p></egXML>
</p>
<p>Where the material affected is entirely illegible or inaudible, the
<gi>gap</gi> element discussed above should be used in preference.</p>
<p>The <gi>del</gi> element is used to mark material which is deleted in
the source but which can still be read with some degree of confidence,
as opposed to material which has been omitted by the encoder or
transcriber either because it is entirely illegible or for some other
reason.  This is of particular importance in transcribing manuscript
material, though deletion is also found in printed texts, sometimes for
humorous purposes:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CO-eg-01"><l>One day I will sojourn to your shores</l>
<l>I live in the middle of England</l>
<l>But!</l>
<l>Norway! My soul resides in your watery
<del rend="overstrike">fiords fyords fiiords</del></l>
<l>Inlets.</l></egXML> 
</p><p>The <att>rend</att> attribute may be used to distinguish different
methods of deletion in manuscript or typescript material, as in this
line from the typescript of Eliot's <title>Waste Land</title>:
<egXML xmlns="http://www.tei-c.org/ns/Examples" xml:lang="de" source="#COEDADD-eg-89"><l><del rend="overtyped">Mein</del> Frisch
<del type="overstrike">schwebt</del> weht der Wind</l></egXML></p>
<p>Deletion in manuscript or typescript is often associated with
addition:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COEDADD-eg-89"><l><del rend="overstrike">Inviolable</del>
   <add place="below">Inexplicable</add>
splendour of Corinthian white and gold</l></egXML> 
The <gi>subst</gi> element discussed in <ptr target="#PHSU"/> provides
a way of grouping additions and deletions of this kind. </p>
<p>The <gi>del</gi> element should not be used where the deletion is
such that material cannot be read with confidence, or read at all, or
where the material has been omitted by the transcriber or editor for
some other reason. Where the material deleted cannot be read with
confidence, the <gi>unclear</gi> tag should be used with the
<att>reason</att> attribute indicating that the difficulty of
transcription is due to deletion. Where material has been omitted by
the transcriber or editor, this may be indicated by use of the
<gi>gap</gi> element. A deletion in which some parts may be read but
not others may thus be represented by one or more <gi>gap</gi>
elements intermingled with text, all contained by a <gi>del</gi>
element. Text supplied or marked as unneccessary by an editor should
be marked with the <gi>supplied</gi> and <gi>surplus</gi> elements 
(discussed in <ptr target="#PHOM"/>) rather than <gi>add</gi> and 
<gi>del</gi>. These two sets of elements allow the encoder to
distinguish editorial changes from those visible in the source text.
</p>
<specGrp xml:id="DCOEDA" n="Other editorial tags">

<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/gap.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/add.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/del.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/unclear.xml"/></specGrp>
</div></div>
<div type="div2" xml:id="CONA"><head>Names, Numbers, Dates, Abbreviations, and Addresses</head>

<p>This section describes a number of textual features which it is
often convenient to distinguish from their surrounding text.  Names,
dates, and numbers are likely to be of particular importance to the
scholar treating a text as source for a database; distinguishing such
items from the surrounding text is however equally important to the
scholar primarily interested in lexis.</p>

<p>The treatment of these textual features proposed here is not
intended to be exhaustive: fuller treatments for names, numbers,
measures, and dates are provided in the
names and dates module (see chapter <ptr target="#ND"/>); more detailed
treatment of abbreviations is provided by the transcription module
(see section <ptr target="#PHAB"/>). </p>

<div type="div3" xml:id="CONARS"><head>Referring Strings</head>
<p>A <term>referring string</term> is a phrase which refers to some
person, place, object, etc. Two elements are provided to mark such
strings:
<specList>
<specDesc key="rs"/>
<specDesc key="name"/>
</specList>
The <gi>name</gi> element is a member of the <ident type="class">att.typed</ident>
class, from which it inherits the  following attributes:
<specList><specDesc key="att.typed" atts="type subtype"/></specList>
which may be used to further categorize the 
kind of object referred to. The <gi>rs</gi> element defines the
<att>type</att> attribute locally. </p>

<p>Examples include:

<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CONARS-eg-101"><p><q>My dear 
<rs type="person">Mr. Bennet</rs></q>, said his lady to 
him one day, <q>have you heard that <rs type="place">
Netherfield Park</rs> is let at last?</q></p></egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CONARS-eg-102"><p>Collectors of water-rents were appointed by the
<rs type="org">Watering Committee</rs>.
They were paid a commission not exceeding four per
cent, and gave bond.</p></egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CONARS-eg-103"><p>It being one of the principles of the
<rs type="org">Circumlocution Office</rs> never, on any
account whatsoever, to give a straightforward answer,
<rs type="person">Mr Barnacle</rs> said, <q>Possibly.</q></p></egXML>
</p><p>As the following example shows, the <gi>rs</gi> element may be used
for any reference to a person, place, etc., not only to references in
the form of a proper noun or noun phrase.
<egXML xmlns="http://www.tei-c.org/ns/Examples"  source="#CONARS-eg-101"><p><q>My dear <rs type="person">Mr. Bennet</rs></q>, said
<rs type="person">his lady</rs> to him one day ... </p></egXML>
</p><p>The <gi>name</gi> element by contrast is provided for the special
case of referencing strings which consist only of proper nouns; it may
be used synonymously with the <gi>rs</gi> element, or nested within it
if a referring string contains a mixture of common and proper nouns.
The following example shows an alternative way of encoding the short
sentence from <title>Pride and Prejudice</title> quoted above:
<egXML xmlns="http://www.tei-c.org/ns/Examples"  source="#CONARS-eg-101"><p><q>My dear <name type="person">Mr. Bennet</name>,</q> said <rs type="person">his lady</rs> to him one day,  
<q>have you heard that <name type="place">Netherfield Park</name> is let at last?</q></p></egXML>
As the following example shows,  a proper name may be nested within a
referring string:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CONARS-eg-106"><rs>His Excellency the Life President, <name>Ngwazi Dr H. Kamuzu Banda</name></rs></egXML>
 </p>
<p>Simply tagging something as a name is generally not enough to
enable automatic processing of personal names into the canonical forms
usually required for reference purposes. The name as it appears in the
text may be inconsistently spelled, partial, or vague.  Moreover, name
prefixes such as <mentioned>van</mentioned> or <mentioned>de
la</mentioned> may or may not be included as part of the reference
form of a name, depending on the language and country of origin of the
bearer. </p>

<p>Two issues arise in this context: firstly, there may be a need to
encode a regularized form of a name, distinct from the actual form in
the source to hand; secondly, there may be a need to identify the
particular person, place, etc. referred to by the name, irrespective
of whether the name itself is normalized or not. The element
<gi>reg</gi>, introduced in <ptr target="#COEDREG"/> is provided for
the former purpose; the attributes <att>key</att> or <att>ref</att>
for the latter.</p>

<p>The <att>key</att> and <att>ref</att> attributes are common to all
members of the <ident type="class">att.canonical</ident> class and are
defined as follows: <specList><specDesc key="att.canonical" atts="key
ref"/></specList>
 </p>

<p>A very useful application for them is as a means of gathering
together all references to the same individual or location scattered
throughout a document: <egXML
xmlns="http://www.tei-c.org/ns/Examples" source="#CONARS-eg-101"><p><q>My dear <rs key="BENM1"
type="person"> Mr. Bennet</rs>,</q> said <rs key="BENM2"
type="person">his lady</rs> to him one day, <q>have you heard that <rs
key="NETP1" type="place">Netherfield Park</rs> is let at
last?</q></p></egXML>

<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CONARS-eg-108" xml:lang="fr">
<p><name key="VOM1" type="person">Mme. de Volanges</name> marie <rs key="VOM2">sa fille</rs>: c'est encore un secret;
mais elle m'en a fait part hier.</p></egXML>
</p>

<p>The value of the <att>key</att> attribute may be an unexpanded
code, as in the examples above, with no particular significance. More
usually however, it will be an externally defined code of some kind,
as provided by a standard reference source.

<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#NONE" >
<p><name key="LHR" type="airport">Heathrow</name> </p></egXML>
</p>
 
<p>The standard reference source should be documented using a <gi>taxonomy</gi> element in the TEI header.</p>

<p>The <att>ref</att> attribute can be used to point directly
to  some other resource providing more information about the
entity named by the element, such as an authority record in a
database, an encylopaedia entry, another element in the same
or a different document etc. 

<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#NONE">
<p><name ref="http://en.wikipedia.org/wiki/Heathrow_airport" type="airport">Heathrow</name> </p></egXML>
</p>
<p>This use should be distinguished from the use of a nested
<gi>reg</gi> (regularization) element to provide the standard form
of a referring string, as in this example: <egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CONARS-eg-109"><p>My personal life during
the administration of <rs key="POJA1" type="person">Col. Polk
(<reg>Polk, James K.</reg>)</rs> has but poorly compensated me for the
suspended enjoyments and pursuits of private and professional
spheres</p></egXML>
</p>
 
<p>No particular syntax is proposed for the values of the <att>key</att> 
attribute, since its form will depend entirely on practice within a 
given project. For the same reason, this attribute is not recommended in 
data interchange, since there is no way of ensuring that the values used 
by one project are distinct from those used by another. In such a 
situation, a preferable approach for magic tokens which follows standard practice on the Web is to use a <att>ref</att> attribute whose value is a tag URI as defined in <ref target="#RFC4151">RFC 4151</ref>. For example:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CONARS-eg-108" xml:lang="en">
<p><name ref="tag:projectname.org,2012:VOM1" type="person">Mme. de Volanges</name> marie <rs ref="tag:theworksoflaclos.org,2012:VOM2">sa fille</rs>: c'est encore un secret;
mais elle m'en a fait part hier.</p>
</egXML>
The inclusion of the domain name of the party responsible for tagging (<ident>theworksoflaclos.org</ident>), as specified in RFC 4151, helps ensure uniqueness of magic token values across TEI encoding projects, allowing for improved interchange of TEI documents.</p>

<p>The <gi>choice</gi> element discussed in <ptr target="#COED"/> may be
used if it is desired to record both a normalized form of a name and
the name used in the source being encoded: 
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CONARS-eg-110">
<p><name ref="tag:projectname.org,2012:WADLM1" type="person"><choice>
<orig>Walter de la Mare</orig>
<reg>de la Mare, Walter</reg>
</choice></name>
was born at <name ref="tag:projectname.org,2012:Ch1" type="place">Charlton</name>, in
<name ref="tag:projectname.org,2012:KT1" type="county">Kent</name>, in 1873.</p></egXML>
</p><p>The <gi>index</gi> element discussed in <ptr target="#CONOIX"/> may be
more appropriate if the function of the regularization is to provide a
consistent index:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CONARS-eg-111"><p><name type="place">Montaillou</name> is not a large parish.
At the time of the events which led to
<name type="person">Fournier<index><term>Benedict XII, Pope of Avignon (Jacques Fournier)</term></index></name>'s 
investigations, the local population consisted of between 200 and 250 inhabitants.</p></egXML>
Although adequate for many simple applications, these methods have two
inconveniences: if the name occurs many times, then its regularized
form must be repeated many times; and the burden of additional XML
markup in the body of the text may be inconvenient to maintain and
complex to process. For applications such as onomastics, relating to
persons or places named rather than the name itself, or wherever a
detailed analysis of the component parts of a name is needed, the
specialized elements described in chapter <ptr target="#ND"/> or the
analytical tools described in chapter <ptr target="#FS"/> should be
used.
 </p>

<specGrp xml:id="DCONA" n="Proper Names">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/name.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/rs.xml"/>
</specGrp>
</div>
<div type="div3" xml:id="CONAAD"><head>Addresses</head>

<p>These Guidelines propose the following elements to distinguish
postal and electronic addresses: 
<specList>
<specDesc key="address"/>
<specDesc key="email"/>
</specList>
These two elements constitute the class of
<ident type="class">model.addressLike</ident> elements; for other kinds of address
this class may be extended by adding new elements if necessary. </p>
<p>These Guidelines provide no particular means for encoding the
substructure of an email address (for example, distinguishing the
local part from the domain part), nor of distinguishing personal email
addresses from generic or fictitious ones. 

<egXML xmlns="http://www.tei-c.org/ns/Examples" xml:lang="und">
<email>info@tei-c.org</email>
</egXML>
</p>

<p>The simplest way of encoding a postal address is to regard it as a series
of distinct lines, just as they might be written on an envelope. The
following element supports this view:
<specList><specDesc key="addrLine"/></specList>
Here is an example of a postal address encoded using this approach:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><address>
   <addrLine>110 Southmoor Road,</addrLine>
   <addrLine>Oxford OX2 6RB,</addrLine>
   <addrLine>UK</addrLine>
</address></egXML>
 </p>
<p>Alternatively, an address may be encoded as a structure of
more semantically rich elements. The class <ident type="class">model.addrPart</ident> element class identifies a number
of such possible components: <specList><specDesc key="street"/><specDesc key="name" /><specDesc key="postCode"/><specDesc key="postBox"/>
<specDesc key="model.nameLike"/>
<specDesc key="model.persNamePart"/>
<specDesc key="model.placeNamePart"/>
</specList> Any number of
elements from the <ident type="class">model.addrPart</ident> class may
appear within an address and in any order. None of them is
required. </p>

<p>Where code letters are commonly used in addresses (for
example, to identify regions or countries) a useful practice is to
supply the full name of the region or country as the content of the
element, but to supply the abbreviatory code as the value of the
global <att>n</att> attribute, so that (for example) an application
preparing formatted labels can readily find the required
information. Other components of addresses may be represented using
the general-purpose <gi>name</gi> element or (when the additional
module for names and dates is included) the more specialized  elements
provided for that purpose.
</p>

<p>Using just the elements defined by the core module, the above
address could thus be represented as follows:

<egXML xmlns="http://www.tei-c.org/ns/Examples"><address>
   <street>110 Southmoor Road</street>
   <name type="city">Oxford</name>
   <postCode>OX2 6RB</postCode>
   <name type="country">United Kingdom</name>
</address></egXML>
 </p>

<p>The order of elements within an address is highly culture-specific,
and is therefore unconstrained:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><address>
   <name type="org">Università di Bologna</name>
   <name type="country">Italy</name>
   <postCode>40126</postCode>
   <name type="city">Bologna</name>
   <street>via Marsala 24</street>
</address></egXML>
 </p>
  
  <p>A telephone number (normally outside of the <gi>address</gi> element) might be tagged with an
    <gi>addrLine</gi> and <gi>ref</gi>, with the number itself appearing in the <code>tel:</code> namespace:
  <egXML xmlns="http://www.tei-c.org/ns/Examples">
    <addrLine><ref target="tel:+1-201-555-0123">(201) 555 0123</ref></addrLine>
  </egXML></p>
  
<p>For further discussion of ways of regularizing the names of places,
see section <ptr target="#CONA"/>. A full postal address may also include
the name of the addressee, tagged as above using the general purpose
<gi>name</gi> element.    </p>

<p>When a schema includes the names and dates
module discussed in chapter <ptr target="#ND"/>, a large number of more specific elements such as <gi>country</gi> or <gi>settlement</gi> will be
available from the class <ident type="class">model.addrPart</ident>. The above
example might then be encoded as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><address>
   <street>110 Southmoor Road</street>
   <settlement>Oxford</settlement>
   <postCode>OX2 6RB</postCode>
   <country>United Kingdom</country>
</address></egXML>
</p><specGrp xml:id="DCOAD" n="Addresses and their components"><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/email.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/address.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/addrLine.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/street.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/postCode.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/postBox.xml"/>
</specGrp>

</div>
<div type="div3" xml:id="CONANU"><head>Numbers and
Measures</head>
<p>This section describes elements provided for the simple encoding
of numbers and measurements and gives some indication of circumstances in
which this may usefully be done.  The following phrase level elements
are provided for this purpose:
<specList><specDesc key="num" atts="type value"/><specDesc key="measure" atts="type"/><specDesc key="measureGrp"/></specList>
 </p>

<p>Like names or abbreviations, numbers can occur virtually anywhere
in a text.  Numbers are special in that they can be written with
either letters or digits (<mentioned>twenty-one</mentioned>,
<mentioned>xxi</mentioned>, and <mentioned>21</mentioned>) and their
presentation is language-dependent (e.g.  English
<mentioned>5th</mentioned> becomes Greek <mentioned>5.</mentioned>;
English <mentioned>123,456.78</mentioned> equals French
<mentioned>123.456,78</mentioned>).
 </p>
<p>For many kinds of application, e.g. natural-language processing or
machine translation, numbers are not regarded as
<soCalled>lexical</soCalled> in the same way as other parts of a text.
For these and other applications, the <gi>num</gi> element provides a
convenient method of distinguishing numbers from the surrounding text.
For other kinds of application, numbers are only useful if normalized:
here the <gi>num</gi> element is useful precisely because it provides a
standardized way of representing a numerical value.
 </p>
<p>For example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><num value="33">xxxiii</num>
<num type="cardinal" value="21">twenty-one</num>
<num type="percentage" value="10">ten percent</num>
<num type="percentage" value="10">10%</num>
<num type="ordinal" value="5">5th</num></egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples"><num type="fraction" value="0.5">one half</num>
<num type="fraction" value="0.5">1/2</num></egXML>
 </p>

<p>Sometimes it may be desired to mark something as numerical
which cannot be accurately  normalized, for example an expression
such as <mentioned>dozens</mentioned>; less frequently the number may
be recognisable linguistically as such but may use a notation with which
the encoder is unfamiliar. To help in these situations, the
<gi>num</gi> element may also bear either or both of the following
attributes from the <ident type="class">att.ranging</ident> class:
<specList>
<specDesc key="att.ranging" atts="atLeast atMost"/>
</specList>
</p>

<p>In its fullest form, a <term>measure</term> consists of a number, a phrase
expressing units of measure and a phrase expressing the commodity
being measured, though not all of these components need be present in
every case. It may be helpful to distinguish measures from surrounding
text for two reasons. Firstly, a measure may be expressed using a
particular notation or system of abbreviations which the encoder does
not wish to regard as lexical. Secondly, a quantitative application
may wish to distinguish and normalize the internal components of a
measure, in order to perform calculations on them.</p>

<p>Consider, as an example of the first case, the following list of
Celia's charms, in which the encoder has chosen to make  explicit the measurements:

<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CO-eg-02"><div n="2"><list type="gloss">
<label>Age</label><item>Unimportant</item>
<label>Head</label><item>Small and round</item>
<label>Eyes</label><item>Green</item>
<label>Complexion</label><item>White</item>
<label>Hair</label><item>yellow</item>
<label>Features</label><item>Mobile</item>
<label>Neck</label><item><measure>13¾"</measure></item>
<label>Upper arm</label><item><measure>11"</measure></item>
<!--...-->
</list>
<!-- ... -->
</div></egXML>

In the same way, it may be convenient to mark 
representations of currency which might otherwise be misinterpreted as
lexical:

<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>...the sum of
<measure type="currency">12s 6d</measure>...</p></egXML>
</p>
<p>In general, normalization of a measure will require specification
of one or more of its three parts: the quantity, the units, and
possibly also the commodity being measured. This is accomplished by
supplying values for the three attributes <att>quantity</att>,
<att>unit</att>, and <att>commodity</att>, which are supplied by the
<ident type="class">att.measurement</ident> class:
<specList>
<specDesc key="att.measurement" atts="quantity unit commodity"/>
</specList>
With these attributes, the measurement of Celia's neck may be
specified in a normalized form:
<egXML xmlns="http://www.tei-c.org/ns/Examples" xml:lang="und">
<measure quantity="13.75" unit="in">13¾"</measure>
</egXML>
Such techniques are particularly useful when representing historical
data such as inventories:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CO-eg-03">
<list>
  <item><measure type="volume" quantity="2" unit="bag" commodity="hops">ii bags hops</measure>
  </item>
  <item><measure type="volume" quantity="6" unit="truss" commodity="cloth">six trusses Woolen and linen goods</measure>
  </item>
  <item><measure type="weight" quantity="5" unit="ton" commodity="coal">5 tonnes coale</measure>
  </item>
</list></egXML>
 </p> 
<p>The <gi>measureGrp</gi> element is provided as a means of grouping
several related measurements together, either because the measurement
involves several dimensions (for example height and width) or to
avoid the need to repeat all the normalizing attributes:
<egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples">
<measureGrp type="volume" unit="in">
<measure type="height" quantity="14">xiv</measure>
<measure type="width" quantity="5">v</measure>
<measure type="depth" quantity="10">x</measure>
</measureGrp>
</egXML>
<!-- better example needed -->
</p>
<specGrp xml:id="DCONU" n="Numbers and measures">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/num.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/measure.xml"/>

<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/measureGrp.xml"/>
</specGrp>
</div>
<div type="div3" xml:id="CONADA"><head>Dates and Times</head>
<p>Dates and times, like numbers, can appear in widely varying
culture- and language-dependent forms, and can pose similar problems
in automatic language processing. Such elements constitute the <ident type="class">model.dateLike</ident> class, of which the default
members are:
 <specList>
   <specDesc key="date"/>
   <specDesc key="time"/>
 </specList>
These elements have some additional attributes by virtue of being
members of the <ident type="class">att.datable</ident> and <ident type="class">att.duration</ident> classes which, in turn, are members
of the <ident type="class">att.datable.w3c</ident> and <ident type="class">att.duration.w3c</ident> classes. In particular, the
<att>when</att> and <att>calendar</att> attributes will be discussed here:
 <specList>
  <specDesc key="att.datable.w3c" atts="when"/>
  <specDesc key="att.datable" atts="calendar"/>
</specList>
</p>
<p>Dates can occur virtually anywhere in a text, but in some contexts
(e.g. bibliographic citations) their encoding is recommended or
required rather than optional.  Times can also appear anywhere but
are generally optional.
 </p>
<p>Partial dates or times (e.g. <mentioned>1990</mentioned>,
<mentioned>September 1990</mentioned>,
<mentioned>twelvish</mentioned>) can be expressed in the
<att>when</att> attribute by simply omitting a part of the value
supplied.  Imprecise dates or times (for example <mentioned>early
August</mentioned>, <mentioned>some time after ten and before
twelve</mentioned>) may be expressed as date or time ranges.
 </p>
<p>These mechanisms are useful primarily for fully specified dates or
times known with certainty.  If component parts of dates or times are to
be marked up, or if a more complex analysis of the meaning of a temporal
expression is required, the techniques described in chapter <ptr target="#ND"/> should be used in preference to the simple method
 outlined here.
 </p>
<p>Where the certainty (i.e. reliability) of the date or time  is
in question,  the encoder should record this
fact using the mechanisms discussed in chapter <ptr
target="#CE"/>. The same chapter also discusses various methods of
recording the precision of numerical or temporal assertions.
 </p>
<p>The <att>when</att> attribute is a useful way of normalizing or
 disambiguating  dates and times which can appear in many formats, as
 the following examples show: 
<egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples"><date when="1980-02-12">12/2/1980</date></egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples">Given on the <date when="1977-06-12">Twelfth Day of June
in the Year of Our Lord One Thousand Nine Hundred and
Seventy-seven of the Republic the Two Hundredth and first
and of the University the Eighty-Sixth.</date></egXML>
</p>
<p>The <att>when</att> attribute always supplies a normalized
representation of the date given as content of the <gi>date</gi>
element. The format used should be a valid W3C schema datatype.<note place="bottom">The datatypes are taken from the W3C Recommendation <title ref="#XSD2">XML Schema Part 2: Datatypes Second Edition</title>. 
The permitted datatypes are:
<list rend="bulleted">
<item><ref target="http://www.w3.org/TR/xmlschema-2/#date">date</ref></item>
<item><ref target="http://www.w3.org/TR/xmlschema-2/#gYear">gYear</ref></item>
<item><ref target="http://www.w3.org/TR/xmlschema-2/#gMonth">gMonth</ref></item>
<item><ref target="http://www.w3.org/TR/xmlschema-2/#gDay">gDay</ref></item>
<item><ref target="http://www.w3.org/TR/xmlschema-2/#gYearMonth">gYearMonth</ref></item>
<item><ref target="http://www.w3.org/TR/xmlschema-2/#gMonthDay">gMonthDay</ref></item>
<item><ref target="http://www.w3.org/TR/xmlschema-2/#time">time</ref></item>
<item><ref target="http://www.w3.org/TR/xmlschema-2/#dateTime">dateTime</ref></item>
</list>
There 
is one exception: these Guidelines permit a time to be expressed as only a number of hours, or as a number of hours and minutes,
as per ISO 8601:2004 section 4.2.2.3 and 4.3.3. 
The W3C <ident type="datatype">time</ident> and <ident type="datatype">dateTime</ident> 
datatypes require that the minutes and seconds be included in the
normalized value if they are to be correctly processed for example
when sorting.</note>
Some typical examples follow:
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<date when="2001">The year 2001</date>
<date when="2001-09">September 2001</date>
<date when="2001-09-11">11 Sep 01</date>
<date when="--09-11">9/11</date>
<date when="--09">September</date>
<date when="---11">Eleventh of the month</date>
<time when="08:48:00">8:48</time>
<date when="2001-09-11T12:48:00">Sept 11th, 12 minutes before 9 am</date>
</egXML>Note in the last example the use of a normalized representation for
the date string which includes a time: this example could thus equally
well be tagged using the <gi>time</gi> element. 
</p>
<p>The following examples demonstrate the use of the
<gi>date</gi> element to mark a period of time:<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CONADA-eg-143"><p>Those five years — 
<date from="1918" to="1923">1918 to 1923</date>
— had been, he suspected,
somehow very important.</p></egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples"
 source="#CONADA-eg-144">
<p>The Eddic poems are preserved in a unique
manuscript (Codex Regius 2365) from 
<date notBefore="1250" notAfter="1300">the second half of the 
thirteenth century</date>, and <title>Hervarar
saga</title> dates from <date when="1300">around 1300</date>.</p></egXML>
</p>
<p>The <att>calendar</att> attribute may be used to specify a date in
any calendar system; if the <att>when</att> attribute is also supplied,
it should specify the equivalent date in the Gregorian calendar. </p>
<!-- example needed -->

<specGrp xml:id="DCODA" n="Dates and times">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/date.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/time.xml"/>
</specGrp>
</div>
<div type="div3" xml:id="CONAAB"><head>Abbreviations and Their Expansions</head>
<p>It is sometimes desirable to mark abbreviations in the copy text,
whether to trigger special processing for them, to provide the full form
of the word or phrase abbreviated, or to allow for different possible
expansions of the abbreviation. Abbreviations may be transcribed as
they stand, or expanded; they may be left unmarked, or marked using
these tags: 
<specList>
<specDesc key="abbr"/>
<specDesc key="expan"/>
</specList>
</p><p>The <gi>abbr</gi> element is useful as a means of distinguishing
semi-lexical items such as acronyms or jargon:
<egXML xmlns="http://www.tei-c.org/ns/Examples"
 source="#CONAAB-eg-150">We can sum up the above discussion as follows: 
the identity of a <abbr>CC</abbr> is defined by that calibration of values which
motivates the elements of its <abbr>GSP</abbr>; ...</egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CONAAB-eg-151">Every manufacturer of <abbr>3GL</abbr> or <abbr>4GL</abbr>
languages is currently nailing on <abbr>OOP</abbr> extensions.</egXML> 
</p><p>The <att>type</att> attribute may be used to distinguish types
of abbreviation by their function:<egXML xmlns="http://www.tei-c.org/ns/Examples"><abbr type="title">Dr.</abbr> <abbr type="initial">M.</abbr> Deegan is
the Director of the <abbr type="acronym">CTI</abbr> Centre for Textual Studies.</egXML>
 </p>
<p>Abbreviations such as <mentioned>Dr. M.</mentioned> above may be
treated as two abbreviations, as above, or as one: <egXML xmlns="http://www.tei-c.org/ns/Examples"><abbr>Dr. M.</abbr> Deegan is
the Director of the <abbr>CTI</abbr> Centre for Textual Studies.</egXML>
 </p>
<p>The <gi>expan</gi> element may be used simply to record that an
abbreviation has been silently expanded by the encoder, perhaps for
reasons of house style or editorial policy. It should
always include the whole of an abbreviated phrase or word.  More
usually however this will be combined with the <gi>abbr</gi> element
inside a <gi>choice</gi> element to record both the abbreviation and
its expansion: 
<egXML xmlns="http://www.tei-c.org/ns/Examples"> the 
<choice><expan>World Wide Web Consortium</expan>
<abbr>W3C</abbr></choice></egXML>
Nested abbreviations may also be handled in this way:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><choice><abbr>RELAXNG</abbr><expan>regular
language for <choice><abbr>XML</abbr><expan>extensible markup
language</expan></choice>, next
generation</expan></choice></egXML></p>

<p>Abbreviation is a particularly important feature of manuscript
and other source materials, the transcription of which needs more detailed treatment than
is possible using these simple elements. A more detailed set of
recommendations is discussed in <ptr target="#PHCH"/>, which includes
additional elements made available for the purpose by the <ident type="module">transcr</ident> module. </p>

<specGrp xml:id="DCOAB" n="Abbreviations">

<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/abbr.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/expan.xml"/></specGrp>
</div></div>
<div type="div2" xml:id="COXR"><head>Simple Links and Cross-References</head>

<p>Cross-references or links between one location in a document and one or more
other locations, either in the same or different XML documents, may be encoded
using the elements <gi>ptr</gi> and <gi>ref</gi>, as discussed in this
section. These elements both <soCalled>point</soCalled> from one
location in a document, the place that the element itself appears, to
another (or to several), specified by means of a <att>target</att>
attribute, supplied by the <ident
type="class">att.pointing</ident> class:
<specList>
<specDesc key="att.pointing" atts="target"/>
</specList>
Linkages of several other kinds are also provided for in these
guidelines; see further chapter <ptr target="#SA"/>.
 </p>
<p>The value of the <att>target</att> attribute, wherever it appears,
provides a way of pointing to some other element using a method
standardized by  the W3C consortium, and known as the <term>XPointer</term>
mechanism. This permits a range of complexity, from the very simple
(a reference to the value of  the target element's <att>xml:id</att>
attribute) to the more complex usage of a full URI with
embedded XPointers. For example, the source of the following paragraph
looks something like this: 
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>...
The complete XPointer specification is managed by the W3C<note place="foot"><ptr target="http://www.w3.org/TR/xptr-framework/"/>, 
<ptr target="http://www.w3.org/TR/xptr-element/"/>, 
<ptr target="http://www.w3.org/TR/xptr-xmlns/"/>, and 
<ptr target="http://www.w3.org/TR/xptr-xpointer/"/></note>;
for a discussion of TEI schemes for XPointer, see 
<ptr target="#eSATS"/>.</p>
<!--... -->
<div xml:id="eSATS">
  <!--... -->
</div>
</egXML>
Alternatively, if no explicit link is to
be encoded, but it is simply required to mark the phrase as a
cross-reference, the <gi>ref</gi> element may be used without a
<att>target</att> attribute.</p>
<p>For an introduction to the use of links in general, see <ptr
target="#SA"/>. The complete XPointer specification is managed by the W3C<note place="foot"><ptr target="http://www.w3.org/TR/xptr-framework/"/>, <ptr target="http://www.w3.org/TR/xptr-element/"/>, <ptr target="http://www.w3.org/TR/xptr-xmlns/"/>, and <ptr target="http://www.w3.org/TR/xptr-xpointer/"/></note>; for a discussion of
TEI schemes for XPointer, see <ptr target="#SATS"/>.</p>
<p>
<specList><specDesc key="ptr" /><specDesc key="ref"/></specList>
 </p>
<p>The elements <gi>ptr</gi> and <gi>ref</gi> are the default members
of the phrase-level model class <ident type="class">model.ptrLike</ident>. As
members of the classes <ident type="class">att.pointing</ident>,
<ident type="class">att.typed</ident>, <ident
type="class">att.cReferencing</ident>,  and <ident type="class">att.internetMedia</ident> they
also carry the following
attributes:
<specList><specDesc key="att.pointing" atts="target evaluate"/>
<specDesc key="att.cReferencing" atts="cRef"/>
<specDesc key="att.typed" atts="type subtype"/>
  <specDesc key="att.internetMedia" atts="mimeType"/></specList>
 </p>
<p>The two elements may be used in the same
way; the difference between them is simply that while the <gi>ptr</gi>
element is empty, the <gi>ref</gi> element may contain phrases
specifying, or describing more exactly, the target of a cross-reference,
which form the content of the element. Since its content thus serves as
a human-readable pointer, in the simplest case a <gi>ref</gi> element
need not identify its target in any other way. For example:
<egXML xmlns="http://www.tei-c.org/ns/Examples">See <ref>section 12 on page 34</ref>.</egXML>
 </p>
<p>More usually, it will be desirable to identify the target of the
cross-reference using either the <att>target</att> or the
<att>cRef</att>  attribute, so that
processing software can access it directly, for example to implement a
linkage, to generate an appropriate reference, or to give an error
message if it cannot be found. Assuming that section
12 in the previous example has been tagged 
<egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples">
<div1 xml:id="SEC12"><!-- ... --> </div1></egXML> 
then the same cross-reference might more exactly be encoded as 
<egXML xmlns="http://www.tei-c.org/ns/Examples">See especially <ref target="#SEC12">section 12 on page 34</ref>.</egXML>
</p>
<p>If the cross-reference itself is to be generated according to a
fixed pattern, or if no text is to appear in the body of the 
cross-reference, the <gi>ptr</gi> element would be used as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples">See in particular <ptr target="#SEC12"/>.</egXML>
 </p>
<p>The <att>cRef</att> attribute may be used to express the target of
a cross reference using some canonical referencing scheme, such as
those typically used for ancient texts.  In this case, the referencing
scheme must be defined using the <gi>cRefPattern</gi> element
discussed below (<ptr target="#CORS6"/>); the definition it provides is used
to translate the value of the <att>cRef</att> attribute into a
conventional pointer value, such as one that might be supplied by the
<att>target</att> attribute. It is an error to supply both
<att>cRef</att> and <att>target</att> values. </p>
<p>When the <att>target</att> attribute is used, a cross reference may point to any number of locations simultaneously,
simply by giving more than one identifier as the value of its
<att>target</att> attribute.  This may be particularly useful where
an analytic index is to be encoded, as in the following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COXR-eg-164"><list>
   <item>Saints aid rejected in mel. <ptr target="#p299"/></item>
   <item>Sallets censured <ptr target="#p143 #p144"/></item>
   <item>Sanguine mel. signs <ptr target="#p263"/></item>
   <item>Scilla or sea onyon, a purger of mel. <ptr target="#p442"/></item>
</list></egXML>
Here the targets of the cross-references are simply page numbers; it
is assumed that corresponding elements with identifiers
<ident>p299</ident>, <ident>p143</ident>, etc. have been provided in
the body of the text, for example as page breaks
<egXML xmlns="http://www.tei-c.org/ns/Examples" xml:lang="und">
<pb xml:id="p143"/>
...
<pb xml:id="p144"/>
...
<pb xml:id="p263"/>
...
<pb xml:id="p299"/>
...
<pb xml:id="p442"/>
...
</egXML>
</p>
<p>A similar method may be used to link annotations on a text with the
sigla used to encode their points of attachment in a text. For
example:
<egXML xmlns="http://www.tei-c.org/ns/Examples" xml:lang="und">
annotated text <ref target="#a51" type="noteAnchor">&#x2075;&#x00B9;</ref>
<!-- ... -->
<note xml:id="a51" type="footnote">text of annotation</note>
</egXML>
</p>
<p>The <att>type</att> attribute may be used,
as elsewhere, to categorize the cross-reference according to any
system of importance to the encoder. If bibliographic references
require special processing (e.g. in order to provide a consistent
short-form reference), they might be tagged thus: <egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COXR-eg-166">Similar forms, often called
<term rend="ldquo rdquo">rewriting systems</term>, have a long history
among mathematicians, but the specific form of <ptr target="#fig22"/>
was first studied extensively by Chomsky <ptr type="bibliog" target="#chom59"/>.
<!-- ... -->
<figure xml:id="fig22"><graphic url="fig22.jpg"/></figure>
<!-- elsewhere, in the bibliography -->
<bibl xml:id="chom59"><!-- citation for the book referenced above --></bibl>
</egXML> 
The value <val>bibliog</val> for the <att>type</att> attribute on the
second <gi>ptr</gi> element here might be used to indicate that the
object being referenced here is a bibliographic entry rather than a
simple cross-reference to an illustration, as is the first
<gi>ptr</gi>. In either case, the value of the <att>target</att>
attribute is a pointer to some other element.
 </p><p>The <gi>ptr</gi> and <gi>ref</gi> elements have many applications in
addition to the simple cross-referencing facilities illustrated in this
section. In conjunction with the analytic tools discussed
in chapters <ptr target="#SA"/>, <ptr target="#AI"/>, and <ptr target="#FS"/>, they may be
used to link analyses of a text to their object, to combine
corresponding segments of a text, or to align segments of a text with a
temporal or other axis or with each other.</p> 
  <p>Where the <att>target</att> attribute of <gi>ptr</gi> or <gi>ref</gi>
points to an external resource available on the network, the <att>mimeType</att> attribute 
 may be used to specify the mime type of that resource; this may be important 
 for to enable appropriate processing. For example:
    <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:lang="en">
      <p>The current version of the TEI Guidelines source code is available in the TEI Sourceforge Repository; <ref target="http://sourceforge.net/p/tei/code/HEAD/tree/trunk/P5/Source/guidelines-en.xml" mimeType="application/tei+xml">guidelines-en.xml</ref> is the root document used to create the English version of the Guidelines.</p>
    </egXML>
  </p><specGrp xml:id="DCOXR" >
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/ptr.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/ref.xml"/>
</specGrp>

</div>
<div type="div2" xml:id="COLI"><head>Lists</head>
<p>The following elements are provided for the encoding of lists,
their constituent items, and the labels or headings associated
with them:
<specList><specDesc key="list"/><specDesc key="item"/><specDesc key="label"/><specDesc key="head"/><specDesc key="headLabel"/><specDesc key="headItem"/></specList>
 </p>
<p>The <gi>list</gi> element should be used to mark any kind of
<term rend="noindex">list</term>:
numbered, lettered, bulleted, or unmarked. Lists formatted as such in
the copy text should in general be encoded using this element, with an
appropriate value for the <att>rend</att> attribute. Suggested values 
  for <att>rend</att> include:
  <list rend="bulleted">
    <item><term>bulleted</term> (items preceded by bullets or similar markings)</item>
    <item><term>inline</term> (items rendered within continuous prose, with no linebreaks)</item>
    <item><term>numbered</term> (items preceded by numbers or letters)</item>
    <item><term>simple</term> (items rendered as blocks, but with no bullet or number)</item>
  </list>
  
  Some of these values may of course be combined; a list may be inline, but also be rendered with 
  numbers. An example appears below. For more sophisticated and detailed description of list rendering, consider using the <att>style</att>
  attribute with Cascading Stylesheet properties and values, as described in the W3C's
  <ref target="http://www.w3.org/TR/css-lists-3/">CSS Lists and Counters Module Level 3</ref>.</p>

<p>Each distinct item in the list should be encoded as a distinct
<gi>item</gi> element.  If the numbering or other identification for the
items in a list is unremarkable and may be reconstructed by any
processing program, no enumerator need be specified.  If however an
enumerator is retained in the encoded text, it may be supplied either by
using the <att>n</att> attribute on the <gi>item</gi> element, or by
using a <gi>label</gi> element.  The following examples are thus
equivalent:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#GibAut">I will add two facts, which have seldom occurred in
the composition of six, or even five quartos.
<list rend="inline numbered">
      <label>(1)</label>
      <item>My first rough manuscript, without any
intermediate copy, has been sent to the press.</item>
      <label>(2)</label>
      <item>Not a sheet has been seen by any human
eyes, excepting those of the author and the printer:
the faults and the merits are exclusively my own.</item>
   </list></egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples"  source="#GibAut">I will add two facts, which have seldom occurred in
the composition of six, or even five quartos.
<list rend="inline numbered">
      <item n="1">My first rough manuscript, without any
intermediate copy, has been sent to the press.</item>
      <item n="2">Not a sheet has been seen by any human
eyes, excepting those of the author and the printer:
the faults and the merits are exclusively my own.</item>
   </list></egXML>
The two styles may not be mixed in the same list:  if one item is
preceded by a label, all must be.
 </p>
<p>A list need not necessarily be displayed in list format.  For
example, the following is a reasonable encoding of a list which (in
the original) is simply printed as a single paragraph:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COLI-eg-171">On those remote pages it is written that animals are
divided into <list rend="inline">
   <item n="a">those that belong to the Emperor, </item>
   <item n="b">embalmed ones, </item>
   <item n="c">those that are trained, </item>
   <item n="d">suckling pigs, </item>
   <item n="e">mermaids,  </item>
   <item n="f">fabulous ones, </item>
   <item n="g">stray dogs, </item>
   <item n="h">those that are included in this classification, </item>
   <item n="i">those that tremble as if they were mad, </item>
   <item n="j">innumerable ones, </item>
   <item n="k">those drawn with a very fine camel's-hair brush, </item>
   <item n="l">others, </item>
   <item n="m">those that have just broken a flower vase, </item>
   <item n="n">those that resemble flies from a distance. </item>
   </list></egXML>
 </p>
<p>A list may be given a heading or title, for which the <gi>head</gi>
element should be used, as in the next example, which also demonstrates
simple use of the <gi>label</gi> element to mark a tabular or glossary
list in which each item is associated with a word or phrase rather than
a numeric or alphabetic enumerator:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COLI-eg-172"><list type="gloss">
 <head>Report of the conduct and progress of Ernest Pontifex.
   Upper Vth form — half term ending Midsummer 1851</head>
 <label>Classics</label>    <item>Idle listless and unimproving</item>
 <label>Mathematics</label> <item>ditto</item>
 <label>Divinity</label>    <item>ditto</item>
 <label>Conduct in house</label> <item>Orderly</item>
 <label>General conduct</label>
 <item>Not satisfactory, on account of his great 
    unpunctuality and inattention to duties</item>
</list></egXML>
 </p>
<p>In such a list, the individual items have internal structure.  In
complex cases, where list items contain many components, the list is
better treated as a <term rend="noindex">table</term>,
<index><term>tables</term><index><term>and
lists</term></index></index> on which see chapter <ptr target="#FT"/>.  A particularly important instance of the simple two-column
table is the <soCalled>glossary list</soCalled>, which should be marked
by the tag <tag>list type="gloss"</tag>.  In such lists, each
<gi>label</gi> element contains a term and each <gi>item</gi> its gloss;
it is a semantic error for a list tagged with <code>type="gloss"</code> not to have labels.  For example:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CO-eg-04"><list type="gloss">
  <head>Unit Three — Vocabulary</head>
  <label xml:lang="la">acerbus, -a, -um     </label> <item>bitter, harsh</item>
  <label xml:lang="la">ager, agrī, M. </label> <item>field</item>
  <label xml:lang="la">audiō, īre,
    īvī, ītus         </label> <item>hear, listen (to)</item>
  <label xml:lang="la">bellum, -ī, N. </label> <item>war</item>
  <label xml:lang="la">bonus, -a, -um       </label> <item>good</item>
</list></egXML>
Additionally, the  <gi>term</gi> and <gi>gloss</gi> elements discussed
in section <ptr target="#COHQU"/> might be used to make explicit the role
that each column in the glossary list has, as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples"  source="#CO-eg-04"><list type="gloss">
  <head>Unit Three — Vocabulary</head>
  <label><term xml:lang="la">acerbus, -a, -um</term> </label>
  <item><gloss>bitter, harsh</gloss> </item>
  <label><term xml:lang="la">ager, agrī, M. </term> </label>
  <item><gloss>field</gloss> </item>
  <label>
	<term xml:lang="la">audiō, -īre, -īvī, -ītus</term>
  </label>
  <item><gloss>hear, listen (to)</gloss> </item>
  <label><term xml:lang="la">bellum, -ī, N. </term> </label>
  <item><gloss>war</gloss> </item>
  <label><term xml:lang="la">bonus, -a, -um</term> </label>
  <item><gloss>good</gloss> </item>
</list></egXML>
Note in the above examples the use of the global <att>xml:lang</att>
attribute to specify on the <gi>label</gi> (or <gi>term</gi>) element
what language the term is from.  For further discussion of the
<att>xml:lang</att> attribute see section <ptr target="#STGA"/>, and
section <ptr target="#CHSH"/>.  A more elaborate markup for this
glossary would distinguish the headword forms from the grammatical
information (principal parts and gender), perhaps using elements taken
from <ptr target="#DI"/>.
 </p>
<p>In addition to the <gi>head</gi> element used to supply
a title or heading for the whole list, headings for the two
columns of a glossary-style list may be specified using
the two special elements <gi>headLabel</gi> and <gi>headItem</gi>:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COLI-eg-175">The simple, straightforward statement of an idea is
preferable to the use of a worn-out expression.
<list type="gloss">
  <headLabel>TRITE</headLabel>
  <headItem>SIMPLE, STRAIGHTFORWARD</headItem>
  <label>bury the hatchet  </label> <item>stop fighting, make peace</item>
  <label>at loose ends     </label> <item>disorganized</item>
  <label>on speaking terms </label> <item>friendly</item>
  <label>fair and square   </label> <item>completely honest</item>
  <label>at death's door   </label> <item>near death</item>
</list></egXML>
 </p>
<p>The elements <gi>label</gi>, <gi>head</gi>, <gi>headLabel</gi>, and
<gi>headItem</gi> may contain only phrase-level elements.  The
<gi>item</gi> element however may contain paragraphs or other
<soCalled>chunks</soCalled>, including other lists.  In this example, a
glossary list contains two items, each of which is itself a simple list:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COLI-eg-176"><list type="gloss">
   <label>EVIL</label>
   <item>
      <list rend="bulleted">
         <item>I am cast upon a horrible desolate island, void
            of all hope of recovery.</item>
         <item>I am singled out and separated as it were from
            all the world to be miserable.</item>
         <item>I am divided from mankind — a solitaire; one
            banished from human society.</item>
      </list> 
   </item>
   <label>GOOD</label>
   <item>
      <list rend="bulleted">
         <item>But I am alive; and not drowned, as all my
            ship's company were.</item>
         <item>But I am singled out, too, from all the ship's
            crew, to be spared from death...</item>
         <item>But I am not starved, and perishing on a barren place,
            affording no sustenances....</item>
      </list>
   </item>
</list></egXML>
 </p>
<p>Lists of different types may be nested to arbitrary depths in this
way.
 </p>
<specGrp xml:id="DCOLI" n="Lists and List Items">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/list.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/item.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/label.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/head.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/headLabel.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/headItem.xml"/>
</specGrp>
</div>
<div type="div2" xml:id="CONO"><head>Notes, Annotation, and Indexing</head>
<div type="div3" xml:id="CONONO"><head>Notes and Simple Annotation</head>
<p>The following element is provided for the encoding of
discursive notes, whether already present in the copy text or
supplied by the encoder:
<specList><specDesc key="note"/></specList>	
 </p>
<p>A note is any additional comment found in a text, marked in some way as being
out of the main textual stream.  All notes should be marked using the
same tag, <gi>note</gi>, whether they appear as block notes in the main
text area, at the foot of the page, at the end of the chapter or volume,
in the margin, or in some other place.
 </p>
<p>Notes may be in a different hand or typeface, may be authorial or
editorial, and may have been added later.  Attributes may be used to
specify these and other characteristics of notes, as detailed below.
 </p>
<p>A note is usually attached to a specific point or span within a text, which we
term here its <term>point of attachment</term>. In conventional
printed text, the point of attachment is represented by some siglum
such as a star or cross, or a superscript digit. </p>
<p>When encoding such a text, it is conventional to replace this
siglum by the content of the annotation, duly marked up with a
<gi>note</gi> element. This may not always be
possible for example with marginal notes, which may not be anchored to
an exact location. For ease of processing, it may be adequate to
position marginal notes before the relevant paragraph or other
element. In printed texts, it is sometimes conventional to group notes
together at the foot of the page on which their points of attachment
appear. This practice is not generally recommended for TEI-encoded
texts, since the pagination of a particular printed text is unlikely
to be of structural significance.  In some cases, however, it may be
desirable to transcribe notes not at their point of attachment to the
text but at their point of appearance, typically at the end of the
volume, or the end of the chapter.  In such cases, the
<att>target</att> attribute of the <gi>note</gi> may be used to
indicate the point of attachment.  It is also possible to encode the
point of attachment itself, using the <gi>ptr</gi> or <gi>ref</gi>
element, pointing from that to the body of the <gi>note</gi> placed
elsewhere. </p>
<p>In cases where the note is
applied not to a point but to a span of text,  not itself represented
as a TEI element, the
<att>target</att> attribute may use an appropriate pointer
expression, for example using the <ident>range()</ident> function
to specify the span of attachment.</p>
<p>For further discussion of pointing
to points and spans in the text, see section <ptr target="#COXR"/>.</p>
<p>In the following example, the <att>type</att> attribute is used to
categorise the note as a gloss:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CONONO-eg-189"><l>The self-same moment I could pray</l>
<l>And from my neck so free</l>
<l>The albatross fell off, and sank</l>
<l>Like lead into the sea.
<note type="gloss" place="margin">The spell begins to break</note>
</l></egXML>
As the <gi>note</gi> appears within an <gi>l</gi> element, we may
infer that its point of attachment is in the margin adjacent to the
line in question. In the following version of the same text, however, it may be
inferred that the note applies to the whole of the stanza:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CONONO-eg-189"><lg><l>The self-same moment I could pray</l>
<l>And from my neck so free</l>
<l>The albatross fell off, and sank</l>
<l>Like lead into the sea.</l>
<note type="gloss" place="margin">The spell begins to break</note>
</lg></egXML>
</p>
<p>This type of annotation, very common in the early printed texts
which Coleridge may be presumed to be imitating in this case, may also
be regarded as providing a heading or descriptive label for the
passage concerned. The encoder may therefore prefer to use the
<gi>label</gi> element to represent it, as in the following case:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CONONO-eg-189"><lg><l>The self-same moment I could pray</l>
<l>And from my neck so free</l>
<l>The albatross fell off, and sank</l>
<l>Like lead into the sea.</l>
<label place="margin">The spell begins to break</label> </lg></egXML>
</p>
<p>In the following example, a note which appears at the foot of the
page in the printed source is given at its point of attachment within
the text. The global <att>n</att> attribute is used to indicate the
note number: <egXML
xmlns="http://www.tei-c.org/ns/Examples" source="#LANGENPOST">Collections are ensembles of
distinct entities or objects of any sort.<note n="1" place="bottom">We
explain below why we use the uncommon term
<mentioned>collection</mentioned> instead of the expected
<mentioned>set</mentioned>.  Our usage corresponds to the
<mentioned>aggregate</mentioned> of many mathematical writings and to
the sense of <mentioned>class</mentioned> found in older logical
writings.</note> The elements ...</egXML>
 </p>
<p>In addition to transcribing notes already present in the copy text,
researchers may wish to add their own notes or comments to it. The
<gi>note</gi> element may be used for either purpose, but it will
usually be advisable to distinguish the two categories. One way might
be to use the <att>type</att> attribute shown above, categorizing notes as <mentioned>authorial</mentioned>,
<mentioned>editorial</mentioned>, etc. Where notes
derive from many sources, or where a more precise attribution is
required, the <att>resp</att> attribute may be used to point to a
definition of the person or other agency responsible for the content
of the note.</p>
<p>As a simple example, an edition of the <title>Ancient
Mariner</title> might include both Coleridge's original glosses and
those of a modern commentator:
<egXML xmlns="http://www.tei-c.org/ns/Examples"
 source="#CONONO-eg-189"><lg>
<!-- ... -->
<l>The self-same moment I could pray;
<note place="margin" resp="#STC" type="gloss">
The spell begins to break</note>
<note place="bottom" resp="#JLL">The turning point of the poem...</note>
</l>
</lg></egXML>
For this to be valid, the codes <code>#JLL</code> and
<code>#STC</code> must point to some more information identifying the
agency concerned. The syntax used is identical to that used for other
cross-references, as discussed in <ptr target="#COXR"/>; thus in this
case,  the TEI header for this text might contain a 
title statement like the following:
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<titleStmt>
<title>The Rime of the Ancient Mariner: an annotated edition</title>
<author xml:id="STC">Samuel Taylor Coleridge</author>
<editor xml:id="JLL">John Livingston Lowes</editor>
</titleStmt>
</egXML></p>

<p>When annotating the electronic text by means of analytic notes in
some structured vocabulary, e.g. to specify the topics or themes of a
text, the <gi>span</gi> and <gi>interp</gi> elements may be more
effective than the free form <gi>note</gi> element; these elements are
available when the module for simple analysis is selected (see section
<ptr target="#AISP"/>).
 </p>
</div>
<div type="div3" xml:id="CONOIX"><head>Index Entries</head>
<p>The indexing of scholarly texts is a skilled activity, involving
substantial amounts of human judgment and analysis. It should not therefore be
assumed that simple searching and information retrieval software will
be able to meet all the needs addressed by a well-crafted manual
index, although it may complement them for example by providing free
text search. The role of an index is to provide access via
keywords and phrases which are not necessarily present in the text
itself, but must be added by the skill of the indexer. 
</p>
<div type="div4" xml:id="CONOIXpre"><head>Pre-existing Indexes</head>
<p>When encoding a pre-existing text, therefore, if such an index
is present it may be advisable to retain it along with the text,
rather than attempt to regenerate it automatically. Elements discussed
elsewhere in these Guidelines may be used for this purpose. For
example, the <gi>div1</gi> element or <gi>div</gi> element may be used
to mark the section of the text containing the index and the
<gi>list</gi> element might be used to mark the index itself, each
entry being represented by an <gi>item</gi> element, possibly
containing within it a series of <gi>ptr</gi> or <gi>ref</gi>
elements, as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COXR-eg-164"><div type="index">
<!--...-->
<list type="index">
<item>Women, how cause of mel. <ref>193</ref>; their vanity in
apparell taxed, <ref>527</ref>; their counterfeit tears
<ref>547</ref>; their vices <ref>601</ref>, commended,
<ref>624</ref>.</item>
<item>Wormwood, good against mel. <ref>443</ref></item>
<item>World taxed, <ref>181</ref></item>
<item>Writers of the cure of mel. 295</item>
<!--...-->
</list>
</div></egXML>
</p>
<p>Note that this simple representation does not capture the nested
structure of the first of these index entries. A more accurate representation might
entail the use of nested lists like the following:
<egXML xmlns="http://www.tei-c.org/ns/Examples"  source="#COXR-eg-164"><item>Women, 
   <list><item>how cause of mel. <ref>193</ref>;</item> 
      <item>their vanity in apparell taxed, <ref>527</ref>;</item>
      <item>their counterfeit tears <ref>547</ref>;</item>
      <item>their vices 
          <list><item><ref>601</ref>,</item>
              <item> commended, <ref>624</ref>.</item>
           </list></item>
   </list>
</item></egXML>
</p>
<p>The page references, encoded simply as <gi>ref</gi> elements above,
might also include direct links to the appropriate location in the
encoded text, using (for example) a target attribute to supply the
identifier of an associated page break element:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><!-- in the text -->
<pb xml:id="P624"/> <!-- start of  page 624 -->
<!-- in the index -->
<ref target="#P624">624</ref>
</egXML>

For further discussion of this and alternative ways of encoding such
links see the discussion in section <ptr target="#SA"/>. Note that
similar methods may also be used to encode a table of contents, as
further exemplified in section <ptr target="#DSFRONT"/>.
</p>
</div>
<div type="div4" xml:id="CONOIXgen"><head>Auto-generated Indexes</head>
<p>It can also be useful, however, to generate a new index from a
machine-readable text, whether the text is being written for the first time
with the tags here defined, or as an addition to a text transcribed from
some other source. Depending on the complexity of the text and its subject
matter, such an automatically-generated index may not in itself satisfy all
the needs of scholarly users. However it can assist a professional indexer
to construct a fully adequate index, which might then be post-edited into
the digital text, marked-up along the lines already suggested for preserving
pre-existing index material.</p>
<p>Indexes generally contain both  references to specific pages or
sections and references to page ranges or sequences. The same element
is used in either case:
<specList>
<specDesc key="index"/>
</specList>
</p>
<p>Like the <gi>interp</gi> element described in <ptr target="#AISP"/>
this element may be used simply to provide descriptive or interpretive
label of some kind for any location within a text, to be processed in
any way by analytic software, but its main purpose is to facilitate
the generation of an index for a printed version of the text. An
<gi>index</gi> element may be placed anywhere within a text, between
or within other elements. The
headwords to be used when making up this index are given by the
<gi>term</gi> elements within the <gi>index</gi>
element. The location of the generated index
might be specified by means of a processing instruction within the
text, such as the following (the exact form of the PI is of course
dependent on the application software in use):
<eg><![CDATA[<?tei indexplacement ?>]]></eg>
Alternatively, the special purpose  <gi>divGen</gi> element might be used.</p>
<p>In the simplest case, a single headword is supplied  by
an  <gi>term</gi> elements contained by an
<gi>index</gi> element:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>The students understand procedures for Arabic lemmatisation
<index>
  <term>Lemmatization, Arabic</term>
</index>and are beginning to build parsers.</p></egXML>
 </p>
<p>The effect of this is to document an index entry for the term
<q>Lemmatization, Arabic</q>,
which when processed could reference the location of the original <gi>index</gi> element. </p>
<p>If the subject of Arabic lemmatization is treated at length
in a text, then the index entry generated may need to reference a
sequence of locations (e.g. page numbers). In such a case it will be necessary to identify the end of the relevant
span of text as well as its starting point. This is most conveniently
done by supplying an  empty <gi>anchor</gi> element (as discussed in chapter
<ptr target="#SA"/>) at the appropriate point and pointing to it from
the <gi>index</gi> element by means of its <att>spanTo</att>
attribute, as
in this example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>We now turn to the
topic of Arabic lemmatisation
<index spanTo="#ALAMEND">
  <term>Lemmatization, Arabic</term>
</index> concerning which it is important to note .....
<!-- much learned material omitted here -->
and now we can  build our parser.<anchor xml:id="ALAMEND"/></p></egXML>
 </p>
<p>This would generate the same index entries as the previous example,
but the reference would be to the whole span of text between the
location of the <gi>index</gi> element and the location of the element
identified by the code <ident>ALAMEND</ident>, rather than
a single point, and thus might (for
example) include a sequence of page numbers.</p>
<p>Although the position of the <gi>index</gi> element in the text
provides the target location that will be specified in the generated index
entry, no part of the text itself is used to construct that entry. Index
terms appearing in the entry come solely from the content of <gi>term</gi>
elements, which consequently may have to repeat words or phrases from the
text proper. This need not be done verbatim, thus giving scope for
normalization of spelling (as in the example above) or other modifications which may assist
generation of an index in a desired form or sequence.</p>
<p>Sometimes, for example when
index terms are taken from a different language or consist of
mathematical formulae or other expressions, even a
normalized form of an index term may be insufficient for an application to
order it exactly as desired. The <att>sortKey</att> attribute may be
used to address this problem, as in the following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>The @ operator 
<index><term sortKey="0000">@</term></index> precedes an
attribute name</p></egXML> Here, an entry for the symbol @ will appear
in the index, but will be sorted alphabetically as if it were the
string <val>0000</val>. This technique is also useful when an index
entry is to contain some non-Unicode character or glyph represented by
the <gi>g</gi> element discussed in chapter <ptr target="#WD"/>. In
the following example, we assume that somewhere a definition for this
glyph has been provided using the elements described in chapter <ptr target="#WD"/>, and given the code <val>PrinceGlyph</val>:
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<char xml:id="PrinceGlyph">
<!-- definition of the glyph here -->
</char>

<p>The Artist formerly known as Prince  <index><term sortKey="Prince"><g ref="#PrinceGlyph"/></term></index>...</p></egXML>
Note that if no value is supplied for the sortKey attribute, a sorting
application should always use the content of the <gi>term</gi> element
as a sort key.</p>

<p>It is common practice to compile more than one index for a given text.
A biography of a poet, for example, may offer an index of references to
poems by the subject of the study, another index of works by other writers,
an index of places or historical personages etc. The indexName
attribute is used to assigning index terms and locations to one or
more specific indexes:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>Sir John Ashford
<index indexName="INDEX-PERSONS"><term>Ashford, John</term></index> was,
coincidentally, born in 
<index indexName="INDEX-PLACES"><term>Ashford
(Kent)</term></index>Ashford...</p></egXML></p>

<p>Multi-level indexing is particularly common in scholarly
documents. For example, as well as entries
such as <term>TEI</term>, or <term>markup</term>, an index  may contain structured entries like <term>TEI,
markup practices, index terms</term>, where a top level entry <term>TEI</term>
is followed by a number of second-level subcategories, any or all of
which may have a third-level list attached to them and so on. In order to
reflect such a hierarchical index listing,  <gi>index</gi>  elements may be
nested to the required depth. For example,
suppose that we wish to make a structured index entry for
<q>lemmatisation</q> with subentries for <q>Arabic</q>,
<q>Sanskrit</q>, etc. The example at the start of this section  might
then be encoded with  nested
<gi>index</gi> elements:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>The students understand procedures for Arabic lemmatisation
    <index>
   <term>lemmatization</term>
   <index>
     <term>arabic</term>
   </index>
</index>
...</p></egXML></p>
<p>The  index entry from Burton's <title>Anatomy of
Melancholy</title> quoted above might be generated in a similar way. 
To generate such an entry, the body of the text might include, at page
193, an <gi>index</gi> element such as
    <egXML xmlns="http://www.tei-c.org/ns/Examples"   source="#COXR-eg-164">
<index>
    <term>Women</term>
    <index>
    <term>how cause of mel.</term>
    </index>
</index>
</egXML>. Similarly,  page 601 of the body text would include
an <gi>index</gi> element like the following:
    <egXML xmlns="http://www.tei-c.org/ns/Examples"><index>
    <term>Women</term>
    <index>
      <term>their vices</term>
    </index>
</index></egXML>
while the <gi>index</gi> element at page 624 would have a structure
like the following:
    <egXML xmlns="http://www.tei-c.org/ns/Examples"><index>
    <term>Women</term>
    <index>
      <term>their vices</term>
      <index>
      <term>commended</term>
      </index>
    </index>
 </index></egXML>       
</p>
<p>When processing such <gi>index</gi> elements, the duplication
required to make the structure explicit will normally be removed, so
as to produce entries like those quoted above. However, this is not
required by the encoding recommended here. </p>

<p>As noted above, either a processing instruction or a <gi>divGen</gi>
element may be used to mark the place at which an index
generated from  <gi>index</gi> elements should be inserted into the
output of a processing program; typically but not necessarily this will be at some point
within the back matter of the document. If the <gi>divGen</gi> element
is used, then the  <att>type</att> attribute
should be used to specify which kind of index is to be generated, and
its value should correspond with that of the
<att>indexName</att> attribute on the relevant <gi>index</gi>
elements. 
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<back>
   <div type="appendix">
      <head>Bibliography</head>
      <listBibl>
         <bibl> ... </bibl>
      </listBibl>
   </div>
   <divGen n="Index Nominum" type="INDEX-NAMES"/>
   <divGen n="Index Loci" type="INDEX-PLACES"/>
</back></egXML>
 </p>
<p>As this example shows, the global
<att>n</att> attribute may also be used to specify a name or
identifier for the
generated index itself in the usual way. Any additional headings
etc. required for the generated index must be specified as content of
the <gi>divGen</gi> element.
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<back>
   <divGen n="A1" type="INDEX-NAMES">
   <head>An Index of Names</head>
   </divGen>
</back>
</egXML>
</p>
<p>If a processing instruction is used, then these parameters for the
generated index may be supplied in some other way.</p>
<p>One final feature frequently found in manually-created indexes to
printed works cannot readily be encoded by the means provided here,
namely cross-references internal to the index term listing. For
example, if all references to the TEI in a text have been indexed
using the index term <term>Text Encoding Initiative</term>, it may
also be helpful to include an entry under the term <term>TEI</term>
containing some text such as <q>see Text Encoding Initiative</q>. Such
internal cross-references must be added as part of the post-editing
phase for an auto-generated index.</p>
<specGrp xml:id="DCONO" n="Annotation">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/note.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/index.xml"/>
</specGrp>	
</div>
</div>
</div>
<div type="div2" xml:id="COGR"><head>Graphics and Other Non-textual Components</head>
<p>Graphics, such as illustrations or diagrams, appear in many
different kinds of text, and often with different purposes. Audio or video
clips may also appear. In some
cases, such media form an integral part of a text (indeed, some texts—comic 
books for example—may be almost entirely graphic); in others
the graphic or video may be a kind of optional extra. In some cases, the text
may be incomprehensible unless the media is included; in others, the
presence of the media adds  little to the sense of the
work. It will therefore be a matter of encoding policy as to whether
or how media found in a source text are transferred to a new encoded 
version of the same.  In documents which are <soCalled>born
digital</soCalled>, media such as graphics
and other non-textual components may be particularly salient,
but their inclusion in an archival form of the document concerned
remains an editorial decision.</p>
<p>Considered as structural components, media may be anchored to a particular point in
the text, or they may <term>float</term> either completely freely, or
within some defined scope, such as a chapter or section. Time-based
media such as audio or video may need to be synchronized  with particular
parts of a written text.  Media of all kinds often contain associated
text such as a heading or label.  These Guidelines provide the following
different elements to indicate their appearance within a text:
<specList>
<specDesc key="figure"/>
<specDesc key="media"/>
<specDesc key="graphic"/>
<specDesc key="binaryObject"/>
</specList>
 </p>
<p>Media files may be encoded in a number of different ways:
<list rend="bulleted">
<item>in some non-XML or binary format such as PNG, JPEG, MP3, MP4 etc.</item>
<item>in an XML format such as SVG</item>
<item>in a TEI XML format such as the notation for graphs and trees
described in <ptr target="#GD"/></item>
</list> In the last two cases, the  presence of the graphic
will be indicated by an appropriate XML element, drawn from the SVG
namespace in the second case, and its content will fully define the
graphic to be produced. In the first case, however, one of the elements
<gi>graphic</gi> or <gi>media</gi> is used to mark the presence of the graphic only and the
visual content itself is stored outside the XML document at a location 
referenced by means of an <att>url</att> attribute. This attribute is
provided by membership of these elements in the <ident
type="class">att.resourced</ident> class. Alternatively, if
it is small, the media information may be embedded directly within the document
using some suitable binary format such as Base64; in this case the
<gi>binaryObject</gi> element may be used to contain it.
</p>

<p>The elements <gi>graphic</gi>, <gi>media</gi>, and <gi>binaryObject</gi> are made
available as members of the class <ident
type="class">model.graphicLike</ident> when this module is included in
a schema. These elements are also members of the class <ident
type="class">att.media</ident>, from which they inherit the
following attributes: 
<specList>
  <specDesc key="att.internetMedia" atts="mimeType"/>
  <specDesc key="att.media" atts="width height scale"/>
</specList></p>

<p>For example, the following passage indicates that a copy of the  image
found in the source text may be recovered from the URL
<ident>zigzag2.png</ident> and that this image is in PNG format:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COHQHE-eg-13"><p>These were the four lines I moved in
through my first, second, third, and
fourth volumes.   -- In the fifth volume
I have been very good,   -- the precise
line I have described in it being this:     
<graphic url="zigzag2.png" mimeType="image/png"/>      
By which it appears, that except at the
curve, marked A. where I took a trip
to Navarre, -- and the indented curve B.
which is the short airing when I was
there with the Lady Baussiere and her
page, -- I have not taken the least frisk 	
...</p>	
</egXML>	

</p>	
<p>The media elements  are phrase
level  elements which may be used
anywhere  that textual content is permitted, within but not between
paragraphs or headings. In the following example, the encoder has
decided to treat a specific printer's ornament as a heading:
 <egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples"><head><graphic url="http://www.iath.virginia.edu/gants/Ornaments/Heads/hp-ral02.gif"/></head>
</egXML>. </p>
<p>The <gi>figure</gi> element discussed in <ptr target="#FTGRA"/>
provides additional capabilities, for example the ability to combine a
number of images into a hierarchically organized structure or a block
of images. The <gi>figure</gi> element carries a <att>type</att>
attribute, which can be used to distinguish different kinds of graphic
component within a single work, for example, maps as opposed to
illustrations. It also provides the ability to associate an image with
additional information such as a heading or a description.</p>
<specGrp xml:id="DCOGR" n="Graphic containers">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/media.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/graphic.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/binaryObject.xml"/></specGrp>



</div>	
<div type="div2" xml:id="CORS"><head>Reference Systems</head>
<p>By <mentioned>reference system</mentioned> we mean the system by which names
or references are associated with particular passages of a text (e.g.
<mentioned>Ps. 23:3</mentioned> for the third verse of Psalm 23 or <mentioned>Amores
2.10.7</mentioned> for Ovid's <title>Amores</title>, book 2, poem 10, line
7).  Such names make it possible to mark a place within a text and
enable other readers to find it again.  A reference system may be based
on structural units (chapters, paragraphs, sentences; stanza and verse),
typographic units (page and line numbers), or divisions created
specifically for reference purposes (chapter and verse in Biblical
texts).  Where one exists, the traditional reference system for a text
should be preserved in an electronic transcript of it, if only to make
it easier to compare electronic and non-electronic versions of the text.
 </p>
<p>Reference systems may be recorded in TEI-encoded texts in any of the
following ways:
<list rend="bulleted">
<item>where a reference system exists, and is based on the same
logical structure as that of the text's markup, the reference for
a passage may be recorded as the value of the global <att>xml:id</att> or
<att>n</att> attribute on an appropriate tag, or may be constructed by
combining attribute values from several levels of tags, as described
below in section <ptr target="#CORS1"/>.
 </item>
<item>where there is no pre-existing reference system, the global
<att>xml:id</att> or <att>n</att> attributes may be used to construct one
(e.g. collections and corpora created in electronic form), as described
below in section <ptr target="#CORS2"/>.
 </item>
<item>where a reference system exists which is not based on the same
logical structure as that of the text's markup (for example, one
based on the page and line numbers of particular editions of the text
rather than on the structural divisions of it), any of a
variety of methods for encoding the logical structure representing
the reference system may be employed, as described in chapter
<ptr target="#NH"/>.
 </item>
<item>where a reference system exists which does not correspond to any
particular logical structure, or where the logical structure concerned
is of no interest to the encoder except as a means of supporting the
referencing system, then references may be encoded by means of
<gi>milestone</gi> elements, which simply mark points in the text at
which values in the reference system change, as described below in
section <ptr target="#CORS5"/>.
 </item></list>
The specific method used to record traditional or new reference systems
for a text should be declared in the TEI header, as further described in
section <ptr target="#CORS6"/> and in section <ptr target="#SACR"/>.
 </p>
<p>When a text has no pre-existing associated reference system of any
kind, these Guidelines recommend as a minimum that at least the page
boundaries of the source text be marked using one of the methods
outlined in this section.  Retaining page breaks in the markup is also
recommended for texts which have a detailed reference system of their
own. Line breaks in prose texts may be, but need not be, tagged.<note place="bottom">Many encoders find it convenient to retain the line
breaks of the original during data entry, to simplify proofreading,
but this may be done without inserting a tag for each line break of
the original.</note></p>
<div type="div3" xml:id="CORS1"><head>Using the <att>xml:id</att> and <att>n</att> Attributes</head>
<p>When traditional reference schemes represent a hierarchical
structuring of the text which mirrors that of the marked-up document, the
<att>n</att> attribute defined for all elements may be used to indicate
the traditional identifier of the relevant structural units. The
<att>n</att> attribute may also be used to record the numbering of
sections or list items in the copy text if the copy-text numbering is
important for some reason, for example because the numbers are out of
sequence.</p>
<p>For example, a traditional reference to Ovid's
<title>Amores</title> might be <mentioned>Amores
2.10.7</mentioned>—book 2, poem 10, line 7. Book, poem, and
line are structural units of the work and will therefore be tagged in
any case. (See chapter <ptr target="#VE"/> for a
discussion of structural units in verse collections.) In such cases,
it is convenient to record traditional reference numbers of the
structural units using the <att>n</att> attribute. The relevant tags
for our example would be:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><div1 n="Amores" type="volume">
  <div2 n="1" type="book"><!-- ... --></div2>
  <div2 n="2" type="book">
      <div3 n="1" type="poem"><!-- ... --></div3>
      <div3 n="2" type="poem"><!-- ... --></div3>
      <!-- ... -->
      <div3 n="10" type="poem">
          <l n="1"> ... </l>
          <l n="2"> ... </l>
          <!-- ... -->
          <l n="7"> ... </l>
      </div3>
      <!-- ... -->
  </div2>
  <!-- ... -->
</div1></egXML>
 </p>
<p>One may also place the entire standard reference for each portion of
the text into the appropriate value for the <att>n</att> attribute,
though for obvious reasons this takes more space in the file:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><div1 n="Amores" type="volume">
  <div2 n="Amores 1" type="book"><!-- ... --></div2>
  <div2 n="Amores 2" type="book">
    <div3 n="Amores 2.1" type="poem"><!-- ... --></div3>
    <!-- ... -->
    <div3 n="Amores 2.10" type="poem">
      <!-- ... -->
      <l n="Amores 2.10.7"> ... </l>
      <!-- ... -->
    </div3>
    <!-- ... -->
  </div2>
  <!-- ... -->
</div1></egXML>
 </p>
<p>If the names used by the traditional reference system can be
formulated as identifiers, then the references can be given as values
for the <att>xml:id</att> attribute; this requires that the reference
be given without internal spaces, begin with a letter or underscore,
and contain no characters other than letters, digits, hyphens,
underscores, full stops, and the various combining and extender
characters, as defined by the XML specification.  Unlike values for
the <att>n</att> attribute, values for the <att>xml:id</att> attribute
must be unique throughout the document. Our example then looks like
this: <egXML xmlns="http://www.tei-c.org/ns/Examples"><div1 n="Amores" type="volume"> <div2 xml:id="am.1" type="book"><!-- ... --></div2>
  <div2 xml:id="am.2" type="book">
    <div3 xml:id="am.2.1" type="poem"><!-- ... --></div3>
    <!-- ... -->
    <div3 xml:id="am.2.10" type="poem">
      <!-- ... -->
      <l xml:id="am.2.10.7"> ... </l>
      <!-- ... -->
    </div3>
    <!-- ... -->
  </div2>
  <!-- ... -->
</div1></egXML>
</p>
<p>To document the usage and to allow automatic processing of these
standard references, it is recommended that the TEI header be used to
declare whether standard references are recorded in the <att>n</att> or
<att>xml:id</att> attributes and which elements may carry standard
references or portions of them. For examples of declarations for the
reference systems just shown, see section <ptr target="#CORS6"/>.
 </p>
<p>Using the <att>n</att> attribute one can specify only a single
standard referencing system, a limitation not without problems, since
some editions may define structural units differently and thus create
alternative reference systems.  For example, another edition of the
<title>Amores</title> considers poem 10 a continuation of poem 9, and
therefore would specify the same line as <mentioned>Amores 2.9.31</mentioned>.
In order to record both of these reference systems  one
could employ any of a variety of methods discussed in chapter <ptr target="#NH"/>.
 </p></div>
<div type="div3" xml:id="CORS2"><head>Creating New Reference Systems</head>
<p>If a text has no canonical reference system of its own, a new custom reference
system may be used.</p>
  
<p>The global attributes <att>n</att> and <att>xml:id</att> may be used to
  assign reference identifiers to segments of the text.  Identifiers
  specified by either attribute apply to the entire element for which they
  are given. <att>xml:id</att> attributes must be unique within a single
  document, and <att>xml:id</att> values must begin with a letter.  No such restrictions
  are made on the values of <att>n</att> attributes.
</p>

<p>Determining a referencing system for a TEI encoding depends on many factors 
  that may either be derived from textual structure, or influenced by extra-textual 
  contingencies such as project and file management concerns. It is important,
  therefore, that the attribute used, the elements which can bear standard
  reference identifiers, and the method for constructing standard reference 
  identifiers, should all be declared in the header as described in section
  <ptr target="#HD54"/>.
</p>

<p>The Guidelines do not recommend one specific method for creating new referencing
   systems; however, the rest of this section lists some possibly useful strategies.</p>

  <div type="div4" xml:id="CORS2-1">
    <head>Referencing system derived from markup</head>
    <p>
      A new referencing system may be derived from the structure of the electronic
      text, specifically from the markup of the text. As with any
      reference system intended for long-term use, it is important to see the
      reference as an established, unchanging point in the text. Should the
      text be revised or rearranged, the reference-system identifiers
      associated with any section of text must stay with that section of text, even if
      it means the reference numbers fall out of sequence.  (A new reference
      system may always be created beside the old one if out-of-sequence
      numbers must be avoided.)
    </p>
    <p>A convenient method of mechanically generating unique values for
      <att>xml:id</att> or <att>n</att> attributes based on the structure of
      the document is to construct, for each element, a <term>domain-style
        address</term> comprising a series of components separated by full
      stops, with one component for each level of the document hierarchy.
      Two methods may be used.  In the <term>typed path</term> form of
      identifier, each component in the identifier takes the form of an
      element identifier, a hyphen, and a number, for example
      <code>p-2</code>. The element name specifies what type of
      element is to be sought, and the number specifies which occurrence of that
      element type is to be selected.  (The hyphen and number may be omitted
      if there is only one element of the given type.)  In the <term>untyped
        path</term> form of identifier, each component consists of a number,
      indicating which element in the sequence of nodes at each level is to be
      selected.  To make the resulting identifier a valid XML identifier, it
      may need to be prefixed with an unchanging alphabetic letter.</p>
    <p>Identifiers generated with these methods should use the <gi>text</gi>
      element as their starting point, rather than the <gi>TEI</gi> or
      <gi>body</gi> elements. The <gi>TEI</gi> element may be taken
      as a starting point only if identifiers need to be generated for the
      <gi>teiHeader</gi>, which is not usually the case;  using the
      <gi>body</gi> element as a root would prevent assignment of identifiers
      for the front and back matter.  The component corresponding to the root
      element can be omitted from identifiers, if no confusion will result.
      In collections and corpora, the component corresponding to the root may
      be replaced by the unique identifier assigned to the text or sample.
    </p>
    <p>In the following example, each element within the <gi>text</gi>
      element has been given a typed-path identifier as its <att>xml:id</att>
      value, and an untyped-path identifier as its <att>n</att> value; the
      latter are prefixed with the string <mentioned>AB</mentioned>, which may be
      imagined to be the general identifier for this text.
      <egXML xmlns="http://www.tei-c.org/ns/Examples"><text xml:id="Text-1" n="AB">
        <front xml:id="Front" n="AB.1">
          <div xml:id="Front.div-1" n="AB.1.1">
            <p> ... </p> 
          </div>
          <titlePage xml:id="Front.titlePage" n="AB.1.2">
            <titlePart> ... </titlePart>       
          </titlePage>
          <div xml:id="Front.div-2" n="AB.1.3">
            <p> ... </p> 
          </div>
        </front>
        <body xml:id="Body" n="AB.2">
          <p xml:id="Body.p-1" n="AB.2.1"> ... </p>
          <p xml:id="Body.p-2" n="AB.2.2"> ... </p>
          <div xml:id="Body.div-1" n="AB.2.3">
            <head xml:id="Body.div-1.head" n="AB.2.3.1"> ... </head>
            <p xml:id="Body.div-1.p-1" n="AB.2.3.2"> ... </p>
            <p xml:id="Body.div-1.p-2" n="AB.2.3.3"> ... </p>
          </div>
          <div xml:id="Body.div-2" n="AB.2.4">
            <head xml:id="Body.div-2.head" n="AB.2.4.1"> ... </head>
            <p xml:id="Body.div-2.p-1" n="AB.2.4.2"> ... </p>
            <p xml:id="Body.div-2.p-2" n="AB.2.4.3"> ... </p>
          </div>
        </body>
      </text></egXML>
      The typed and untyped path methods are convenient, but are in no way
      required for anyone creating a reference system.
    </p>
    <p>If the <att>xml:id</att> attribute is used to record the reference
      identifiers generated, each value should record the entire path.  If the
      <att>n</att> attribute is used, each value may record either the entire
      path or only the subpath from the parent element.  The attribute
      used, the elements which can bear standard reference identifiers, and
      the method for constructing standard reference identifiers, should all
      be declared in the header as described in section <ptr target="#HD54"/>.
    </p>
  </div>

  <div type="div4" xml:id="CORS2-2">
    <head>Referencing systems based on project conventions</head>
    <p>A reference system may be based on an agreed project-specific convention for <att>xml:id</att> attributes.
      Every convention will have strengths and weaknesses and it is left to  
      encoders to make a decision that enables them to locate information in their TEI document.</p>
    
    <p>Here are some examples of referencing systems that have been used in TEI project:
      <list>
        <item><label>Title-based identifiers:</label> identifiers constructed with a 
          number of characters from the main document title, followed by an incremental 
          number. E.g. HOL001, HOL002, etc. using a fixed number of digits; or without 
          fixed digits: HOL1, HOL2, etc.</item>
        <item><label>Based on markup, with prefix:</label> identifiers constructed on 
          the markup itself, as described in the previous section. To facilitate uniqueness 
          in a corpus, each identifier may be prefixed with the identifier of the root <gi>TEI</gi> element.
          E.g. RootID-Body-p-1.</item>
        <item><label>Opaque identifiers:</label> computed identifiers using either a 
          randomized algorithm or a universally unique identifier (UUID) algorithm.
          Note that XSLT's function generate-id() only guarantees identifier unique 
          to the document being processed.</item>
      </list>
    </p>
    
    <p>XML well-formedness requires only that xml:id attributes be unique within a single document. 
      However, it is also worth keeping in mind that for operating with
      referencing systems across a corpus of TEI files it is helpful (or even necessary in some 
      circumstances) to have unique identifiers across the whole corpus.</p>
    <p>Values of <att>xml:id</att> may be either populated computationally or manually. In the latter
    case, it is advisable to put measures in place to avoid human error. Custom data types and Schematron rules may be 
    defined in a customization ODD, and a check digit may be added to prevent unwanted changes.
    <note place="bottom">A check digit is computed from the value of an identifier and appended to the value itself. 
    If the identifier is changed, the check digit would therefore invalidate it.</note></p>
  </div>

</div>
<div type="div3" xml:id="CORS5"><head>Milestone
Elements</head><p>Where the desired reference system does not
correspond to any particular structural hierarchy, or the document
combines multiple structural hierarchies (as further discussed in <ptr target="#NH"/>), simpler though less expressive methods may be
necessary. In such cases the simplest solution may be just to mark up
changes in the reference system where they occur, by using one or more
of the following <term>milestone</term> elements: <specList><specDesc
key="milestone"/><specDesc key="gb"/><specDesc key="pb"/><specDesc key="lb"/><specDesc key="cb"/></specList>
 </p>
<p>These elements simply mark the points in a text at which some
category in a reference system changes.  They have no content but
subdivide the text into regions, rather in the same way as milestones
mark points along a road, thus implicitly dividing it into segments.
The elements <gi>gb</gi>, <gi>pb</gi>, <gi>cb</gi>, and <gi>lb</gi> are specialized
types of milestone, marking gathering, page, column, and line
boundaries respectively.  The
global <att>n</att> attribute is used in each case to provide a value
for the particular unit associated with this milestone (for example,
the page or line number).  Since it is not structural, validation of a
reference system based on <gi>milestone</gi>s cannot readily be checked by an
XML parser, so it will be the responsibility of the encoder or the
application software to ensure that they are given in the correct
order.</p>
<p>Milestone elements are often used as a simple means of capturing
the original appearance of an early printed text, which will rarely
coincide exactly with structural units, but they are generally useful
wherever a text has two or more competing
structures. For example, many English novels were first published as
serial works, individual parts of which do not always contain a whole
number of chapters. An encoder might decide to represent the
chapter-based structure using <gi>div1</gi> elements, with
<gi>milestone</gi> elements to mark the points at which individual
parts end; or the reverse. Thus, an encoding in which chapters are
regarded as more important than parts might encode some work in which
chapter three begins in part one and is concluded in part two as
follows: <egXML xmlns="http://www.tei-c.org/ns/Examples"><text><body>
    <milestone unit="part" n="1"/>
    <div1 n="1" type="chapter">
      <p> <!-- ... --> </p>
    </div1>
    <div1 n="2" type="chapter">
      <p> <!-- ... --> </p>
    </div1>
    <div1 n="3" type="chapter">
      <p> <!-- ... --> </p>
       <milestone unit="part" n="2"/>
      <p> <!-- ... --> </p>
    </div1>

  </body></text></egXML>
An encoding of the same work in which parts are regarded as more
important than chapters might begin as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><text>
  <body>
    <div1 n="1" type="part">
      <milestone unit="chapter" n="1"/>
      <p> <!-- ... --> </p>
        <milestone unit="chapter" n="2"/>
      <p> <!-- ... --> </p>
        <milestone unit="chapter" n="3"/>
      <p> <!-- ... --> </p>
    </div1>
    <div1 n="2" type="part">
      <p> <!-- ... --> </p>
        <milestone unit="chapter" n="4"/>
      <p> <!-- ... --> </p>
    </div1>
  </body>
</text></egXML>
 </p>
 <p>Similarly, when tagging dramatic verse one may wish to privilege stanzas
and lines over speeches and speakers, particularly where speeches cross line
and line group boundaries. One might also wish to mark changes in
narrative voice in a prose text. In either case, a milestone tag may be used to
indicate change of speaker:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CORS5-eg-01"><lg>
  <milestone unit="speaker" n="Man"/><l>Oh what is this I cannot see</l>
  <l>With icy hands gets a hold on me</l>
  <milestone unit="speaker" n="Death"/><l>Oh I am Death, none can excel</l>
  <l>I open the doors of heaven and hell</l>
</lg></egXML>
 </p><p>Milestone tags also make it possible to record the reference systems
used in a number of different editions of the same work. The reference
system of any one edition can be recreated from a text in which all are
marked by simply ignoring all elements that do not specify that edition
on their <att>ed</att> attribute.
 </p>
<p>As a simple example, assuming that edition E1 of some collection of
poems regards the first two poems as constituting the first book, while
edition E2 regards the first poem as prefatory, a markup scheme like
the following might be adopted:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><milestone ed="E1" unit="work"/>
<milestone ed="E2" unit="work"/>
<milestone ed="E1" unit="book"/>
<milestone ed="E1" unit="poem"/>
<milestone ed="E2" unit="poem"/>
<milestone ed="E2" unit="book"/>
<milestone ed="E1" unit="poem"/>
<milestone ed="E2" unit="poem"/>
  </egXML>
 </p>
<p>In this case no <att>n</att> value is specified, since the numbers
rise predictably and the application can keep a count from the start of
the document, if desired.
 </p>
<p>The value of the <att>n</att> attribute may but need not include the
identifiers used for any larger sections.  That is, either of the
following styles is legitimate:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><milestone ed="E1" unit="work" n="Amores"/>
<milestone ed="E1" unit="book" n="1"/>
<milestone ed="E1" unit="poem" n="1"/>
<milestone ed="E1" unit="poem" n="2"/>
<milestone ed="E1" unit="book" n="2"/></egXML>
or
<egXML xmlns="http://www.tei-c.org/ns/Examples"><milestone ed="E1" unit="work" n="Amores"/>
<milestone ed="E1" unit="book" n="1"/>
<milestone ed="E1" unit="poem" n="1.1"/>
<milestone ed="E1" unit="poem" n="1.2"/>
<milestone ed="E1" unit="book" n="2"/></egXML>
 </p>
<p>When using <gi>milestone</gi> tags, line numbers may be supplied for
every line or only periodically (every fifth, every tenth line).  The
latter may be simpler; the former is more reliable.
 </p>
<p>The style of numbering used in the values of <att>n</att> is
unrestricted: for the example above, <val>I.i</val>, <val>I.ii</val>,
and <val>I.iii</val> could have been used equally well if preferred.
The special value <val>unnumbered</val> should be reserved for marking
sections of text which fall outside the normal numbering system
(e.g. chapter heads, poem numbers, titles, or speaker attributions in
a verse drama).
 </p>
<p>By default,  there are no constraints on the values supplied for
the <att>ed</att> attribute. If it is felt
appropriate to enforce such a restriction, the techniques described in
<ptr target="#MD"/> may be used, for example to specify that the
attribute must specify one of a predefined set of values. 
 </p>
<p>See below, section <ptr target="#CORS6"/>, for examples of
declarations for the reference systems just shown.
 </p>
<p>Milestone elements may be used to mark any kind of shift in the
properties associated with a piece of text, whether or not would
normally be considered a reference system. For example, they may be
used to mark changes in narrative voice in a prose text, or 
changes of speaker in a dramatic text, where these are not marked
using structural elements such as <gi>sp</gi>, perhaps in order to
avoid a clash of hierarchies.</p>
<!-- example to be supplied by DS -->
<p>As noted in <ptr target="#COPU-2"/> above, milestone elements such
as <gi>lb</gi> or <gi>pb</gi> represent whitespace and are therefore
by default assumed to occur between orthographic tokens in the text, where
these are not otherwise indicated.  By default it is reasonable to assume that
words are not broken across page or line boundaries, and that
therefore a sequence such as 
<egXML xmlns="http://www.tei-c.org/ns/Examples">
...sed imp<lb/>erator dixit...
</egXML>
should be tokenized as four words (<mentioned>sed</mentioned>,
<mentioned>imp</mentioned>, <mentioned>erator</mentioned>, and
<mentioned>dixit</mentioned>). The <att>break</att> attribute is
provided to change the default assumption. To make explicit that
<mentioned>imperator</mentioned> in the above example should be
treated as a single word, a tagging such as the following is recommended:
<egXML xmlns="http://www.tei-c.org/ns/Examples">
...sed imp<lb break="no"/>erator dixit...
</egXML>
Where hyphenation appears before a line or page break, the encoder may
or may not choose to record the fact, either explicitly using an
appropriate Unicode character, or descriptively for example by means
of the <att>rend</att> attribute; see further <ptr target="#COPU-2"/>.</p>
<specGrp xml:id="DCORSM" n="Milestone tags">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/milestone.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/gb.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/pb.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/lb.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/cb.xml"/>
</specGrp>
</div>
<div type="div3" xml:id="CORS6"><head>Declaring Reference Systems</head>
<p>Whatever kind of reference system is used in an electronic text, it
is recommended that the TEI header contain a description of its
construction in the <gi>refsDecl</gi> element described in section
<ptr target="#HD54"/>. As described there, the declaration
may consist either of a formal declaration using the
<gi>cRefPattern</gi> element or an informal description in prose.  The
former is recommended because unlike prose it can be processed by
software.</p>

<p>The three examples given in section <ptr target="#CORS1"/> would be declared as follows. The first example encodes
the standard references for Ovid's <title>Amores</title> one level at
a time, using the <att>n</att> attribute on the <gi>div1</gi>,
<gi>div2</gi>, <gi>div3</gi>, and <gi>l</gi> tags. The header section for such
an encoding should look something like this: <egXML xmlns="http://www.tei-c.org/ns/Examples">
    <encodingDesc>
         <refsDecl>
           <cRefPattern matchPattern="([^ ]+) ([0-9]+)\.([0-9]+)\.([0-9]+)" replacementPattern="#xpath(//div1[@n='$1']/div2[@n='$2']/div3[@n='$3']/l[@n='$4']">
	     <p>A canonical reference is assembled with
	     <list>
	       <item>the name of the <label>work</label>: the
	       <att>n</att> of a <gi>div1</gi>,</item>
	       <item>a space,</item>
	       <item>the number of the <label>book</label>: the
	       <att>n</att> of a child <gi>div2</gi>,</item>
	       <item>a full stop</item>
	       <item>the number of the <label>poem</label>: the
	       <att>n</att> of a child <gi>div3</gi>,</item>
	       <item>the line number: the <att>n</att> value of a
	       child <gi>l</gi></item>
	     </list>
	     </p>
	   </cRefPattern>
           <cRefPattern matchPattern="([^ ]+) ([0-9]+)\.([0-9]+)" replacementPattern="#xpath(//div1[@n='$1']/div2[@n='$2']/div3[@n='$3']">
	     <p>Same as above, but without the last component (full
	     stop followed by the <gi>l</gi>'s <att>n</att>.</p>
	   </cRefPattern>
           <cRefPattern matchPattern="([^ ]+) ([0-9]+)" replacementPattern="#xpath(//div1[@n='$1']/div2[@n='$2']">
	     <p>Same as above, but without the poem component (full
	     stop followed by the <gi>div3</gi>'s <att>n</att>.</p>
	   </cRefPattern>
         </refsDecl>         
    </encodingDesc>
</egXML>
 </p>
<p>The second example encodes the same reference system, again using
the <att>n</att> attribute on the <gi>div1</gi>, <gi>div2</gi>,
<gi>div3</gi>, and <gi>l</gi> tags, but giving the reference string in
full on each tag. If canonical references are made only to lines, the
reference system could be declared as follows: <egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples"><refsDecl>
  <cRefPattern matchPattern="([^ ]+ [0-9]+\.[0-9]+\.[0-9]+)" replacementPattern="#xpath(//l[@n='$1')"/>
</refsDecl></egXML>
Since the entire regular expression is enclosed as a parenthetical
subgroup, the entire canonical reference string is sought as the value
of the <att>n</att> attribute on an <gi>l</gi> element.</p>
<p>In order to handle references to poems as well as to individual
lines, the declaration for the reference system must be more
complicated:
 <egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples"><refsDecl>
  <cRefPattern matchPattern="([^ ]+ [0-9]+\.[0-9]+\.[0-9]+)" replacementPattern="#xpath(//l[@n='$1')"/>
  <cRefPattern matchPattern="([^ ]+ [0-9]+\.[0-9]+)" replacementPattern="#xpath(//div2[@n='$1')"/>
</refsDecl></egXML>
This declaration indicates that the entire reference string must be
sought as the value of the <att>n</att> attribute on a <gi>div1</gi>,
<gi>div2</gi>, <gi>div3</gi>, or <gi>l</gi> element.
 </p>
<p>The third example encodes the same reference system, this time
giving the entire reference string as the value of the
<att>xml:id</att> attribute on the relevant tags. The reference system
declaration for such an encoding could be:
 <egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples"><refsDecl>
  <cRefPattern matchPattern="(.*)" replacementPattern="#$1"/>
</refsDecl></egXML>
although in general there seems to be little advantage in this case:
it is no more difficult to use a standard relative URI reference as
the value of <att>target</att>.</p>
<p>Reference systems recorded by means of milestone tags can also be
declared; the following prose description could be used to declare
the example given in section <ptr target="#CORS5"/>.
<egXML xmlns="http://www.tei-c.org/ns/Examples"><refsDecl>
   <p>Standard references to work, book, poem, and line may be
     constructed from the milestone tags in the text.</p>
</refsDecl></egXML>
Or in this way, using a formal declaration for this reference scheme
derived from edition <val>E1</val>.
<egXML xmlns="http://www.tei-c.org/ns/Examples"><refsDecl>
   <refState ed="E1" unit="work" delim=" "/>
   <refState ed="E1" unit="book" delim="."/>
   <refState ed="E1" unit="poem" delim=":"/>
   <refState ed="E1" unit="line"/>
</refsDecl></egXML>
 </p>
</div></div>
<div type="div2" xml:id="COBI"><head>Bibliographic Citations and References</head>
<p>Bibliographic references (that is, full descriptions of bibliographic
items such as books, articles, films, broadcasts, songs, etc.) or
pointers to them may appear at various places in a TEI text.  They are
required at several points within the TEI header's source description,
as discussed in section <ptr target="#HD3"/>; they may also appear within
the body of a text, either singly (for example within a footnote), or
collected together in a list as a distinct part of a text; detailed
bibliographic descriptions of manuscript or other source materials may
also be required. These Guidelines propose a number of specialized
elements to encode such descriptions, which together constitute the <ident type="class">model.biblLike</ident> class.  
<specList>
<specDesc key="model.biblLike"/>
</specList>
Lists of such elements may also be encoded using the following element:
<specList>
<specDesc key="listBibl"/></specList>
</p>

<p>In printed texts, the individual constituents of a bibliographic
reference are conventionally marked off from each other and from the
flow of text by such features as bracketing, italics, special
punctuation conventions, underlining, etc.  In electronic texts, such
distinctions are also important, whether in order to produce
acceptably formatted output or to facilitate intelligent retrieval
processing,<note place="bottom">For example, to distinguish
<mentioned>London</mentioned> as an author's name from
<mentioned>London</mentioned> as a place of publication or as a
component of a title.</note> quite apart from the need to distinguish
the reference itself as a textual object with particular linguistic
properties.
 </p>

<p>It should be emphasized that for references as for other textual
features, the primary or sole consideration is not how the text should
be formatted when it is printed.  The distinctions permitted by the
scheme outlined here may not necessarily be all that particular
formatters or bibliographic styles require, although they should prove
adequate to the needs of many such commonly used software
systems.<note place="bottom">Among the bibliographic software systems
and subsystems consulted in the design of the <gi>biblStruct</gi>
structure were BibTeX, Scribe, and ProCite.  The distinctions made by
all three may be preserved in <gi>biblStruct</gi> structures, though
the nature of their design prevents a simple one-to-one mapping from
their data elements to TEI elements.  For further information, see
section <ptr target="#COBIOT"/>.</note> The features distinguished and
described below (in section <ptr target="#COBICO"/>) constitute a set
which has been useful for a wide range of bibliographic purposes and
in many applications, and which moreover corresponds to a great extent
with existing bibliographic and library cataloguing practice.  For a
fuller account of that practice as applied to electronic texts see
section <ptr target="#HD3"/>; for a brief mention of related library
standards see section <ptr target="#HD8"/>.
 </p>

<p>The most commonly used elements in the <ident
type="class">model.biblLike</ident> class are <gi>biblStruct</gi> and
<gi>bibl</gi>. <gi>biblStruct</gi> will usually be easier to process
mechanically than <gi>bibl</gi> because its structure is more
constrained and predictable. It is suited to situations in which the
objective is to represent bibliographic information for machine
processing directly by other systems or after conversion to some other
bibliographic markup formats such as BibTeXML or MODS. Punctuation
delimiting the components of a print citation is not permitted
directly within a <gi>biblStruct</gi> element; instead, the presence
and order of child elements must be used to reconstruct the
punctuation required by a particular style.
</p>

<p>By contrast, <gi>bibl</gi> allows for considerable flexibility in
that it can include both delimiting punctuation and unmarked-up text;
and its constituents can also be ordered in any
way. This makes it suitable for marking up bibliographies in existing
documents, where it is considered important to preserve the form of references
in the original document, while also  distinguishing 
important pieces of information such as authors, dates, publishers, and so
on. <gi>bibl</gi> may also be useful when encoding <soCalled>born digital</soCalled>
documents which require use of a specific style
guide when rendering the content;
its flexibility makes it easier to provide all the information for a reference in the
exact sequence required by the target rendering, including any
necessary punctuation and linking words, rather than using an XSLT
stylesheet or similar to reorder and punctuate the data.
</p>
<p>
The third element in the <ident type="class">model.biblLike</ident>
class, <gi>biblFull</gi>, has a content model based on the
<gi>fileDesc</gi> element of the
TEI header. Both are based on the International Standard for Bibliographic
Description (ISBD), which forms the basis of several national standards for bibliographic
citations. The order of child elements in both
<gi>biblFull</gi> and <gi>fileDesc</gi>  corresponds to the order
of bibliographic description <soCalled>areas</soCalled> in ISBD with two
minor exceptions. First, the <gi>extent</gi> element, corresponding to the <term>physical
description area</term> in ISBD, appears just after the <term>publication,
production, distribution, etc. area</term> in ISBD, not before it as in
TEI. Second, <gi>biblFull</gi> and <gi>fileDesc</gi> use the child
element <gi>publicationStmt</gi> to cover not only the <term>publication,
production, distribution, etc. area</term> but also the <term>resource identifier
and terms of availability area</term> associated with that publication.
Despite these inconsistencies, users
encoding citations and attempting to format them according to a
standard that closely adheres to ISBD may find that <gi>biblFull</gi>,
used with its child elements and without delimiting punctuation,
provides an appropriate granularity of encoding with elements that can
easily be rendered for the reader. However, it is important to note that some
ISBD-derived citation formats (such as ANSI/NISO Z39.29 and 
ГОСТ 7.1) are not entirely conformant to ISBD either, since they may begin with a statement of authorship that does not map to
the ISBD statement of responsibility.  
</p>



<div type="div3" xml:id="COBITY"><head>Methods of Encoding Bibliographic References and Lists of References</head>
<p>
The members of the <ident type="class">model.biblLike</ident> class
all share a number of possible component sub-elements.  For the
<gi>bibl</gi> and <gi>biblStruct</gi> elements, exactly the same
sub-elements are concerned, and they are described together in section
<ptr target="#COBICO"/>; for the <gi>biblFull</gi> element, the
sub-elements concerned are fully described in section <ptr target="#HD2"/>.
 </p>
<p>Different levels of specific tagging may be appropriate in different
situations.  In some cases, it may be felt necessary to mark just the
extent of the reference itself, with perhaps a few distinctions being
made within it (for example, between the part of the reference which
identifies a title or author and the rest).  Such references, containing
a mixture of text with specialized bibliographic elements, are regarded
as <gi>bibl</gi> elements, and tagged accordingly.  For example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>A book which had a great influence on him
was <bibl>Tufte's <title>Envisioning
Information</title></bibl>, although he may
never have actually read it.</p></egXML>
Indeed, some encoders may find it unnecessary to mark the bibliographic
reference at all:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>A book which had a great influence on him
was Tufte's <title>Envisioning Information</title>,
although he may never have actually read it.</p></egXML>
 </p>
<p>Some bibliographic references are extremely elliptical, often only a
string of the form <mentioned>Baxter, 1983</mentioned>.  If no further details of
Baxter's book are given in the source text and none is supplied by the
encoder, then the reference thus given should be tagged as a
<gi>bibl</gi>:
<egXML xmlns="http://www.tei-c.org/ns/Examples">All of this is of course much more fully treated
in <bibl>Baxter, 1983</bibl>.</egXML>
In general, however, normal modern bibliographic practice, and these
Guidelines, distinguish between a bibliographic <term rend="noindex">reference</term>,<index><term>references</term><index><term>bibliographic</term></index></index><index><term>bibliographic references</term></index>
which is a self-sufficient description of a bibliographic item, and a
bibliographic <term rend="noindex">pointer</term>,<index><term>bibliographic pointers</term></index><index><term>pointers</term><index><term>bibliographic</term></index></index>
which is a short-form citation (e.g. <mentioned>Baxter,
1983</mentioned>) which serves usually as a place-holder or pointer to
a full long-form reference found elsewhere in the text.  The usual
encoding of short-form references such as <mentioned>Baxter,
1983</mentioned> is not as <gi>bibl</gi> elements but as
cross-references to such elements; see section <ptr target="#COBIXR"/>
below.  </p> 
<p>In cases where the encoder wishes to impose more structure on the
bibliographic information, for example to make sure it conforms to a
particular stylesheet or retrieval processor, the <gi>biblStruct</gi>
element should be used.  Note that several of the features in this and
later examples are explained later in the current section.
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COBITY-eg-240"><biblStruct>
   <monogr>
      <author>
        <persName>
          <forename>Edward</forename>
          <forename full="init">R.</forename>
          <surname>Tufte</surname>
        </persName>
        <idno type="scopus">6506403994</idno>
        <idno type="lcaf">http://id.loc.gov/authorites/names/n50012763.html</idno>
      </author>
      <title level="m">Envisioning Information</title>
      <imprint>
         <pubPlace>Cheshire, Conn.</pubPlace>
         <publisher>Graphics Press</publisher>
         <date when="1990"/>
      </imprint>
   </monogr>
</biblStruct></egXML>
 </p>
<p>A more complex and detailed bibliographic structure is provided by the
<gi>biblFull</gi> element defined in the TEI header module. This
element is provided as a means of embedding the file description of
one existing digital text within that of another (see further section
<ptr target="#HD2"/>); however, its use is not confined to digital
texts, and it may be used in the same way as any other bibliographic
element, as in this example:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COBITY-eg-240"><biblFull>
   <titleStmt>
      <title>Envisioning Information</title>
      <author>Tufte, Edward R[olf]</author>
   </titleStmt>
   <extent>126 pp.</extent>
   <publicationStmt>
      <publisher>Graphics Press</publisher>
      <pubPlace>Cheshire, Conn. USA</pubPlace>
      <date>1990</date>
   </publicationStmt>
</biblFull></egXML>
 </p>
<p>A list of bibliographic items, of whatever kind, may be treated in
the same way as any other list (see section <ptr target="#COLI"/>).
Alternatively, the specialized <gi>listBibl</gi> element may be used.
The difference between the two is that a <gi>list</gi> contains
<gi>item</gi> elements, within which bibliographic elements (<gi>bibl</gi>,
<gi>biblStruct</gi>, or <gi>biblFull</gi>) may appear, as well as other
phrase- and paragraph-level elements, whereas the <gi>listBibl</gi> may
contain only bibliographic elements, optionally preceded by a heading and a
series of introductory paragraphs.  For most purposes, good practice would usually
require that a <gi>listBibl</gi> contain only one kind of bibliographic
element, though the following example combines both fully structured
<gi>biblStruct</gi> and informal <gi>bibl</gi> elements:

<egXML xmlns="http://www.tei-c.org/ns/Examples" xml:space='preserve'><listBibl>
   <head>Bibliography</head>
   <biblStruct xml:id="NELSON80">
      <analytic>
        <author>
          <persName>
            <surname>Nelson</surname>
            <forename>Theodore</forename>
            <forename>Holm</forename>
          </persName>
        </author>
        <title level="a">Replacing the printed word:
             a complete literary system</title>
      </analytic>
      <monogr>
         <title level="m">Information Processing '80:  Proceedings of the IFIPS
             Congress, October 1980</title>
         <editor>
           <persName>
             <forename>Simon</forename>
             <forename>H.</forename>
             <surname>Lavington</surname>
           </persName>
         </editor>
         <imprint>
            <publisher>North-Holland</publisher>
            <pubPlace>Amsterdam</pubPlace>
            <date when="1980"/>
         </imprint>
         <biblScope unit="pp" from="1013"
		    to="1023">1013–23</biblScope>
      </monogr>
      <note>Apparently a draft of section 4 of 
          <title level="m">Literary Machines</title>.</note>
   </biblStruct>   

  <bibl xml:id="NELSON88"><author><persName><forename>Ted</forename>
<surname>Nelson</surname></persName></author>: 
<title level="u">Literary Machines</title> (privately published, 
<date when="1987">1987</date>).</bibl>
  
  <bibl xml:id="BAXTER88"><author><persName><surname>Baxter</surname>, 
<forename>Glen</forename></persName></author>:
<title level="m">Glen Baxter His Life: the years of struggle</title> 
<pubPlace>London</pubPlace>: <publisher>Thames and Hudson</publisher>, 
<date when="1988">1988</date>.</bibl>
</listBibl></egXML>
  
  This example also demonstrates the way that bibliographical markup of 
  authors, titles, dates etc. can be handled differently in 
  <gi>biblStruct</gi>s and <gi>bibl</gi>s. In the two <gi>bibl</gi> 
  items, the key information is marked up, but it is presented in an 
  order which makes it suitable for direct rendering, with the punctuation
  included.</p>

<p>The <gi>listBibl</gi> element is most appropriate 
for  a more formal bibliography. The same <gi>bibl</gi> or
<gi>biblStruct</gi> elements may however be embedded within an
ordinary list, thus allowing them to be mixed with running prose or
presented informally, as in the following version of the same example:


<egXML xmlns="http://www.tei-c.org/ns/Examples"><list>
   <head>Bibliography</head>
   <item>
      <bibl xml:id="NEL80">        
         <author>Nelson, T. H.</author>
         <title level="a">Replacing the printed word:
          a complete literary system</title>.
         <title level="m">Information Processing '80:
          Proceedings of the IFIPS Congress, October 1980</title>.
         <editor>Simon H. Lavington</editor>
         <publisher>North-Holland</publisher>:
         <pubPlace>Amsterdam</pubPlace>,
         <date>1980</date>.
         <biblScope>pp 1013–23
     </biblScope>
         <note>Apparently a draft of section 4 of 
      <title>Literary Machines</title>.</note>
      </bibl>
   </item>
   <item>
      <bibl xml:id="NEL88">Ted Nelson: <title>Literary Machines</title>
       (privately published, 1987)</bibl>
   </item>
   <item>
      <bibl xml:id="BAX88">    
         <author>Baxter, Glen</author>
         <title>Glen Baxter His Life: the years of struggle</title>
          London: Thames and Hudson, 1988.
      </bibl>
   </item>
</list></egXML>
 </p>
</div>
<div type="div3" xml:id="COBICO"><head>Components of Bibliographic References</head>

<p>This section discusses commonly occurring components of
bibliographic references and elements used for encoding them.  They fall
into four groups:
<list rend="bulleted">
<item>elements for grouping components of the <term>analytic</term>,
<term>monographic</term>, and <term>series</term> levels in a
structured bibliographic reference</item>
<item>titles of various kinds, and statements of intellectual
responsibility (authorship, etc.)</item>
<item>information relating to the publication, pagination, etc. of an
item (most of these
constitute the default members of the <ident type="class">model.biblPart</ident> class) </item>
<item>annotation, commentary, and further detail</item></list> The
following sections describe the elements which may be used to
represent such information within a <gi>bibl</gi> or
<gi>biblStruct</gi> element.  Within the former, elements from the
<ident type="class">model.biblPart</ident> class, other phrase-level
elements, and plain text may be combined without other constraint;
within the latter, such of these elements as exist for a given
reference must be distinguished, and must also be presented in a
specific order, discussed further below (section <ptr
target="#COBICOO"/>).
 </p>
<div type="div4" xml:id="COBICOL"><head>Analytic, Monographic, and Series Levels</head>
<p>In common library practice a clear distinction is made between an
individual item within a larger collection and a free-standing book,
journal, or collection.  Similarly a book in a series is distinguished
sharply from the series within which it appears.  An article forming
part of a collection which itself appears in a series thus has a
bibliographic description with three quite distinct levels of
information:
<list rend="numbered">
<item>the <term rend="noindex">analytic</term> level, giving the title, author, etc., of the article;
 </item>
<item>the <term rend="noindex">monographic</term>
level, giving the title, editor, etc., of the collection;
 </item>
<item>the <term rend="noindex">series</term>
level, giving the title of the series, possibly the names of its
editors, etc., and the number of the volume within that series.
 </item></list>
In the same way, an article in a journal requires at least two levels of
information:  the analytic level describing the article itself, and the
monographic level describing the journal.
</p>
<p>A different identifying number may be supplied for any of these
three items, that is, for the analytic item, the monographic item, or
the series. </p>
  <p>Within <gi>bibl</gi>, these three levels may be distinguished simply by the use
    of the <att>level</att> attribute on <gi>title</gi>. They may also be distinguished through
    the practice of employing nested <gi>bibl</gi> elements. In this example, for
    instance, the monograph-level component of the reference is encapsulated in
    its own <gi>bibl</gi> within the main <gi>bibl</gi> for the article:
    <egXML xmlns="http://www.tei-c.org/ns/Examples">
      <bibl type="article" subtype="magazine_article" xml:id="beaupaire_1911">
        <author><name><surname>Beaupaire</surname>
          (<forename>Edmond</forename>)</name></author>,
        <title level="a">A propos de la rue de la Femme-sans-Tête</title>,
        <bibl type="monogr">
          <title level="j">La Cité</title>,
          <date when="1911-01">janvier 1911</date>, pp. <biblScope
            unit="pp" from="5" to="17">5-17</biblScope>.
        </bibl>
      </bibl>
      </egXML>
  </p>

  <p>Within <gi>biblStruct</gi>, the levels are distinguished by the use of the
    following distinct elements:

<specList><specDesc key="analytic"/><specDesc key="monogr"/><specDesc key="series"/></specList>
 </p>

<p>For purposes of TEI encoding, journals and anthologies are both
treated as monographs; a journal title should thus be tagged as a
<tag>title level="j"</tag> element within
a <gi>monogr</gi> element.  Individual articles in the journal or
collected texts should be treated at the <soCalled>analytic</soCalled>
level.  When an article has been printed in more than one journal or
collection, the bibliographic reference may have more than one
<gi>monogr</gi> element, each possibly followed by one or more
<gi>series</gi> elements.  A <gi>series</gi> element always relates to
the most recently preceding <gi>monogr</gi> element.  (Whether
reprints of an article are treated in the same bibliographic reference
or a separate one varies among different styles.  Library lists
typically use a different entry for each publication, while academic
footnoting practice typically treats all publications of the same
article in a single entry.)
 </p>
<p>The <gi>biblScope</gi> element is used to supply further
information about the location of some part of a bibliographic
reference. It specifies where to find the component in which it appears
within the immediately preceding component of a different level.  </p>
<p>In the following example, Schacter's article
<title>Iolaos</title> appeared on pages 64 to 70 of a volume entitled
<title>Herakles to Poseidon</title>, which was itself the second of a
four volumes published together under the title <title>Cults of
Boitia</title>;  this last title constituted the 38th volume in the
series of <title>Bulletin of the Institute of Classical Studies
Supplements</title>:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><biblStruct>
<analytic>
<author>Albert Schachter</author>
<title level="a">Iolaos</title>
</analytic>
<monogr>
<title level="m">Herakles to Poseidon</title>
<imprint><date>1986</date></imprint>
<biblScope unit="pp">64-70</biblScope>
</monogr>
<monogr>
<title level="m">Cults of Boiotia</title>
<imprint><pubPlace>London</pubPlace></imprint>
<extent>4 vols.</extent>
<biblScope unit="part">2</biblScope>
</monogr>
<series>
<title level="s">Bulletin of the Institute of Classical Studies
Supplements</title>
<biblScope unit="vol">38</biblScope>
</series>
</biblStruct>
</egXML>
</p>

<p>In the following example, the article cited  has been published
twice, once in a journal (where it appeared in volume 40, on pages 3
-46 of the issue of October 1986)  and once as a free-standing item,
which appeared as number 11 of  a
German language series. 
<egXML xmlns="http://www.tei-c.org/ns/Examples"><biblStruct>
   <analytic>
      <author>
      <persName><surname>Thaller</surname>
      <forename>Manfred</forename></persName></author>
      <title level="a">A Draft Proposal for a Standard for the
                       Coding of Machine Readable Sources</title>

   </analytic>
   <monogr>
      <title level="j">Historical Social Research</title>
      <imprint>
         <date when="1986-10">October 1986</date>
      </imprint>
         <biblScope unit="vol">40</biblScope>
         <biblScope unit="pp" from="3" to="46">3-46</biblScope>   </monogr>
   <monogr>
      <title level="m">Modelling Historical Data:
                       Towards a Standard for Encoding and 
                       Exchanging Machine-Readable Texts</title>
      <editor><persName><forename>Daniel</forename><forename full="init">I.</forename>
      <surname>Greenstein</surname></persName></editor>
      <imprint xml:lang="de">
         <pubPlace>St. Katharinen</pubPlace>
         <publisher>Max-Planck-Institut für Geschichte
                    In Kommission bei
                    Scripta Mercaturae Verlag</publisher>
         <date when="1991"/>
      </imprint>
   </monogr>
   <series xml:lang="de">              
      <title level="s">Halbgraue Reihe
                       zur Historischen Fachinformatik</title>
      <respStmt>
         <resp>Herausgegeben von</resp>
         <name type="person">Manfred Thaller</name>
         <name type="org">Max-Planck-Institut für Geschichte</name>
      </respStmt>
      <title level="s">Serie A: Historische Quellenkunden</title>
      <biblScope unit="vol">11</biblScope>
   </series>
</biblStruct></egXML>
 </p>


<p>The practice of analytic vs. monographic citation, as described here,
should be distinguished from the practice of including within one
citation a reference to another work, which the encoder considers
to be related to in some way: see further <ptr target="#COBIRI"/> below.</p>
<p>If an identifier is available for the analytic item, it should be
represented by means of an <gi>idno</gi> element placed within the
<gi>analytic</gi> element, as in the following example where a DOI
(Digital Object identifier) is supplied for the article in question.</p>

<egXML xmlns="http://www.tei-c.org/ns/Examples">
<biblStruct>
<analytic>
<author>
<forename>James</forename>
<forename>H.</forename>
<surname>Coombs</surname>
</author>
<author>
<forename>Allen</forename>
<surname>Renear</surname>
</author>
<author>
<forename>Steven</forename>
<forename>J.</forename>
<surname>DeRose</surname>
</author>
<title level="a">Markup Systems and The Future of Scholarly Text
Processing</title>
<idno type="DOI">10.1145/32206.32209</idno>
</analytic>
<monogr>
<title level="j">Communications of the ACM</title>
<imprint><date>1987</date></imprint>
<biblScope unit="vol">30</biblScope>
<biblScope unit="issue">11</biblScope>
<biblScope unit="pp">933–947</biblScope>
</monogr>
<ref type="url">http://xml.coverpages.org/coombs.html</ref>
</biblStruct>
</egXML>
<p>Punctuation must not appear between the elements within a
structured bibliographic entry encoded with <gi>biblStruct</gi> or <gi>biblFull</gi>,
unless it is contained within the elements it delimits.  When (as in
most of the examples in this chapter) entries are encoded without any
inter-element punctuation, they can be usually be processed more
easily by rendering systems able to output bibliographic
references in any of several styles.
 </p>
<p>Within a <gi>bibl</gi> however, it is possible and often convenient
to include punctuation.
<egXML xmlns="http://www.tei-c.org/ns/Examples">
  <bibl xml:id="NELSON_80">
    <author>
      <persName>
	<surname>Nelson</surname>, 
	<forename>T.</forename>
  <forename>H.</forename>
      </persName>
    </author>
    <date when="1980">1980</date>.
    <title level="a">Replacing the printed word: a complete literary
    system</title>. In <title level="m">Information Processing '80: Proceedings of the
    IFIPS Congress, October 1980</title>,
    ed.
    <editor>
      <persName>
	<forename>Simon</forename>
  <forename>H.</forename>
	<surname>Lavington</surname>
      </persName>
    </editor>,
    <biblScope unit="pp">1013-23</biblScope>.
    <pubPlace>Amsterdam</pubPlace>: <publisher>North-
    Holland</publisher>. (<note>Apparently a draft of section 4 of
    <ref target="#NELSON_88"><title level="m">Literary
    Machines</title></ref>.</note>)
  </bibl></egXML>
This example shows the components sequenced and punctuated
according to the Chicago style<?tei winita (Reference needed)?>, with all the relevant data items marked up appropriately. This
markup approach can provide easy rendering, if only one styleguide is
targeted, or an original source document uses a specific styleguide,
while still allowing for automated recovery of key data items such as
names of authors, titles etc. </p>

<specGrp xml:id="DCOBILV" n="Levels of bibliographic information"><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/analytic.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/monogr.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/series.xml"/>

</specGrp>
</div>
<div type="div4" xml:id="COBICOR"><head>Titles, Authors, and Editors</head>
<p>Bibliographic references typically include the
title of the work being cited and the names of those intellectually
responsible for it.  For articles in journals or collections, such
statements should appear both for the analytic and for the monographic
level.  The following elements are provided for tagging such elements:
<specList>
<specDesc key="title"/><specDesc key="author"/>
<specDesc key="editor"/>
<specDesc key="respStmt"/>
<specDesc key="resp"/>
<specDesc key="name"/>
<specDesc key="meeting"/>
<specDesc key="sponsor"/>
<specDesc key="funder"/>
<specDesc key="distributor"/>
<specDesc key="principal"/>
</specList>
The elements <gi>author</gi>, <gi>editor</gi>, <gi>respStmt</gi>, <gi>meeting</gi>, <gi>sponsor</gi>, <gi>funder</gi>, and <gi>principal</gi>
are the default members of the <ident type="class">model.respLike</ident> class, a subclass of the <ident type="class">model.biblPart</ident> class to which the constituents of
the <gi>bibl</gi> element belong.</p>

<p>In bibliographic references, all titles should be tagged as such,
whether analytic, monographic, or series titles.  The single element
<gi>title</gi> is used for all these cases.  When it appears directly
within an <gi>analytic</gi>, <gi>monogr</gi>, or <gi>series</gi>
element, <gi>title</gi> is interpreted as belonging to the appropriate
level.  However, it is recommended that the <att>level</att> attribute be used to signal this explicitly.</p>
<p>It is a semantic error to
give a value for the <att>level</att> attribute which is inconsistent
with the context. The <att>level</att>
value <val>a</val> implies the analytic level; the values
<val>m</val>, <val>j</val>, and <val>u</val> imply the monographic level; the value <val>s</val> implies the series level.  Note, however, that the
semantic error occurs only if the nested title is directly enclosed by
the <gi>analytic</gi>, <gi>monogr</gi>, or <gi>series</gi> element; if
it is enclosed only indirectly (i.e., nested more deeply), no semantic error need be present.  For
example, the analytic title may contain a monographic title, as in the
following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><biblStruct>
   <analytic>
      <author ref="http://id.loc.gov/authorities/names/no2001067434">
        <persName>
          <forename>Lucy</forename>
          <forename>Allen</forename>
          <surname>Paton</surname>
        </persName>
      </author>
      <title>Notes on Manuscripts of the 
         <title level="m" xml:lang="fr">Prophécies de Merlin</title>
      </title>
   </analytic>
   <monogr>
      <title level="j">PMLA</title>
         <imprint><date>1913</date></imprint>
         <biblScope unit="vol">8</biblScope>
         <biblScope unit="pp">122</biblScope>   </monogr>
</biblStruct></egXML>
In this case, the analytic title <q>Notes on Manuscripts of the
<title level="m" xml:lang="fr">Prophécies de Merlin</title></q> needs no <att>level</att>
attribute because it is directly contained by an <gi>analytic</gi>
  element; the monographic title contained within it, <q xml:lang="fr">Prophécies de Merlin</q>, is not semantically erroneous because it is not directly contained by the <gi>analytic</gi> element.</p>
<p>In some bibliographic applications, it may prove useful to
distinguish main titles from subordinate titles, parallel titles, etc.
The <att>type</att> attribute is provided to allow this distinction to
be recorded.
 </p>
<p>The following reference, from a national standard for bibliographic
references,
illustrates this type of analysis with its distinction between main
and subordinate titles. Note that this uses the more flexible
<gi>bibl</gi>, rather than the structured <gi>biblStruct</gi>
element: consequently, there is no requirement to tag all the
components of the reference (notably the authors).
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COBICOR-eg-246"><bibl>Saarikoski, Pirkko-Liisa, and Paavo Suomalainen,
   <title level="a" type="main">Studies on the physiology of
     the hibernating hedgehog, 15</title>
   <title level="a" type="sub">Effects of seasonal
     and temperature changes on the in vitro glycerol release from 
     brown adipose tissue</title>
   <title level="j">Ann. Acad. Sci. Fenn., Ser. A4</title>
   <date>1972</date> 
   <biblScope unit="vol">187</biblScope>
   <biblScope unit="pp" to="4">1-4</biblScope>
</bibl></egXML>
<!-- example from ANSI Z39.29, sec. A.2.2.1, p.34 -->
 </p>
<p>Slightly more complex is the distinction made below among main,
subordinate, and parallel titles, in an example from the same source (p.
63).  The punctuation and the bibliographic analysis are those given in
ANSI Z39.29-1977; the punctuation is in the style prescribed by the
International Standard Bibliographic Description (ISBD).<note place="bottom">The analysis is not wholly unproblematic:  as the text of the
standard points out, the first subordinate title is subordinate only to
the parallel title in French, while the second is subordinate to both
the English main title and the French parallel title, without this
relationship being made clear, either in the markup given in the example
or in the reference structure offered by the standard.</note> Again,
it is only because this example uses <gi>bibl</gi> rather than <gi>biblStruct</gi>,
that specific punctuation may be included between the component
elements of the reference.
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COBICOR-eg-246"><bibl>Tchaikovsky, Peter Ilich.
<title level="m" type="main">The swan lake ballet</title>
= <title level="m" type="parallel" xml:lang="fr">Le lac des cygnes</title>
: <title level="m" type="sub" xml:lang="fr">grand ballet en 4 actes</title>
: <title level="m" type="sub">op. 20</title>
[Score].
New York:  Broude Brothers; [1951] (B.B. 59). vi, 685 p.</bibl></egXML>
<!-- example from ANSI Z39.29, sec. A.12.2, p.63 -->
 </p>

<p>The elements <gi>author</gi> and <gi>editor</gi> have fairly
obvious significance for printed
books and articles; for other kinds of
bibliographic items their proper usage may be less obvious.  The
<gi>author</gi> element should be used for the person or agency with
primary responsibility for a work's intellectual content, and the
element <gi>editor</gi> for other people or agencies with some responsibility for
that content, whether or not they are called
<soCalled>editor</soCalled>.  An organization such as a radio or
television station is usually accounted <soCalled>author</soCalled> of
a broadcast, for example, while the author of a government report will
usually be the agency which produced it. A translator, illustrator, or
compiler, may however be marked by means of the <gi>editor</gi> element,
optionally using the <att>role</att> attribute to specify the nature
of their responsibility more exactly.
 </p>
<p>Many bibliographic and Linked Data applications require disambiguation 
of author names using unique identifiers. Both the <gi>author</gi> and 
<gi>editor</gi> elements may contain one or more <gi>idno</gi> elements, 
to supply such identifiers. Alternatively, if only a single identifier 
is to be recorded, the <att>key</att> or 
<att>ref</att> attribute may be used, as further discussed in <ptr target="#CONARS"/>.</p>
<p>For anyone else with responsibility for the work, the
<gi>respStmt</gi> element should be used. The nature of the
responsibility is indicated by means of a <gi>resp</gi> element, and
the person, organization, etc. responsible by a <gi>name</gi>,
<gi>persName</gi>, or <gi>orgName</gi> element. Strings such as
<q>unknown</q> may be encoded using the <gi>rs</gi> element. A
<gi>respStmt</gi> should comprise either at least one of the four
naming elements (<gi>name</gi>, <gi>persName</gi>, <gi>orgName</gi>,
or <gi>rs</gi>) followed by one or more <gi>resp</gi> elements, or at
least one <gi>resp</gi> element followed by one or more of the four
naming elements.</p>
<p>Examples of
secondary responsibility of this kind include the roles of
illustrator, translator, encoder, and annotator. The <gi>respStmt</gi>
element may also be used for editors, if it is desired to record the
specific terms in which their role is described.</p>
<p>Examples of <gi>author</gi> and <gi>editor</gi> may be found in
sections <ptr target="#COBITY"/>, and <ptr target="#COBICOL"/>; wherever
<gi>author</gi> and <gi>editor</gi> may occur, the <gi>respStmt</gi>
element may also occur. When one of these elements precedes or
immediately follows a title, it applies to that title; when it follows
an <gi>edition</gi> element or occurs within an edition statement, it
applies to the edition in question.
 </p>
<p>In this example, the <gi>respStmt</gi> elements apply to the work as
a whole, not merely to the first edition:
<!--here's the text as it appears in ISO 690:
Lominandze, DG. Cyclotron waves in plasma. Translated by AN. Dellis;
edited by SM. Hamberger. 1st ed. Oxford : Pergamon Press, 1981. 206
p. International series in natural philosophy. Translation of:
Ciklotronnye volny v plazme. ISBN 0-08-021680-3.
-->
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COBICOR-eg-248"><bibl>    
   <author>Lominandze, DG</author>.
   <title level="m">Cyclotron waves in plasma</title>.
   <respStmt>
      <resp>Translated by</resp>
      <name>AN. Dellis</name>
   </respStmt>;
   <respStmt>
      <resp>edited by</resp>
      <name>SM. Hamberger</name>
   </respStmt>.
   <edition>1st ed.</edition>
   <pubPlace>Oxford</pubPlace>:
   <publisher>Pergamon Press</publisher>,
   <date>1981</date>.
   <extent>206 p.</extent>
   <title level="s">International series in natural philosophy</title>.
   <note place="inline">Translation of:
   <title xml:lang="ru-Latn" level="m">Ciklotronnye volny v
   plazme</title>.
   <idno type="isbn">ISBN 0-08-021680-3</idno>. 
   </note>
</bibl></egXML>
<!-- from ISO 690: 1987, clause 4.1, p. 2. -->
<!--   <title xml:lang="ru" level="m">Циклотронные волны в плазме</title>.-->
<!-- transliterated back into russian per ISO 9 (1995)  --> </p>
<p>This example retains the original punctuation and editorial conventions of
the source (ISO 690: 1987) and is therefore encoded using
the <gi>bibl</gi> element.</p>
<p>In the following example, by contrast, the <gi>respStmt</gi> element applies
to the edition, and not to the collection <foreign>per se</foreign> (Moser and Tervooren
were not responsible for the first thirty-five printings). As is
permissible within a <gi>biblStruct</gi> element, the component elements
have been reordered from their appearance on the title
page of the volume in order to ensure the correct relationship of the
collection title, the edition statement, and the statement of
responsibility. 
<egXML xml:lang="de" xmlns="http://www.tei-c.org/ns/Examples"><biblStruct>
   <monogr xml:lang="de">
      <title level="m">Des Minnesangs Frühling</title>
      <note place="inline">Mit 1 Faksimile</note>
      <edition>36., neugestaltete und erweiterte Auflage</edition>
      <respStmt>
         <resp>Unter Benutzung der Ausgaben von <name>Karl 
            Lachmann</name> und <name>Moriz Haupt</name>, <name>Friedrich
            Vogt</name> und <name>Carl von Kraus</name> bearbeitet von</resp>
         <name>Hugo Moser</name>
         <name>Helmut Tervooren</name>
      </respStmt>
         <imprint>
	<pubPlace>Stuttgart</pubPlace>
	<publisher>S. Hirzel Verlag</publisher>
	<date>1977</date>
      </imprint>
      <biblScope unit="vol">I Texte</biblScope>
</monogr>
</biblStruct></egXML><!-- reference taken from book in hand by MSM. -->
	<!-- note that from the book it is impossible to -->
	<!-- tell whether this is the 36th edition of -->
	<!-- MF or of Moser/Tervooren's edition.  -->
</p>
<p>The party with a particular responsibility for the intellectual
content may vary over time. Likewise, a given individal's
responsibility or role may change over time. These situations may be
recorded with the <gi>respStmt</gi> element. For example, the
following could be used when one proofreader took over for another.
<egXML xmlns="http://www.tei-c.org/ns/Examples"><respStmt>
  <resp>proofreading</resp>
  <persName from="1994-02" to="1994-05">Ashley Cross</persName>
  <persName from="1994-06" to="1994-10">Loren Noveck</persName>
</respStmt></egXML>
The following example records the fact that one individual had two 
distinctly different intellectual responsibilities at different times.
<egXML xmlns="http://www.tei-c.org/ns/Examples"><respStmt>
  <persName>Erica Dillon</persName>
  <resp when="2000-08">annotated uncredited citations</resp>
  <resp when="2001-03">encoded named entities</resp>
</respStmt></egXML>
</p>
<p>Another form of <soCalled>responsibility</soCalled> arises when a
work is published as the outcome of a conference, workshop
or similar meeting. The <gi>meeting</gi> element may be used to supply
this information, as in the following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><biblStruct>
   <monogr>
     <title level="m">Proceedings of a workshop on corpus resources</title>
     <respStmt>
       <resp>Programme Organizer</resp>
       <name>Geoffrey Leech</name>
     </respStmt>
     <meeting>DTI Speech and Language Technology Club meeting, 3-4
     January 1990, Wadham College, Oxford</meeting>
     <imprint><pubPlace>Oxford</pubPlace></imprint>
   </monogr>
</biblStruct>
</egXML>

</p>

<specGrp xml:id="DCOBICOR" n="Author, title, etc.">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/author.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/editor.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/respStmt.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/resp.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/title.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/meeting.xml"/>
</specGrp>
</div>
<div type="div4" xml:id="COBICOD"><head>Document Identifiers</head>
<p>Many bibliographic references include identifiers for a work to help with precise identification of an appropriate document. For example, a book in the <title>Short Title Catalogue</title> could be referenced with its STC number:
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<biblStruct>
  <monogr>
    <author>
      <forename>John</forename>
      <surname>Downame</surname>
    </author>
    <title type="short">Foure treatises tending to disswade all Christians from foure no lesse hainous then common sinnes</title>
    <idno type="stc2ndEd">7141</idno>
    <imprint>
      <pubPlace>At London</pubPlace>
      <publisher>Imprinted by Felix Kyngston, for William Welby, and are to be sold at his shop in Pauls Church-yard at the signe of the Greyhound</publisher>
      <date when="1609">1609</date>
    </imprint>
  </monogr>
</biblStruct>
</egXML>
</p>
<p>However, some bibliographic references actually <emph>require</emph> identifiers of various types because they do not include a statement of the title and the names of those intellectually responsible for it. The following elements may be used for such purposes:
<specList>
  <specDesc key="orgName"/>
  <specDesc key="idno"/>
  <specDesc key="classCode"/>
  <specDesc key="date"/>
</specList>
</p>
<p>For example, a citation to a patent typically includes a country or organization code (a two-character code identifying a patent authority) and a serial number for the patent (whose structure varies by patent authority). The citation might also contain a <term>kind code</term> (which characterizes a particular publication for the patent and which corresponds to a specific stage in the patent procedure) and the date when the patent was filed with or published by the issuing authority. For bibliographic references to patents, the above elements may be used as follows:
<list rend="bulleted">
<item><gi>orgName</gi>, within <gi>authority</gi>, may be used to contain the code of the patent authority. The <att>type</att> attribute may be used to specify the type of patent authority (such as a national patent office or a supra-national patent organization).</item>
<item><gi>idno</gi> may be used to contain the serial number assigned by the corresponding patent authority.</item>
<item><gi>classCode</gi> may be used to contain the kind code of the patent document.</item>
<item><gi>date</gi> may be used to contain the date of the patent document. The <att>type</att> attribute may be used to specify whether this corresponds to the filing date of a patent application or the publication date of a patent publication.</item>
</list>
</p>
<p>The following reference illustrates an encoding for a patent publication which might be cited in print as "United States patent US 6,885,550 B1, issued April 26, 2005":
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<biblStruct type="patent" status="publication">
  <monogr>
    <authority>
      <orgName type="national">US</orgName>
    </authority>
    <idno type="docNumber">6885550</idno>
    <imprint>
      <classCode scheme="http://www.uspto.gov/">B1</classCode>
      <date type="publicationDate" when="2005-04-26">April 26, 2005</date>
    </imprint>
  </monogr>
</biblStruct>
</egXML>
</p>
</div>
<div type="div4" xml:id="COBICOI"><head>Imprint, Size of a Document, and Reprint Information</head>
<p>By <mentioned>imprint</mentioned> is meant all the information
relating to the publication of a work: the person or organization by
whose authority and in whose name a bibliographic entity such as a
book is made public or distributed (whether a commercial publisher or
some other organization), the place and the date of publication.  It
may also include a full address for the publisher or organization.
A full bibliographic references will usually also specify the number of
pages in a print publication (or equivalent information for non-print
materials), and possibly also the specific location of the material being cited
within its containing publication.  The following elements are
provided to hold this information: 
  <specList>
    <specDesc key="imprint"/>
    <specDesc key="address"/>
    <specDesc key="pubPlace"/>
    <specDesc key="publisher"/>
    <specDesc key="date"/>
    <specDesc key="extent"/>
    <specDesc key="idno"/>
  </specList> 
Members of the model classes
<ident type="class">model.imprintPart</ident>
and <ident type="class">model.dateLike</ident>
may appear inside an <gi>imprint</gi> element in a specific
location within a <gi>biblStruct</gi>, or alternatively, they may
appear alongside any other bibliographic component inside a
<gi>bibl</gi>.
<specList>
    <specDesc key="model.imprintPart"/>
    <specDesc key="model.dateLike"/>
</specList>
 </p>
<p>For bibliographic purposes, usually only the place (or places) of
publication are required, possibly including the name of the country,
rather than a full address; the element <gi>pubPlace</gi> is provided
for this purpose. Where however the full postal address is likely to
be of importance in identifying or locating the bibliographic item
concerned, it may be supplied and tagged using the <gi>address</gi>
element described in section <ptr target="#CONAAD"/>. Alternatively,
if desired, the <gi>rs</gi> or <gi>name</gi> elements described in
section <ptr target="#CONARS"/> may be used; this involves no claim
that the information given is either a full address or the name of a
city.
 </p>
<p>The name of the publisher of an item should be marked using the
<gi>publisher</gi> element even if the item is made public
(<soCalled>published</soCalled>) by an organization other than a
conventional publisher, as is frequently the case with technical
reports:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><biblStruct>
   <monogr>
      <author>Nicholas, Charles K.</author>
      <author>Welsch, Lawrence A.</author>
      <title level="m">On the interchangeability of SGML and ODA</title>
      <idno type="NIST">NISTIR 4681</idno>
      <imprint>
         <pubPlace>Gaithersburg, MD</pubPlace>
         <publisher>
	   National Institute of Standards and Technology
	 </publisher>
         <date when="1992-01">January 1992</date>
      </imprint>
      <extent>19 pp.</extent>
   </monogr>
</biblStruct></egXML>
and with dissertations:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COBICOI-eg-264"><biblStruct>
   <monogr>
      <author>Hansen, W.</author>
      <title level="u">Creation of hierarchic text
      with a computer display</title>
      <idno type="ANL">ANL-7818</idno>
      <note place="inline">Ph.D. dissertation</note>
      <imprint>
         <publisher>Dept. of Computer Science, Stanford Univ.</publisher>
         <pubPlace>Stanford, CA</pubPlace>
         <date when="1971-06">June 1971</date>
      </imprint>
   </monogr>
</biblStruct></egXML>
</p>
<p>In this second example, the <gi>idno</gi> element is used to
provide the identifier allocated to the thesis by the Argonne
National Laboratory. Since it applies to the monographic element,
the <gi>idno</gi> should be provided as a direct child of the <gi>monogr</gi>
element, rather than elsewhere in the <gi>biblStruct</gi> element. </p>
 <p>The specialist elements <gi>publisher</gi> and <gi>distributor</gi> 
  are provided to cover the most common roles related to the production
  and distribution of a bibliographical item, but other roles such as 
  printer and bookseller may also need to be encoded, and <gi>respStmt</gi>
 is available inside <gi>imprint</gi> for this purpose.</p>
 
<p>When an item has been reprinted, especially reprinted without change
from a specific earlier edition, the reprint may appear in a
<gi>monogr</gi> element with only the <gi>imprint</gi> and other details
of the reprint.  In the following example, a microform reprint has been
issued without any change in the title or authorship.  The series
statement here applies only to the second <gi>monogr</gi> element.
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COBICOR-eg-246"><biblStruct>
   <monogr>
      <author>Shirley, James</author>
      <title type="main">The gentlemen of Venice</title>
      <title type="sub">a tragi-comedie presented at the private
          house in Salisbury Court by Her Majesties servants</title>
      <note place="inline">[Microform]</note>
      <imprint>
         <pubPlace>London</pubPlace>
         <publisher>H. Moseley</publisher>
         <date>1655</date>
      </imprint>
      <extent>78 p.</extent>
   </monogr>
   <monogr>
      <imprint>
         <pubPlace>New York</pubPlace>
         <publisher>Readex Microprint</publisher>
         <date>1953</date>
      </imprint>
      <extent>1 microprint card, 23 x 15 cm.</extent>
   </monogr>
   <series>    
      <title level="s">Three centuries of drama: English, 1642–1700</title>
   </series>
</biblStruct></egXML>
<!-- example from ANSI Z39.29, sec. A.3.12.1, p.41 -->
 </p>
<p>This encoding can be extended to the case of patent documents, where the same patent application is published, with or without changes, at different stages of the patenting procedure. In this case, the kind code and, optionally, the publication date characterize different publications of the same patent application during the procedure. For example:
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<biblStruct type="patent" status="publication">
  <monogr>
    <authority>
      <orgName type="national">EP</orgName>
    </authority>
    <idno type="docNumber">1558513</idno>
    <imprint>
      <classCode scheme="http://www.epo.org/">A1</classCode>
      <date type="publicationDate" when="2005-08-03"/>
    </imprint>
  </monogr>
  <monogr>
    <imprint>
      <classCode scheme="http://www.epo.org/">B1</classCode>
      <date type="publicationDate" when="2009-09-09"/>
    </imprint>
  </monogr>
</biblStruct>
</egXML>
</p>
<p>The above bibliographic reference discloses different publications of the patent EP1558513 during the patenting procedure. The first publication from 3 August 2005 has the kind code "A1" indicating that it is a published patent application comprising the European search report issued after carrying out the search at the European Patent Office, whereas the second publication from 9 September 2009 has the kind code "B1" indicating that it was published after the patent application has been granted.</p>
<p>An alternative way of handling the above situations would be to use the
<gi>relatedItem</gi> element described in section <ptr target="#COBIRI"/> below.</p>
</div>
<div type="div4" xml:id="COBICOB">
<head>Scopes and Ranges in Bibliographic Citations</head>
<p>Many bibliographic citations contain data limiting the citation to one 
or more volumes, issues, or pages, or to a name or number of a subdivison 
of the host work. These come in two varieties:
<list>
<item>the scope of a bibliographic reference (encoded using <gi>biblScope</gi>)</item>
<item>the range of a work cited (encoded using <gi>citedRange</gi>)</item>
</list>
Where it is desired to distinguish different classes of such information 
(volume number, page number, chapter number, etc.), the <att>unit</att> 
attribute may be used with any convenient typology (see the element 
definitions for <gi>biblScope</gi> and <gi>citedRange</gi> for some 
suggested values). 
</p>
<p>A scope of a bibliographic reference defines that the <emph>entire work 
cited</emph> may be found in particular volumes, issues, pages, etc. For example: 
<egXML xmlns="http://www.tei-c.org/ns/Examples"><biblStruct>
   <analytic>
      <author>
        <persName>
          <surname>Wrigley</surname>
          <forename full="init">E.</forename>
          <forename full="init">A.</forename>
        </persName>
      </author>
      <title level="a">Parish registers and the historian</title>
   </analytic>
   <monogr>
     <author>
       <persName>
         <surname>Steel</surname>
         <forename full="init">D.</forename>
         <forename full="init">J.</forename>
       </persName>
     </author>
     <author>
       <persName>
         <surname>Steel</surname>
         <forename full="init">A.</forename>
         <forename full="init">E.</forename>
         <forename full="init">F.</forename>
       </persName>
     </author>
     <title level="m">General sources of births, marriages and deaths before 1837</title>
     <imprint>
       <pubPlace>London</pubPlace>
       <publisher>Society of Genealogists</publisher>
       <date when="1968"/>
     </imprint>
     <biblScope unit="pp" from="155" to="167">155–167</biblScope>
   </monogr>
   <series>
     <title level="s">National index of parish registers</title>
     <biblScope unit="vol">1</biblScope>
   </series>
</biblStruct></egXML>
 </p>
<p>The <att>unit</att> attribute on <gi>biblScope</gi> is optional:
both the following are legal examples:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><biblStruct>
   <analytic>
      <author>Boguraev, Branimir</author>
      <author>Neff, Mary</author>
      <title level="a">Text Representation, Dictionary Structure,
             and Lexical Knowledge</title>
   </analytic>
   <monogr>
      <title level="j">Literary &amp; Linguistic Computing</title>
      <imprint>
         <date>1992</date>
      </imprint>
         <biblScope unit="vol">7</biblScope>
         <biblScope unit="issue">2</biblScope>
         <biblScope unit="pp">110-112</biblScope>
   </monogr>
</biblStruct></egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples"><biblStruct>
   <analytic>
      <author>Chesnutt, David</author>
      <title level="a">Historical Editions in the States</title>
   </analytic>
   <monogr>
      <title level="j">Computers and the Humanities</title>
      <imprint>
         <date when="1991-12">(December, 1991):</date>
      </imprint>
         <biblScope>25.6</biblScope>
         <biblScope from="377" to="280">377–380</biblScope>
   </monogr>
</biblStruct></egXML>
<!-- TYPE attribute dropped from biblScope in Chesnutt        -->
	<!-- example, as per WWP suggestion (otherwise lead-in makes  -->
	<!-- no sense) (msm)                                          -->
 </p>
<p>On the other hand, a cited range encodes that the author <emph>cited 
only the portion</emph> defined by this range. For example, a footnote 
following a quotation from page 378 of <title level="a">Historical 
Editions in the States</title> that includes a full bibliographic 
reference would be encoded using <gi>biblStruct</gi> as follows:
  <egXML xmlns="http://www.tei-c.org/ns/Examples"><biblStruct>
    <analytic>
      <author>Chesnutt, David</author>
      <title level="a">Historical Editions in the States</title>
    </analytic>
    <monogr>
      <title level="j">Computers and the Humanities</title>
      <imprint>
        <date when="1991-12">(December, 1991):</date>
      </imprint>
      <biblScope>25.6</biblScope>
      <biblScope unit="pp" from="377" to="280">377–380</biblScope>
    </monogr>
    <citedRange>378</citedRange>
  </biblStruct></egXML>
</p>
<specGrp xml:id="DCOPUB" >
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/imprint.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/publisher.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/biblScope.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/citedRange.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/pubPlace.xml"/>
</specGrp>
</div>
<div type="div4" xml:id="COBICOS"><head>Series Information</head>
<p>Series information may (in <gi>bibl</gi> elements) or must (in
<gi>biblStruct</gi> elements) be enclosed in a <gi>series</gi> element
or (in a <gi>biblFull</gi> element) a <gi>seriesStmt</gi> element.  The
title of the series may be tagged <tag>title level="s"</tag>, the
volume number <tag>biblScope unit="vol"</tag>, and responsibility
statements for the series (e.g. the name and affiliation of the editor,
as in the example in section <ptr target="#COBICOL"/>) may be tagged
<gi>editor</gi> or <gi>respStmt</gi>. Any identifier associated with
the series itself should be marked using the <gi>idno</gi> element.
 </p></div>
<div type="div4" xml:id="COBIRI"><head>Related Items</head>
<p>In bibliographic parlance, a <term>related item</term> is any
bibliographic item which, though related to that being defined, is
distinct from it. The distinction between analytic and monographic
items made above may be thought of as a special case of this kind of
<q>related</q> item. More usually however, the term is applied to such
items as translations, continuations, different versions, parts,
etc. </p>
<p>The element <gi>relatedItem</gi> is provided as a means of documenting such
associated items:
<specList>
<specDesc key="relatedItem"/>
</specList></p>

<p>In the following example, the first <gi>biblStruct</gi> describes a
facsimile edition, and the second describes the work of which it is a
facsimile. The relation between the facsimile and its source is
represented by means of a <gi>relatedItem</gi> within the first
description, which points to the description of the source.

<egXML xmlns="http://www.tei-c.org/ns/Examples"><biblStruct xml:id="bibl03">
<monogr>
    <author>Swinburne, Algernon Charles</author>
    <title level="m">Swinburne's <title level="m">Atalanta in Calydon</title>: A Facsimile of the
        First Edition</title>
    <editor>Georges Lafourcade</editor>
    <imprint>
        <pubPlace>London</pubPlace>
        <publisher>Oxford UP</publisher>
        <date>1930</date>
    </imprint>
</monogr>
<relatedItem type="otherEdition">
    <ref target="#bibl04"/>
</relatedItem>
</biblStruct>
                
<biblStruct xml:id="bibl04">
<monogr>
    <author> Swinburne, Algernon Charles</author>
    <title level="m">Atalanta in Calydon</title>
    <imprint>
        <pubPlace>London</pubPlace>
        <publisher>Edward Moxon</publisher>
        <date>1865</date>
    </imprint>
</monogr>
</biblStruct>
</egXML>

</p>

<p>The <gi>ref</gi> element in the above example could be
replaced by the referenced <gi>biblStruct</gi> itself since a
<gi>relatedItem</gi> may contain any form of bibliographic
reference. For example, one of the examples quoted above might also be
encoded as follows:

<egXML xmlns="http://www.tei-c.org/ns/Examples"><biblStruct>
   <monogr>
      <author>Shirley, James</author>
      <title type="main">The gentlemen of Venice</title>
      <imprint>
         <pubPlace>New York</pubPlace>
         <publisher>Readex Microprint</publisher>
         <date>1953</date>
      </imprint>
      <extent>1 microprint card, 23 x 15 cm.</extent>
   </monogr>
   <series>    
      <title level="s">Three centuries of drama: English, 1642–1700</title>
   </series>
  <relatedItem type="otherEdition">   
    <biblStruct><monogr>
      <author>Shirley, James</author>
      <title type="main" level="m">The gentlemen of Venice</title>
      <title type="sub" level="m">a tragi-comedie presented at the private
          house in Salisbury Court by Her Majesties servants</title>
      <imprint>
         <pubPlace>London</pubPlace>
         <publisher>H. Moseley</publisher>
         <date when="1655">1655</date>
      </imprint>
      <extent>78 p.</extent>
   </monogr></biblStruct></relatedItem>
</biblStruct></egXML>
</p>
<p>The <att>type</att> attribute should be used to indicate the
relationship between the bibliographic item and any
<gi>relatedItem</gi> it contains or points to. The relationships may
be transitive (for example <val>translatedAs</val> or
<val>reprintedFrom</val>) or non-transitive (for example
<val>otherEdition</val>).  The <att>subtype</att> attribute may be
used to provide a more detailed classification, where this is
appropriate. Some further examples follow:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><biblStruct>
   <monogr>
      <author>Tolkien, J.R.R.</author>
      <title level="m">Den hobbit</title>
      <title type="sub">aus dem Engleschen iwwersat</title>
      <editor role="translator">Henry Wickens</editor>
      <imprint>
         <pubPlace>Esch-sur-Sûre</pubPlace>
         <publisher>Op der Lay S. àr. L</publisher>
         <date>2002</date>
      </imprint>
   </monogr>
<relatedItem type="translatedFrom">
<bibl>
      <author>Tolkien, J.R.R.</author>
<title level="m">The Hobbit</title>. 
      <publisher>Collins</publisher>
      <date>1997</date>
</bibl>
</relatedItem>
</biblStruct></egXML>
In this example, a full bibliographic description
of the edition used as source for the translation is provided within
the content of the <gi>relatedItem</gi>. Alternatively this might be
provided by means of a link, in which case the <gi>relatedItem</gi>
would be empty:
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<relatedItem type="translatedFrom" target="http://www.example.com/bibliography.xml#TOLK97"/>
</egXML>
</p>
<specGrp xml:id="DCOBI" n="Tags for Bibliographic References">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/bibl.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/biblStruct.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/listBibl.xml"/>
<specGrpRef target="#DCOBILV"/>
<specGrpRef target="#DCOBICOR"/>
<specGrpRef target="#DCOPUB"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/relatedItem.xml"/>
</specGrp>
</div>
<div type="div4" xml:id="COBICON"><head>Notes and Statement of Language</head>
<p>Explanatory notes about the publication of unusual items, the form of
an item (e.g.  <mentioned>[Score]</mentioned> or <mentioned>[Microform]</mentioned>), or
its provenance (e.g.  <mentioned>translation of ...</mentioned>) may be tagged
using the <gi>note</gi> element.  The same element may be used for any
descriptive annotation of a bibliographic entry in a database.
<specList><specDesc key="note"/></specList>
 </p>
<p>For example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><bibl>
   <author>Coombs, James H., Allen H. Renear,
           and Steven J. DeRose.</author>
   <title level="a">Markup Systems and the Future of Scholarly
Text Processing.</title>
   <title level="j">Communications of the ACM</title>
   <biblScope>30.11 (November 1987): 933–947.</biblScope>
   <note>Classic polemic supporting descriptive over procedural
         markup in scholarly work.</note>
</bibl></egXML>
 </p>
<p>
The <gi>textLang</gi> element may be used to record information about the languages used within a bibliographic item.
<specList><specDesc key="textLang" atts="mainLang otherLangs"/></specList>
This element can take the form of a simple note such as:
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<textLang>Latin, with some glosses in Anglo-Saxon and French</textLang>
</egXML>
However, it is generally recommended where feasible to use the <att>mainLang</att> attribute to record the chief 
language of the bibliographic item, and optionally the <att>otherLangs</att> to identify other languages used in the work. For example:
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<textLang mainLang="la" otherLangs="ang fr">Latin, with some glosses in Anglo-Saxon and French</textLang>
</egXML>
</p>
<p>The <att>mainLang</att> and <att>otherLangs</att> attributes should both provide language identifiers 
in the same form as used for <att>xml:lang</att> as described at <ptr target="#CHSH"/>. Where additional 
detail is needed correctly  to describe a language, or to discuss its 
deployment in a given text, this should be done using the
<gi>langUsage</gi> element in the TEI header, within which
individual <gi>language</gi> elements document the languages
used: see <ptr target="#HD41"/>.  </p>
<p> A description, in French, of a work predominantly in German, but also with some Latin might 
have a <gi>textLang</gi> like the following:
<egXML xmlns="http://www.tei-c.org/ns/Examples" xml:lang="fr">
<textLang xml:lang="fr" mainLang="de" otherLangs="la">allemand et latin</textLang>
</egXML> For more information about the use of <gi>textLang</gi> in manuscript descriptions
see: <ptr target="#mslangs"/>.</p>

</div>
<div type="div4" xml:id="COBICOO"><head>Order of Components within References</head>
<p>The order of elements in <gi>bibl</gi> elements is not constrained.
 </p>
<p>In <gi>biblStruct</gi> elements, the <gi>analytic</gi> element, if
it occurs, must come first, followed by one or more <gi>monogr</gi> and
<gi>series</gi> elements, which may appear intermingled (as long as a
<gi>monogr</gi> element comes first), and then zero or more of the 
following in any order: <gi>note</gi>, <gi>witDetail</gi>, <gi>idno</gi>, 
<gi>ptr</gi>, <gi>ref</gi>, <gi>relatedItem</gi>, and <gi>citedRange</gi>.  
Within <gi>analytic</gi>, the
title(s), author(s), editor(s), and other statements of responsibility
may appear in any order; it is recommended that all forms of the title
be given together.  Within <gi>monogr</gi>, the author, editor, and
statements of responsibility may either come first or else follow the
monographic title(s).  Following these, the elements listed below, if 
present, must appear in the following order:
<list rend="bulleted">
<item><gi>note</gi>s on the publication (and <gi>meeting</gi> elements
describing the conference, in the case of a proceedings volume)</item>
<item><gi>edition</gi> elements, each followed by any related
<gi>editor</gi> or <gi>respStmt</gi> elements</item>
<item><gi>imprint</gi></item>
<item><gi>biblScope</gi></item></list>
Within <gi>imprint</gi>, the elements allowed may appear in any
order.</p>
<p>Finally, within the <gi>series</gi> information in a
<gi>biblStruct</gi>, the sequence of elements is not constrained.
 </p>
<p>If more detailed structuring of a bibliographic description is
required, the <gi>biblFull</gi> element should be used.  This is not
further described here, as its contents are essentially equivalent to
those of the <gi>fileDesc</gi> element in the <gi>teiHeader</gi>, which
is fully described in section <ptr target="#HD2"/>.
 </p></div></div>
<div type="div3" xml:id="COBIXR"><head>Bibliographic Pointers </head>
<p>References which are pointers to bibliographic items, of whatever
kind, should be treated in the same way as other cross-references (see
section <ptr target="#COXR"/>).  As discussed in that section, 
cross-referencing within TEI texts is in general represented by means of
<gi>ptr</gi> or <gi>ref</gi> elements. A <att>target</att> attribute on
these elements is used to supply an identifying value for the target of
the cross-reference, which should be, in the case of bibliographic
elements, a bibliographic reference of some kind.  Where the form of the
reference itself is unimportant, or may be reconstructed mechanically,
or is not to be encoded, the <gi>ptr</gi> element is used, as in the
following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples">As shown above (<ptr target="#NEL80"/>) ...</egXML>
 </p>
<p>Where the form of the reference is important, or contains additional
qualifying information which is to be kept but distinguished from the
surrounding text, the <gi>ref</gi> element should be used, as in the
following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples">Nelson claims <ref target="#NEL80">(ibid, passim)</ref> ...</egXML>
It may be important to distinguish between the short form of a
bibliographic reference and some qualifying or additional information.
The latter should not appear within the scope of the <gi>ref</gi>
element when this is the case, as for example in an application
concerned to normalize bibliographic references:
<egXML xmlns="http://www.tei-c.org/ns/Examples">Nelson claims (<ref target="#NEL80">Nelson [1980]</ref> pages 13–37) ...</egXML>
 </p>
<p>The <gi>ref</gi> element may also be used to provide a reference to a copy of the bibliographic item itself, particularly if this is available online, as in the following example: 
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<biblStruct>
<analytic>
<author>
<forename>Suzana</forename>
<surname>Sukovic</surname>
</author>
<title level="a">Beyond the Scriptorium: The Role of the Library in Text
Encoding</title>
</analytic>
<monogr>
<title level="j">D-Lib</title>
<ref
    type="url">http://www.dlib.org/dlib/january02/sukovic/01sukovic.html</ref>
<imprint>
<biblScope unit="vol">8</biblScope>
<biblScope unit="issue">1</biblScope>
<date>2002</date>
</imprint>

</monogr>
</biblStruct>
</egXML>
  </p>
</div>

<div type="div3" xml:id="COBIOT"><head>Relationship to Other Bibliographic Schemes</head>

<p>The bibliographic tagging defined here can capture the distinctions
required by most bibliographic encoding systems; for the benefit of
users of some commonly used systems, the following lists of equivalences
are offered, showing the relationship of the markup defined here to the
fields defined for bibliographic records in the Scribe, BibTeX, and
ProCite systems.
 </p>
<p>Listed below are the equivalences between the various bibliographic fields
defined for use in the Scribe and BibTeX systems of bibliographic
databases and the elements defined in this module.<note place="bottom">The BibTeX scheme is
intentionally compatible with that of Scribe, although it omits some
fields used by Scribe. Hence only one list of fields is given
here.</note> Elements and structures available in the module defined here which
have no analogues in Scribe and BibTeX are not noted.
<list type="gloss">
  <label>address</label>
  <item>tag as <gi>placeName</gi> or <gi>address</gi></item>
  <label>annote</label>
  <item>tag as <gi>note</gi></item>
  <label>author</label>
  <item>tag as <gi>author</gi></item>
  <label>booktitle</label>
  <item>tag as <tag>title level="m"</tag> or <gi>title</gi> within
    <gi>monogr</gi></item>
  <label>chapter</label>
  <item>tag as <tag>biblScope unit="chap"</tag></item>
  <label>date</label>
  <item>used only to record date entry was made in the bibliographic database;
    not supported</item>
  <label>edition</label>
  <item>tag as <gi>edition</gi></item>
  <label>editor</label>
  <item>tag as <gi>editor</gi> or <gi>respStmt</gi></item>
  <label>editors</label>
  <item>tag as multiple <gi>editor</gi> or <gi>respStmt</gi> elements</item>
  <label>fullauthor</label>
  <item>use the <gi>reg</gi> element, possibly inside a <gi>choice</gi> element, inside either an <gi>author</gi> or <gi>name</gi></item>
  <label>fullorganization</label>
  <item>use the <gi>reg</gi> element, possibly inside a <gi>choice</gi> element, inside a <tag>name type="org"</tag></item>
  <label>howpublished</label>
  <item>tag as <gi>note</gi>, possibly using the form <tag>note
    place="inline"</tag></item>
  <label>institution</label>
  <item>used only for issuer of technical reports; tag as <gi>publisher</gi></item>
  <label>journal</label>
  <item>tag as <tag>title level="j"</tag> or <gi>title</gi> within
    <gi>monogr</gi></item>
  <label>key</label>
  <item>used to specify an alternate sort key for the bibliographic item, for
    use instead of author's or editor's name; not supported</item>
  <label>meeting</label>
  <item>tag as <gi>meeting</gi> or as <gi>note</gi></item>
  <label>month</label>
  <item>use <gi>date</gi>; if the date is not in a trivially parseable form, use
    the <att>when</att> attribute to provide a normalized equivalent in one of
    the format from <title ref="#XSD2">XML Schema Part 2: Datatypes Second Edition</title></item>
  <label>note</label>
  <item>tag as <gi>note</gi></item>
  <label>number</label>
  <item>tag as <tag>biblScope unit="issue"</tag> or <tag>biblScope
    unit="number"</tag>; for technical report numbers, use <tag>idno
      type="docno"</tag></item>
  <label>organization</label>
  <item>used only for sponsor of conference; use <tag>name type="org"</tag>
    within <gi>respStmt</gi> within <gi>meeting</gi> element</item>
  <label>pages</label>
  <item>tag as <tag>biblScope unit="pp"</tag></item>
  <label>publisher</label>
  <item>tag as <gi>publisher</gi></item>
  <label>school</label>
  <item>used only for institutions at which thesis work is done; tag as
      <gi>publisher</gi></item>
  <label>series</label>
  <item>tag as <tag>title level="s"</tag> or <gi>title</gi> within
    <gi>series</gi></item>
  <label>title</label>
  <item>tag as <gi>title</gi> in appropriate context or with appropriate
      <att>level</att> value</item>
  <label>volume</label>
  <item>tag as <tag>biblScope unit="vol"</tag></item>
  <label>year</label>
  <item>tag as <gi>date</gi>; if the date is not in a trivially parseable form,
    use the <att>when</att> attribute to provide an ISO-format
    equivalent</item>
</list>
</p></div></div>

<div type="div2" xml:id="CODV"><head>Passages of Verse or Drama</head>
<p>The following elements are included in the core module for the
convenience of those encoding texts which include mixtures of prose,
verse and drama.
<specList><specDesc key="l"/><specDesc key="lg"/><specDesc key="sp"/><specDesc key="speaker"/><specDesc key="stage"/></specList>
 </p>
<p>Full details of other, more specialized, elements for the encoding of
texts which are predominantly verse or drama are described in the
appropriate chapter of part three (for verse, see the verse base
described in chapter <ptr target="#VE"/>; for performance texts, see the
drama base described in chapter <ptr target="#DR"/>).  In this section, we
describe only the elements listed above, all of which can appear in any
text, whichever of the three modes prose, verse, or drama may predominate
in it.
 </p>

<div type="div3" xml:id="COVE"><head>Core Tags for Verse</head>
<p>Like other written texts, verse texts or poems may be
hierarchically subdivided, for example into books or cantos. These
structural subdivisions should be encoded using the general purpose
<gi>div</gi> or <gi>div1</gi> (etc.) elements described below in
chapters <ptr target="#DS"/> and <ptr target="#VE"/>. The fundamental
unit of a verse text is the verse line rather than the paragraph,
however.</p>

<p>The <gi>l</gi> element is used to mark up verse lines, that is
metrical rather than typographic lines.  In some modern or free verse,
it may be hard to decide whether the typographic line is to be
regarded as a verse line or not, but the distinction is quite clear
for verse following regular metrical patterns. Where a metrical line is
interrupted by a typographic line break, the encoder may choose to
ignore the fact entirely or to use the empty <gi>lb</gi> (line break)
element discussed in <ptr target="#CORS"/>.  By convention, the start
of a metrical line implies the start of a typographic line; hence
there is no need to introduce an <gi>lb</gi> tag at the start of every
<gi>l</gi> element, but only at places where a new typographic line
starts  within a metrical line, as in the following example:

<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CO-eg-06">
<l>Of Mans First Disobedience, and<lb/> the Fruit</l>
<l>Of that Forbidden Tree, whose<lb/> mortal tast</l>
<l>Brought Death into the World,<lb/> and all our woe,</l>
<l>With loss of Eden, till one greater Man</l>
<l>Restore us, and regain the blissful Seat...</l>
</egXML>

In the original copy text, the presence of an ornamental capital at
the start of the poem means that the measure is not wide enough to
print the first four lines on four lines; instead each metrical line occupies
two typographic lines, with a break at the point indicated. Note that
this encoding makes no attempt to preserve information about the
whitespace or indentation associated with either kind of line; if regarded
as essential, this information would be recorded using the
<att>rend</att> or <att>rendition</att> attributes discussed in <ptr target="#STGA"/>. </p><p>The <gi>l</gi> element should not be used to represent typographic
lines in non-verse materials: if the line-breaking points in a prose
text are considered important for analysis, they should be marked with
the <gi>lb</gi> element. Alternatively, a neutral segmentation element
such as <gi>seg</gi> or <gi>ab</gi> may be used; see further
discussion of these elements in chapter <ptr target="#SA"/>. The
<gi>l</gi> element is a member of the <ident type="class">model.lLike</ident> class, which is a subclass of the
<ident type="class">model.divPart</ident> class, along with elements
from the <ident type="class">model.pLike</ident> (paragraph-like)
class.</p>

<p>In some verse forms, regular groupings of lines are regarded as units
of some kind, often identified by a regular verse scheme.  In stichic
verse and couplets, groups of lines analogous to paragraphs are often
indicated by indentation.  In other verse forms, lines are grouped into
irregular sequences indicated simply by whitespace.  The 
<gi>lg</gi> or line group element may be used to mark any such grouping
of elements from the <ident type="class">model.lLike</ident> class. As a member of the <ident type="class">att.typed</ident>
class, the <gi>lg</gi> element bears the following attributes:
<specList><specDesc key="att.typed" atts="type subtype"/></specList>
which may be used to further categorize the
line group where this is felt desirable, as in the following example.
This example also demonstrates the <att>rend</att> attribute to indicate
whether or not a line is indented.

<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COVE-eg-285"><lg>
   <l>Come fill up the Glass,</l>
   <l rend="indent">Round, round let it pass,</l>
   <l>'Till our Reason be lost in our Wine:</l>
   <l rend="indent">Leave Conscience's Rules</l>
   <l rend="indent">To Women and Fools,</l>
   <l>This only can make us divine.</l>
</lg>
<lg n="Chorus" type="refrain">
   <l>Then a Mohock, a Mohock I'll be,</l>
   <l>No Laws shall restrain</l>
   <l>Our Libertine Reign,</l>
   <l>We'll riot, drink on, and be free.</l>
</lg></egXML>
 </p>
<p>For some kinds of analysis, it may be useful to identify different
kinds of line group within the same piece of verse. Such line groups
may self-nest, in much the same way as the un-numbered <gi>div</gi>
element described in chapter <ptr target="#DS"/>. For example:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COVE-eg-286"><lg type="sonnet">
  <lg type="octet">
    <l>Thus speaks the Muse, and bends her brow severe:—</l>
    <l>“Did I, <name>Lætitia</name>, lend my choicest lays,</l>
    <l>And crown thy youthful head with freshest bays,</l>
    <l>That all the' expectance of thy full-grown year</l>
    <l>Should lie inert and fruitless? O revere</l>
    <l>Those sacred gifts whose meed is deathless praise,</l>
    <l>Whose potent charms the' enraptured soul can raise</l>
    <l>Far from the vapours of this earthly sphere!</l>
  </lg>
  <lg type="sestet">
    <l>Seize, seize the lyre! resume the lofty strain!</l>
    <l>'T is time, 't is time! hark how the nations round</l>
    <l>With jocund notes of liberty resound,—</l>
    <l>And thy own <name>Corsica</name> has burst her chain!</l>
    <l>O let the song to <name>Britain's</name> shores rebound,</l>
    <l rend="indent(-1)">Where Freedom's once-loved voice is heard,
      alas! in vain.”</l>
  </lg>
</lg></egXML>
 </p>
<p>It is often the case that verse line boundaries conflict with the
boundaries of other structural elements. In the following example, the
single verse line <q>A Workeman in't... welcome</q> is interrupted by
a stage direction:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CO-eg-07"><l>Thou fumblest <name>Eros</name>, and my Queenes a Squire</l>
<l>More tight at this, then thou:  Dispatch. O Loue,</l>
<l>That thou couldst see my Warres to day, and knew'st</l>
<l>The Royall Occupation, thou should'st see</l>
<l part="I">A Workeman in't. <stage>Enter an Armed Soldier.</stage>
</l>
<l part="F">Good morrow to thee, welcome. </l></egXML> 
In this encoding, the <att>part</att> attribute is used, as with
<gi>div</gi>,  to indicate that the last two <gi>l</gi> elements
should be regarded as the initial and final parts of a single line,
rather than as two lines.</p>

<p>The same technique may be used where verse lines are collected
together into units such as verse paragraphs:

<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CO-eg-08"><lg n="6" type="para">
<!-- ... -->
<l>Unprofitably travelling toward the grave,</l>
<l>Like a false steward who hath much received</l>
<l part="I">And renders nothing back.</l></lg>
<lg type="para" n="7">
<l part="F">Was it for this</l>
<l>That one, the fairest of all rivers, loved</l>
<l>To blend his murmurs with my nurse's song,</l>
<!-- ... -->
</lg></egXML>

</p>
<p>The <att>part</att> attribute may also be attached to an <gi>lg</gi>
element to indicate that it is incomplete, for example because it forms
part of a group that is divided between two speakers, as in the
following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CONONO-eg-189"><sp>
   <speaker>First Voice</speaker>
   <lg type="stanza" part="I">
      <l>But why drives on that ship so fast</l>
      <l>Withouten wave or wind?</l>
   </lg>
</sp>
<sp>
   <speaker>Second Voice</speaker>
   <lg type="stanza" part="F">
      <l>The air is cut away before,</l>
      <l>And closes from behind.</l>
   </lg>
</sp></egXML>

 </p>
<p>For alternative methods of aligning groups of lines which do not form
simple hierarchic groups, or which are discontinuous, see the more
detailed discussion in chapter <ptr target="#SA"/>.  For discussion of
other elements and attributes specific to the encoding of verse, see
chapter <ptr target="#VE"/>.
 </p>

<specGrp xml:id="DCOVE" n="Verse">

<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/l.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/lg.xml"/></specGrp>
</div>
<div type="div3" xml:id="CODR"><head>Core Tags for Drama</head><p>Like other written texts, dramatic and other <term>performance
texts</term> such as cinema or TV scripts are often hierarchically
organized, for example into acts and scenes.  These structural
subdivisions should be encoded using the general purpose <gi>div</gi>
or <gi>div1</gi> (etc.) elements described below in chapters <ptr target="#DS"/> and <ptr target="#DR"/>.  Within these divisions, the
body of a performance text typically consists of <term rend="noindex">speeches</term>, often prefixed by a phrase indicating
who is speaking, and occasionally interspersed with stage directions
of various kinds.  </p><p>In the following simple example, each speech consists of a single
paragraph:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#COVE-eg-285"><div2 n="I.2" type="scene">
   <head>Scene 2.</head>
   <stage type="setting">Peachum, Filch.</stage>
   <sp>
      <speaker>FILCH.</speaker>
      <p>Sir, Black Moll hath sent word her Trial comes on in
       the Afternoon, and she hopes you will order Matters
       so as to bring her off.</p>
   </sp>
   <sp>
      <speaker>PEACHUM.</speaker>
      <p>Why, she may plead her Belly at worst; to my 
        Knowledge she hath taken care of that Security.
        But, as the Wench is very active and industrious, 
        you may satisfy her that I'll soften the Evidence.</p>
   </sp>
   <sp>
      <speaker>FILCH.</speaker>
      <p>Tom Gagg, sir, is found guilty.</p>
   </sp>
</div2></egXML>

 </p>
<p>In the following example, each speech consists of a sequence of verse
lines, some of them being marked as metrically incomplete:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CODR-eg-293"><div1 n="I" type="Act">
   <head>ACT I</head>
   <div2 n="1" type="Scene">
      <head>SCENE I</head>
      <stage rend="italic">Enter Barnardo and Francisco,
         two Sentinels, at several doors</stage>
      <sp><speaker>Barn</speaker>
         <l part="Y">Who's there?</l>
      </sp>
      <sp><speaker>Fran</speaker>
         <l>Nay, answer me. Stand and unfold yourself.</l>
      </sp>
      <sp><speaker>Barn</speaker>
         <l part="I">Long live the King!</l>
      </sp>
      <sp><speaker>Fran</speaker>
         <l part="M">Barnardo?</l>
      </sp>
      <sp><speaker>Barn</speaker>
         <l part="F">He.</l>
      </sp>
      <sp><speaker>Fran</speaker>
         <l>You come most carefully upon your hour.</l>
      </sp>
      <sp><speaker>Barn</speaker>
         <l>'Tis now struck twelve. Get thee to bed, Francisco.</l>
      </sp>
      <sp><speaker>Fran</speaker>
         <l>For this relief much thanks. 'Tis bitter cold,</l>
         <l part="I">And I am sick at heart.</l>
      </sp>
   </div2>
</div1></egXML>

 </p>
<p>In some cases, as here in the First Quarto of <title>Hamlet</title>,
the printed speaker attributions need to be supplemented by use of the
<att>who</att> attribute; again, the lines are marked as complete or
incomplete:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CODR-eg-294"><stage>Enter two Centinels.
<add place="margin">Now call'd <name xml:id="barnardo">Bernardo</name> &amp; 
<name xml:id="francisco">Francesco</name>.</add></stage>
<sp who="#francisco"> <speaker>1.</speaker>
   <l part="Y">Stand: who is that?</l>
</sp>
<sp who="#barnardo"> <speaker>2.</speaker>
   <l part="Y">Tis I.</l>
</sp>
<sp who="#francisco"> <speaker>1.</speaker>
   <l>O you come most carefully vpon your watch,</l>
</sp>
<sp who="#barnardo"> <speaker>2.</speaker>
   <l>And if you meete Marcellus and Horatio,</l>
   <l>The partners of my watch, bid them make haste.</l>
</sp>
<sp who="#francisco"> <speaker>1.</speaker>
   <l part="Y">I will: See who goes there.</l>
</sp>
<stage>Enter Horatio and Marcellus.</stage>
</egXML>

 </p>
<p>By contrast with the preceding examples, the following encodes an
early printed edition without making any assumption about which parts
are prose or verse:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CODR-eg-295"><div1 n="I" type="act">
   <div2 n="1" type="scene">
      <head rend="italic">Actus primus, Scena prima.</head>
      <stage rend="italic" type="setting">A tempestuous 
        noise of Thunder and Lightning heard: Enter
        a Ship-master, and a Boteswaine.</stage>
      <sp>
         <speaker>Master.</speaker> <p>Bote-swaine.</p>
      </sp>
      <sp>
         <speaker>Botes.</speaker> <p>Heere Master: What cheere?</p>
      </sp>
      <sp>
         <speaker>Mast.</speaker>
         <p>Good: Speake to th' Mariners: fall
           too't, yarely, or we run our selues a ground,
           bestirre, bestirre.  <stage type="move">Exit.</stage>
         </p>
      </sp>
      <stage type="move">Enter Mariners.</stage>
      <sp>
         <speaker>Botes.</speaker>
         <p>Heigh my hearts, cheerely, cheerely my harts: yare,
           yare: Take in the toppe-sale: Tend to th' Masters whistle:
           Blow till thou burst thy winde, if roome e-nough.</p>
      </sp>
   </div2>
</div1></egXML>

 </p>
<p>The <gi>sp</gi> and <gi>stage</gi> elements should also be used to
mark parts of a text otherwise in prose which are presented as if they
were dialogue in a play.  The following example is taken  from a 19th century
   novel in which passages of narrative and passages of dialogue are
   mixed within the same chapter:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CODR-eg-296"><sp><speaker>The reverend Doctor Opimian</speaker>
   <p>I do not think I have named a single unpresentable fish.</p>
</sp>
<sp><speaker>Mr Gryll</speaker>
   <p>Bream, Doctor: there is not much to be said for bream.</p>
</sp>
<sp><speaker>The Reverend Doctor Opimian</speaker>
   <p>On the contrary, sir, I think there is much to be said for him.  
     In the first place ...</p>
   <p>Fish, Miss Gryll — I could discourse to you on fish by the
     hour: but for the present I will forbear ...</p>
</sp></egXML>

<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#CODR-eg-296"><sp>
   <speaker>Lord Curryfin</speaker>
   <stage>(after a pause).</stage>
   <p><q>Mass</q> as the second grave-digger says
     in <title>Hamlet</title>, <q>I cannot tell.</q></p>
</sp>
<p>A chorus of laughter dissolved the sitting.</p></egXML>

 </p>

<specGrp xml:id="DCODR" n="Drama">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/sp.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/speaker.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/stage.xml"/>
</specGrp>
</div></div>
 <div type="div2" xml:id="COOV"><head>Overview of the Core Module </head>
<p>All the elements described in this chapter
are provided by the
<ident type="module">core</ident> module.
<moduleSpec xml:id="DCO" ident="core">
<altIdent type="FPI">Common Core</altIdent>
<desc>Elements common to all TEI documents</desc>
<desc xml:lang="fr">Éléments disponibles pour tous les documents
TEI</desc>
<desc xml:lang="zh-TW">所有TEI文件所通用的元素</desc>
<desc xml:lang="it">Elementi comuni a tutti i documenti TEI</desc><desc xml:lang="pt">Elementos comuns a todos os documentos TEI</desc><desc xml:lang="ja">コアモジュール</desc></moduleSpec>
The selection and combination of modules to form a TEI schema is described in
<ptr target="#STIN"/>.
</p>
<specGrpRef target="#DCOPA"/>
<specGrpRef target="#DCOHQ"/>
<specGrpRef target="#DCONA"/>
<specGrpRef target="#DCONU"/>
<specGrpRef target="#DCODA"/>
<specGrpRef target="#DCOAB"/>
<specGrpRef target="#DCOEDC"/>
<specGrpRef target="#DCOEDR"/>
<specGrpRef target="#DCOEDA"/>
<specGrpRef target="#DCOAD"/>
<specGrpRef target="#DCOXR"/>
<specGrpRef target="#DCOLI"/>
<specGrpRef target="#DCONO"/>
<specGrpRef target="#DCOGR"/>
<specGrpRef target="#DCORSM"/>
<specGrpRef target="#DCOBI"/>
<specGrpRef target="#DCOVE"/>
<specGrpRef target="#DCODR"/>
 </div></div>
