<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright TEI Consortium.
Dual-licensed under CC-by and BSD2 licences
See the file COPYING.txt for details.
$Date$
$Id$
-->
<?xml-model href="http://tei.oucs.ox.ac.uk/jenkins/job/TEIP5/lastSuccessfulBuild/artifact/P5/release/xml/tei/odd/p5.nvdl" type="application/xml" schematypens="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0"?>
<div xmlns="http://www.tei-c.org/ns/1.0" xmlns:xi="http://www.w3.org/2001/XInclude" type="div1" xml:id="HD" n="5">
<head>The TEI Header</head>
<p>This chapter addresses the problems of describing an encoded work
so that the text itself, its source, its encoding, and its revisions
are all thoroughly documented. Such documentation is equally necessary
for scholars using the texts, for software processing them, and for
cataloguers in libraries and archives. Together these descriptions and
declarations provide an electronic analogue to the title page attached
to a printed work. They also constitute an equivalent for the content
of the code books or introductory manuals customarily accompanying
electronic data sets.</p>
<p>Every TEI-conformant text must carry such a set of descriptions,
prefixed to it and encoded as described in this chapter.  The set is
known as the <term>TEI header</term>, tagged <gi>teiHeader</gi>,
and  has five major parts:
<list rend="numbered">
<item>a <term>file description</term>, tagged <gi>fileDesc</gi>,
containing a full bibliographical description of the computer file
itself, from which a user of the text could derive a proper
bibliographic citation, or which a librarian or archivist could use in
creating a catalogue entry recording its presence within a library or
archive.  The term <term>computer file</term> here is to be understood
as referring to the whole entity or document described by the header,
even when this is stored in several distinct operating system files.
The file description also includes information about the
source or sources from which the electronic document was derived.  The TEI
elements used to encode the file description are described in section <ptr target="#HD2"/>
 below.</item>
<item>an <term>encoding description</term>, tagged <gi>encodingDesc</gi>,
which describes the relationship between an electronic text and its
source or sources.  It allows for detailed description of whether (or
how) the text was normalized during transcription, how the encoder
resolved ambiguities in the source, what levels of encoding or analysis
were applied, and similar matters. The TEI elements used to encode the
encoding description are described in section <ptr target="#HD5"/> below.</item>
<item>a <term>text profile</term>, tagged <gi>profileDesc</gi>,
containing classificatory and contextual information about the text,
such as its subject matter, the situation in which it was produced, the
individuals described by or participating in producing it, and so forth.
Such a text profile is of particular use in highly structured composite
texts such as corpora or language collections, where it is often highly
desirable to enforce a controlled descriptive vocabulary or to perform
retrievals from a body of text in terms of text type or origin.  The
text profile may however be of use in any form of automatic text
processing.  The TEI elements used to encode the profile description
are described in section <ptr target="#HD4"/> below.</item>
<item>a container element, tagged <gi>xenoData</gi>, which allows easy
inclusion of metadata from non-TEI schemes (i.e., other than
elements in the TEI namespace). For example, the MARC record for the
encoded document might be included using MARCXML or MODS. A simple set
of metadata for harvesting might be included encoded in Dublin
Core.</item>
<item>a <term>revision history</term>, tagged <gi>revisionDesc</gi>,
which allows the encoder to provide a history of changes made during the
development of the electronic text.  The revision history is important
for <term>version control</term> and for resolving questions about the
history of a file. The TEI elements used to encode the revision
description are described in section <ptr target="#HD6"/> below.</item></list>
</p>
<p>A TEI header can be a very large and complex object, or it may be a
very simple one.  Some application areas (for example, the construction
of language corpora and the transcription of spoken texts) may require
more specialized and detailed information than others.  The present
proposals therefore define both a <term>core</term> set of elements
(all of which may be used without formality in any TEI header) and
some additional elements which become available within the header as
the result of including additional specialized modules within the
schema. When the module for language corpora (described in  chapter
<ptr target="#CC"/>) is in use, for example, several additional
elements are available, as further detailed in that chapter.
</p>
  <p>The next section of the present chapter briefly introduces the
  overall structure of the header and the kinds of data it may
  contain. This is followed by a detailed description of all the
  constituent elements which may be used in the core header. Section
  <ptr target="#HD7"/>, at the end of the present chapter, discusses
  the recommended content of a minimal TEI header and its relation to
  standard library cataloguing practices.</p>

<div type="div2" xml:id="HD1"><head>Organization of the TEI Header</head>

<div type="div3" xml:id="HD11"><head>The TEI Header and Its Components</head>
<p>The <gi>teiHeader</gi> element should be clearly distinguished 
from the <term>front matter</term> of the text itself (for which see
section <ptr target="#DSFRONT"/>). A composite text, such as a corpus or
collection, may contain several headers, as further discussed below. In
the general case, however, a TEI-conformant text will contain a single
<gi>teiHeader</gi> element, followed by a single <gi>text</gi> or
<gi>facsimile</gi> element, or both.
</p>
<p>The header element has the following description:
<specList><specDesc key="teiHeader" atts="type"/></specList>
</p>
<p>As discussed above, the <gi>teiHeader</gi> element has five principal
components:
<specList>
  <specDesc key="fileDesc"/>
  <specDesc key="encodingDesc"/>
  <specDesc key="profileDesc"/>
  <specDesc key="xenoData"/>
  <specDesc key="revisionDesc"/>
</specList>
</p>
<p>Of these, only the <gi>fileDesc</gi> element is required in all TEI
headers; the others are optional. That is, only one of the five
components of the TEI header (the <gi>fileDesc</gi>) is mandatory, and
it also has some mandatory components, as further discussed in <ptr
target="#HD2"/> below. The smallest possible valid TEI Header thus
looks like this:
  <egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples"><teiHeader>
    <fileDesc>
      <titleStmt>
	<title><!-- title of the resource --></title>
      </titleStmt>
      <publicationStmt>
	<p><!-- Information about distribution of the resource --></p>
      </publicationStmt>
      <sourceDesc>
	<p><!-- Information about source from which the resource derives --></p>
      </sourceDesc>
    </fileDesc>
  </teiHeader></egXML></p>
<p>The content of the elements making up a TEI header may be given in
any language, not necessarily that of the text to which the header
applies, and not necessarily English. As elsewhere, the
<att>xml:lang</att> attribute should be used at an appropriate level
to specify the language.
For example, in the following schematic example, an
English text has been given a French header:
  <egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples" valid="feasible">
<TEI>
  <teiHeader xml:lang="fr">
    <!-- ... -->
  </teiHeader>
  <text xml:lang="en">
    <!-- ... -->
  </text>
</TEI></egXML>
</p>
<p>In the case of language corpora or collections, it may be
desirable to record header information either at the level of the individual
components in the corpus or collection, or at the level of
the corpus or collection itself (more details
concerning the tagging of composite texts are given in section <ptr target="#CC"/>, which should be read in conjunction with the current
chapter). A corpus may thus take the form:
<egXML xmlns="http://www.tei-c.org/ns/Examples" valid="feasible">
<teiCorpus>
   <teiHeader>
     <!-- corpus-level metadata here -->
   </teiHeader>
   <TEI>
      <teiHeader>
	<!-- metadata specific to this text here -->
      </teiHeader>
      <text><!-- ... --></text>
   </TEI>
   <TEI>
      <teiHeader>
	<!-- metadata specific to this text here -->
      </teiHeader>
      <text><!-- ... --></text>
   </TEI>
</teiCorpus>
</egXML>
<!-- left this in for now, but I dont think it's very helpful (LB) -->
<!-- that's cause you already know it.  the reader don't. -msm     -->
<!-- I left it, too, but at least I fixed the outermost TEI to   -->
<!-- teiCorpus! -sb, 2001-11-02                                  -->
</p>
<specGrp xml:id="D221B">
  <!--&model.teiHeaderPart;-->
  <include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/teiHeader.xml"/>
  <specGrpRef target="#D222"/>
  <specGrpRef target="#D223"/>
  <specGrpRef target="#D225"/>
  <specGrpRef target="#D224"/>
  <specGrpRef target="#D227"/>
  <specGrpRef target="#D226"/>
</specGrp>
</div>

<div type="div3" xml:id="HD12"><head>Types of Content in the TEI Header</head>
<p>The elements occurring within the TEI header may contain several
types of content; the following list indicates how these types of
content are described in the following sections:
<list type="gloss">
  <label>free prose</label>
  <item>Most elements contain simple running prose at some level. Many
  elements may contain either prose (possibly organized into
  paragraphs) or more specific elements, which themselves contain
  prose. In this chapter's descriptions of element content, the phrase
  <mentioned>prose description</mentioned> should be understood to
  imply a series of paragraphs, each marked as a <gi>p</gi> element.
  The word <mentioned>phrase</mentioned>, by contrast, should be
  understood to imply character data, interspersed as need be with
  phrase-level elements, but not organized into paragraphs. For more
  information on paragraphs, highlighted phrases, lists, etc., see
  section <ptr target="#COPA"/>.</item>
  <label>grouping elements</label>
  <item>Elements whose names end with the suffix
  <mentioned>Stmt</mentioned> (e.g. <gi>editionStmt</gi>,
  <gi>titleStmt</gi>) and the <gi>xenoData</gi> element enclose a
  group of specialized elements recording some structured information.
  In the case of the bibliographic elements, the suffix
  <mentioned>Stmt</mentioned> is used in names of elements
  corresponding to the <soCalled>areas</soCalled> of the International
  Standard Bibliographic Description.<note place="bottom"> For more
  information on this highly influential family of standards, first
  proposed in 1969 by the International Federation of Library
  Associations, see <ptr
  target="http://www.ifla.org/VII/s13/pubs/isbd.htm"/>. On the
  relation between the TEI proposals and other standards for
  bibliographic description, see further section <ptr
  target="#HD8"/>.</note> In the case of the <gi>xenoData</gi>
  element, the specialized elements are not TEI elements, but rather
  come from some other metadata scheme. In most cases grouping
  elements may contain prose descriptions as an alternative to the set
  of specialized elements, thus allowing the encoder to choose whether
  or not the information concerned should be presented in a structured
  form or in prose.</item>
  <label>declarations</label>
  <item>Elements whose names end with the suffix
  <mentioned>Decl</mentioned> (e.g. <gi>tagsDecl</gi>,
  <gi>refsDecl</gi>) enclose information about specific encoding
  practices applied in the electronic text; often these practices are
  described in coded form. Typically, such information takes the form
  of a series of declarations, identifying a code with some more
  complex structure or description. A declaration which applies to
  more than one text or division of a text need not be repeated in the
  header of each such text or subdivision. Instead, the
  <att>decls</att> attribute of each text (or subdivision of the text)
  to which the declaration applies may be used to supply a
  cross-reference to it, as further described in section <ptr
  target="#CCAS"/>.</item>
  <label>descriptions</label>
  <item>Elements whose names end with the suffix
  <mentioned>Desc</mentioned> (e.g. <gi>settingDesc</gi>,
  <gi>projectDesc</gi>) contain a prose description, possibly, but not
  necessarily, organized under some specific headings by suggested
  sub-elements.</item>
</list></p></div>

<div><head>Model Classes in the TEI Header</head>
<p>The TEI header provides a very rich collection of metadata
categories, but makes no claim to be exhaustive. It is certainly the
case that individual projects may wish to record specialized metadata
which either does not fit within one of the predefined categories
identified by the TEI header or requires a more specialized element
structure than is proposed here. To overcome this problem, the encoder
may elect to define additional elements using the customization
methods discussed in <ptr target="#MD"/>. The TEI class system makes
such customizations simpler to effect and easier to use in
interchange.</p>
<p>These  classes are specific to parts of the header:
<specList>
  <specDesc key="model.applicationLike"/>
  <specDesc key="model.availabilityPart"/>
  <specDesc key="model.catDescPart"/>
  <specDesc key="model.editorialDeclPart"/>
  <specDesc key="model.encodingDescPart"/>
  <specDesc key="model.profileDescPart"/>
  <specDesc key="model.teiHeaderPart"/>
  <specDesc key="model.sourceDescPart"/>
  <specDesc key="model.textDescPart"/>
</specList>
</p>
</div></div>

<div type="div2" xml:id="HD2"><head>The File Description</head>
<p>This section describes the <gi>fileDesc</gi> element, which is the
first component of the <gi>teiHeader</gi> element.</p>
<p>The bibliographic description of a machine-readable or digital text
resembles in structure that of a book, an article, or any other kind
of textual object. The file description element of the TEI header has
therefore been closely modelled on existing standards in library
cataloguing; it should thus provide enough information to allow users
to give standard bibliographic references to the electronic text, and
to allow cataloguers to catalogue it. Bibliographic citations
occurring elsewhere in the header, and also in the text itself, are
derived from the same model (on bibliographic citations in general,
see further section <ptr target="#COBI"/>). See further section <ptr
target="#HD8"/>.</p>
<p>The bibliographic description of an electronic text should be
supplied by the  mandatory <gi>fileDesc</gi> element:
<specList>
  <specDesc key="fileDesc"/>
</specList>
</p>
<p>The <gi>fileDesc</gi> element contains three mandatory elements and
four optional elements, each of which is described in more detail in
sections <ptr target="#HD21"/> to <ptr target="#HD27"/> below. These elements
are listed below in the order in which they must be given within the
<gi>fileDesc</gi> element.
<specList><specDesc key="titleStmt"/><specDesc key="editionStmt"/><specDesc key="extent"/><specDesc key="publicationStmt"/><specDesc key="seriesStmt"/><specDesc key="notesStmt"/><specDesc key="sourceDesc"/></specList>
</p>
<p>A complete file description containing all possible sub-elements
might look like this:
  <egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples"><teiHeader>
    <fileDesc>
         <titleStmt> <title><!-- title of the resource --></title> </titleStmt>
         <editionStmt> <p> <!-- information about the edition of the
				resource  --></p> </editionStmt>
         <extent> <!-- description of the size of the resource --></extent>
         <publicationStmt> <p><!-- information about the distribution
				   of the resource --> </p></publicationStmt>
         <seriesStmt> <p><!-- information about any series to which
			      the resource belongs  --></p></seriesStmt>
         <notesStmt> <note><!-- notes on other aspects of the resource --> </note></notesStmt>
         <sourceDesc> <p> <!-- information about the source from which
			       the resource was derived  --> </p></sourceDesc>
    </fileDesc>
</teiHeader></egXML>
Of these  elements, only the <gi>titleStmt</gi>,
<gi>publicationStmt</gi>, and <gi>sourceDesc</gi> are required; the
others  may be omitted unless considered useful.
</p>
<specGrp xml:id="D222" n="The file description"><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/fileDesc.xml"/>
<specGrpRef target="#D2221"/>
<specGrpRef target="#D2222"/>
<specGrpRef target="#D2223"/>
<specGrpRef target="#D2224"/>
<specGrpRef target="#D2225"/>
<specGrpRef target="#D2226"/></specGrp>

<div type="div3" xml:id="HD21"><head>The Title Statement</head>
<p>The <gi>titleStmt</gi> element is the first component of the
<gi>fileDesc</gi> element, and is mandatory:
<specList><specDesc key="titleStmt"/></specList>
It contains the title given to the electronic work, together with
one or more optional <term>statements of responsibility</term> which
identify the encoder, editor, author, compiler, or other parties responsible
for it:
<specList><specDesc key="title"/><specDesc key="author"/><specDesc key="editor"/><specDesc key="sponsor"/><specDesc key="funder"/><specDesc key="principal"/><specDesc key="respStmt"/><specDesc key="resp"/><specDesc key="name"/></specList>
</p>
<p>The <gi>title</gi> element contains the chief name of the
electronic work, including any alternative title or subtitles it may have.
It may be repeated, if the work has more than one title
(perhaps in different languages) and takes whatever form is
considered appropriate by its creator. Where the electronic work
is derived from an existing source text, it is strongly
recommended that the title for the former should be derived
from the latter, but clearly distinguishable from it, for example
by the addition of a phrase such as <q>: an electronic
transcription</q> or <q>a digital edition</q>. <!--
For example, do not call the computer file <q>A
Sanskrit-English Dictionary, based upon the St. Petersburg
Lexicons</q>.  Call it, rather, <q>Sanskrit-English Dictionary,
based upon the St. Petersburg Lexicons: a machine readable
transcription</q>.  If you wish to retain some or all of the
title of the source text in the title of the computer file, then
introduce one of the following phrases:
<list type="bullets">
<item>[title of source]:  a machine readable transcription.</item>
<item>[title of source]:  electronic edition.</item>
<item>A machine readable version of:  [title of source].</item></list>-->
This will distinguish the electronic work from the source text in
citations and in catalogues which contain descriptions of both types of
material.
</p>
<p>The electronic work will also have an external name (its
<soCalled>filename</soCalled> or <soCalled>data set name</soCalled>) or reference number on the
computer system where it resides at any time.  This name is likely to
change frequently, as new copies of the file are made on the computer
system.  Its form is entirely dependent on the particular computer
system in use and thus cannot always easily be transferred from one
system to another.  Moreover, a given work may be composed of many
files. For these reasons, these Guidelines strongly
recommend that such names should <emph>not</emph> be used as the
<gi>title</gi> for any electronic work.
</p>
<p>Helpful guidance on the formulation of useful descriptive titles in
difficult cases may be found in chapter 25 of <ptr target="#HD-BIBL-1"/>)
or another national cataloguing code.
</p>
<p>The elements <gi>author</gi>, <gi>editor</gi>, <gi>sponsor</gi>, <gi>funder</gi>,
and <gi>principal</gi>, are specializations of the more general
<gi>respStmt</gi> element. These elements are used to provide the
<term>statements of responsibility</term> which identify the person(s)
responsible for the intellectual or artistic content of an item and
any corporate bodies from which it emanates.
</p>
<p>Any number of such statements may occur within the title
statement.  At a minimum, identify the author of the text and (where
appropriate) the
creator of the file.  If the bibliographic description
is for a corpus, identify the creator of the corpus.  <!-- These
identifications are mandatory when applicable, though not enforceable by
the parser.-->  Optionally include also names of others involved in
the transcription or elaboration of the text, sponsors, and funding
agencies.  The name of the person responsible for physical data input
need not normally be recorded, unless that person is also intellectually
responsible for some aspect of the creation of the file.
</p>
<p>Where the person whose responsibility is to be documented is not an
author, sponsor, funding body, or principal researcher, the <gi>respStmt</gi>
element should be used.  This has two subcomponents: a <gi>name</gi>
element identifying a responsible individual or organization, and a
<gi>resp</gi> element indicating the nature of the responsibility.  No
specific recommendations are made at this time as to appropriate content
for the <gi>resp</gi>:  it should make clear the nature of the
responsibility concerned, as in the examples below.
</p>
<p>Names given may be personal names or corporate names.  Give all names
in the form in which the persons or bodies wish to be publicly cited.
This would usually be the fullest form of the name, including first
names.<note place="bottom">Agencies compiling catalogues of
machine-readable files are recommended to use available authority lists,
such as the Library of Congress Name Authority List, for all common
personal names.</note>
</p>
<p>Examples:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><titleStmt>
  <title>Capgrave's Life of St. John Norbert:  a
         machine-readable transcription</title>
  <respStmt> <resp>compiled by</resp> <name>P.J. Lucas</name> </respStmt>
</titleStmt></egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples"><titleStmt>
  <title>Two stories by Edgar Allen Poe: electronic version</title>
  <author>Poe, Edgar Allen (1809-1849)</author>
  <respStmt>
    <resp>compiled by</resp> <name>James D. Benson</name>
  </respStmt>
</titleStmt></egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples"><titleStmt>
  <title>Yogadarśanam (arthāt
         yogasūtrapūṭhaḥ):
         a digital edition.</title>
  <title>The Yogasūtras of Patañjali:
         a digital edition.</title>
  <funder>Wellcome Institute for the History of Medicine</funder>
  <principal>Dominik Wujastyk</principal>
  <respStmt><name>Wieslaw Mical</name>
        <resp>data entry and proof correction</resp>
  </respStmt>
  <respStmt><name>Jan Hajic</name>
            <resp>conversion to TEI-conformant markup</resp></respStmt>
</titleStmt></egXML>
</p><specGrp xml:id="D2221" n="The title statement"><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/titleStmt.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/sponsor.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/funder.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/principal.xml"/>
</specGrp>
</div>

<div type="div3" xml:id="HD22"><head>The Edition Statement</head>
<p>The <gi>editionStmt</gi> element is the second component of the
<gi>fileDesc</gi> element.  It is optional but recommended.
<specList><specDesc key="editionStmt"/></specList>
It contains either phrases or more specialized elements identifying the
edition and those responsible for it:
<specList><specDesc key="edition"/><specDesc key="respStmt"/><specDesc key="name"/><specDesc key="resp"/></specList>
</p>
<p>For printed texts, the word <mentioned>edition</mentioned> applies
to the set of all the identical copies of an item produced from one
master copy and issued by a particular publishing agency or a group of
such agencies.  A change in the identity of the distributing body or
bodies does not normally constitute a change of edition, while a
change in the master copy does.
</p>
<p>For electronic texts, the notion of a <soCalled>master
copy</soCalled> is not entirely appropriate, since they are far more
easily copied and modified than printed ones; nonetheless the term
<mentioned>edition</mentioned> may be used for a particular state of a
machine-readable text at which substantive changes are made and fixed.
Synonymous terms used in these Guidelines are
<mentioned>version</mentioned>, <mentioned>level</mentioned>, and
<mentioned>release</mentioned>.  The words
<mentioned>revision</mentioned> and <mentioned>update</mentioned>, by
contrast, are used for minor changes to a file which do not amount to
a new edition.
</p><p>No simple rule can specify how <soCalled>substantive</soCalled>
changes have to be before they are regarded as producing a new
edition, rather than a simple update.  The general principle proposed
here is that the production of a new edition entails a significant
change in the intellectual content of the file, rather than its
encoding or appearance.  The addition of analytic coding to a text
would thus constitute a new edition, while automatic conversion from
one coded representation to another would not.  Changes relating to
the character code or physical storage details, corrections of
misspellings, simple changes in the arrangement of the contents and
changes in the output format do not normally constitute a new edition,
whereas the addition of new information (e.g. a linguistic analysis
expressed in part-of-speech tagging, sound or graphics, referential
links to external data sets) almost always does.
</p>
<p>Clearly, there will always be borderline cases and the matter is
somewhat arbitrary.  The simplest rule is: if you think that your file
is a new edition, then call it such.  An edition statement is optional
for the first release of a computer file; it is mandatory for
each later release, though this requirement cannot be enforced by the
parser.
</p>
<p>Note that <emph>all</emph> changes in a file considered significant, whether or not they are
regarded as constituting a new edition or simply a new revision, should
be independently noted in the revision description section of the file
header (see section <ptr target="#HD6"/>).
</p>
<p>The <gi>edition</gi> element should contain phrases describing the
edition or version, including the word <mentioned>edition</mentioned>,
<mentioned>version</mentioned>, or equivalent, together with a number or date,
or terms indicating difference from other editions such as <mentioned>new
edition</mentioned>, <mentioned>revised edition</mentioned> etc.  Any dates that
occur within the edition statement should be marked with the
<gi>date</gi> element.  The <att>n</att> attribute of the
<gi>edition</gi> element may be used as elsewhere to supply any formal
identification (such as a version number) for the edition.
</p>
<p>One or more <gi>respStmt</gi> elements may also be used to supply
statements of responsibility for the edition in question.  These may
refer to individuals or corporate bodies and can indicate functions such
as that of a reviser, or can name the person or body responsible for the
provision of supplementary matter, of appendices, etc., in a new
edition.  For further detail on the <gi>respStmt</gi> element,
see section <ptr target="#COBI"/>.</p>
<p>Some examples follow:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><editionStmt>
  <edition n="P2">Second draft, substantially
       extended, revised, and corrected.</edition>
</editionStmt></egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples"><editionStmt>
<edition>Student's edition, <date>June 1987</date></edition>
<respStmt>
  <resp>New annotations by</resp>
  <name>George Brown</name>
</respStmt>
</editionStmt></egXML>
</p>
<specGrp xml:id="D2222" n="The edition statement"><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/editionStmt.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/edition.xml"/>
</specGrp>
</div>

<div type="div3" xml:id="HD23"><head>Type and Extent of File</head>
<p>The <gi>extent</gi> element is the third component of the
<gi>fileDesc</gi> element.  It is optional.
<specList><specDesc key="extent"/></specList>
</p>
<p>For printed books, information about the carrier, such as the kind of
medium used and its size, are of great importance in cataloguing
procedures.  The print-oriented rules for bibliographic description of
an item's medium and extent need some re-interpretation when applied to
electronic media.  An electronic file exists as a distinct entity quite
independently of its carrier and remains the same intellectual object
whether it is stored on a magnetic tape, a CD-ROM, a set of floppy disks,
or as a file on a mainframe computer.  Since, moreover, these Guidelines
are specifically aimed at facilitating transparent document storage and
interchange, any purely machine-dependent information should be
irrelevant as far as the file header is concerned.
</p>
<p>This is particularly true of information about <term>file-type</term>
although library-oriented rules for cataloguing often distinguish two
types of computer file:  <q>data</q> and <q>programs</q>.  This
distinction is quite difficult to draw in some cases, for example,
hypermedia or texts with built in search and retrieval software.
</p>
<p>Although it is equally system-dependent, some measure of the size of
the computer file may be of use for cataloguing and other practical
purposes.  Because the measurement and expression of file size is
fraught with difficulties, only very general recommendations are
possible; the element <gi>extent</gi> is provided for this purpose.  It
contains a phrase indicating the size or approximate size of the
computer file in one of the following ways:
<list rend="bulleted">
<item>in bytes of a specified length (e.g. <q>4000 16-bit bytes</q>)</item>
<item>as falling within a range of categories, for example:
<list rend="bulleted">
<item>less than 1 Mb</item>
<item>between 1 Mb and 5 Mb</item>
<item>between 6 Mb and 10 Mb</item>
<item>over 10 Mb</item></list></item>
<item>in terms of any convenient logical units (for example,
words or sentences, citations, paragraphs)</item>
<item>in terms of any convenient physical units (for example,
blocks, disks, tapes)</item></list>
</p>
<p>The use of standard abbreviations for units of quantity is
recommended where applicable, here as elsewhere (see <ptr
target="http://physics.nist.gov/cuu/Units/binary.html"/>). </p>
<p>Examples:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><extent>between 1 and
2 Mb</extent>
<extent>4.2 MiB</extent>
<extent>4532 bytes</extent>
<extent>3200 sentences</extent>
<extent>Five 90 mm High Density Diskettes</extent></egXML>
</p>
<p>The <gi>measure</gi> element and its attributes may be used to supply
machine-tractable or normalised versions of the size or sizes given,
as in the following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><extent>
<measure unit="MiB" quantity="4.2">About four megabytes</measure>
<measure unit="pages" quantity="245">245 pages of source
material</measure>
</extent></egXML>
Note that when more than one <gi>measure</gi> is supplied in a single
<gi>extent</gi>, the implication is that all the measurements apply to
the whole resource.</p>
<specGrp xml:id="D2223" n="The extent statement"><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/extent.xml"/>
</specGrp>
</div>

<div type="div3" xml:id="HD24"><head>Publication, Distribution, Licensing, etc.</head>
<p>The <gi>publicationStmt</gi> element is the fourth component of the
<gi>fileDesc</gi> element and is mandatory. Its function is to name the agency by which a resource is made available (for example, a publisher or distributor) and to supply any additional information about the way in which it is made available such as licensing conditions, identifying numbers, etc.
<specList><specDesc key="publicationStmt"/></specList>
It may contain either a simple prose description organized as one or
more paragraphs, or the more specialised elements described below.
</p>
<p>A structured publication statement must begin with one of the following elements:
<specList><specDesc key="publisher"/><specDesc key="distributor"/><specDesc key="authority"/></specList>
These elements form the <ident
type="class">model.publicationStmtPart.agency</ident> class; if the agency making the resource available is unknown, but other structured information about it is available, an explicit statement such as <soCalled>publisher unknown</soCalled> should be used.
</p>

<p>The <term>publisher</term> is the person or institution by whose
authority a given edition of the file is made public.  The
<term>distributor</term> is the person or institution from whom copies
of the text may be obtained.  Where a text is not considered formally
published, but is nevertheless made available for circulation by some
individual or organization, this person or institution is termed the
<term>release authority</term>.
</p>
<p>Whichever of these elements is chosen, it may be followed by one or more of the following elements, which together form the <ident
type="class">model.publicationStmtPart.detail</ident> class
 <specList><specDesc key="pubPlace"/><specDesc key="address"/><specDesc key="idno" atts="type"/><specDesc key="availability" atts="status"/><specDesc key="date"/><specDesc key="licence"/></specList>
</p>
<p>Here is a simple example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><publicationStmt>
     <publisher>Oxford University Press</publisher>
     <pubPlace>Oxford</pubPlace> <date>1989</date>
     <idno type="ISBN">0-19-254705-4</idno>
     <availability><p>Copyright 1989, Oxford University Press</p></availability></publicationStmt></egXML>
</p>

<p>The  <ident
type="class">model.publicationStmtPart.detail</ident> elements all supply additional information relating to the
the publisher, distributor, or release authority immediately preceding them.
In the following example, Benson is identified as responsible for distribution of some resource at the date and place cited:
 <egXML xmlns="http://www.tei-c.org/ns/Examples"><publicationStmt>
    <authority>James D. Benson</authority>
    <pubPlace>London</pubPlace> <date>1994</date></publicationStmt></egXML>
</p>
<p>
A resource may have (for example) both a publisher and a distributor, or more than one publisher each using different identifiers for the same resource, and so on. For this reason, the sequence of at least one <ident type="class">model.publicationStmtPart.agency</ident> element followed by zero or more <ident type="class">model.publicationStmtPart.detail</ident> elements may be repeated as often as necessary. </p>
<p>The following example shows a resource published by one agency (Sigma Press) at one address and date, which is also distributed by another (Oxford Text Archive), with a specified identifier and a different date: <egXML xmlns="http://www.tei-c.org/ns/Examples"><publicationStmt>

  <publisher>Sigma Press</publisher>
  <address>
    <addrLine>21 High Street,</addrLine>
    <addrLine>Wilmslow,</addrLine>
    <addrLine>Cheshire M24 3DF</addrLine>
  </address>
  <date>1991</date>

  <distributor>Oxford Text Archive</distributor>
  <idno type="OTA">1256</idno>
  <availability>
    <p>Available with prior consent of depositor for
      purposes of academic research and teaching only.</p>
  </availability>
<date>1994</date>

</publicationStmt></egXML>
</p>
<p>The <gi>date</gi> element used within <gi>publicationStmt</gi> always refers to the date of publication, first distribution, or initial release.
If the text was created at some other date, this may be recorded using the <gi>creation</gi> element within the
<gi>profileDesc</gi> element. Other useful dates (such as dates of collection of data) may be given using a note in the <gi>notesStmt</gi> element.
</p>

<p>The <gi>availability</gi> element may be used, as above, to provide a simple prose statement of any restrictions concerning the distribution of the resource. Alternatively, a more formal statement of the licensing conditions applicable may be provided using the <gi>licence</gi> element:
  <egXML xmlns="http://www.tei-c.org/ns/Examples"><publicationStmt>
    <publisher>University of Victoria Humanities Computing and Media Centre</publisher>
    <pubPlace>Victoria, BC</pubPlace> <date>2011</date><availability status="restricted"><licence target="http://creativecommons.org/licenses/by-sa/3.0/">
      Distributed under a Creative Commons Attribution-ShareAlike 3.0 Unported License
    </licence></availability></publicationStmt></egXML>
Note here the use of the <att>target</att> attribute to point to a location from which the licence document itself may be obtained. Alternatively, the licence document may simply be contained within the <gi>licence</gi> element.</p>


<specGrp xml:id="D2224" n="The publication statement"><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/publicationStmt.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/distributor.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/authority.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/idno.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/availability.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/licence.xml"/>
</specGrp>
</div>

<div type="div3" xml:id="HD26"><head>The Series Statement</head>
<p>The <gi>seriesStmt</gi> element is the fifth component of the
<gi>fileDesc</gi> element and is optional.
<specList><specDesc key="seriesStmt"/></specList>
</p>
<p>In bibliographic parlance, a <term>series</term> may be defined in
one of the following ways:
<list rend="bulleted">
<item>A group of separate items related to one another by the fact
that each item bears, in addition to its own title proper, a
collective title applying to the group as a whole.  The
individual items may or may not be numbered.</item>
<item>Each of two or more volumes of essays, lectures, articles,
or other items, similar in character and issued in sequence.</item>
<item>A separately numbered sequence of volumes within a series or
serial.</item></list>
The <gi>seriesStmt</gi> element may contain a prose description or one
or more of the following more specific elements:
<specList><specDesc key="title"/><specDesc key="idno"/><specDesc key="respStmt"/><specDesc key="resp"/><specDesc key="name"/></specList>
</p>
<p>The <gi>idno</gi> may be used to supply any identifying number
associated with the item, including both standard numbers such as an
ISSN and particular issue numbers. (Arabic numerals separated by
punctuation are recommended for this purpose: <val>6.19.33</val>,
for example, rather than <val>VI/xix:33</val>). Its <att>type</att> attribute is used to categorize
the number further, taking the value <val>ISSN</val> for an ISSN for
example.
</p>
<p>Examples:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><seriesStmt>
    <title level="s">Machine-Readable Texts for the Study of
      Indian Literature</title>
    <respStmt> <resp>ed. by</resp> <name>Jan Gonda</name> </respStmt>
    <biblScope unit="vol">1.2</biblScope>
    <idno type="ISSN">0 345 6789</idno>
</seriesStmt></egXML>
<specGrp xml:id="D2225" n="The series statement"><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/seriesStmt.xml"/>
</specGrp></p></div>

<div type="div3" xml:id="HD27"><head>The Notes Statement</head>
<p>The <gi>notesStmt</gi> element is the sixth component of the
<gi>fileDesc</gi> element and is optional.  If used, it contains one or
more <gi>note</gi> elements, each containing a single piece of
descriptive information of the kind treated as <soCalled>general
notes</soCalled> in traditional bibliographic descriptions.
<specList><specDesc key="notesStmt"/><specDesc key="note"/></specList>
</p>
<p>Some information found in the notes area in conventional bibliography
has been assigned specific elements in these Guidelines; in particular
the following items should be tagged as indicated, rather than as
general notes:
<list rend="bulleted">
<item>the nature, scope, artistic form, or purpose of the file; also
the genre or other intellectual category to which it may belong:
e.g. <q>Text types:  newspaper editorials and reportage, science
fiction, westerns, and detective stories</q>. These should be
formally described within the <gi>profileDesc</gi> element
(section <ptr target="#HD4"/>).</item>
<item>an abstract or summary of the content of a document which has been supplied by the encoder because no such abstract forms part of the content of the source. This should be supplied in the <gi>abstract</gi> element within the <gi>profileDesc</gi> element (section <ptr target="#HD4"/>).</item>
<item>summary description providing a factual, non-evaluative
account of the subject content of the file: e.g. <q>Transcribes
interviews on general topics with native speakers of English in
17 cities during the spring and summer of 1963.</q> These should
also be formally described within the <gi>profileDesc</gi> element
(section <ptr target="#HD4"/>).</item>
<item>bibliographic details relating to the source or sources of
an electronic text: e.g. <q>Transcribed from the Norton
facsimile of the 1623 Folio</q>. These should be formally
described in the <gi>sourceDesc</gi> element
(section <ptr target="#HD3"/>).</item>
<item>further information relating to publication, distribution,
or release of the text, including sources from which the text
may be obtained, any restrictions on its use or formal terms on
its availability.  These should be placed in the appropriate
division of the <gi>publicationStmt</gi> element
(section <ptr target="#HD24"/>).</item>
<item>publicly documented numbers associated with the file:  e.g.
<q>ICPSR study number 1803</q> or <q>Oxford Text Archive text
number 1243</q>.  These should be placed in an <gi>idno</gi>
element within the appropriate division of the
<gi>publicationStmt</gi> element.  International Standard
Serial Numbers (ISSN), International Standard Book Numbers
(ISBN), and other internationally agreed upon standard numbers
that uniquely identify an item, should be treated in the same
way, rather than as specialized bibliographic notes.</item></list>
</p>
<p>Nevertheless, the <gi>notesStmt</gi> element may be used to record
potentially significant details about the file and its features, e.g.:
<list rend="bulleted">
<item>dates, when they are relevant to the content or condition of the
computer file:  e.g.  <q>manual dated 1983</q>,  <q>Interview wave I:
Apr. 1989; wave II: Jan. 1990</q></item>
<item>names of persons or bodies connected with the technical production,
administration, or consulting functions of the effort which produced
the file, if these are not named in statements of responsibility in
the title or edition statements of the file description:
e.g. <q>Historical commentary provided by Mark Cohen</q></item>
<item>availability of the file in an additional medium or information not
already recorded about the availability of documentation:
e.g. <q>User manual is loose-leaf in eleven paginated sections</q></item>
<item>language of work and abstract, if not encoded in the <gi>langUsage</gi>
element, e.g. <q>Text in English with
summaries in French and German</q></item>
<item>The unique name assigned to a serial by the International Serials
Data System (ISDS), if not encoded in an <gi>idno</gi></item>
<item>lists of related publications, either describing the source itself,
or concerned with the creation or use of the electronic work, e.g.
<q>Texts used in <ptr type="cit" target="#HD-BIBL-2"/></q></item>
</list></p>
<p>Each such item of information may be tagged using the
general-purpose <gi>note</gi> element, which is described in section
<ptr target="#CONO"/>. Groups of notes are contained within the
<gi>notesStmt</gi> element, as in the following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><notesStmt>
  <note>Historical commentary provided by Mark Cohen.</note>
  <note>OCR scanning done at University of Toronto.</note>
</notesStmt></egXML>
There are advantages, however, to encoding such information with more
precise elements elsewhere in the TEI header, when such elements are
available. For example, the notes above might be encoded as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples">
      <titleStmt>
        <title>…</title>
        <respStmt>
          <persName>Mark Cohen</persName>
          <resp>historical commentary</resp>
	</respStmt>
        <respStmt>
          <orgName>University of Toronto</orgName>
          <resp>OCR scanning</resp>
	</respStmt>
      </titleStmt></egXML>
<specGrp xml:id="D2226" n="The notes statement"><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/notesStmt.xml"/>
</specGrp>
</p>
</div>

<div type="div3" xml:id="HD3"><head>The Source Description</head>
<p>The <gi>sourceDesc</gi> element is the seventh and final component of
the <gi>fileDesc</gi> element.  It is a mandatory element and is used to
record details of the source or sources from which a computer file is
derived.  This might be a printed text or manuscript, another computer
file, an audio or video recording of some kind, or a combination of
these.  An electronic file may also have no source, if what is being
catalogued is an original text created in electronic form.
<specList><specDesc key="sourceDesc"/></specList>
 </p>
<p>The <gi>sourceDesc</gi> element may contain little more than a
simple prose description, or a brief note stating that the document
has no source:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><sourceDesc>
   <p>Born digital.</p>
</sourceDesc></egXML></p>
<p>Alternatively, it may contain elements drawn from the following
three classes:
<specList>
<specDesc key="model.biblLike"/>
<specDesc key="model.sourceDescPart"/>
<specDesc key="model.listLike"/>
</specList>
</p>
<p>These classes make available by default a range of ways of
providing bibliographic citations which specify the provenance of the
text.  For written or printed sources, the source may be described in
the same way as any other bibliographic citation, using one of the
following elements:
<specList>
<specDesc key="bibl"/>
<!--specDesc key="biblItem"/-->
<specDesc key="biblStruct"/>
<specDesc key="listBibl"/>
</specList>
These elements are described in more detail in section <ptr target="#COBI"/>.
Using them, a source might be described in very simple terms:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><sourceDesc>
   <bibl>The first folio of Shakespeare, prepared by
   Charlton Hinman (The Norton Facsimile, 1968)</bibl>
</sourceDesc></egXML>
or with more elaboration:
<egXML xmlns="http://www.tei-c.org/ns/Examples" xml:lang="fr"><sourceDesc>
  <biblStruct xml:lang="fr">
    <monogr>
      <author>Eugène Sue</author>
      <title>Martin, l'enfant trouvé</title>
      <title type="sub">Mémoires d'un valet de chambre</title>
      <imprint>
        <pubPlace>Bruxelles et Leipzig</pubPlace>
        <publisher>C. Muquardt</publisher>
        <date when="1846">1846</date>
      </imprint>
    </monogr></biblStruct>
</sourceDesc></egXML>
 </p>
<p>When the header describes a text derived from some pre-existing
TEI-conformant or other digital document, it may be simpler to use the
following element, which is designed specifically for documents
derived from texts which were <soCalled>born digital</soCalled>:
<specList>
<specDesc key="biblFull"/>
</specList>
For further discussion see
section <ptr target="#HD31"/>.</p>
<p>When the module for manuscript description is included in a schema,
this class also makes available the following element:
<specList>
<specDesc key="msDesc"/>
</specList>
This element enables the encoder to record very detailed information about
one or more manuscript or analogous sources, as further discussed in
<ptr target="#MS"/>.</p>
<p>The <ident type="class">model.sourceDescPart</ident> class also makes
available additional elements when additional modules are
included. For example, when the <ident type="module">spoken</ident>
module is included, the
<gi>sourceDesc</gi> element may also include the following
special-purpose elements, intended for cases where an electronic text
is derived from a spoken text rather than a written one:
<specList><specDesc key="scriptStmt"/><specDesc key="recordingStmt"/></specList>
Full descriptions of these elements and their contents are given in
section <ptr target="#HD32"/>.</p>

<p>A single electronic text may be derived from multiple source
documents, in whole or in part. The <gi>sourceDesc</gi> may therefore
contain a <gi>listBibl</gi> element grouping together <gi>bibl</gi>,
<gi>biblStruct</gi>, or <gi>msDesc</gi> elements for each of the
sources concerned. It is also possible to repeat the
<gi>sourceDesc</gi> element in such a case.  The <att>decls</att>
attribute described in section <ptr target="#CCAS"/> may be used to
associate parts of the encoded text with the bibliographic element
from which it derives in either case. </p>

<p>The source description may also include lists of names, persons,
places, etc. when these are considered to form part of the source for
an encoded document. When such information is recorded using the
specialized elements discussed in the <ident type="module">namesdates</ident> module (<ptr target="#ND"/>), the class
<ident type="class">model.listLike</ident> makes available the
following elements to hold such information:
<specList>
<specDesc key="listNym"/>
<specDesc key="listOrg"/>
<specDesc key="listPerson"/>
<specDesc key="listPlace"/>
</specList>
</p><specGrp xml:id="D223" n="The source description">
<!--&model.sourceDescPart;-->
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/sourceDesc.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/biblFull.xml"/><specGrpRef target="#D2231"/>
</specGrp></div>

<div type="div3" xml:id="HD31"><head>Computer Files Derived from Other Computer Files</head>
<p>If a computer file (call it B) is derived not from a
printed source but from another computer file (call it
A) which includes a TEI header, then the source text of
computer file B is another computer file, A.  The five sections
of A's file header will need to be incorporated into the new
header for B in slightly differing ways, as listed below:
<list type="gloss">
  <label>fileDesc</label>
  <item>A's file description should be copied into the
  <gi>sourceDesc</gi> section of B's file description, enclosed within
  a <gi>biblFull</gi> element</item>
  <label>profileDesc</label>
  <item>A's <gi>profileDesc</gi> should be copied into B's, in
  principle unchanged; it may however be expanded by project-specific
  information relating to B.</item>
  <label>encodingDesc</label>
  <item>A's encoding practice may or (more likely) may not be the same
  as B's. Since the object of the encoding description is to define
  the relationship between the current file and its source, in
  principle only changes in encoding practice between A and B need be
  documented in B. The relationship between A and its source(s) is
  then only recoverable from the original header of A. In practice it
  may be more convenient to create a new complete
  <gi>encodingDesc</gi> for B based on A's.</item>
  <label>xenoData</label>
  <item>B is a new computer file, with a different source than A's
  source (namely, A). Thus it is unlikely that metadata from other
  schemes about A or its source can be copied wholesale to B, although
  there may be similarities.</item>
  <label>revisionDesc</label>
  <item>B is a new computer file, and should therefore have a new
  revision description. If, however, it is felt useful to include some
  information from A's <gi>revisionDesc</gi>, for example dates of
  major updates or versions, such information must be clearly marked
  as relating to A rather than to B.</item>
</list>
This concludes the discussion of the <gi>fileDesc</gi> element and
its contents.
 </p></div>
</div>

<div type="div2" xml:id="HD5"><head>The Encoding Description</head>
<p>The <gi>encodingDesc</gi> element is the second major subdivision of
the TEI header.  It specifies the methods and editorial principles which
governed the transcription or encoding of the text in hand and may also
include sets of coded definitions used by other components of the
header.  Though not formally required, its use is highly recommended.
<specList>
  <specDesc key="encodingDesc"/>
</specList>
The encoding description may contain any combination of
paragraphs of text, marked up
using the <gi>p</gi> element, along with more specialized
elements taken from the <ident type="class">model.encodingDescPart</ident>
class. By default, this class makes available the following elements:
<specList>
  <specDesc key="projectDesc"/>
  <specDesc key="samplingDecl"/>
  <specDesc key="editorialDecl"/>
  <specDesc key="tagsDecl"/>
  <specDesc key="styleDefDecl"/>
  <specDesc key="refsDecl"/>
  <specDesc key="classDecl"/>
  <specDesc key="geoDecl"/>
  <specDesc key="schemaSpec"/>
</specList>Each of these elements is further described
in the appropriate section below. Other modules have the ability to
extend this class; examples are noted in section <ptr target="#HDENCOTH"/>
</p>
<specGrp xml:id="D225">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/encodingDesc.xml"/>
<specGrpRef target="#D2251"/>
<specGrpRef target="#D2252"/>
<specGrpRef target="#D2253"/>
<specGrpRef target="#DHD57"/>
<specGrpRef target="#D2254"/>
<specGrpRef target="#D2255"/>
<specGrpRef target="#DHDAPP"/>
</specGrp>

<div type="div3" xml:id="HD51"><head>The Project Description</head>
<p>The <gi>projectDesc</gi> element may be used to describe, in prose,
the purpose for which a digital resource was created, together with
any other relevant information concerning the process by which it was
assembled or collected.  This is of particular importance for corpora
or miscellaneous collections, but may be of use for any text, for
example to explain why one kind of encoding practice has been followed
rather than another.
<specList>
  <specDesc key="projectDesc"/>
</specList></p>
<p>For example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><encodingDesc>
   <projectDesc>
      <p>Texts collected for use in the
        Claremont Shakespeare Clinic, June 1990.</p>
   </projectDesc>
</encodingDesc></egXML></p><specGrp xml:id="D2251" n="The project description">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/projectDesc.xml"/>
</specGrp>
</div>

<div type="div3" xml:id="HD52"><head>The Sampling Declaration</head>
<p>The <gi>samplingDecl</gi> element may be used to describe, in
prose, the rationale and methods used in selecting texts, or parts of
text, for inclusion in the resource.
<specList>
<specDesc key="samplingDecl"/>
</specList>
It should include information about such matters as
<list rend="bulleted">
<item>the size of individual samples</item>
<item>the method or methods by which they were selected</item>
<item>the underlying population being sampled</item>
<item>the object of the sampling procedure used</item></list>
but is not restricted to these.
        <!-- should attributes be used for sample size, etc? -LB -->
	<!-- if it would help automatic processing yes.  but i   -->
	<!-- doubt it.  so no.  -msm                             -->
<egXML xmlns="http://www.tei-c.org/ns/Examples"><samplingDecl>
   <p>Samples of 2000 words taken from the beginning of the text.</p>
</samplingDecl></egXML></p>
<p>It may also include a simple
description of any parts of the source text included or excluded.
<egXML xmlns="http://www.tei-c.org/ns/Examples"><samplingDecl>
  <p>Text of stories only has been transcribed. Pull quotes, captions,
    and advertisements have been silently omitted. Any mathematical
    expressions requiring symbols not present in the ISOnum or ISOpub
    entity sets have been omitted, and their place marked with a GAP
    element.</p>
</samplingDecl></egXML></p>
<p>A sampling declaration which applies to more than one text or
division of a text need not be repeated in the header of each such text.
Instead, the <att>decls</att> attribute of each text (or subdivision of
the text) to which the sampling declaration applies may be used to
supply a cross-reference to it, as further described in section <ptr target="#CCAS"/>.
</p>
<specGrp xml:id="D2252" n="The sampling       declaration">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/samplingDecl.xml"/></specGrp>
</div>

<div type="div3" xml:id="HD53">
<head>The Editorial Practices Declaration</head>
<p>The <gi>editorialDecl</gi> element is used to
provide details of the editorial practices applied during the encoding
of a text.
<specList>
<specDesc key="editorialDecl"/>
</specList>It may contain a prose description only, or one or more of a set of
specialized elements, members of the TEI <ident type="class">model.editorialDeclPart</ident> class. Where an encoder
wishes to record an editorial policy not specified above, this may be
done by adding a new element to this class, using the mechanisms
discussed in chapter <ptr target="#MD"/>.</p><p>Some of these policy elements carry attributes to support automated
processing of certain well-defined editorial decisions; all of them
contain a prose description of the editorial principles adopted with
respect to the particular feature concerned.  Examples of the kinds of
questions which these descriptions are intended to answer are given in
the list below.<list type="gloss">
<label><gi>correction</gi></label>
<item>
<specList>
<specDesc key="correction" atts="status method"/>
</specList>
<p>
Was the text corrected during or after data capture?  If
so, were corrections made silently or are they marked using the
tags described in section <ptr target="#COED"/>?  What principles
have been adopted with respect to omissions, truncations, dubious
corrections, alternate readings, false starts, repetitions,
etc.?</p></item>
<label><gi>normalization</gi></label>
<item>
<specList>
<specDesc key="normalization" atts="source method"/>
</specList>
<p>
Was the text normalized, for example by regularizing any non-standard
spellings, dialect forms, etc.? If so, were normalizations performed
silently or are they marked using the tags described in section <ptr target="#COED"/>? What authority was used for the regularization?
Also, what principles were used when normalizing numbers to provide
the standard values for the <att>value</att> attribute described in
section <ptr target="#CONANU"/> and what format used for
them?</p></item>
<label><gi>quotation</gi></label>
<item>
<specList>
<specDesc key="quotation" atts="marks"/>
</specList>
<p>
How were quotation marks processed?  Are apostrophes and quotation
marks distinguished?  How?  Are quotation marks retained as content in
the text or replaced by markup?  Are there any special conventions
regarding for example the use of single or double quotation marks when
nested?  Is the file consistent in its practice or has this not been
checked? See section <ptr target="#COHQQ"/> for discussion of ways in which
quotation marks may be encoded.</p></item>
<label><gi>hyphenation</gi></label>
<item>
<specList>
<specDesc key="hyphenation" atts="eol"/>
</specList>
<p>
Does the encoding distinguish <soCalled>soft</soCalled> and
<soCalled>hard</soCalled> hyphens?
What principle has been adopted with respect to end-of-line hyphenation
where source lineation has not been retained?  Have soft hyphens been
silently removed, and if so what is the effect on lineation and
pagination? See section <ptr target="#COPU-2"/> for discussion of ways in which
hyphenation  may be encoded.
</p></item>
<label><gi>segmentation</gi></label>
<item>
<specList>
<specDesc key="segmentation"/>
</specList>
<p>
How is the text segmented?  If
<gi>s</gi> or <gi>seg</gi> segmentation units have been used to divide
up the text for analysis, how are they marked and how was the
segmentation arrived at?</p></item>
<label><gi>stdVals</gi></label>
<item>
<specList>
<specDesc key="stdVals"/>
</specList>
<p>In most cases, attributes bearing standardized values (such as the
<att>when</att> or <att>when-iso</att> attribute on dates) should
conform to a defined W3C or ISO
datatype. In cases where this is not appropriate, this element may be
used to describe the standardization methods underlying the values
supplied.  <!-- any standardized values
supplied for numeric values? If the <att>value</att> attribute
described in section <ptr target="#CONANU"/> has been used, in what
format are its values presented? If the <att>when</att>,
<att>when-iso</att>, or similar attributes described in sections <ptr
target="#CONADA"/> and <ptr target="#NDDATE"/> has been used, are the
values given supposed to match a subset of the formats
available?--></p></item>
<label><gi>interpretation</gi></label>
<item>
<specList>
<specDesc key="interpretation"/>
</specList>
<p>
Has any analytic or <soCalled>interpretive</soCalled> information
been provided—that is, information which is felt to be non-obvious,
or potentially contentious? If so, how was it generated?
How was it encoded? If feature-structure analysis has been used, are
<gi>fsdDecl</gi> elements (section <ptr target="#FD"/>)
present?</p></item>
<label><gi>punctuation</gi></label>
<item>
<specList>
<specDesc key="punctuation"/>
</specList>
<p>
How has the encoding of punctuation marks present in the original source been treated? For example, has it been normalised, or suppressed in favour of descriptive markup? If it has been retained, is it located within or around elements such as <gi>quote</gi> which are normally associuated with quotations? </p></item>

</list></p><p>Any information about the editorial principles applied not falling
under one of the above headings should be recorded in a distinct list
of items.  Experience shows that a full record should be kept of
decisions relating to editorial principles and encoding practice,
both for future users of the text and for the project which produced
the text in the first instance. Some simple examples follow:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><editorialDecl>
  <segmentation><p><gi>s</gi> elements mark orthographic sentences and
  are numbered sequentially
  within their parent <gi>div</gi> element
</p></segmentation>
  <interpretation>
    <p>The part of speech analysis applied throughout section 4 was
      added by hand and has not been checked.</p>
  </interpretation>
  <correction>
    <p>Errors in transcription controlled by using the
      WordPerfect spelling checker.</p>
  </correction>
  <normalization source="http://szotar.sztaki.hu/webster/">
    <p>All words converted to Modern American spelling following
      Websters 9th Collegiate dictionary.</p>
  </normalization>
  <quotation marks="all">
    <p>All opening quotation marks represented by entity reference
    <ident type="ge">odq</ident>; all closing quotation marks
    represented by entity reference <ident type="ge">cdq</ident>.</p>
  </quotation>
</editorialDecl></egXML>
</p><specGrp xml:id="D2253" n="The editorial practices declaration">
<!--&model.editorialDeclPart;-->
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/editorialDecl.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/correction.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/normalization.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/quotation.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/hyphenation.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/segmentation.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/stdVals.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/interpretation.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/punctuation.xml"/>
</specGrp>
<p>An editorial practices declaration which applies to more than one text
or division of a text need not be repeated in the header of each such
text.  Instead, the <att>decls</att> attribute of each text (or
subdivision of the text) to which it applies may
be used to supply a cross-reference to it, as further described in
section <ptr target="#CCAS"/>.
	</p></div>

<div type="div3" xml:id="HD57"><head>The Tagging Declaration</head>
<p>The <gi>tagsDecl</gi> element is used to record
the following information about the tagging used within a particular
text:
<list rend="bulleted">
<item>the namespace to which elements appearing within the transcribed
text belong.</item>
<item>how often particular elements appear within the text, so that
a recipient can validate the integrity of a text during interchange.</item>
<item>any comment relating to the usage of particular elements not
specified elsewhere in the header.</item>
<item>a default rendition applicable to all instances
of an element.</item></list></p>
<p>This information is conveyed by the following elements:
<specList>
<specDesc key="rendition" atts="selector"/>
  <specDesc key="att.styleDef" atts="scheme schemeVersion"/>
<specDesc key="namespace"/>
<specDesc key="tagUsage"/>
</specList></p>
<p>The <gi>tagsDecl</gi> element consists of an optional sequence of
<gi>rendition</gi> elements, each of which must bear a unique
identifier, followed by an optional sequence of one or more
<gi>namespace</gi> elements, each of which contains a series of
<gi>tagUsage</gi> elements, up to one for each element type from that
namespace occurring within the associated <gi>text</gi> element. Note
that these <gi>tagUsage</gi> elements must be nested within a
<gi>namespace</gi> element, and cannot appear directly within the
<gi>tagsDecl</gi> element.</p>

<div xml:id="HD57-1"><head>Rendition</head>
<p>The <gi>rendition</gi> element allows the encoder to specify how
one or more elements are rendered in the original source in any of the
following ways:
<list rend="bulleted">
<item>using an informal prose description</item>
<item>using a standard stylesheet language such as CSS or
XSL-FO</item>
<item>using a project-defined formal language</item>
</list></p>
<p>One or more such specifications may be associated with elements of
a document in three ways:
<list rend="bulleted">
<item>the
<att>render</att> attribute of the appropriate <gi>tagUsage</gi>
element may be used to indicate a default rendition for all occurrences of the named
element</item>
  <item>the <att>selector</att> attribute on any <gi>rendition</gi> element may be used to select a collection of elements to which it applies</item>
<item>the global <att>rendition</att> attribute may be used on any
element to
indicate its rendition, overriding or complementing any supplied default value</item>
</list>
The global <att>rend</att> and <att>style</att> attributes may also be
used to  describe the rendering of an element. See further <ptr
target="#STGAre"/>
</p>

<p>For example, the following schematic shows how an encoder might
specify that all <gi>p</gi> elements are by default to be rendered using one
set of specifications identified as <val>style1</val>,
while <gi>hi</gi>
elements are to use a different set, identified as <val>style2</val>:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><tagsDecl>
   <rendition xml:id="style1">
      ... description of one default rendition here ...
   </rendition>
   <rendition xml:id="style2">
      ... description of another default rendition here ...
   </rendition>
   <namespace name="http://www.tei-c.org/ns/1.0">
   <tagUsage gi="p" render="#style1"> ... </tagUsage>
   <tagUsage gi="hi" render="#style2"> ... </tagUsage>
   </namespace>
</tagsDecl><!-- elsewhere in the document -->
<p>This paragraph, mostly rendered in style1, contains a few words
<hi>rendered in style2</hi></p>
<p rendition="#style2">This paragraph is all rendered in style2</p>
<p>This is back to style1</p>
</egXML></p>

<p>As noted above, the content of a <gi>rendition</gi> element may
describe the appearance of the source material using prose, a
project-defined formal language, or any standard languages such as the
Cascading Stylesheet Language (<ptr target="#CSS21"/>) or the XML
vocabulary for specifying formatting semantics which forms a part of
the W3C's Extensible Stylesheet Language (<ptr target="#XSL11"/>). A
<gi>styleDefDecl</gi> element (<ptr target="#HD57-1a"/>) may be
supplied within the <gi>encodingDesc</gi> to specify which of these
applies by default; it may be overridden for one or more specific
<gi>rendition</gi> elements using the <att>scheme</att> attribute.
</p>
  
  <p>In the above example, all <gi>p</gi> elements are styled in the same way, as are all <gi>hi</gi> elements, except where <att>rendition</att> is used to override the default at the element level. However, suppose that all paragraphs in the <gi>front</gi> of a text appear in a large font, with significant top and bottom margins, while paragraphs in the main <gi>body</gi> are in regular font size and have no top and bottom margins. This would be difficult to express using <gi>tagUsage</gi>; and it would be inefficient to have to provide repeated <att>rendition</att> attributes on every paragraph in one or the other context. Instead, we can use the <att>selector</att> attribute on <gi>rendition</gi> for a more elegant solution:
<egXML xmlns="http://www.tei-c.org/ns/Examples">
  <rendition 
    scheme="css" 
    selector="front p">  
    font-size: 110%;
    margin-top: 0.5em;
    margin-bottom: 0.5em;
  </rendition>
</egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples">
  <rendition 
    scheme="css" 
    selector="body p">  
    font-size: 100%;
    margin-top: 0;
    margin-bottom: 0;
  </rendition>
</egXML>
    
It is recommended that in a single text, either <gi>rendition</gi>/<att>selector</att> or <gi>tagUsage</gi>/<att>render</att> be used; the combination of both is potentially confusing and harder to process.
    
</p>

<p>In the following extended example we consider how to capture the
appearance of a typical early 20th century titlepage, such as that in the
following figure: <figure corresp="#HD-eg-swinb"><graphic url="Images/HDswinburne.jpg" height="400px"/></figure> Elements for the encoding of
the information on a titlepage are presented in <ptr target="#DSTITL"/>; here we consider how we might go about encoding
some of the visual information as well, using the <gi>rendition</gi>
element and its corresponding attributes.</p>
<p>First we define a rendition element for each aspect of the source
page rendition that we wish to retain. Details of CSS are given in <ptr target="#CSS21"/>; we use it here simply to  provide a vocabulary with
which to describe such aspects
as font size and style, letter and line spacing, colour, etc. Note
that the purpose of this encoding is to describe the original, rather
than specify how it should be reproduced, although the two are
obviously closely linked.
  <egXML xml:lang="und"
	 xmlns="http://www.tei-c.org/ns/Examples"><styleDefDecl scheme="css" schemeVersion="2.1"/>
<!-- ... -->
<tagsDecl>
  <rendition xml:id="center" >text-align: center;</rendition>
  <rendition xml:id="small" >font-size: small;</rendition>
  <rendition xml:id="large" >font-size: large;</rendition>
  <rendition xml:id="x-large" >font-size: x-large;</rendition>
  <rendition xml:id="xx-large" >font-size: xx-large</rendition>
  <rendition xml:id="expanded" >letter-spacing: +3pt;</rendition>
  <rendition xml:id="x-space" >line-height: 150%;</rendition>
  <rendition xml:id="xx-space" >line-height: 200%;</rendition>
  <rendition xml:id="red" >color: red;</rendition>
</tagsDecl>             </egXML>
</p>
<p>The global <att>rendition</att> attribute can now be used to
specify on any element which of the above rendition features apply to
it. For example, a title page might be encoded as
follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#HD-eg-swinb"><titlePage>
    <docTitle rendition="#center #x-space">
        <titlePart><lb/>
            <hi rendition="#x-large">THE POEMS</hi>
            <lb/><hi rendition="#small">OF</hi>
            <lb/><hi rendition="#red #xx-large">ALGERNON CHARLES SWINBURNE</hi>
            <lb/><hi rendition="#large #xx-space">IN SIX VOLUMES</hi>
        </titlePart>
        <titlePart rendition="#xx-space"><lb/> VOLUME I.
        <lb/><hi rendition="#red #x-large">POEMS AND BALLADS</hi>
        <lb/><hi rendition="#x-space">FIRST SERIES</hi>
	</titlePart>
    </docTitle>
    <docImprint rendition="#center">
        <lb/><pubPlace rendition="#xx-space">LONDON</pubPlace>
        <lb/><publisher rendition="#red #expanded">CHATTO &amp; WINDUS</publisher>
        <lb/><docDate when="1904" rendition="#small">1904</docDate>
    </docImprint>
    </titlePage>
    </egXML></p>
<p>When CSS is used as the style definition language, the <att>scope</att>
attribute may be used to specify CSS pseudo-elements. These
pseudo-elements are used to specify styling applicable to only a portion of the
given text. For example, the  <code>first-letter</code>
pseudo-element defines styling to be applied  to the first letter in the targeted
element, while the <code>before</code> and
<code>after</code> pseudo-elements can be used often in conjunction with the
"content" property to add additional characters which need to be added
before or after the element content to
make it more closely resemble the appearance of the source.</p>
<p>For example, assuming that a text has been encoded using the <gi>q</gi>
element to enclose passages in quotation marks, but the quotation
marks themselves have been routinely omitted from the encoding, a set of
renditions such as the following:
  <egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples">
<rendition xml:id="quoteBefore" scheme="css" scope="before">content:
'“';</rendition>
<rendition xml:id="quoteAfter" scheme="css" scope="after">content:
'”';</rendition>
</egXML>
might be used to predefine pseudo-elements <code>quoteBefore</code>
and <code>quoteAfter</code>. Where a <gi>q</gi> element is actually
rendered in the source with initial and final quotation marks, it may
then be encoded as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<q rendition="#quoteBefore #quoteAfter">Four score and seven years
ago...</q>
</egXML>
</p>
</div>

<div xml:id="HD57-2"><head>Tag Usage</head>
<p>As noted above, each <gi>namespace</gi> element, if present,
should contain up to one occurrence of a <gi>tagUsage</gi> element
for each element type from the given namespace that occurs within
the outermost <gi>text</gi> element associated with the
<gi>teiHeader</gi> in which it appears.<note place="bottom">In the case
of a TEI corpus (<ptr target="#CC"/>), a <gi>tagsDecl</gi> in a corpus
header will describe tag usage across the whole corpus, while one in
an individual text header will describe tag usage for the individual
text concerned.</note> The <gi>tagUsage</gi> element may be used to supply
a count of the number of occurrences of this element within the text,
which is given as the value of its <att>occurs</att> attribute. It may
also be used to hold any additional usage information, which is
supplied as running prose within the element itself.</p>
<p>For example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><tagUsage gi="hi" occurs="28">
    Used only to mark English words italicised in the copy text.
</tagUsage></egXML>
This indicates that the <gi>hi</gi> element appears a total of 28 times
in the <gi>text</gi> element in question, and that the encoder has used
it to mark italicised English words only.</p>
<p>The <att>withId</att> attribute may optionally be used to specify
how many of the occurrences of the element in question bear a value
for the global <att>xml:id</att> attribute, as in the following
example: <egXML xmlns="http://www.tei-c.org/ns/Examples"><tagUsage
gi="pb" occurs="321" withId="321"> Marks page breaks in the York
(1734) edition only </tagUsage></egXML> This indicates that the
<gi>pb</gi> element occurs 321 times, on each of which an identifier
is provided.</p>

<p>The content of the <gi>tagUsage</gi> element is not susceptible of
automatic processing. It should not therefore be used to hold
information for which provision is already made by other components of
the encoding description. A TEI-conformant document is not required to
provide any <gi>tagUsage</gi> elements or <att>occurs</att>
attributes, but if it does, then the counts provided must correspond
with the number of such elements present in the associated
<gi>text</gi>.

<specGrp xml:id="DHD57" n="Tag usage and rendition declarations">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/tagsDecl.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/tagUsage.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/namespace.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude"
	 href="../../Specs/rendition.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude"
	 href="../../Specs/styleDefDecl.xml"/>

</specGrp></p></div>
</div>

<div xml:id="HD57-1a"><head>The Default Style Definition Language Declaration</head>
  <p>The content of  the <gi>rendition</gi> element, the value of its 
    <att>selector</att> attribute, and the value of the
<att>style</att> attribute are expressed using one of a small number
of formally defined style definition languages. For ease of processing, it is strongly
recommended to use a single such language throughout an encoding
project, although the TEI system permits a mixture. </p>
<p>The element <gi>styleDefDecl</gi>, a sibling of the
<gi>tagsDecl</gi> element, is used to supply the name of the
default style definition language. The name is supplied as the value
of the <att>scheme</att> attribute and may take any of the
following values:
<!-- why can't i do a specList for a valList? -->
<list type="gloss">
<label>free</label><item>Informal free text description</item>
<label>css</label><item>Cascading Stylesheet Language</item>
<label>xslfo</label><item>Extensible Stylesheet Language Formatting Objects</item>
<label>other</label><item>A user-defined formal description
language</item> </list>. The <att>schemeVersion</att> attribute may be
  used to supply the precise version of the style definition language
  used, and the content of this element, if any, may
supply additional information.</p>
<p>When the <att>style</att> attribute is used, its value must always
be expressed using whichever default style definition language is in
force. If more than one occurrence of the <gi>styleDefDecl</gi> is
provided, there will be more than one default available, and the
<att>decls</att> attribute must be used to select which is applicable
in a given context, as discussed in section <ptr
target="#CCAS"/>. </p>
</div>

<div type="div3" xml:id="HD54"><head>The Reference System Declaration</head>
<p>The <gi>refsDecl</gi> element is  used to
document the way in which any standard referencing scheme built into
the encoding works.
<specList>
<specDesc key="refsDecl"/>
</specList>
 It may contain either  a series of prose paragraphs or the following specialized elements:
<specList>
<specDesc key="cRefPattern"/>
<specDesc key="refState"/>
<specDesc key="att.patternReplacement" atts="matchPattern replacementPattern"/>
</specList>
Note that not all possible referencing schemes are equally easily
supported by current software systems.  A choice must be made between
the convenience of the encoder and the likely efficiency of the
particular software applications envisaged, in this context as in many
others. For a more detailed discussion of referencing
systems supported by these Guidelines, see section <ptr target="#CORS"/> below.
	</p>
<p>A referencing scheme may be described in one of three ways using this
element:
<list rend="bulleted">
<item>as a prose description</item>
<item>as a series of pairs of regular expressions and XPaths</item>
<item>as a concatenation of sequentially organized
<term>milestone</term>s</item></list>
Each method is described in more detail below. Only one method can be
used within a single <gi>refsDecl</gi> element.</p>
<p>More than one <gi>refsDecl</gi> element can be included in the
header if more than one canonical reference scheme is to be used in
the same document, but the current proposals do not check for mutual
inconsistency. <!-- A reference declaration can only describe the
referencing system applicable to a single document; if therefore
several referencing systems are in use (as discussed in section <ptr target="#CORS"/>), a <gi>refsDecl</gi> element must be supplied for
each; the <att>doctype</att> attribute should be used to specify the
document type to which the declaration relates.--></p>

<div type="div4" xml:id="HD54P"><head>Prose Method</head>
<p>The referencing scheme may be specified within the <gi>refsDecl</gi>
by a simple prose description.  Such a description should indicate which
elements carry identifying information, and whether this information is
represented as attribute values or as content.  Any special rules about
how the information is to be interpreted when reading or generating a
reference string should also be specified here.  Such a prose
description cannot be processed automatically, and this method of
specifying the structure of a canonical reference system is therefore
not recommended for automatic processing.</p>
<p>For example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><refsDecl>
  <p>The <att>n</att> attribute of each text in this corpus carries a
  unique identifying code for the whole text. The title of the text is
  held as the content of the first <gi>head</gi> element within each
  text. The <att>n</att> attribute on each <gi>div1</gi> and
  <gi>div2</gi> contains the canonical reference for each such
  division, in the form 'XX.yyy', where XX is the book number in Roman
  numerals, and yyy the section number in arabic. Line breaks are
  marked by empty <gi>lb</gi> elements, each of which includes the
  through line number in Casaubon's edition as the value of its
  <gi>n</gi> attribute.</p>
  <p>The through line number and the text identifier uniquely identify
  any line. A canonical reference may be made up by concatenating the
  <gi>n</gi> values from the <gi>text</gi>, <gi>div1</gi>, or
  <gi>div2</gi> and calculating the line number within each part.</p>
</refsDecl></egXML></p></div>

<div type="div4" xml:id="HD54S">
<head>Search-and-Replace Method</head>
<p>This method often requires a significant investment of effort
initially, but permits extremely flexible addressing. For details, see
section <ptr target="#SACR"/>.
<?tei winita This section must be completely rewritten based on what is now #SACR. ?>
<specList>
  <specDesc key="cRefPattern"/>
</specList>
</p></div>

<div type="div4" xml:id="HD54M"><head>Milestone Method</head>
<p>This method is appropriate when only <soCalled>milestone</soCalled>
tags (see section <ptr target="#CORS5"/>) are available to provide the
required referencing information. It does not provide any abilities
which cannot be mimicked by the search-and-replace referencing method
discussed in the previous section, but in the cases where it applies,
it provides a somewhat simpler notation.</p>
<p>A reference based on milestone tags concatenates the values specified
by one or more such tags.  Since each tag marks the point at which a
value changes, it may be regarded as specifying the <term>refState</term>
of a variable.  A reference declaration using this method therefore
specifies the individual components of the canonical reference as a
sequence of <gi>refState</gi> elements:
<specList>
<specDesc key="refState" atts="delim length"/>
<specDesc key="att.milestoneUnit" atts="unit"/>
</specList></p>
<p>For example, the reference <q>Matthew 12:34</q> might be thought of
as representing the state of three variables: the <ident>book</ident> variable
is in state <q>Matthew</q>; the <ident>chapter</ident> variable
is in state <q>12</q>, and the <ident>verse</ident> variable is
in state <q>34</q>. If milestone tagging has been used, there should be
a tag marking the point in the text at which each of the above
<soCalled>variables</soCalled> changes its state.<note place="bottom">On the
<gi>milestone</gi> tag itself, what are here referred to as
<soCalled>variables</soCalled> are identified by the combination of the
<att>ed</att> and <att>unit</att> attributes.</note>
To find <q>Matthew 12:34</q> therefore an application must scan left to
right through the text, monitoring changes in the state of each of these
three variables as it does so.  When all three are simultaneously in the
required state, the desired point will have been reached.  There may of
course be several such points.</p>
<p>The <att>delim</att> and <att>length</att> attributes are used to
specify components of a canonical reference using this method in exactly
the same way as for the stepwise method described in the preceding
section.  The other attributes are used to determine which instances of
<gi>milestone</gi> tags in the text are to be checked for state-changes.
A state-change is signalled whenever a new <gi>milestone</gi> tag is
found with <att>unit</att> and, optionally, <att>ed</att> attributes
identical to those of the <gi>refState</gi> element in question.  The value
for the new state may be given explicitly by the <att>n</att> attribute
on the <gi>milestone</gi> element, or it may be implied, if the
<att>n</att> attribute is not specified.</p>
<p>For example, for canonical references in the form <mentioned>xx.yyy</mentioned> where
the <mentioned>xx</mentioned> represents the page number in the first edition, and
<mentioned>yyy</mentioned> the line number within this page, a reference system
declaration such as the following would be appropriate:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><refsDecl>
   <refState ed="first" unit="page" length="2" delim="."/>
   <refState ed="first" unit="line" length="3"/>
</refsDecl></egXML>
This implies that milestone tags of the form
<egXML xmlns="http://www.tei-c.org/ns/Examples"><milestone n="II" ed="first" unit="page"/>
<milestone ed="first" unit="line"/></egXML>
will be found throughout the text, marking the positions at which page
and line numbers change. Note that no value has been specified for the
<att>n</att> attribute on the second milestone tag above; this implies
that its value at each state change is monotonically increased.
For more detail on the use of milestone tags,
see section <ptr target="#CORS5"/>.<!-- was:  target='P2.232ms'> -->
	<!-- was:  target='CORSYS5' > -->
	<!-- following algorithm should also be moved to CREF --></p>
<p>The milestone referencing scheme, though conceptually simple, is not
supported by a generic XML parser.  Its use places a
correspondingly greater burden of verification and accuracy on the
encoder.</p>
 <specGrp xml:id="D2254" n="The reference scheme declaration">

<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/refsDecl.xml"/>

<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/cRefPattern.xml"/>
   <include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/att.patternReplacement.xml"/>

   <include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/prefixDef.xml"/>
   <include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/listPrefixDef.xml"/>

<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/refState.xml"/>
</specGrp>
<p>A reference system declaration which applies to more than one text or
division of a text need not be repeated in the header of each such text.
Instead, the <att>decls</att> attribute of each text (or subdivision of
the text) to which the declaration applies may be used to supply a
cross-reference to it, as further described in section <ptr target="#CCAS"/>.</p>
</div></div>

<div type="div3" xml:id="HD55"><head>The Classification Declaration</head>
<p>The <gi>classDecl</gi> element is used to group
together definitions or sources for any descriptive classification
schemes used by other parts of the header.  Each such scheme is
represented by a <gi>taxonomy</gi> element, which may contain either a
simple bibliographic citation, or a definition of the
descriptive typology concerned; the following elements are used
in defining a descriptive classification scheme:
<specList>
  <specDesc key="classDecl"/>
  <specDesc key="taxonomy"/>
  <specDesc key="category"/>
  <specDesc key="catDesc"/>
</specList>
The <gi>taxonomy</gi> element has two slightly different, but related,
functions. For well-recognized and documented public classification
schemes, such as Dewey or other published descriptive thesauri, it
contains simply a bibliographic citation indicating where a full
description of a particular taxonomy may be found.
<egXML xmlns="http://www.tei-c.org/ns/Examples">
  <taxonomy xml:id="DDC12">
    <bibl>
      <title>Dewey Decimal Classification</title>
      <edition>Abridged Edition 12</edition>
    </bibl>
    </taxonomy></egXML>
For less easily accessible schemes, the <gi>taxonomy</gi> element
contains a description of the taxonomy itself as well as an optional
bibliographic citation. The description consists of a number of
<gi>category</gi> elements, each defining a single category within the
given typology. The category is defined by the contents of a nested
<gi>catDesc</gi> element, which may contain either a phrase describing
the category, or any number of elements from the <ident
type="class">model.catDescPart</ident> class. When the corpus module
is included in a schema, this class provides the <gi>textDesc</gi>
element whose components allow the definition of a text type in terms
of a set of <soCalled>situational parameters</soCalled> (see further
section <ptr target="#CCAHTD"/>; if the corpus module is not included
in a schema, this class is empty and the <gi>catDesc</gi> element may
contain only plain text. </p>
<p>If the category is subdivided, each subdivision is represented by a
nested <gi>category</gi> element, having the same structure.
Categories may be nested to an arbitrary depth in order to reflect the
hierarchical structure of the taxonomy. Each <gi>category</gi> element
bears a unique <att>xml:id</att> attribute, which is used as the
target for <gi>catRef</gi> elements referring to it.
<egXML xmlns="http://www.tei-c.org/ns/Examples"><taxonomy xml:id="b">
  <bibl>Brown Corpus</bibl>
  <category xml:id="b.a">
    <catDesc>Press Reportage</catDesc>
    <category xml:id="b.a1"><catDesc>Daily</catDesc></category>
    <category xml:id="b.a2"><catDesc>Sunday</catDesc></category>
    <category xml:id="b.a3"><catDesc>National</catDesc></category>
    <category xml:id="b.a4"><catDesc>Provincial</catDesc></category>
    <category xml:id="b.a5"><catDesc>Political</catDesc></category>
    <category xml:id="b.a6"><catDesc>Sports</catDesc></category>
  </category>
  <category xml:id="b.d"><catDesc>Religion</catDesc>
    <category xml:id="b.d1"><catDesc>Books</catDesc></category>
    <category xml:id="b.d2"><catDesc>Periodicals and tracts</catDesc></category>
  </category>
</taxonomy></egXML></p>
<p>Linkage between a particular text and a category within such a
taxonomy is made by means of the <gi>catRef</gi> element within the
<gi>textClass</gi> element, as described in section <ptr target="#HD43"/>.  Where the taxonomy permits of classification along more
than one dimension, more than one category will be referenced by a
particular <gi>catRef</gi>, as in the following example, which
identifies a text with the sub-categories <q>Daily</q>, <q>National</q>,
and <q>Political</q> within the category <q>Press Reportage</q> as
defined above.
  <egXML xml:lang="und"
	 xmlns="http://www.tei-c.org/ns/Examples"><catRef
	 target="#b.a1 #b.a3 #b.a5"/></egXML></p>
<p>A single <gi>category</gi> may contain more than one
<gi>catDesc</gi> child, when for example the category is described in
more than one language, as in the following example:
    <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:lang="mul">
<category xml:id="lit">
<catDesc xml:lang="pl">literatura piękna</catDesc>
<catDesc xml:lang="en">fiction</catDesc>
<category xml:id="litProza">
<catDesc xml:lang="pl">proza</catDesc>
<catDesc xml:lang="en">prose</catDesc>
</category>
<category xml:id="litPoezja">
<catDesc xml:lang="pl">poezja</catDesc>
<catDesc xml:lang="en">poetry</catDesc>
</category>
<category xml:id="litDramat">
<catDesc xml:lang="pl">dramat</catDesc>
<catDesc xml:lang="en">drama</catDesc>
</category>
</category>
    </egXML>


</p>
<specGrp xml:id="D2255" n="The classification declaration"><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/classDecl.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/taxonomy.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/category.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/catDesc.xml"/>
</specGrp>
</div>

<div type="div3" xml:id="HDGDECL"><head>The Geographic Coordinates Declaration</head>
    <p>The following element is provided to indicate (within the
    header of a document, or in an external location) that a
    particular coordinate notation, or a particular datum, has been
    employed in a text. The default notation
    is  a string containing two real numbers separated by
    whitespace, of which the first indicates latitude and the second
    longitude according to the 1984 World Geodetic System (WGS84).
    <specList><specDesc key="geoDecl" atts="datum"/></specList>
    </p>
    <include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/geoDecl.xml"/>
  </div>

  <div type="div3" xml:id="HDSCHSPEC"><head>The Schema Specification</head>
    <p>The <gi>schemaSpec</gi> element contains a schema specification. When this element appears inside <gi>encodingDesc</gi>, it allows embedding of a schema inside a TEI header; alternatively, this element may be used in the <gi>body</gi> of an ODD document. The use of ODD files, and their relationship to schemas, is described in detail in <ptr target="#TD"/>.</p>

    <p>A <gi>schemaSpec</gi> element contains all the information needed to generate schemas for a particular TEI customization, and the ODD documentation elements, by reference to the TEI, are more succinct than the schemas derived from them. Therefore you may find it
    convenient to make a copy of the <gi>schemaSpec</gi> from your project's ODD document inside the <gi>teiHeader</gi> itself, in addition to
      supplying an external schema and/or ODD file; if the XML file becomes separated from its schema, the schema can be regenerated at
      any time using the information in the <gi>schemaSpec</gi> element. For example:

      <egXML xml:lang="en"
        xmlns="http://www.tei-c.org/ns/Examples"><encodingDesc>
          <!-- Other encoding description elements... -->
          <schemaSpec ident="myTEICustomization" docLang="en" prefix="tei_" xml:lang="en">
            <moduleRef key="core"/>
            <moduleRef key="tei"/>
            <moduleRef key="header"/>
            <moduleRef key="textstructure"/>
          </schemaSpec>
        </encodingDesc>
      </egXML>
    </p>
  </div>

<div type="div3" xml:id="HDAPP">
<head>The Application Information Element</head>
<p>It is sometimes convenient to store information relating to the
processing of an encoded resource within its header. Typical uses for
such information might be:
<list rend="bulleted">
<item> to allow an
  application to discover that it has previously
  opened or edited a file, and what version of itself was used
  to do that;</item>
<item> to show (through a date) which
  application last edited the file to allow for diagnosis of
  any problems that might have been caused by that
  application;</item>
<item> to allow users to discover information
  about an application used to edit the file</item>
<item>to allow
  the application to declare an interest in elements of
  the file which it has edited, so that other applications
  or human editors may be more wary of making changes to
  those sections of the file.</item>
</list> </p><p>The class <ident type="class">model.applicationLike</ident>
provides an element, <gi>application</gi>, which may be used to record
such information within the <gi>appInfo</gi> element.<specList>
<specDesc key="appInfo"/>
<specDesc key="application" atts="ident version"/>
</specList>
</p><p>Each <gi>application</gi> element identifies the current state of
one software application with regard to the current file.  This
element is a member of the <ident type="class">att.datable</ident>
class, which provides a variety of attributes for associating this
state with a date and time, or a temporal range. The <att>ident</att>
and <att>version</att> attributes should be used to uniquely identify
the application and its major version number (for example,
<val>ImageMarkupTool 1.5</val>). It is not intended that an
application should add a new <gi>application</gi> each time it touches
the file.</p>
<p>The following example shows how these elements might be used to
document the fact that version 1.5 of an application called   <soCalled>Image Markup
Tool</soCalled>  has an interest in two parts of a document
which was last saved on June 6 2006. The parts concerned are
accessible at the URLs given as target for the two <gi>ptr</gi>
elements.     <egXML xmlns="http://www.tei-c.org/ns/Examples">
      <appInfo>
	<application version="1.5" ident="ImageMarkupTool" notAfter="2006-06-01">
	  <label>Image Markup Tool</label>
	  <ptr target="#P1"/>
	  <ptr target="#P2"/>
	</application>
      </appInfo>
    </egXML>
   </p><specGrp xml:id="DHDAPP"><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/appInfo.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/application.xml"/><!--&model.applicationLike;-->
</specGrp></div>

<div xml:id="HDENCOTH"><head>Module-Specific Declarations</head>
<p>The elements discussed so far are available to any
schema. When the schema in use includes some of the more specialized
TEI modules, these make available other more module-specific
components of the encoding description. These are discussed fully in
the documentation for the module in question, but are also noted
briefly here for convenience.
</p><p>The <gi>fsdDecl</gi> element is available only when the <ident type="module">iso-fs</ident> module is included in a schema. Its
purpose is to document the <term>feature system declaration</term> (as
defined in chapter <ptr target="#FD"/>) underlying any analytic
<term>feature structures</term> (as defined in chapter <ptr target="#FS"/>) present in the text documented by this header. </p><p>The <gi>metDecl</gi> element is available only when the <ident type="module">verse</ident> module is included in a schema. Its
purpose is to document any metrical notation scheme used in the text,
as further discussed in section <ptr target="#VEME"/>. It consists
either of a prose description or a series of <gi>metSym</gi>
elements.</p><p>The <gi>variantEncoding</gi> element is available only when the
<ident type="module">textcrit</ident> module is included in a
schema. Its purpose is to document the method used to encode textual
variants in the text, as discussed in section <ptr target="#TCAPLK"/>.</p>
</div>
</div>

<div type="div2" xml:id="HD4"><head>The Profile Description</head>
<p>The <gi>profileDesc</gi> element is the third major subdivision of
the TEI header. It is an optional element, the purpose of which is to
enable information characterizing various descriptive aspects of a text
or a corpus to be recorded within a single unified framework.
<specList>
<specDesc key="profileDesc"/>
</specList>
In principle, almost any component of the header might be of
importance as a means of characterizing a text.  The author of a written
text, its title or its date of publication, may all be regarded as
characterizing it at least as strongly as any of the parameters
discussed in this section.  The rule of thumb applied has been to
exclude from discussion here most of the information which generally
forms part of a standard bibliographic style description,
if only because such information has already been included
elsewhere in the TEI header.
</p>
<p>The <gi>profileDesc</gi> element contains an optional
<gi>creation</gi> element, followed by any number of additional
elements taken from the <ident type="class">model.profileDescPart</ident> class. The
default members of this class are the following :
<specList>
  <specDesc key="abstract"/>
  <specDesc key="creation"/>
  <specDesc key="langUsage"/>
  <specDesc key="textClass"/>
  <specDesc key="calendarDesc"/>
</specList>
These elements are further described in the remainder of this section.</p>
<p>When the <ident type="module">corpus</ident>
module described in chapter <ptr target="#CC"/> is included in a
schema, three further elements become available  within the
<gi>profileDesc</gi> element:
<specList>
<specDesc key="textDesc"/>
<specDesc key="particDesc"/>
<specDesc key="settingDesc"/>
</specList>
For descriptions of these elements, see section <ptr
target="#CCAH"/>.</p>
<p>When the <ident type="module">transcr</ident> module for the
transcription of primary sources described in chapter <ptr
target="#PH"/> is included in a schema, the following element becomes
available within the <gi>profileDesc</gi>element:
<specList>
<specDesc key="handNotes"/>
</specList>
For a description of this element, see section <ptr target="#PHDH"/>. Its purpose is to group together a number of
<gi>handNote</gi> elements, each of which describes a different hand
or equivalent identified within a manuscript. The
<gi>handNote</gi> element can also appear within a structured manuscript
description,  when the <ident type="module">msdescription</ident>
module described in chapter <ptr target="#MS"/> is included in a
schema. For this reason, the <gi>handNote</gi> element is actually
declared within the header module, but is only accessible to a schema
when one or other of the <ident type="module">transcr</ident> or <ident type="module">msdescription</ident> modules is included in a
schema. See further the discussion at <ptr target="#PHDH"/>.  </p>
<specGrp xml:id="D224" n="The profile description">
<!--&model.profileDescPart;-->
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/profileDesc.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/handNote.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/abstract.xml"/>
  <specGrpRef target="#D224C"/>
  <specGrpRef target="#D2241"/>
  <specGrpRef target="#D2243"/>
  <specGrpRef target="#D2242"/>
</specGrp>

<div type="div3" xml:id="HD4C"><head>Creation</head>
<p>The <gi>creation</gi> element contains phrases describing the
origin of the text, e.g. the date and place of its composition.
<specList>
<specDesc key="creation"/>
</specList>
The date and place of composition are often of particular importance for
studies of linguistic variation; since such information cannot be
inferred with confidence from the bibliographic description of the copy
text, the <gi>creation</gi> element may be used to provide a consistent
location for this information:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><creation>
   <date when="1992-08">August 1992</date>
   <rs type="city">Taos, New Mexico</rs>
</creation></egXML></p>
<specGrp xml:id="D224C" n="Creation">
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/creation.xml"/>
</specGrp>
</div>

<div type="div3" xml:id="HD41"><head>Language Usage</head>
<p>The <gi>langUsage</gi> element is used within the
<gi>profileDesc</gi> element to describe the languages, sublanguages,
registers, dialects, etc. represented within a text.  It contains one
or more <gi>language</gi> elements, each of which provides information
about a single language, notably the quantity of that language present
in the text.  Note that this
element should <emph>not</emph> be used to supply information about
any non-standard characters or glyphs used by this language; such
information should be recorded in the <gi>charDecl</gi> element in
the encoding description (see further <ptr target="#WD"/>).
<specList>
<specDesc key="langUsage"/>
<specDesc key="language" atts="usage ident"/>
</specList></p>
<p>A <gi>language</gi> element may be supplied for each different
language used in a document. If used, its <att>ident</att> attribute
should specify an appropriate language identifier, as further
discussed in section <ptr target="#CHSH"/>. This is particularly
important if extended language identifiers have been used as the
value of <att>xml:lang</att> attributes elsewhere in the
document. </p>
<p>Here is an example of the use of this element:
<egXML xml:lang="mul" xmlns="http://www.tei-c.org/ns/Examples"><langUsage>
   <language ident="fr-CA" usage="60">Québecois</language>
   <language ident="en-CA" usage="20">Canadian business English</language>
   <language ident="en-GB" usage="20">British English</language>
</langUsage></egXML>
</p>
<specGrp xml:id="D2241" n="Language usage"><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/langUsage.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/language.xml"/>
</specGrp></div>

<div type="div3" xml:id="HD43"><head>The Text Classification</head>
<p>The
<gi>textClass</gi> element is used to classify a text in some way.
<specList>
<specDesc key="textClass" />
</specList></p>
<p>
Text classification may be carried out
according to one or more of the following methods:
<list rend="bulleted">
<item>by reference to a recognized international classification such as
the Dewey Decimal Classification, the Universal Decimal Classification,
the Colon Classification, the Library of Congress Classification, or any
other system widely used in library and documentation work</item>
<item>by providing a set of keywords, as provided for example by British
Library or Library of Congress Cataloguing in Publication data</item>
<item>by referencing any other taxonomy of text categories recognized in
the field concerned, or peculiar to the material in hand; this may
include one based on recurring sets of values for the situational
parameters defined in section <ptr target="#CCAHTD"/>, or the demographic
elements described in section <ptr target="#CCAHPA"/></item></list>
The last of these may be particularly important for dealing with
existing corpora or collections, both as a means of avoiding the expense
or inconvenience of reclassification and as a means of documenting
the organizing principles of such materials.</p>
<p>The following elements are provided for this purpose:
<specList>
<specDesc key="keywords" atts="scheme"/>
<specDesc key="classCode" atts="scheme"/>
<specDesc key="catRef"/>
</specList></p>
<p>The <gi>keywords</gi> element simply categorizes an individual text
by supplying a list of keywords which may describe its topic or subject
matter, its form, date, etc.  In some schemes, the order of items in the
list is significant, for example, from major topic to minor; in others,
the list has an organized substructure of its own.  No recommendations
are made here as to which method is to be preferred.  Wherever possible,
such keywords should be taken from a recognized source, such as the
British Library/Library of Congress Cataloguing in Publication data in
the case of printed books, or a published thesaurus appropriate to the
field.</p>
<p>The <att>scheme</att> attribute is used to indicate the source of
the keywords used, in the case where such a source exists.  If the
keywords are taken from an externally defined authority which is
available online, this attribute should point directly to it, as in
the following examples:
    <egXML xmlns="http://www.tei-c.org/ns/Examples">
      <keywords scheme="http://classificationweb.net">
          <term>Babbage, Charles</term>
          <term>Mathematicians - Great Britain - Biography</term>
      </keywords>
    </egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples"><keywords scheme="http://id.loc.gov/authorities/about.html#lcsh">
    <term>English literature -- History and criticism -- Data processing.</term>
    <term>English literature -- History and criticism -- Theory, etc.</term>
    <term>English language -- Style -- Data processing.</term>
    <term>Style, Literary -- Data processing.</term>
</keywords></egXML>
</p>
<p>If the authority file is not available online, but is generally
recognized and commonly cited, a bibliographic description for it should
be supplied within the <gi>taxonomy</gi> element described in section
<ptr target="#HD55"/>; the <att>scheme</att> attribute may then
reference that <gi>taxonomy</gi> element by means of its identifier in
the usual way:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><keywords scheme="#welch">
<term>ceremonials</term>
<term>fairs</term>
<term>street life</term>
</keywords>
<!-- elsewhere in the document -->
<taxonomy xml:id="welch">
  <bibl>
    <title>Notes on London Municipal Literature, and a Suggested
    Scheme for Its Classification</title>
<author>Charles Welch</author>
<edition>1895</edition>
  </bibl>
</taxonomy>
</egXML>
</p>
<p>If no authority file exists, perhaps because the keywords used were
assigned directly by an author, the <att>scheme</att> attribute should
be omitted.</p>
<p>Alternatively, if the keyword vocabulary itself is locally defined,
the <att>scheme</att> attribute will point to the local definition, which will
typically be held in a <gi>taxonomy</gi> element within the
<gi>classDecl</gi> part of the encoding description (see section <ptr
target="#HD55"/>).
<!-- LC subject tracings on R Potter, ed., Literary           -->
	<!-- Computing and Literary Criticism:  Theoretical and       -->
	<!-- Practical Essays on Theme and Rhetoric (Philadelphia,    -->
	<!-- U Penn, 1989), from CIP data on title overleaf.
 source="#HD43-eg-123" but I think we dont need to cite source          --></p>
<p>The <gi>classCode</gi> element also categorizes an individual text,
by supplying a numerical or other code rather than descriptive
terms. Such codes constitute a recognized classification scheme, such
as the Dewey Decimal Classification. On this element, the
<att>scheme</att> attribute is required; it indicates the source of the
classification scheme in the same way as for keywords: this may be a
pointer of any kind, either to a TEI element, possibly in the current
document, as in the <gi>keywords</gi> examples above, or to some
canonical source for the scheme, as in the following example: <egXML
xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples"><classCode
scheme="http://www.udcc.org/udcsummary/php/index.php">005.756</classCode>
</egXML>
</p>
<p>The <gi>catRef</gi> element categorizes an individual text by
pointing to one or more <gi>category</gi> elements using the
<att>target</att> attribute, which it inherits from the <ident
type="class">att.pointing</ident> class.  The
<gi>category</gi> element (which is fully described in section <ptr target="#HD55"/>) holds information about a particular classification or
category within a given taxonomy.  Each such category must have a unique
identifier, which may be supplied as the value of the <att>target</att>
attribute for <gi>catRef</gi> elements which are regarded as falling
within the category indicated.</p>
<p>A text may, of course, fall into more than one category, in which
case more than one identifier may be supplied as the value for
the <att>target</att> attribute on the <gi>catRef</gi> element, as
in the following example:
  <egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples"><catRef target="#b.a4 #b.d2"/></egXML></p><p>The <att>scheme</att> attribute
may be supplied to specify the taxonomy to which the categories
identified by the target attribute belong, if this is not adequately
conveyed by the  resource pointed to. For example,
    <egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples"><catRef target="#b.a4 #b.d2" scheme="http://www.example.com/browncorpus"/>
<catRef target="http://www.example.com/SUC/#A45"/></egXML>
Here the same text has been classified as of categories <val>b.a4</val> and
<val>b.d2</val> within the Brown classification scheme (presumed to be
available from <ident type="file">http://www.example.com/browncorpus</ident>), and as of category
<q>A45</q> within the SUC classification scheme documented at the URL given.</p>
<p>In general, it is a matter of style whether to use a single <gi>catRef</gi> with multiple
identifiers in the value of <att>target</att> or multiple <gi>catRef</gi> elements, each with
a single identifier in the value of <att>target</att>. However, note that maintenance of a TEI
document with a large number of values within a single <att>target</att> can be cumbersome.</p>
<p>The distinction between the <gi>catRef</gi> and <gi>classCode</gi>
elements is that the values used as identifying codes are exhaustively
enumerated for the former, typically within the TEI header. In the
latter case, however, the values use any
externally-defined scheme, and therefore may be taken from
 a more open-ended descriptive
classification system.</p>
<specGrp xml:id="D2243" n="Text Classification"><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/textClass.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/keywords.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/classCode.xml"/>
<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/catRef.xml"/>
</specGrp></div>

<div type="div3" xml:id="HD4ABS"><head>Abstracts</head>
<p>The main purpose of the <gi>abstract</gi> element is to supply a brief resume or abstract for an article which was originally published without such a component. An abstract or summary forming part of the document at its creation should usually appear in the front matter (<gi>front</gi>) of the document.
 <egXML  xml:lang="en" xmlns="http://www.tei-c.org/ns/Examples">
<profileDesc><abstract>
<p>This paper is a draft studying 
   various aspects of using the TEI 
   as a reference serialization framework 
   for LMF. Comments are welcome to bring 
   this to a useful document for the 
   community.
</p></abstract></profileDesc></egXML>
</p>
  <p>Abstracts may be provided in multiple languages, distinguished by the <att>xml:lang</att> attribute:
 <egXML xmlns="http://www.tei-c.org/ns/Examples">
<profileDesc><abstract xml:lang="en">
  <p>The recent archaeological emphasis 
    on the study of settlement patterns, 
    landscape and palaeoenvironments has 
    shaped and re-shaped our understanding 
    of the Viking settlement of Iceland. 
    This paper reviews the developments 
    in Icelandic archaeology, examining 
    both theoretical and practical advances. 
    Particular attention is paid to new 
    ideas in terms of settlement patterns 
    and resource exploitation. Finally, 
    some of the key studies of the ecological 
    consequences of the Norse 
    <foreign xml:lang="is">landnám</foreign> 
    are presented. </p>
</abstract>
<abstract xml:lang="fr"><p>L’accent récent des 
  recherches archéologiques sur l’étude des 
  configurations spatiales des colonies, de la 
  géographie des sites ainsi que des éléments 
  paléo-environnementaux nous mène à réexaminer 
  et réévaluer nos connaissances acquises sur 
  la colonisation de l’Islande par les Vikings. 
  Cet article passe en revue le développement 
  de l’archéologie islandaise en examinant les 
  progrès théoriques et pratiques en la matière. 
  Une attention particulière est portée sur 
  l’étude des configurations spatiales des 
  colonies ainsi qu’une considération des 
  questions d’exploitation des ressources. 
  Finalement, l’article présente un aperçu des 
  études principales qui traitent des 
  conséquences écologiques du 
  <foreign xml:lang="is">landnám</foreign> 
  islandais.</p> 
</abstract></profileDesc></egXML>

  </p>

<p>The same element may be used to provide other summary information supplied by the encoder, perhaps grouped together into a list of discrete items:
 <egXML  xml:lang="en" xmlns="http://www.tei-c.org/ns/Examples">
<profileDesc><abstract>
<list>
<item>An annual HBC supply ship is 
  set to the North West Coast for mid-September.</item>
  <item><name key="pelly_jh">Pelly</name> writes 
    to ascertain the British Government's plans 
    for the lands associated with the Oregon Treaty; 
    he wants to know what will happen to the HBC's 
    establishment on the southern <name type="place" 
      key="vancouver_island">Vancouver Island</name>. 
    He adds that a former Crown grant, an 1838 exclusive 
    trade-grant for the lands in question, has yet to 
    expire.</item>
  <item>The minutes discuss the nature of the HBC's 
    original entitlements and question whether or not, 
    and in what capacity, the Oregon Treaty affects the 
    HBC's position. The majority council further 
    investigation, and to reply cautiously and 
    judiciously to the HBC inquiry.</item><item>A 
      summary of a meeting with <name 
        key="pelly_jh">Pelly</name> is offered in 
      order to elucidate the HBC's intentions.</item>
  <item><name key="grey_hg">Lord Grey</name> calls 
    for greater consideration on the issue of 
    colonization; he asks that <name 
      key="stephen_j">Stephen</name> write the Company, 
    asking them to detail their intentions, and to 
    state their legal opinion for entitlement.
  </item>
</list>
</abstract>
</profileDesc>
</egXML>
</p>
</div>

<div type="div3" xml:id="HD44"><head>Calendar Description</head>
  <p>The <gi>calendarDesc</gi> element is used within the
  <gi>profileDesc</gi> element to document objects referenced by means of
either the <att>calendar</att> attribute on <gi>date</gi> or the
<att>datingMethod</att> attribute on any member of the <ident
type="class">att.datable</ident> class.
    <specList>
      <specDesc key="calendarDesc"/>
    </specList></p>
  <p>This element may contain one or more <gi>calendar</gi> elements:
    <specList>
      <specDesc key="calendar"/>
    </specList></p>
  <p>Each such element contains one or more paragraphs of description
  for the calendar system concerned, and also supplies an identifying
  code for it as the value of its <att>xml:id</att> attribute.
    <egXML xmlns="http://www.tei-c.org/ns/Examples">
      <calendarDesc>
        <calendar xml:id="Gregorian"><p>Gregorian calendar</p></calendar>
        <calendar xml:id="Stardate"><p>Fictional Stardate (from Star Trek series)</p></calendar>
        <calendar xml:id="BP"><p>Calendar years before present (measured from 1950)</p></calendar>
      </calendarDesc>
    </egXML>
  </p>
<p>This identifying code may then be referenced from any element
supplying  a date
expressed using that calendar system:
    <egXML xmlns="http://www.tei-c.org/ns/Examples">
<p>Captain's log <date calendar="#Stardate">stardate 23.9 rounded off
    to the nearest decimal point</date>...</p></egXML>

  See <ptr target="#NDDATECUSTOM"/> for details of the usage of dating attributes in conjunction
  with <gi>calendar</gi> elements in the <gi>teiHeader</gi>.
</p>
  <specGrp xml:id="D2242" n="Calendar Description"><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/calendarDesc.xml"/><include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/calendar.xml"/>
  </specGrp>
</div>

<div type="div3" xml:id="HD44CD"><head>Correspondence Description</head>
      <p>The <gi>correspDesc</gi> element is used within the <gi>profileDesc</gi> element to
        provide detailed correspondence-specific metadata, concerning in particular the
        communicative aspects (sending, receiving, forwarding etc.) associated with acts of
        of correspondence. </p>
      <p>This information is complementary to the detailed
      descriptions of physical objects (such as letters) associated
      with correspondence activities, which are typically provided by
      the sourceDesc element. <specList>
          <specDesc key="correspDesc"/>
        </specList>
      </p>
      <p>The <gi>correspDesc</gi> element contains the elements <gi>correspAction</gi> and
        <gi>correspContext</gi>, describing the actions identified and the context in
        which the correspondence occurs respectively. <specList>
          <specDesc key="correspAction" atts="type"/>
          <specDesc key="correspContext"/>
        </specList>
      </p>
      <p>Acts of correspondence typically do not occur in isolation from each other. The
        <gi>correspContext</gi> element is used to group references relevant to the item
        of correspondence being described, typically to other items such as the item to which
        it is a reply, or the item which replies to it: <egXML
          xmlns="http://www.tei-c.org/ns/Examples">
          <correspContext>
            <ref type="replyTo" target="#CLF0102">Previous letter of <persName>Chamisso</persName> to <persName>de La
              Foye</persName>: <date when="1807-01-16">16 January 1807</date></ref>
            <ref type="replyFrom" target="#CLF0104">Next letter of <persName>Chamisso</persName> to <persName>de La Foye</persName>:
              <date when="1810-05-07">07 May 1810</date></ref>
          </correspContext>
        </egXML>
      </p>

<!-- add an example using note or p ? -->

      <p>Many types of correspondence action may be distinguished. The <att>type</att>
        attribute should be used to indicate the type of action being documented, using values
        such as those suggested above. </p>
      <p>The following simple example uses <gi>correspAction</gi> to describe the sending of a
        letter by Adelbert von Chamisso from Vertus on 29 January 1807 to Louis de La Foye at
        Caen. The date of reception is unknown: <egXML
          xmlns="http://www.tei-c.org/ns/Examples">
          <correspAction type="sentBy">
            <persName>Adelbert von Chamisso</persName>
            <placeName>Vertus</placeName>
            <date when="1807-01-29"/>
          </correspAction>
          <correspAction type="deliveredTo">
            <persName>Louis de La Foye</persName>
            <placeName>Caen</placeName>
            <date>unknown</date>
          </correspAction>
        </egXML>
Note the use of the <att>when</att> attribute described in <ptr target="#CONADA"/> to provide a normalized
      form of the date. The content of the <gi>date</gi> element may
      also be omitted, since no underlying source is being transcribed.      </p>

      <p>Several senders, recipients, etc. can be specified for a single
        <gi>correspAction</gi> if the action is considered to apply to them all acting as
        a single group. In the following example two people are considered to have received
        the communication. <egXML xmlns="http://www.tei-c.org/ns/Examples">
          <correspAction type="receiving">
            <persName>Hermann Hesse</persName>
            <persName>Ninon Hesse</persName>
            <placeName>Montagnola</placeName>
          </correspAction>
        </egXML>
      </p>
      <p>The <att>subtype</att> attribute may be used to further distinguish types of action.
        In the following example, an e-mail is sent to two people directly and to a third by
        <soCalled>carbon copy</soCalled> (<term>cc</term>). <egXML xmlns="http://www.tei-c.org/ns/Examples">
          <correspAction type="sentBy">
            <persName>PN0001</persName>
            <date when="1999-06-02"/>
          </correspAction>
          <correspAction type="deliveredTo" subtype="to">
            <persName>PN0002</persName>
          </correspAction>
          <correspAction type="deliveredTo" subtype="to">
            <persName>PN0003</persName>
          </correspAction>
          <correspAction type="deliveredTo" subtype="cc">
            <persName>PN0004</persName>
          </correspAction>
        </egXML>
      </p>
      <p>The same person may be associated with many actions. For example, it will often
        be the case that the author and sender of a message are identical, and that many
        individual letters will need to be associated with the same person. The
        <att>sameAs</att> attribute defined in section <ptr target="#SAIE"/> may be used
        to indicate that the same name applies to many actions. Its value will usually be the
        identifier of an element defining the person or name concerned, which is supplied
        elsewhere in the document. <egXML xmlns="http://www.tei-c.org/ns/Examples">
          <correspAction type="sentBy">
            <name sameAs="#author"/>
          </correspAction>
        </egXML>
        <!-- example needs expansion to include definition for #author -->
      </p>
      <p>It is assumed that each correspondence action applies to a single act of
        communication. It may however be the case that the same physical object is involved
        in several such acts, if for example person A sends a letter to person B,
        who then annotates it and sends it on to person C, or if persons A and B both use
        the same document to convey quite different messages. In such situations, multiple
        <gi>correspDesc</gi> elements should be supplied, one for each communication. In
        the following example, the same document contains distinct messages, sent by two
        different people to the same destination: <gi>correspDesc</gi> is used for each
        separately: <egXML xmlns="http://www.tei-c.org/ns/Examples">
          <correspDesc xml:id="message1">
            <correspAction type="sentBy">
              <persName>John Gneisenau Neihardt</persName>
              <placeName>Branson (Montgomery)</placeName>
              <date when="1932-12-17"/>
            </correspAction>
            <correspAction type="deliveredTo">
              <persName xml:id="JTH">Julius Temple House</persName>
              <placeName>New York</placeName>
            </correspAction>
          </correspDesc>
          <correspDesc xml:id="message2">
            <correspAction type="sentBy">
              <persName>Enid Neihardt</persName>
              <placeName>Branson (Montgomery)</placeName>
              <date when="1932-12-17"/>
            </correspAction>
	    <correspAction type="deliveredTo">
              <persName sameAs="#JTH"/>
              <placeName>New York</placeName>
            </correspAction>
          </correspDesc>
        </egXML>
      </p>
<!-- add a more complex example using more types of action -->
      <specGrp n="Correspondence Description">
        <include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/correspDesc.xml"/>
        <include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/correspAction.xml"/>
        <include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/correspContext.xml"/>
        <include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/model.correspActionPart.xml"/>
        <include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/model.correspContextPart.xml"/>
        <include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/model.correspDescPart.xml"/>
      </specGrp>
    </div>
  </div>

<div type="div2" xml:id="HD9"><head>Non-TEI Metadata</head>
<p>Projects often maintain metadata about their TEI documents in more
than one form or system. For example, a project may have a database of
bibliographic information on the set of documents they intend to
encode. From this database, both a MARC record and a
<gi>teiHeader</gi> are generated. The document is then encoded, during
which process additional information is added to the
<gi>teiHeader</gi> manually. Then, when the document is published on
the web, a Dublin Core record is generated for discoverability of the
resource. It is sometimes advantageous to store some or all of the
non-TEI metadata in the TEI file.</p>
<p>Such non-TEI data may be placed anywhere within a TEI file
(other than as the root element), as it does not affect TEI
conformance. However, it is easier for humans to manage these kinds of data
if they are grouped together in a single location. In addition, such
grouping makes it easy to avoid accidentally flagging non-TEI data as errors
during validation of the file against a TEI schema. The
<gi>xenoData</gi> element, which may appear in the TEI Header after
the <gi>fileDesc</gi> but before the optional <gi>revisionDesc</gi>,
is provided for this purpose.
<specList>
  <specDesc key="xenoData"/>
</specList></p>
<p>The <gi>xenoData</gi> element may contain anything except TEI
elements. It may contain one or more elements from outside the
TEI<note place="bottom">As is always the case when mixing elements
from different namespaces in an XML document, the namespace of these
non-TEI elements must be declared either on the elements themselves or
on an ancestor element.</note> or data in some non-XML text
format.<note place="bottom">As is always the case when using text
inside an XML document, certain characters cannot occur in their
normal form, and must be <soCalled>escaped</soCalled>. The most common
of these are LESS-THAN SIGN (<q>&lt;</q>, U+003C) and AMPERSAND
(<q>&amp;</q>, U+0026), which may be escaped with
<code>&amp;lt;</code> and <code>&amp;amp;</code> respectively. See
<ptr target="#SG-er"/>.</note></p>
<specGrp n="non-TEI Metadata" xml:id="D227">
  <xi:include href="../../Specs/xenoData.xml"/>
</specGrp>
  <p>In the following example, the prefix
    <code>rdf</code> is bound to the namespace <code>http://www.w3.org/1999/02/22-rdf-syntax-ns#</code>, the prefix
    <code>dc</code> is bound to the namespace <code>http://purl.org/dc/elements/1.1/</code>, and the prefix
    <code>cc</code> is bound to the namespace <code>http://web.resource.org/cc/</code>.
    <egXML valid="feasible" xmlns="http://www.tei-c.org/ns/Examples"><xenoData 
      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:cc="http://web.resource.org/cc/"
      xmlns:dc="http://purl.org/dc/elements/1.1/">
  <rdf:RDF>
    <cc:Work rdf:about="">
      <dc:title>Applied Software Project Management  - review</dc:title>
      <dc:type rdf:resource="http://purl.org/dc/dcmitype/Text"></dc:type>
      <dc:license rdf:resource="http://creativecommons.org/licenses/by-sa/2.0/uk/"/>
    </cc:Work>
    <cc:License rdf:about="http://creativecommons.org/licenses/by-sa/2.0/uk/">
      <cc:permits rdf:resource="http://web.resource.org/cc/Reproduction"/>
      <cc:permits rdf:resource="http://web.resource.org/cc/Distribution"/>
      <cc:requires rdf:resource="http://web.resource.org/cc/Notice"/>
      <cc:requires rdf:resource="http://web.resource.org/cc/Attribution"/>
      <cc:permits rdf:resource="http://web.resource.org/cc/DerivativeWorks"/>
      <cc:requires rdf:resource="http://web.resource.org/cc/ShareAlike"/>
    </cc:License>
  </rdf:RDF>
</xenoData></egXML></p>
</div>

<div type="div2" xml:id="HD6"><head>The Revision Description</head>
<p>The final sub-element of the TEI header, the <gi>revisionDesc</gi>
element, provides a detailed change log in which each change made to a
text may be recorded.  Its use is optional but highly recommended.
It provides essential information for the administration of large
numbers of files which are being updated, corrected, or otherwise
modified as well as extremely useful documentation for files being
passed from researcher to researcher or system to system. Without
change logs, it is easy to confuse different versions of a file, or to
remain unaware of small but important changes made in the file by some
earlier link in the chain of distribution.  No significant change should be made in
any TEI-conformant file without corresponding entries being made in the
change log.<specList>
<specDesc key="revisionDesc"/>
<specDesc key="listChange"/>
<specDesc key="change"/>
</specList>
</p>
<p>The main purpose of the revision description is to record changes
in the text to which a header is prefixed. However, it is recommended
TEI practice to include entries also for significant changes in the
header itself (other than the revision description itself, of
course). At the very least, an entry should be supplied indicating the
date of creation of the header.</p>
<p>The log consists of a list of entries, one for each change. Changes
may be grouped and organised using  either the <gi>listChange</gi> element described in
section <ptr target="#PH-changes"/> or the simple <gi>list</gi> element
described in section <ptr target="#COLI"/>. Alternatively, a simple sequence of
<gi>change</gi> elements may be given. The attributes
<att>when</att> and <att>who</att> may be supplied for each
<gi>change</gi> element to indicate its date
and the person responsible for it respectively. The
description of the change itself can range from a simple phrase to a
series of paragraphs. If a number is to be associated with one or more
changes (for example, a revision number), the global <att>n</att>
attribute may be used to indicate it.</p>
<p>It is recommended to give changes in reverse chronological order,
most recent first.</p>
<p>For example:
<egXML xmlns="http://www.tei-c.org/ns/Examples">
  <!-- ... -->
  <revisionDesc>
    <change n="RCS:1.39" when="2007-08-08" who="#jwernimo.lrv">Changed <val>drama.verse</val>
    <gi>lg</gi>s to <gi>p</gi>s. <note>we have opened a discussion about the need for a new
          value for <att>type</att> of <gi>lg</gi>, <val>drama.free.verse</val>, in order to address
	  the verse of Behn which is not in regular iambic pentameter. For the time being these
	  instances are marked with a comment note until we are able to fully consider the best way
	  to encode these instances.</note>
    </change>
    <change n="RCS:1.33" when="2007-06-28" who="#pcaton.xzc">Added <att>key</att> and <att>reg</att>
    to <gi>name</gi>s.</change>
    <change n="RCS:1.31" when="2006-12-04" who="#wgui.ner">Completed renovation. Validated.</change>
  </revisionDesc>
</egXML>
<!-- adapted from WWP TR00241 -->
In the above example, the <att>who</att> attributes point to
<gi>respStmt</gi> elements which have been included earlier in the
<gi>titleStmt</gi> of the same header:
<egXML xmlns="http://www.tei-c.org/ns/Examples">
      <titleStmt>
	<title>The Amorous Prince, or, the Curious Husband, 1671</title>
	<author><persName ref="#abehn.aeh">Behn, Aphra</persName></author>
	<respStmt xml:id="pcaton.xzc">
	  <persName>Caton, Paul</persName>
	  <resp>electronic publication editor</resp>
	</respStmt>
	<respStmt xml:id="wgui.ner">
	  <persName>Gui, Weihsin</persName>
	  <resp>encoder</resp>
	</respStmt>
	<respStmt xml:id="jwernimo.lrv">
          <persName>Wernimont, Jacqueline</persName>
	  <resp>encoder</resp>
	</respStmt>
      </titleStmt>
</egXML>
There is however no requirement that the <gi>respStmt</gi> be used for
this person, or that the elements indicated be contained within the
same document. A project might for example maintain a separate
document listing all of its personnel in which they were represented
using the
<gi>person</gi> element described in <ptr target="#CCAHPA"/>.</p>
<specGrp xml:id="D226" n="The Revision Description">
  <include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/revisionDesc.xml"/>
  <include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/change.xml"/>
</specGrp>
</div>

<div type="div2" xml:id="HD7"><head>Minimal and Recommended Headers
 </head>
<p>The TEI header allows for the provision of a very large amount of
information concerning the text itself, its source, its encodings, and
revisions of it, as well as a wealth of descriptive information such
as the languages it uses and the situation(s) in which it was
produced, together with the setting and identity of participants
within it.  This diversity and richness reflects the diversity of uses
to which it is envisaged that electronic texts conforming to these
Guidelines will be put. It is emphatically <emph>not</emph> intended
that all of the elements described above should be present in every
TEI Header.
 </p>
<p>The amount of encoding in a header will depend both on the nature
and the intended use of the text. At one extreme, an encoder may
expect that the header will be needed only to provide a bibliographic
identification of the text adequate to local needs. At the other,
wishing to ensure that their texts can be used for the widest range
of applications, encoders will want to document as explicitly as
possible both bibliographic and descriptive information, in such a
way that no prior or ancillary knowledge about the text is needed in
order to process it. The header in such a case will be very full,
approximating to  the kind of
documentation often supplied in the form of a manual. Most texts will
lie somewhere between these extremes; textual corpora in particular
will tend more to the latter extreme.
In the remainder of this section we demonstrate first the minimal,
and next a commonly
recommended, level of encoding for the bibliographic information held by
the TEI header.
</p>
<p>Supplying only the minimal level of encoding required, the
TEI header of a single text might look like the following
example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><teiHeader>
  <fileDesc>
    <titleStmt>
      <title>Thomas Paine: Common sense, a
        machine-readable transcript</title>
      <respStmt>
        <resp>compiled by</resp>
        <name>Jon K Adams</name>
      </respStmt>
    </titleStmt>
    <publicationStmt>
      <distributor>Oxford Text Archive</distributor>
    </publicationStmt>
    <sourceDesc>
      <bibl>The complete writings of Thomas Paine, collected and edited
	    by Phillip S. Foner (New York, Citadel Press, 1945)</bibl>
    </sourceDesc>
  </fileDesc>
</teiHeader></egXML></p><p>The only mandatory component of the TEI header is the
<gi>fileDesc</gi> element.  Within this, <gi>titleStmt</gi>,
<gi>publicationStmt</gi>, and <gi>sourceDesc</gi> are all required
constituents.  Within the title statement, a title is required, and an
author should be specified, even if it is <mentioned>unknown</mentioned>, as
should some additional statement of responsibility, here given by the
<gi>respStmt</gi> element. Within the <gi>publicationStmt</gi>, a
publisher, distributor, or other agency responsible for the file must be
specified.  Finally, the source description should contain at the least
a loosely structured bibliographic citation identifying the source of
the electronic text if (as is usually the case) there is one.
 </p>
<p>We now present the same example header, expanded to include
additionally recommended information, adequate to most bibliographic
purposes, in particular to allow for the creation of an <ref target="#HD-BIBL-1">AACR2</ref>-conformant
bibliographic record.  We have also added information about the
encoding principles used in this (imaginary) encoding, about the
text itself (in the form of Library of Congress subject headings),
and about the revision of the file.
<!-- Bibliographic reference to a real OTA text.              -->
	<!-- EncodingDesc mostly from description of                  -->
	<!-- that text, but improvised in part.                       -->
	<!-- Revision desc is imaginary                               -->
<egXML xmlns="http://www.tei-c.org/ns/Examples"><teiHeader>
   <fileDesc>
      <titleStmt>
         <title>Common sense, a machine-readable transcript</title>
         <author>Paine, Thomas (1737-1809)</author>
         <respStmt>
            <resp>compiled by</resp>
            <name>Jon K Adams</name>
         </respStmt>
      </titleStmt>
      <editionStmt>
         <edition>
            <date>1986</date>
         </edition>
      </editionStmt>
      <publicationStmt>
         <distributor>Oxford Text Archive.</distributor>
         <address>
            <addrLine>Oxford University Computing Services,</addrLine>
            <addrLine>13 Banbury Road,</addrLine>
            <addrLine>Oxford OX2 6RB,</addrLine>
            <addrLine>UK</addrLine>
         </address>
      </publicationStmt>
      <notesStmt>
         <note>Brief notes on the text are in a
               supplementary file.</note>
      </notesStmt>
      <sourceDesc>
         <biblStruct>
            <monogr>
               <editor>Foner, Philip S.</editor>
               <title>The collected writings of Thomas Paine</title>
               <imprint>
                  <pubPlace>New York</pubPlace>
                  <publisher>Citadel Press</publisher>
                  <date>1945</date>
               </imprint>
            </monogr>
         </biblStruct>
      </sourceDesc>
   </fileDesc>
   <encodingDesc>
      <samplingDecl>
         <p>Editorial notes in the Foner edition have not
           been reproduced. </p>
         <p>Blank lines and multiple blank spaces, including paragraph
           indents, have not been preserved. </p>
      </samplingDecl>
      <editorialDecl>
         <correction status="high" method="silent">
            <p>The following errors
              in the Foner edition have been corrected:
              <list>
                <item>p.  13 l.  7 cotemporaries   contemporaries</item>
                <item>p.  28 l. 26 [comma]         [period]</item>
                <item>p.  84 l.  4 kin             kind</item>
                <item>p.  95 l.  1 stuggle         struggle</item>
                <item>p. 101 l.  4 certainy        certainty</item>
                <item>p. 167 l.  6 than            that</item>
                <item>p. 209 l. 24 publshed        published</item>
              </list>
            </p>
         </correction>
         <normalization>
            <p>No normalization beyond that performed
           by Foner, if any. </p>
         </normalization>
         <quotation marks="all">
            <p>All double quotation marks
           rendered with ", all single quotation marks with
           apostrophe. </p>
         </quotation>
         <hyphenation eol="none">
            <p>Hyphenated words that appear at the
           end of the line in the Foner edition have been reformed.</p>
         </hyphenation>
         <stdVals>
            <p>The values of <att>when-iso</att> on the <gi>time</gi>
            element always end in the format <val>HH:MM</val> or
            <val>HH</val>; i.e., seconds, fractions thereof, and time
            zone designators are not present.</p>
         </stdVals>
         <interpretation>
            <p>Compound proper names are marked. </p>
            <p>Dates are marked. </p>
            <p>Italics are recorded without interpretation. </p>
         </interpretation>
      </editorialDecl>
      <classDecl>
         <taxonomy xml:id="lcsh">
            <bibl>Library of Congress Subject Headings</bibl>
         </taxonomy>
         <taxonomy xml:id="lc">
            <bibl>Library of Congress Classification</bibl>
         </taxonomy>
      </classDecl>
   </encodingDesc>
   <profileDesc>
      <creation>
         <date>1774</date>
      </creation>
      <langUsage>
         <language ident="en" usage="100">English.</language>
      </langUsage>
      <textClass>
         <keywords scheme="#lcsh">
               <term>Political science</term>
               <term>United States -- Politics and government —
                      Revolution, 1775-1783</term>
         </keywords>
         <classCode scheme="#lc">JC 177</classCode>
      </textClass>
   </profileDesc>
   <revisionDesc>
      <change when="1996-01-22" who="#MSM"> finished proofreading </change>
      <change when="1995-10-30" who="#LB"> finished proofreading </change>
      <change notBefore="1995-07-04" who="#RG"> finished data entry at end of term </change>
      <change notAfter="1995-01-01" who="#RG"> began data entry before New Year 1995 </change>
   </revisionDesc>
</teiHeader></egXML>
<!-- this is what Rich suggested: me, I think the  -->
	<!-- contents of the supplementary file should be  -->
	<!-- decanted into the encodingDesc  (LB)          -->
	<!-- Do we agree that this really is an acceptable -->
	<!-- minimal recommendation?                       -->
	<!-- Not at all:  it ought to have encodingDesc,   -->
	<!-- profileDesc, and revisionDesc too.            -->
	<!-- Which I have now added.  -msm                 -->
 </p>
<p>Many other examples of recommended usage for the elements discussed
in this chapter are provided here, in the reference index and in the
associated tutorials.
</p>
</div>

<div type="div2" xml:id="HD8"><head>Note for Library Cataloguers</head><p>A strong motivation in preparing the material in this chapter was to
provide in the TEI header a viable chief source of information
for cataloguing computer files.  The TEI header is not
a library catalogue record, and so will not make all of the distinctions
essential in standard library work.  It also includes much information
generally excluded from standard bibliographic descriptions.  It is the
intention of the developers, however, to ensure that the information
required for a catalogue record be retrievable from the TEI file
header, and moreover that the mapping from the one to the other be as
simple and straightforward as possible.  Where the correspondence is not
obvious, it may prove useful to consult one of the works which were
influential in developing the content of the TEI header.  These
include:<list type="gloss"><label><ref target="#ISBD">ISBD</ref></label>
  <item><bibl><title level="m">ISBD: International Standard Bibliographic Description</title></bibl>
is an international standard setting out what information
should be recorded in a description of a bibliographical item.  Until a consolidated edition published in 2011, there
was a general standard called ISBD(G) and separate ISBDs covering different types of material, e.g.
ISBD(M) for monographs, ISBD(ER) for electronic resources.  These separate
ISBDs follow the same general scheme as the main ISBD(G), but provide
appropriate interpretations for the specific materials under
consideration.</item>
<label><ref target="#HD-BIBL-1">AACR2</ref></label>
<item>The <bibl><title level="m">Anglo-American Cataloguing Rules</title> (second edition)</bibl> were published
in 1978, with revisions appearing periodically through 2005. AACR2 provides
guidelines for the construction of catalogues in general libraries in the
English-speaking world. AACR2 is explicitly based on the general framework
of the ISBD(G) and the subsidiary ISBDs: it gives a description of how to
describe bibliographic items and how to create access points such as subject
or name headings and uniform titles. Other national cataloguing codes exist
as well, including the Z44 series of standards from issued by the Association
française de normalisation (AFNOR), <title level="m">Regeln für die alphabetische Katalogisierung
in wissenschaftlichen Bibliotheken</title> (RAK-WB), <title level="m">Regole italiane di
catalogazione per autore</title> (RICA), and <title level="m">Система стандартов по
информации, библиотечному и издательскому делу. Библиографическая запись. Библиографическое
описание. Общие требования и правила составления</title> (ГОСТ 7.1).</item>
<label><ref target="#COBICOR-eg-246">ANSI Z39.29</ref></label>
<item>The <title>American National Standard for Bibliographic References</title>
was an American national standard governing bibliographic references for use in
bibliographies, end-of-work lists, references in abstracting and indexing publications,
and outputs from computerized bibliographic data bases.  A revised version is maintained by the
National Information Standards Organization (NISO). The related ISO standard is ISO 690.
Other relevant national standards include BS 5605:1990, BS
6371:1983. DIN 1505-2, and ГОСТ 7.0.5.</item></list></p>

<p>Since the TEI file description elements are based on the ISBD
areas, it should be possible to use the content of file description as
the basis for a catalog record for a TEI document. However,
cataloguers should be aware that the permissive nature of the TEI
Guidelines may lead to divergences between practice in using the TEI
file description and the comparatively strict recommendations of
AACR2 and other national cataloguing codes. Such divergences as the following may preclude automatic
generation of catalogue records from TEI headers: <list>
<item>The TEI Guidelines do not require that text be transcribed from the <soCalled>chief source of information</soCalled> using normalized capitalization and punctuation <!--as in a national cataloguing code-->.</item>
<item>The TEI title statement may not categorize constituent titles in
the same way as prescribed by a national cataloguing code.</item>
<item>The TEI title statement contains authors, editors, and other
responsible parties in separate elements, with names which may not
have been normalized; it does not necessarily contain a single
statement of responsibility <!--from the chief source of
information-->.</item>
<item>There is no specific place in a TEI header to
  specify the <term>main entry</term> or <term>added entries</term>
  (<gloss>name or title headings under which a catalogue record is
    filed</gloss>) for the catalogue record.</item>
<item>The TEI header does not require use of a particular vocabulary
for subject headings nor require the use of subject headings.</item>
</list></p>
</div>

<div><head>The TEI Header Module</head>
<p>The module described in this chapter makes available the following components:
<moduleSpec xml:id="DHD" ident="header">
  <altIdent type="FPI">Common Metadata</altIdent>
  <desc>The TEI Header</desc>
  <desc xml:lang="fr">En-tête TEI</desc>
  <desc xml:lang="zh-TW">TEI標頭</desc>
  <desc xml:lang="it">L'intestazione TEI (TEI Header)</desc>
  <desc xml:lang="pt">O cabeçalho TEI</desc>
  <desc xml:lang="ja">ヘダーモジュール</desc>
</moduleSpec>
<!--publicID:  -//TEI P5//ELEMENTS TEI Header//EN-->
The selection and combination of modules to form a TEI schema is described in
<ptr target="#STIN"/>.</p>
</div>
</div>
