Construction of an XML Version of the TEI DTD


C. M. Sperberg-McQueen

7 July 1999

This unpublished document is distributed privately for comment by friends and colleagues; it is not now a formal publication and should not be quoted in published material.

This document has not yet been reviewed by both editors of the TEI; what it says about the beliefs of the editors should be taken as a proposal by the author for the approval of his co-editor.

Table of Contents

Abstract

This document describes issues involved in creating an XML version of the SGML document type definition (DTD) created by the Text Encoding Initiative, and proposes solutions. It defines a TEI extensions file which incorporates those solutions, in order to allow experimentation.

The discussion of inclusion exceptions defines a method of rewriting SGML content models so as to achieve effects similar to those provided by inclusion exceptions. To make an SGML document type definition compatible with XML, inclusion exceptions must be eliminated. The simplest method of ensuring that this change does not invalidate existing documents is to modify the content model of every element which can occur as a descendant of any element with inclusion exceptions in its content model, in the manner described here. That will ensure that elements named in inclusion exceptions remain legal in all the locations where they are currently legal.

The methods of changing content models described in this paper are believed to preserve determinism (what ISO 8879 calls lack of ambiguity) and to simulate the effects of inclusion exceptions properly. At this point, however, no proof of either conjecture is offered.


1 Introduction

1.1 XML and DTDs

The Extensible Markup Language (XML) defines a syntax for document type definitions similar to that provided by the Standard Generalized Markup Language (SGML), but more restrictive. In particular, XML allows neither inclusion nor exclusion exceptions, and prohibits the ampersand connector.

Modifying an existing SGML document type definition (DTD), such as the TEI DTD, to conform to XML thus involves:

1.2 Modifying the TEI DTD for XML

This document describes in detail the changes necessary to perform these modifications on the TEI DTD. The changes take the form of TEI modifications files suitable for use as the entities TEI.extensions.ent and TEI.extensions.dtd files.

The modifications have different degrees of difficulty. Some affect the technical content of the TEI DTD in serious ways, and therefore require review by the TEI's Technical Review Committee before being formally integrated into TEI P3, while others do not affect the technical content of the TEI at all, or affect it only in minor ways. Changes of this latter type may be regarded as corrections of obvious simple errors, and may be performed by the editors under their authority to correct corrigible errors in the text of the Guidelines. (The concept of corrigible error is defined in document TEI ED W46 (?); in brief, a corrigible error is one which both editors agree is an error, which has an obvious fix, and the fix for which will not affect any existing data.) Each change proposed in this paper is identified as either a correction to a corrigible error, which the editors expect to fix in the course of preparing a revised and corrected reprint of TEI P3, or else a substantive change requiring review by the Technical Review Committee.

1.3 Overview of changes to the TEI DTD

Not all of the changes to the DTD are handled by this document. [1] Those that are, are summarized in the following overviews of the extensions files.

< 1 teixml.ent >(teixml.ent) =

<!--* teixml.ent:  XML version of TEI (1999-07-07)           *-->
<!--* This is the TEI.extensions.ent file of an experimental
    * version of the TEI P3 DTD, adapted to be XML conformant.
    * N.B. using this extensions file with the standard TEI DTD
    * will not make the DTD completely XML compliant.  Some
    * post-processing is needed.  Use the pizza chef at
    * http://www.uic.edu/orgs/tei/pizza.html or
    * http://firth.natcorp.ox.ac.uk/TEI/nupizza.html
    *
    * This version:  1999-07-07b
    *
    * Send comments to tei-l@listserv.uic.edu or to 
    * teitech@listserv.uic.edu
    * Thank you for beta testing! 
    *-->
< Provide default tagset declarations 152 >
< Define TEI keywords 153 >

< Fix placePart class 154 >
< Reproduce class declarations for phrases 22 >
< Reproduce inclusion classes 42 >
< Reproduce classes used by specPara 51 >
< Embed tag-set-specific ent files 151 >
< Element class m.Incl 41 >
< New specialPara 50 >
< New declaration for phrase and phrase.seq 45 >
< New declaration for paraContent 49 >
< New declaration for component and component.seq 47 >

< Suppress definitions of elements with ampersand 3 >
< Suppress element declarations with exclusions 40 >
< Suppress some mixed content elements 11 >
< Suppress users of phrase.seq 24 >
< Suppress standard definitions of PCDATA elements 43 >

< Suppress definitions in core tag set 54 >
< Suppress definitions in text-structure tag set 67 >
< Suppress definitions in front-matter tag set 82 >
< Suppress definitions in header tag set 86 >
< Suppress definitions in verse tag set 98 >
< Suppress definitions in drama tag set 104 >
< Suppress definitions in spoken-text tag set 110 >
< Suppress definitions in terminology tag set 112 >
< Suppress definitions in segmentation and alignment tag set 117 >
< Suppress definitions in analysis tag set 122 >
< Suppress definitions in feature-structures tag set 128 >
< Suppress definitions in text-criticism tag set 136 >
< Suppress definitions in graphs tag set 140 >
< Suppress definitions in tables tag set 146 >

< 2 teixml.dtd >(teixml.dtd) =

<!--* teixml.dtd:  XML version of TEI (1999-07-07)           *-->
<!--* This is the TEI.extensions.dtd file of an experimental
    * version of the TEI P3 DTD, adapted to be XML conformant.
    * N.B. using this extensions file with the standard TEI DTD
    * will not make the DTD completely XML compliant.  Some
    * post-processing is needed.  Use the pizza chef at
    * http://www.uic.edu/orgs/tei/pizza.html or
    * http://firth.natcorp.ox.ac.uk/TEI/nupizza.html
    *
    * This version:  1999-07-07b
    *
    * Send comments to tei-l@listserv.uic.edu or to 
    * teitech@listserv.uic.edu
    * Thank you for beta testing! 
    *-->
< New definitions of elements with ampersand 4 >
< Redeclare elements with mixed content elements 12 >
< New declarations for users of phrase.seq 25 >
< New declarations for exclusion exceptions 37 >
< New definitions for PCDATA elements 44 >
<!--* handle specialPara *-->
< New definition of set element 53 >

< New definitions for core tag set 55 >
< New definitions for text-structure tag set 68 >
< New definitions for front-matter tag set 83 >
< New definitions for header tag set 87 >
< New definitions for verse tag set 99 >
< New definitions for drama tag set 105 >
< New definitions for spoken-text tag set 111 >
< New definitions for terminology tag set 113 >
< New definitions for flat terminology tag set 116 >
< New definitions for segmentation and alignment tag set 118 >
< New definitions for analysis tag set 123 >
< New definitions for feature-structures tag set 129 >
< New definitions for text-criticism tag set 137 >
< New definitions for graphs tag set 141 >
< New definitions for tables tag set 147 >

1.4 Intended use of this document

The immediate goal of this document is to allow experimentation with the TEI DTD and XML processors, by providing the extensions files needed to make the full TEI P3 DTD work with XML processors. To use the extensions files created by this document with other extensions files (e.g. those of TEI Lite), manual merger of the extensions files is required. The editors plan to automate this merger as soon as possible; the following stages of development are anticipated:

A list of open questions is included at the end of the document.

2 Tag omissibility information

Removing tag omissibility information is a trivial task which can be accomplished by a DTD pretty printer, or even a simple editor script. The strings - -, - O, O -, and O O are legal in a DTD only as tag omissibility information, within comments, or within literals. In the TEI DTDs, they do not occur within literals or comments, so a global change in an editor would handle the problem.

To enable the necessary changes to be made with a minimum of manual intervention, however, it is probably better to add a run-time option to a DTD pretty printer, to make it suppress this information, or replace it with a reference to one of the parameter entities om.RR, om.RO, om.OR, or om.OO. If the run-time flag is set, the following entities will be added to the beginning of the DTD:

<!ENTITY % om.RR '- -'>
<!ENTITY % om.RO '- O'>
<!ENTITY % om.OR 'O -'>
<!ENTITY % om.OO 'O O'>
The program carthago has accordingly been outfitted with two run-time options to suppress the omissibility markers, or to replace them with entity references.

3 Normalizing parameter-entity references

In the short term, we will normalize parameter-entity references using the pretty printer mentioned above (or else eliminate them entirely, by running the test DTD through a pre-processor like Carthage, which expands all parameter-entity references).

In the long run, we will systematically normalize all content models in the tagdocs of TEI P3 by adding semicolons to parameter-entity references which currently do not have them. N.B. the editors regard this as a correction of a corrigible error, and this normalization will be performed in the text of TEI P3 as soon as possible.

4 Ampersand connectors

Removing ampersand connectors involves either rewriting the content model as a set of alternative sequence groups (thus retaining strict equivalence with the existing model) or revising the content model entirely. In the case of the TEI, the editors both agree that most uses of & have proven to be design errors, so we propose simply to revise the content models.

The following content models use ampersand connectors in TEI P3:

In this section, we provide alternate declarations for each of them. In the entity extensions file we must first suppress all of them:

< 3 Suppress definitions of elements with ampersand > =

<!ENTITY % cit             'IGNORE' >
<!ENTITY % respStmt        'IGNORE' >
<!ENTITY % publicationStmt 'IGNORE' >
<!ENTITY % graph           'IGNORE' >


And in thd DTD extensions file we must redefine them all:

< 4 New definitions of elements with ampersand > =

< New cit declaration 5 >
< Define new respStmt 8 >
< New publicationStmt 9 >
< New graph element 10 >

N.B. All the ampersand-eliminating content-model changes in this section are regarded by the editors as corrections of corrigible errors, and will be integrated into the text of TEI P3 as soon as possible.

4.1 The <cit> element

The standard declaration for <cit> is as follows:

<!ELEMENT %n.cit;       - -  ((%n.q; | %n.quote;) & (%m.bibl; |
                             %m.loc;))                          >
We will redefine it with a slightly more general content model (well, almost -- see below):

< 5 New cit declaration > =

<!ENTITY % XML.cit "INCLUDE" >
<![%XML.cit;[
<!ELEMENT %n.cit;       - -  ((%n.q; | %n.quote; | %m.bibl; |
                             %m.loc; | %m.Incl;)+)              >
<!ATTLIST %n.cit;            %a.global;
          TEIform            CDATA               'cit'          >
]]>


(The Incl class included here has to do with inclusion exceptions; see below.) If we wished to replicate precisely the original content model, without the ampersand, we could define <cit> thus:
<!ELEMENT %n.cit;       - -  (((%n.q; | %n.quote;),
                               (%m.bibl; | %m.loc;))
                             | ((%m.bibl; | %m.loc;),
                               (%n.q; | %n.quote;)))            >

As it turns out, however the declaration proposed above is ambiguous, since <link> is a member of both the loc and Incl classes. We'll have to unroll one or the other of these two classes; a coin toss decides that we should unroll loc.

< 6 New cit declaration (alternate) > =

<!ENTITY % XML.cit "INCLUDE" >
<![%XML.cit;[
<!ELEMENT %n.cit;       - -  ((%n.q; | %n.quote; | %m.bibl; 
                             | %n.ptr; | %n.ref; 
                             | %n.xptr; | %n.xref;
                             | %m.Incl;)+)                      >
<!ATTLIST %n.cit;            %a.global;
          TEIform            CDATA               'cit'          >
]]>


After further investigation (i.e. further attempts to use the DTD produced by a draft of this paper), however, it becomes clear that loc is a subclass of phrase, so that every content model which uses both the phrase class and the Incl class is going to have troubles. So instead of unrolling each case individually, we take a harsher approach, and remove <link> from the loc class.

< 7 New loc class > =

<!--* remove link from loc class to avoid ambiguity          *-->
<!ENTITY % x.loc ''                                             >
<!ENTITY % m.loc '%x.loc; %n.ptr; | %n.ref; |
           %n.xptr; | %n.xref;'                                 >


This should not cause problems for any existing data, since <link> is still a member of the class Incl, which is (after all) allowed virtually everywhere.

4.2 The <respStmt> element

Similarly, we could replicate the original definition of <respStmt> if we wished, but it's probably better regarded as a design error to be fixed:

<!ELEMENT %n.respStmt;  - O  ((%n.resp; & %n.name;), (%n.resp;
                             | %n.name;)*)                      >
We give it a simpler and looser declaration instead:

< 8 Define new respStmt > =

<!ENTITY % XML.respStmt "INCLUDE" >
<![%XML.respStmt;[
<!ELEMENT %n.respStmt;  - O  (%n.resp; | %n.name;
                             | %m.Incl;)+                       >
<!ATTLIST %n.respStmt;       %a.global;
          TEIform            CDATA               'respStmt'     >
]]>


The prose should make clear that in principle, a <respStmt> should have at least one <resp> and at least one <name>. Enforcing that with the content model may be more pedantic than we want to be, though.
<!ELEMENT %n.respStmt;  - O  (((%n.resp;)+,
                             (%n.name;, (%n.resp; | %n.name;)*))
                             | ((%n.name;)+,
                             (%n.resp;, (%n.resp; | %n.name;)*)))

4.3 The <publicationStmt> element

The content model for <publicationStmt> includes an editorial error I am glad to have the occasion to fix. (In normal bibliographic practice, when place and publisher are both given, the place is given first. I don't know what got into me that morning.)

<!ELEMENT %n.publicationStmt;
                        - O  ((%n.p;)+ | ( (%n.publisher; |
                             %n.distributor; | %n.authority;) &
                             ((%n.pubPlace)?, (%n.address)?,
                             (%n.idno)*, (%n.availability)?,
                             (%n.date)?)+ )+ )                  >
Rather than simply replace the current content model with an equivalent ampersand-less expression, we'll change it. For compatibility with existing data, we'll make the new expression loose rather than tight.

< 9 New publicationStmt > =

<!ENTITY % XML.publicationStmt "INCLUDE" >
<![%XML.publicationStmt;[
<!ELEMENT %n.publicationStmt;
                        - O  ( (%n.p;, (%m.Incl;)*)+
                             | ((%n.publisher; | %n.distributor;
                             | %n.authority; | %n.pubPlace;
                             | %n.address; | %n.idno;
                             | %n.availability; | %n.date;),
                               (%m.Incl;)*)+ )                  >
<!ATTLIST %n.publicationStmt; %a.global;
          TEIform            CDATA               'publicationStmt'
                                                                >
]]>


4.4 The <graph> element

The <graph> element uses the content model to require that graphs be encoded nodes-first or arcs-first, but not mixed hugger-mugger. We'll retain that characteristic. The old declaration is this:

<!ELEMENT %n.graph;     - -  ((%n.node;)+ & (%n.arc;)*)         >
We could require arbitrarily that all nodes come first; it's not clear whether any legacy data using <graph> actually exists. But in the interests of backward compatibility, the new content model might as well allow precisely what the old one did, even if that now seems like a design error:

< 10 New graph element > =

<![%TEI.nets;[
<!ENTITY % XML.graph "INCLUDE" >
<![%XML.graph;[
<!ELEMENT %n.graph;     - -  (((%n.node;, (%m.Incl;)*)+,
                               (%n.arc;, (%m.Incl;)*)*)
                             | ((%n.arc;, (%m.Incl;)*)+,
                               (%n.node;, (%m.Incl;)*)+))       >
<!ATTLIST %n.graph;          %a.global;
          type               CDATA               #IMPLIED
          label              CDATA               #IMPLIED
          order              NUMBER              #IMPLIED
          size               NUMBER              #IMPLIED
          TEIform            CDATA               'graph'        >
]]>
]]>


5 Normalizing mixed-content models

5.1 Individual elements

The following elements use the keyword #PCDATA in ways that must be changed to be legal in XML:

In most of these cases, the #PCDATA keyword is given last, not first, in the content model; in one or two, it's neither first nor last. For example:
<!ELEMENT %n.sense;     - -  (%n.sense; | %m.dictionaryTopLevel
                             | %m.phrase | #PCDATA)*            >
In one or two cases, the group also has a plus operator instead of a star operator.
<!ELEMENT %n.timeStruct;
                        - -  ((%m.temporalExpr; | #PCDATA)+)    >

We must redeclare each of them, which means first of all that we must suppress their standard declarations:

< 11 Suppress some mixed content elements > =

<!ENTITY % sense 'IGNORE' >
<!ENTITY % re 'IGNORE' >
<!ENTITY % persName 'IGNORE' >
<!ENTITY % placeName 'IGNORE' >
<!ENTITY % geogName 'IGNORE' >
<!ENTITY % dateStruct 'IGNORE' >
<!ENTITY % timeStruct 'IGNORE' >
<!ENTITY % dateline 'IGNORE' >


and separately we must redefine them:

< 12 Redeclare elements with mixed content elements > =

<![%TEI.dictionaries;[
< New mixed content elements for dictionaries 13 >
]]>
<![%TEI.names.dates;[
< New mixed content elements for names and dates 15 >
]]>
< New mixed content elements for structure 20 >

Since the normalization is purely mechanical, there seems to be no need to reproduce the original declarations here. The new declarations are given below.

N.B. All the mixed-content normalization changes in this section are regarded by the editors as corrections of corrigible errors, and will be integrated into the text of TEI P3 as soon as possible.

Two elements in this group are from the dictionary tag set:

< 13 New mixed content elements for dictionaries > =

<!ENTITY % XML.sense "INCLUDE" >
<![%XML.sense;[
<!ELEMENT %n.sense;     - -  (#PCDATA | %n.sense;
                             | %m.dictionaryTopLevel;
                             | %m.phrase; | %m.Incl;)*          >
<!ATTLIST %n.sense;          %a.global;
                             %a.dictionaries;
          level              NUMBER              #IMPLIED
          TEIform            CDATA               'sense'        >
]]>


< 14 New mixed content elements for dictionaries 13 (cont'd) > =

<!ENTITY % XML.re "INCLUDE" >
<![%XML.re;[
<!ELEMENT %n.re;        - O  (#PCDATA | %n.sense;
                             | %m.dictionaryTopLevel;
                             | %m.phrase; | %m.Incl;)*          >
<!ATTLIST %n.re;             %a.global;
                             %a.dictionaries;
          type               CDATA               #IMPLIED
          TEIform            CDATA               're'           >
]]>


Note that the standard declaration for <re> also has an exclusion exception which has been dropped silently here. N.B. Elimination of exclusion exceptions is not a corrigible error; the version of this declaration which will go into TEI P3 without review is this:
<!ELEMENT %n.re;        - O  (#PCDATA | %n.sense;
                             | %m.dictionaryTopLevel;
                             | %m.phrase;)*      -(%n.re;)      >

The other elements in this group are from the tag set for names and dates.

< 15 New mixed content elements for names and dates > =


<!ENTITY % XML.persName "INCLUDE" >
<![%XML.persName;[
<!ELEMENT %n.persName;  - -  (#PCDATA | %m.personPart;
                             | %m.phrase; | %m.Incl;)*          >
<!ATTLIST %n.persName;       %a.global;
                             %a.names;
          type               CDATA               #IMPLIED
          TEIform            CDATA               'persName'     >
]]>


< 16 New mixed content elements for names and dates 15 (cont'd) > =

<!ENTITY % XML.placeName "INCLUDE" >
<![%XML.placeName;[
<!ELEMENT %n.placeName; - -  (#PCDATA | %m.placePart;
                             | %m.phrase; | %m.Incl;)*          >
<!ATTLIST %n.placeName;      %a.global;
          type               CDATA               #IMPLIED
          full               (yes | abb | init)  yes
                             %a.names;
          TEIform            CDATA               'placeName'    >
]]>


< 17 New mixed content elements for names and dates 15 (cont'd) > =

<!ENTITY % XML.geogName "INCLUDE" >
<![%XML.geogName;[
<!ELEMENT %n.geogName;  - -  (#PCDATA | %n.geog; | %n.name;
                             | %m.Incl;)*                       >
<!ATTLIST %n.geogName;       %a.global;
                             %a.placePart;
          TEIform            CDATA               'geogName'     >
]]>


< 18 New mixed content elements for names and dates 15 (cont'd) > =

<!ENTITY % XML.dateStruct "INCLUDE" >
<![%XML.dateStruct;[
<!ELEMENT %n.dateStruct;
                        - -  (#PCDATA | %m.temporalExpr;
                             | %m.Incl;)*                       >
<!ATTLIST %n.dateStruct;     %a.global;
                             %a.temporalExpr;
          calendar           CDATA               #IMPLIED
          exact              CDATA               #IMPLIED
          TEIform            CDATA               'dateStruct'   >
]]>


< 19 New mixed content elements for names and dates 15 (cont'd) > =

<!ENTITY % XML.timeStruct "INCLUDE" >
<![%XML.timeStruct;[
<!ELEMENT %n.timeStruct;
                        - -  (#PCDATA | %m.temporalExpr;
                             | %m.Incl;)*                       >
<!ATTLIST %n.timeStruct;     %a.global;
                             %a.temporalExpr;
          zone               CDATA               #IMPLIED
          TEIform            CDATA               'timeStruct'   >
]]>


The <dateline> element (from the default text-structure tag set) is the last one needing a mixed-content fix:

< 20 New mixed content elements for structure > =

<!ENTITY % XML.dateline "INCLUDE" >
<![%XML.dateline;[
<!ELEMENT %n.dateline;  - O  (#PCDATA | %n.date; | %n.time;
                             | %n.name; | %n.address;
                             | %m.Incl;)*                       >
<!ATTLIST %n.dateline;       %a.global;
          TEIform            CDATA               'dateline'     >
]]>


5.2 The entities phrase and phrase.seq

The XML rules for mixed-content models also require that the declarations for phrase and phrase.seq be changed slightly. The current defintions are:

<!ENTITY % phrase '(#PCDATA | %m.phrase)'                       >
<!ENTITY % phrase.seq '(%phrase;)*'                             >
These give us one level too many of parentheses; we need to remove the parentheses from the entity phrase:

< 21 New declaration for phrase and phrase.seq > =

<!ENTITY % phrase '#PCDATA | %m.phrase;'                        >
<!ENTITY % phrase.seq '(%phrase;)*'                             >


N.B. This change to the declaration of phrase is regarded by the editors as the correction of a corrigible error, and will be integrated into the text of TEI P3 as soon as possible.

Unfortunately, integrating this particular fix into the XML modifications file for testing will require that we either hard-code the effective value of m.phrase, or that we recreate the entire sequence of class declarations for phrase in the modifications file. (Sigh.) While we are here, we will introduce some fixes to the declarations of some classes:

< 22 Reproduce class declarations for phrases > =

< Declare new GIs 23 >
<!ENTITY % x.hqphrase ''                                        >
<!ENTITY % m.hqphrase '%x.hqphrase; %n.distinct; | %n.emph; |
           %n.foreign; | %n.gloss; | %n.hi; | %n.mentioned; |
           %n.soCalled; | %n.term; | %n.title;'                 >
<!ENTITY % x.data ''                                            >
<!ENTITY % m.data '%x.data; %n.abbr; | %n.address; | %n.date; 
           | %n.dateRange; | %n.dateStruct; | %n.expan; 
           | %n.geogName; 
           | %n.lang; | %n.measure; | %n.name; | %n.num;
           | %n.orgName; | %n.persName; | %n.placeName; 
           | %n.rs; | %n.time; | %n.timeRange; 
           | %n.timeStruct;'                                    >
<!ENTITY % x.edit ''                                            >
<!ENTITY % m.edit '%x.edit; %n.add; | %n.app; |
           %n.corr; | %n.damage; | %n.del; | 
           %n.orig; | %n.reg; | %n.restore; | %n.sic;
           | %n.space; | %n.supplied; | %n.unclear;'            >
<!ENTITY % x.editIncl ''                                        >
<!ENTITY % m.editIncl '%x.editIncl; %n.addSpan; | %n.delSpan; | 
           %n.gap;'                                             >

< New loc class 7 >
<!ENTITY % x.seg ''                                             >
<!ENTITY % m.seg '%x.seg; %n.c; | %n.cl; | %n.m; |
           %n.phr; | %n.s; | %n.seg; | %n.w;'                   >
<!ENTITY % x.sgmlKeywords ''                                    >
<!ENTITY % m.sgmlKeywords '%x.sgmlKeywords; %n.att; | %n.gi; |
           %n.tag; | %n.val;'                                   >
<!ENTITY % x.phrase.verse ''                                    >
<!ENTITY % m.phrase.verse '%x.phrase.verse; %n.caesura;'        >
<!ENTITY % x.formPointers ''                                    >
<!ENTITY % m.formPointers '%x.formPointers; %n.oRef; | %n.oVar; 
           | %n.pRef; | %n.pVar;'                               >
<!ENTITY % x.phrase ''                                          >
<!ENTITY % m.phrase '%x.phrase; %m.data; | %m.edit; |
           %m.formPointers; | %m.hqphrase; | %m.loc; |
           %m.phrase.verse; | %m.seg; | %m.sgmlKeywords; |
           %n.dictAnomaly; |
           %n.formula; | %n.fw; | %n.handShift;'                >

<!ENTITY % x.fmchunk ''                                         > 
<!ENTITY % m.fmchunk '%x.fmchunk; %n.argument; | %n.byline; | 
           %n.docAuthor; | %n.docDate; | %n.docEdition; | 
           %n.docImprint; | %n.docTitle; | %n.epigraph; | 
           %n.head; | %n.titlePart;'                            >


The element <dictAnomaly> is new; for a description, see below, section The problem of the dictionary chapter.

We need to declare the name of <dictAnomaly>.

< 23 Declare new GIs > =

<!ENTITY % n.dictAnomaly 'dictAnomaly'                          >

5.3 Elements using phrase.seq and paraContent

Note that neither phrase.seq nor paraContent may be combined with other elements in a content model, in XML, because of the XML requirement that mixed content models not have nested groups. This affects the declarations for

These must be suppressed, in order to be redeclared:

< 24 Suppress users of phrase.seq > =

<!ENTITY % castItem 'IGNORE' >
<!ENTITY % docImprint 'IGNORE' >
<!ENTITY % catDesc 'IGNORE' >
<!ENTITY % byline 'IGNORE' >
<!ENTITY % opener 'IGNORE' >
<!ENTITY % closer 'IGNORE' >
<!ENTITY % form 'IGNORE' >
<!ENTITY % gramGrp 'IGNORE' >
<!ENTITY % trans 'IGNORE' >
<!ENTITY % etym 'IGNORE' >
<!ENTITY % xr 'IGNORE' >


And they need to be redefined, tag set by tag set. (We put elements from each tag set into separate scraps to simplify production of specialized modification files.)

< 25 New declarations for users of phrase.seq > =

< New castItem 26 >
< New docImprint 27 >
< New catDesc 28 >
< New opener and closer 29 >
< New phrase.seq elements for dictionaries 32 >

First, the base tag set for drama:

< 26 New castItem > =

<![%TEI.drama;[
<!ENTITY % XML.castItem "INCLUDE" >
<![%XML.castItem;[
<!ELEMENT %n.castItem;  - O  (#PCDATA | %n.role; | %n.roleDesc;
                             | %n.actor; | %m.phrase;
                             | %m.Incl;)*                       >
<!ATTLIST %n.castItem;       %a.global;
          type               (role | list)       role
          TEIform            CDATA               'castItem'     >
]]>
]]>


Next the tag set for front matter:

< 27 New docImprint > =

<!ENTITY % XML.docImprint "INCLUDE" >
<![%XML.docImprint;[
<!ELEMENT %n.docImprint;
                        - O  (#PCDATA | %m.phrase; | %n.pubPlace;
                             | %n.docDate; | %n.publisher;
                             | %m.Incl;)*                       >
<!ATTLIST %n.docImprint;     %a.global;
          TEIform            CDATA               'docImprint'   >
]]>


Then, the header:

< 28 New catDesc > =

<!ENTITY % XML.catDesc "INCLUDE" >
<![%XML.catDesc;[
<!ELEMENT %n.catDesc;   - O  (#PCDATA | %m.phrase;
                             | %n.textDesc;)*                   >
<!ATTLIST %n.catDesc;        %a.global;
          TEIform            CDATA               'catDesc'      >
]]>


And the default text-structure tag set:

< 29 New opener and closer > =

<!ENTITY % XML.byline "INCLUDE" >
<![%XML.byline;[
<!ELEMENT %n.byline;    - O  (#PCDATA | %m.phrase;
                             | %n.docAuthor; | %m.Incl;)*       >
<!ATTLIST %n.byline;         %a.global;
          TEIform            CDATA               'byline'       >
]]>


< 30 New opener and closer 29 (cont'd) > =

<!ENTITY % XML.opener "INCLUDE" >
<![%XML.opener;[
<!ELEMENT %n.opener;    - O  (#PCDATA | %m.phrase;
                             | %n.argument; | %n.byline;
                             | %n.epigraph;
                             | %n.signed; | %n.dateline;
                             | %n.salute; | %m.Incl;)*          >
<!ATTLIST %n.opener;         %a.global;
          TEIform            CDATA               'opener'       >
]]>


< 31 New opener and closer 29 (cont'd) > =

<!ENTITY % XML.closer "INCLUDE" >
<![%XML.closer;[
<!ELEMENT %n.closer;    - O  (#PCDATA | %m.phrase;
                             | %n.signed; | %n.dateline;
                             | %n.salute; | %m.Incl;)*          >
<!ATTLIST %n.closer;         %a.global;
          TEIform            CDATA               'closer'       >
]]>


And finally the base tag set for dictionaries; unlike the preceding elements, these all use paraContent, not phrase.seq. N.B. these content models will require further changes before publication. See below, The problem of the dictionary chapter.

< 32 New phrase.seq elements for dictionaries > =

<![%TEI.dictionaries;[
<!ENTITY % XML.form "INCLUDE" >
<![%XML.form;[
<!ELEMENT %n.form;      - -  (#PCDATA | %m.phrase; | %m.inter;
                             | %m.formInfo; | %m.Incl;)*        >
<!ATTLIST %n.form;           %a.global;
                             %a.dictionaries;
          type               CDATA               #IMPLIED
          TEIform            CDATA               'form'         >
]]>


< 33 New phrase.seq elements for dictionaries 32 (cont'd) > =

<!ENTITY % XML.gramGrp "INCLUDE" >
<![%XML.gramGrp;[
<!ELEMENT %n.gramGrp;   - -  (#PCDATA | %m.phrase; | %m.inter;
                             | %m.gramInfo; | %m.Incl;)*        >
<!ATTLIST %n.gramGrp;        %a.global;
                             %a.dictionaries;
          TEIform            CDATA               'gramGrp'      >
]]>


< 34 New phrase.seq elements for dictionaries 32 (cont'd) > =

<!ENTITY % XML.trans "INCLUDE" >
<![%XML.trans;[
<!ELEMENT %n.trans;     - O  (#PCDATA | %m.phrase; | %m.inter;
                             | %m.dictionaryParts; | %m.Incl;)* >
<!ATTLIST %n.trans;          %a.global;
                             %a.dictionaries;
          TEIform            CDATA               'trans'        >
]]>


< 35 New phrase.seq elements for dictionaries 32 (cont'd) > =

<!ENTITY % XML.etym "INCLUDE" >
<![%XML.etym;[
<!ELEMENT %n.etym;      - O  (#PCDATA | %m.phrase; | %m.inter;
                             | %n.usg; | %n.lbl; | %n.def;
                             | %n.trans; | %n.tr;
                             | %m.morphInfo; | %n.eg;
                             | %n.xr; | %m.Incl;)*              >
<!ATTLIST %n.etym;           %a.global;
                             %a.dictionaries;
          TEIform            CDATA               'etym'         >
]]>


< 36 New phrase.seq elements for dictionaries 32 (cont'd) > =

<!ENTITY % XML.xr "INCLUDE" >
<![%XML.xr;[
<!ELEMENT %n.xr;        - O  (#PCDATA | %m.phrase; | %m.inter;
                             | %n.usg; | %n.lbl; | %m.Incl;)*   >
<!ATTLIST %n.xr;             %a.global;
                             %a.dictionaries;
          type               CDATA               #IMPLIED
          TEIform            CDATA               'xr'           >
]]>
]]>


Since paraContent also occurs in the definition of specialPara, in a form not legal in XML, the specialPara entity must also be redefined; see below, The problem of specialPara elements.

6 Exceptions

Removing inclusion and exclusion exceptions typically involves changing the set of documents accepted by the DTD.[2] In the discussion which follows, I assume that our goal is to ensure that every document legal in the original DTD remains legal in the modified DTD. The changes will cause the modified DTD to accept some other documents which are not valid instances of the original DTD. That is, if the original DTD is taken as an absolutely correct definition of a language, the revised DTD will overgenerate.[3] We will wish to keep the overgeneration to a minimum, but in general we cannot eliminate it entirely, since inclusion and exclusion exceptions do extend the expressive power of the DTD notation.[4]

7 Exclusions

Rewriting declarations without exclusion exceptions involves simply removing the exception, and adding an application-specific constraint to be checked outside the SGML parser, that says the excluded element types must not occur within the element type which excluded them. Thus, for example, the TEI <s> element (for end-to-end segmentation on the level of the orthographic sentence) is currently declared thus:

<!ELEMENT s  - -  (%phrase.seq)  -(s) >
An XML-compatible TEI DTD would replace this with:
<!ELEMENT s %phrase.seq;  >

<!--* CONSTRAINT:  <s> must not occur within
    * an <s>, i.e. Ancestor(1,s) = NIL
    *-->
The important change here, for present purposes, is the removal of the exclusion exception. In addition, we have removed the tag omissibility indicators and the parentheses around phrase.seq, for reasons that should be clear from other portions of this document.

It would be possible to simulate the effect of exclusion exceptions by modifying the content models of possible descendants of <s>, so as to remove <s> from their content model; for elements which can occur both as parents and as descendants of <s>, however, this change would render some existing documents illegal; it is thus not pursued further here.

The following elements have exclusion exceptions in TEI P3:

The new declarations are precisely the same as the old declarations, only without the exclusions:

< 37 New declarations for exclusion exceptions > =

<![ %TEI.analysis; [
<!ENTITY % XML.s "INCLUDE" >
<![%XML.s;[
<!ELEMENT %n.s;         - -  %phrase.seq;                       >
<!ATTLIST %n.s;              %a.global;
                             %a.seg;
          TEIform            CDATA               's'            >
]]>
]]>


< 38 New declarations for exclusion exceptions 37 (cont'd) > =

<!ENTITY % XML.speaker "INCLUDE" >
<![%XML.speaker;[
<!ELEMENT %n.speaker;   - O  %phrase.seq;                       >
<!ATTLIST %n.speaker;        %a.global;
          TEIform            CDATA               'speaker'      >
]]>


< 39 New declarations for exclusion exceptions 37 (cont'd) > =

<!ENTITY % XML.stage "INCLUDE" >
<![%XML.stage;[
<!ELEMENT %n.stage;     - -  %specialPara;                      >
<!ATTLIST %n.stage;          %a.global;
          type               CDATA               mix
          TEIform            CDATA               'stage'        >
]]>


And they have to be excluded from the base DTD:

< 40 Suppress element declarations with exclusions > =

<!ENTITY % s       'IGNORE' >
<!ENTITY % speaker 'IGNORE' >
<!ENTITY % stage   'IGNORE' >


A new definition of <re> has already been given above, in the context of normalizing mixed-content models. The new definition of <hom> would be as follows:

<!ELEMENT %n.hom;       - O  (%n.sense; |
                             %m.dictionaryTopLevel)*            >
The actualy form to be used for <hom> in an XML DTD, however, varies from this, as described below in The problem of the dictionary chapter.

8 Inclusions

Removing inclusion exceptions requires simulating their effect in the content model of each element type which can occur as a descendant of the element type bearing the inclusions. This section discusses

A brief note on the notation used is given in an appendix.

8.1 The Effect of Inclusions

Inclusions make included elements legal at any location in a content model, without however changing the requirements of the basic content model, which must still be fulfilled. (For now, I make the simplifying assumption that the set of included elements and the set of elements named in the content model are disjoint. When they are not, special considerations will apply, because of SGML's requirement that content models be deterministic.)

We can summarize the effect of inclusions very simply if we think of an FSA recognizing a content model: included elements do not change the state of the FSA. So to change an FSA without inclusions to an FSA that accepts the same language, except that it also allows the inclusion of any element i in the set of inclusions I,

    for each state s in the FSA {
       for each element i in I {
          add a transition from s to s, on i
       }
    }

8.2 The Function imf()

We can characterize the language recognized using inclusion exceptions this way. Let us construct a function imf(E,I) which maps from a regular expression E and a set of inclusions I to a new regular expression E'. Ideally we want the following to be true:

In general, for sequences of terminals x, y in Sigma*:

My best cut so far at defining such a function relies in some places on a couple of auxiliary functions. So let us define functions imf(E), mf(E), and m(E) (where i is for `initial', m for `medial', f for `final').[5] imf(E) makes the claim about xiy true for all x, y in Sigma*. mf(E) makes it true for x in Sigma+ and y in Sigma*. m(E) makes it true for x, y in Sigma+. Equivalently, we can say that any element i in I can appear initially, medially, or finally in imf(E), medially or finally (but not initially) in mf(E), and medially (but not initially or finally) in m(E).

The care we have to take with initial and final positions results from the SGML rules about determinism, but also helps keep the resulting expressions simpler than they'd be if we just slapped (I*) in everywhere in the content model.

Here is a first cut at defining the functions. In a number of circumstances, they are undefined; it might perhaps be useful, therefore, to define a simple normalization on (ampersand-free) content models, which would ensure that the functions are always defined.

If E is the empty set, then the content model in question cannot be satisfied; this would be the case if a DTD which lacked any element called <nonesuch> nevertheless included an element which required it as a subelement:

<!ELEMENT impossible - - (nonesuch) >
Given that we want L(E) is a subset of L(E') we must define imf etc. thus for this case:

An element may accept the empty string as its content in either of two ways. First, the element may be declared EMPTY: in this case, inclusions are not legal inside the element.

Second, the element's content model may accept the empty string, either because all subelements are optional or because the content model may be satisfied by #PCDATA: in this case, inclusions are legal within the element.

If E is an atomic symbol, e.g. a, then

If E has the form F?, and F is not nullable (does not accept the empty string), then

Note that we require F to be non-nullable in order to preserve determinism.

If E has the form F?, and F is nullable, then

In other words, if F is nullable, the ? is redundant and may be stripped without loss of information.

If E has the form F+, and F is not nullable, then

If E has the form F+, and F is nullable, then

If E has the form F*, and F is not nullable, then

If E has the form F*, and F is nullable, then

If E has the form (F,G), then

If E has the form (F|G), then

If E has the form (F&G), then

8.3 Examples

Let's do some simple examples, abstracted from the TEI.

8.3.1 Simple Examples

8.3.2 A Complex Example: back

The element <back> is defined thus:

<!ELEMENT %n.back;      - O
  ( (%m.front)*,
    ( ( (%m.divtop),
        (%m.divtop | %n.titlePage;)*
      )
    | ( (%n.div;),
        (%n.div; | (%m.front))*
      )
    | ( (%n.div1;),
        (%n.div1; | (%m.front))*
      )
    )?
  )     >

Removing the parameter entities and using single-letter identifiers, we can rewrite the content model this way to show its structure a little more clearly:

( (a | b | c)*,
  ( ( (d | e | f),
      (d | e | f | g)*
    )
  | ( (h),
      (h | (a | b | c))*
    )
  | ( (i),
      (i | (a | b | c))*
    )
  )?
)
Or more compactly:
( (a | b | c)*,
  ( ( (d | e | f), (d | e | f | g)* )
  | ( h, (h | a | b | c)* )
  | ( i, (i | a | b | c)* )
  )?
)
i.e. E has the form F,G where F=(a|b|c)* and G=(((d|e|f) ... (i|a|b|c)*))?. So imf(E) = imf(F), mf(G).

Now, F is simple: imf(a|b|c)* = (a | b | c | I)*

But mf(G) requires more work.

G = H? where H =

     ( ( (d | e | f), (d | e | f | g)* )
     | ( h, (h | a | b | c)* )
     | ( i, (i | a | b | c)* )
     )
So mf(G) = (m(H), I*)?

H in turn is an alternation of three sequences, each of the form (x, (y|z)*). This leads to a problem, because the final term in each sequence is nullable; we will have a determinism conflict with the trailing I*.

So we add a new definition of mf(E) where E = F?. mf(F?) = mf(F)?

Applied to G, we have: mf(G) = (mf(H))?, with H = (J | K | L).

So mf(H) = ((m(J) | m(K) | m(L)), I*)

But J, K, and L don't have m() forms, since their final term is nullable. So we use the alternate definition:

mf(H) = (mf(J) | mf(K) | mf(L))

We have the following:

So mf(H) =

        ( ( (d | e | f), I*, (d | e | f | g | I)*)
        | ( h, I*, (h | a | b | c | I)* )
        | ( i, I*, (i | a | b | c | I)* )
        )

Recall that mf(G) = (mf(H))?.

So mf(G) =

        ( ( (d | e | f), I*, (d | e | f | g | I)*)
        | ( h, I*, (h | a | b | c | I)* )
        | ( i, I*, (i | a | b | c | I)* )
        )?
and imf(E) = imf(F), mf(G) =
         ( (a | b | c | I)*,
           ( ( (d | e | f), I*, (d | e | f | g | I)*)
           | ( h, I*, (h | a | b | c | I)* )
           | ( i, I*, (i | a | b | c | I)* )
           )?
         )

Or, in content model terms (using the usual TEI conventions for names of element classes):

<!ELEMENT %n.back;      - O
  ( (%m.front; | %m.I;)*,
    ( ( (%m.divtop;),
        (%Istar;),
        (%m.divtop; | %n.titlePage; | %m.I;)*
      )
    | ( (%n.div;),
        (%Istar;),
        (%n.div; | %m.front; | %m.I;)*
      )
    | ( (%n.div1;),
        (%Istar;),
        (%n.div1; | %m.front; | %m.I;)*
      )
    )?
  )     >

I think we've got a system we can use manually, though I don't know for sure how to make it a program, given the problems we have defining some of the functions.

8.4 Removing inclusions in TEI P3

The following elements have inclusion exceptions in TEI P3 (as of September 1994):

The inclusions on <entry>, <entryFree>, and <eg> will be taken care of separately, in the section on the dictionary chapter.

The inclusions on <orgName> were dropped in October 1994 (though this change has not been propagated to any public version of the DTD), and so we will ignore them.

The inclusions on <text> must be propagated to all potential descendants of <text>.

The inclusions on <lem> and <rdg> must be propagated to all potential descendants; it might be possible to do without these, but it's probably not worth the effort.

Note that in the case of terminologyInclusions, the set of inclusions is not disjoint from the set of children named directly in content models.

Study of the full TEI DTD shows that the sets of possible descendants of <text>, <lem>, <rdg>, and <termEntry> are all identical. This is not surprising given that <text> is recursive.

The 263 elements in this set fall into the following groups:

Note that this list excludes most element types from the dictionary tag set, since they need special treatment anyway. (It does not exclude all of them, though, which puzzles me.)

Empty elements need no changes.

The other groups of elements do require changes to the DTD, which are described in the following sections.

8.4.1 The m.Incl element class

In order to simplify the process of adding inclusions to the content models of the DTD, we define a new class for use in content models, namely m.Incl. This consists of:

For now, we ignore the problems posed by the <termEntry> element. In the long run, they mean the terminology tag set is going to need to be rewritten. (Of course, it needs rewriting anyway, to align it with more recent ISO work.)

< 41 Element class m.Incl > =

<!ENTITY % x.Incl ''>
<![%TEI.textcrit;[
<!--* If text criticism tag set is selected, include m.fragmentary
    * in the class m.Incl.
    *-->
<!ENTITY % m.Incl '%x.Incl; %m.globincl; | %m.editIncl; 
    | %m.fragmentary; | %n.anchor;'                             >
]]>
<!--* Otherwise, don't.                                      *-->
<!ENTITY % m.Incl '%x.Incl; %m.globincl; | %m.editIncl;
    | %n.anchor;'                                               >


We have to reproduce the standard declarations for the inclusion classes:

< 42 Reproduce inclusion classes > =

<!ENTITY % x.metadata ''                                        >
<!ENTITY % m.metadata '%x.metadata; %n.alt; | %n.altGrp; | 
           %n.certainty; | %n.fLib; | %n.fs; | %n.fsLib; | 
           %n.fvLib; | %n.index; | %n.interp; | %n.interpGrp; | 
           %n.join; | %n.joinGrp; | %n.link; | %n.linkGrp; | 
           %n.respons; | %n.span; | %n.spanGrp; | %n.timeline;' >
<!ENTITY % x.refsys ''                                          >
<!ENTITY % m.refsys '%x.refsys; %n.cb; | %n.lb; | %n.milestone; 
           | %n.pb;'                                            >
<!ENTITY % x.globincl ''                                        >
<!ENTITY % m.globincl '%x.globincl; %m.metadata; | %m.refsys;'  >


8.4.2 Changing #PCDATA elements

Each element which now has a content model of #PCDATA should, for compatibility, be revised to have a content model of (#PCDATA | %m.Incl;)*.

In some cases, it might be preferable to leave the content model alone: it's not clear that it's really useful to allow index entries, feature structure libraries, and joins to occur within attribute names, generic identifiers, and the components of structured times and dates. Even within generic identifiers and so on, there might be line breaks, page breaks, or other milestones, but perhaps we should define at least some of these elements as (#PCDATA | %m.refsys;)*.

For now, for purposes of the experimental XML DTD, I propose to use the first form given.

First, we suppress all of these elements:

< 43 Suppress standard definitions of PCDATA elements > =

<!ENTITY % day             'IGNORE' >
<!ENTITY % hour            'IGNORE' >
<!ENTITY % minute          'IGNORE' >
<!ENTITY % month           'IGNORE' >
<!ENTITY % offset          'IGNORE' >
<!ENTITY % second          'IGNORE' >
<!ENTITY % week            'IGNORE' >
<!ENTITY % year            'IGNORE' >
<!ENTITY % idno            'IGNORE' >
<!ENTITY % postBox         'IGNORE' >
<!ENTITY % postCode        'IGNORE' >
<!ENTITY % str             'IGNORE' >


Then we supply the new declarations:

< 44 New definitions for PCDATA elements > =

<![%TEI.names.dates;[
<!ENTITY % XML.day "INCLUDE" >
<![%XML.day;[
<!ELEMENT %n.day;         - -  (#PCDATA | %m.Incl;)*  >
<!ATTLIST %n.day;            %a.global;
                             %a.temporalExpr;
          TEIform            CDATA               'day'          >
]]>
<!ENTITY % XML.hour "INCLUDE" >
<![%XML.hour;[
<!ELEMENT %n.hour;        - -  (#PCDATA | %m.Incl;)*  >
<!ATTLIST %n.hour;           %a.global;
                             %a.temporalExpr;
          TEIform            CDATA               'hour'         >
]]>
<!ENTITY % XML.minute "INCLUDE" >
<![%XML.minute;[
<!ELEMENT %n.minute;      - -  (#PCDATA | %m.Incl;)*  >
<!ATTLIST %n.minute;         %a.global;
                             %a.temporalExpr;
          TEIform            CDATA               'minute'       >
]]>
<!ENTITY % XML.month "INCLUDE" >
<![%XML.month;[
<!ELEMENT %n.month;       - -  (#PCDATA | %m.Incl;)*  >
<!ATTLIST %n.month;          %a.global;
                             %a.temporalExpr;
          TEIform            CDATA               'month'        >
]]>
<!ENTITY % XML.offset "INCLUDE" >
<![%XML.offset;[
<!ELEMENT %n.offset;      - -  (#PCDATA | %m.Incl;)*  >
<!ATTLIST %n.offset;         %a.global;
          value              CDATA               #IMPLIED
                             %a.placePart;
          TEIform            CDATA               'offset'       >
]]>
<!ENTITY % XML.second "INCLUDE" >
<![%XML.second;[
<!ELEMENT %n.second;      - -  (#PCDATA | %m.Incl;)*  >
<!ATTLIST %n.second;         %a.global;
                             %a.temporalExpr;
          TEIform            CDATA               'second'       >
]]>
<!ENTITY % XML.week "INCLUDE" >
<![%XML.week;[
<!ELEMENT %n.week;        - -  (#PCDATA | %m.Incl;)*  >
<!ATTLIST %n.week;           %a.global;
                             %a.temporalExpr;
          TEIform            CDATA               'week'         >
]]>
<!ENTITY % XML.year "INCLUDE" >
<![%XML.year;[
<!ELEMENT %n.year;        - -  (#PCDATA | %m.Incl;)*  >
<!ATTLIST %n.year;           %a.global;
                             %a.temporalExpr;
          TEIform            CDATA               'year'         >
]]>
]]>
<!ENTITY % XML.idno "INCLUDE" >
<![%XML.idno;[
<!ELEMENT %n.idno;        - -  (#PCDATA | %m.Incl;)*  >
<!ATTLIST %n.idno;           %a.global;
          type               CDATA               #IMPLIED
          TEIform            CDATA               'idno'         >
]]>
<!ENTITY % XML.postBox "INCLUDE" >
<![%XML.postBox;[
<!ELEMENT %n.postBox;     - -  (#PCDATA | %m.Incl;)*  >
<!ATTLIST %n.postBox;        %a.global;
          TEIform            CDATA               'postBox'      >
]]>
<!ENTITY % XML.postCode "INCLUDE" >
<![%XML.postCode;[
<!ELEMENT %n.postCode;    - -  (#PCDATA | %m.Incl;)*  >
<!ATTLIST %n.postCode;       %a.global;
          TEIform            CDATA               'postCode'     >
]]>
<![%TEI.fs;[
<!ENTITY % XML.str "INCLUDE" >
<![%XML.str;[
<!ELEMENT %n.str;         - -  (#PCDATA | %m.Incl;)*  >
<!ATTLIST %n.str;            %a.global;
          rel                (eq | ne | sb | ns | lt | le | gt 
                             | ge)               eq
          TEIform            CDATA               'str'          >
]]>
]]>


8.4.3 Changing phrase.seq

The parameter entity phrase.seq should be redefined as follows:

< 45 New declaration for phrase and phrase.seq > =

<!ENTITY % phrase '#PCDATA | %m.phrase; | %m.Incl;'             >
<!ENTITY % phrase.seq '(%phrase;)*'                             >


(This supersedes the redefinition given earlier. Adding the inclusions to the class phrase (i.e. to the entity m.phrase) might enable some of the redefinitions already given above to stand unchanged, but for now, at least, I propose to keep the inclusions logically separate from the original element classes.) Note that the entity phrase is used only once, in the definition of <u>.

No changes to the actual content models are needed. (Ah, the joys of indirection.)

(Note, 14 May 1999.) No, wait, actually, that's not true. Many of these declarations read

<!ELEMENT %n.foo;       - O  (%phrase.seq;)                     >
which, expanded, would be
<!ELEMENT %n.foo;       - O  ((#PCDATA | %m.phrase; | %m.Incl;)*)>
which is illegal. The content models do need to be changed, to
<!ELEMENT %n.foo;       %phrase.seq;                            >
This is only required if we wish to allow the extensions file to work with the current (1994-09) production DTDs. Since those are what I currently have on this laptop, I do wish. But since we will shortly be releasing corrected versions, we want to make this part of the extensions file optional. We'll do so using a conditional inclusion on the parameter entity base9409, which by default will be defined IGNORE.

The same logic applies to paraContent and (for now) specialPara.

(Note, 30 May 1999.) No, no, wait. Doesn't carthage already normalize these correctly by omitting extra parentheses? I've already spent several hours making the scraps below, and now realize we may not need them after all. (17 June 1999.) I've removed them, since carthage actually does produce legal XML.

8.4.4 Changing component.seq

The entity component.seq must be redefined to allow inclusions between any two components. In the long run, the changes should be made directly within the various declarations which go into component.seq, but those declarations are among the most complicated of the entire TEI DTD, since there are variant versions for each of the two hundred or so possible combinations of base tag sets.

The quick and dirty approach most suitable for use in the experimental XML DTD is to include the Incl class as a subclass of common, thus:

< 46 New declaration for x.common > =

<!ENTITY % x.common '%m.Incl; |'>

If this proves to introduce ambiguity in the content model, we'll have to find a slower, cleaner way to do it.

Experiment shows that it does indeed introduce ambiguity in content models, notably those for <body> and text divisions. Rather than hack at those content models, I am going to take the longer and slower approach.

< 47 New declaration for component and component.seq > =

<!ENTITY % x.common ''                                          >
<!ENTITY % m.common '%x.common %m.bibl; | %m.chunk; | 
           %m.hqinter; | %m.lists; | %m.notes; | %n.stage;'     >
< Reproduce standard component declarations 48 >
<!-- The entity component.seq is always a starred sequence    -->
<!-- of component elements. Its definition does not vary      -->
<!-- with the base (unless we are using the general base, in  -->
<!-- which case it has already been defined above), but the   -->
<!-- meaning of the definition does.                          -->
<!ENTITY % component.seq '((%component;), (%m.Incl;)*)*'        >


< 48 Reproduce standard component declarations > =

<!ENTITY % mix.verse ''                                         >
<!ENTITY % mix.drama ''                                         >
<!ENTITY % mix.spoken ''                                        >
<!ENTITY % mix.dictionaries ''                                  >
<!ENTITY % mix.terminology ''                                   >

<![ %TEI.mixed; [
<!ENTITY % TEI.singleBase 'IGNORE'                              >
<!ENTITY % component '(%m.common; %mix.verse; %mix.drama; 
  %mix.spoken; %mix.dictionaries; %mix.terminology;)'              >
]]>

<![ %TEI.general; [
<!ENTITY % TEI.singleBase 'IGNORE'                              >
<!ENTITY % component '(%m.common; %mix.verse; %mix.drama; 
  %mix.spoken; %mix.dictionaries; %mix.terminology;)'              >

<![ %TEI.verse; [
<!ENTITY % gen.verse '((%m.comp.verse;), (%m.common; | 
%m.comp.verse; | %m.Incl;)*) |'                                 >
]]>
<![ %TEI.drama; [
<!ENTITY % gen.drama '((%m.comp.drama;), (%m.common; | 
%m.comp.drama; | %m.Incl;)*) |'                                 >
]]>
<![ %TEI.spoken; [
<!ENTITY % gen.spoken '((%m.comp.spoken;), (%m.common; | 
%m.comp.spoken; | %m.Incl;)*) |'                                >
]]>
<![ %TEI.dictionaries; [
<!ENTITY % gen.dictionaries '((%m.comp.dictionaries;), 
(%m.common; | %m.comp.dictionaries; | %m.Incl;)*) |'            >
]]>
<![ %TEI.terminology; [
<!ENTITY % gen.terminology '((%m.comp.terminology;), (%m.common; 
| %m.comp.terminology; | %m.Incl;)*) |'                         >
]]>
<!-- Default declarations for all the entities gen.verse,     -->
<!-- etc.                                                     -->
<!ENTITY % gen.verse ''                                         >
<!ENTITY % gen.drama ''                                         >
<!ENTITY % gen.spoken ''                                        >
<!ENTITY % gen.dictionaries ''                                  >
<!ENTITY % gen.terminology ''                                   >
<!ENTITY % component.seq '((%m.common;), (%m.Incl;)*)*, 
  (%gen.verse; %gen.drama; %gen.spoken; %gen.dictionaries; 
  %gen.terminology; TEI...end)?'   >
<!ENTITY % component.plus '(%gen.verse; %gen.drama; %gen.spoken; 
  %gen.dictionaries; %gen.terminology; TEI...end)
  |
  ( ((%m.common;), (%m.Incl;)*)+, 
    (%gen.verse; %gen.drama; %gen.spoken; 
    %gen.dictionaries; %gen.terminology; TEI...end)?'           >
<!-- (End of marked section for general base.)                -->
]]>

<![ %TEI.prose; [
<!ENTITY % component '(%m.common;)'                             >
<!ENTITY % TEI.singleBase 'INCLUDE'                             >
]]>

<![ %TEI.verse; [
<!ENTITY % component '(%m.common; | %m.comp.verse;)'            >
<!ENTITY % TEI.singleBase 'INCLUDE'                             >
]]>

<![ %TEI.drama; [
<!ENTITY % component '(%m.common; | %m.comp.drama;)'            >
<!ENTITY % TEI.singleBase 'INCLUDE'                             >
]]>

<![ %TEI.spoken; [
<!ENTITY % component '(%m.common; | %m.comp.spoken;)'           >
<!ENTITY % TEI.singleBase 'INCLUDE'                             >
]]>

<![ %TEI.dictionaries; [
<!ENTITY % component '(%m.common; | %m.comp.dictionaries;)'     >
<!ENTITY % TEI.singleBase 'INCLUDE'                             >
]]>

<![ %TEI.terminology; [
<!ENTITY % component '(%m.common; | %m.comp.terminology;)'      >
<!ENTITY % TEI.singleBase 'INCLUDE'                             >
]]>

<!-- Default declaration.                                     -->
<!ENTITY % component '(%m.common;)'                             >
<!ENTITY % TEI.singleBase 'INCLUDE'                             >


8.4.5 Changing paraContent

The parameter entity paraContent must be changed as follows:

< 49 New declaration for paraContent > =

<!ENTITY % paraContent '(#PCDATA | %m.phrase; | %m.inter;
| %m.Incl;)*'                                                   >


No change to actual content models is needed.

(Note, 14 May 1999.) No, wait, actually, that's not true. Many of these declarations read

<!ELEMENT %n.p;         - O  (%paraContent;)                    >
which, expanded, would be
<!ELEMENT %n.p;         - O  ((#PCDATA | %m.phrase; | %m.inter;
                             | %m.Incl;)*)                      >
which is illegal. The content models do need to be changed, to
<!ELEMENT %n.p;         - O  %paraContent;                      >

For now, though, we can rely on carthage to do the job, so I've deleted the long boring scraps that used to be here.

8.4.6 The problem of specialPara elements

In TEI P3, the entity specialPara is defined thus:

<!ENTITY % specialPara '(((%m.chunk), (%component.seq)) |
(%paraContent))'                                                >
It allows an element to contain either a series of chunks or the same content as a paragraph. It is intended for elements like notes and list items: the normal case, in which the item consists of a single paragraph, can be tagged simply (<item> ... </item>) and the multi-paragraph case can be accommodated using nested paragraphs or other chunk-level elements (<item><p> ... </p><p> ... </p></item>). In practice, the multi-paragraph form has proven very disconcerting to users, since it is not intuitively obvious that no white space may appear between the paragraphs.[8] The current definition and use of specialPara are thus acknowledged by the editors to be an error. Since there is no obvious solution, however, it is not a corrigible error.

In changing specialPara to meet the requirements of XML, there are three obvious possible solutions. We can overgenerate, so as to allow all existing data to remain valid:

<!ENTITY % specialPara '(#PCDATA | %m.phrase; | %m.inter;
| %m.chunk;)*' >
This has the drawback of allowing paragraphs and other chunk-level elements to float within character data, thus violating one of the few consistently followed rules of the TEI DTD.

Alternatively, we can bite the bullet and require that list items and notes which consist of a single paragraph be marked as such:

<!ENTITY % specialPara '%component.seq;' >
This has the advantage of being relatively clean, but it has the major disadvantage of requiring retagging for almost all current list items and notes. What is now tagged <item> ... </item> would have to be retagged <item><p> ... </p></item>. The best that can be said is that such retagging could in principle be automated.

A third approach would be to have distinct element types for simple list items and notes, and compound ones. The simple form could be defined as containing paraContent, and the compound ones as containing component.seq. This would also require retagging (of all compound list items and notes), but not as much as the previous approach.

For purposes of the experimental XML DTD, we take the first approach.

The following element types are defined as containing specialPara:

All but one of these can be fixed simply by redefining specialPara thus:

< 50 New specialPara > =

<!ENTITY % specialPara '(#PCDATA | %m.phrase; | %m.inter;
| %m.chunk; | %m.Incl;)*' >


In order to redefine specialPara, we must first reproduce a number of class declarations from teiclas2.ent:

< 51 Reproduce classes used by specPara > =

<!ENTITY % x.hqinter ''                                         >
<!ENTITY % m.hqinter '%x.hqinter; %n.cit; | %n.q; | %n.quote;'  >
<!ENTITY % x.bibl ''                                            >
<!ENTITY % m.bibl '%x.bibl; %n.bibl; | %n.biblFull; |
           %n.biblStruct;'                                      >
<!ENTITY % x.lists ''                                           >
<!ENTITY % m.lists '%x.lists; %n.label; | %n.list; |
           %n.listBibl;'                                        >
<!ENTITY % x.notes ''                                           >
<!ENTITY % m.notes '%x.notes; %n.note; | %n.witDetail;'         >
<!ENTITY % x.stageDirection ''                                  >
<!ENTITY % m.stageDirection '%x.stageDirection; %n.camera; |
           %n.caption; | %n.move; | %n.sound; | %n.tech; |
           %n.view;'                                            >

<!ENTITY % x.inter ''                                           >
<!ENTITY % m.inter '%x.inter; %m.bibl; | %m.hqinter; | %m.lists;
           | %m.notes; | %m.stageDirection; | %n.castList; |
           %n.figure; | %n.stage; | %n.table; | %n.text;'       >
<!ENTITY % x.chunk ''                                           >
<!ENTITY % m.chunk '%x.chunk; %n.ab; | %n.eTree; | %n.graph; | 
           %n.l; | 
           %n.lg; | %n.p; | %n.sp; | %n.tree; | %n.witList;'    >


The <ab> element is new and we need to declare its content model:

< 52 Declare new GIs 23 (cont'd) > =

<!ENTITY % n.ab 'ab' >

Only one content model must be redefined by hand, to flatten the group: that of <set> in the drama tag set. The current definition is this:

<!ELEMENT set           - -  ((head)?, %specialPara;)           >
<!ATTLIST set                %a.global;
          TEIform            CDATA               'set'          >
If we flatten this in the expected way, we get this:
<!ELEMENT %n.set;       - -  (#PCDATA | %m.phrase; | %m.inter;
                             | %m.chunk; | %m.Incl;
                             | %n.head;)*                       >
<!ATTLIST %n.set;            %a.global;
          TEIform            CDATA               'set'          >
This has the unfortunate result of allowing <head> elements at random locations; it might be better, in this case, to tighten the content model instead.[9] Version 2 of the new model is this:

< 53 New definition of set element > =

<![%TEI.drama;[
<!ENTITY % XML.set "INCLUDE" >
<![%XML.set;[
<!ELEMENT %n.set;       - -  ((%n.head;)?, %component.seq;)     >
<!ATTLIST %n.set;            %a.global;
          TEIform            CDATA               'set'          >
]]>
]]>


Version 2 is not strictly compatible with the old version: to be fully compatible we have to allow inclusions up front (Version 3):
<!ELEMENT %n.set;       - -  ((%m.Incl;)*, (%n.head;)?,
                             %component.seq;)                   >
For now, the experimental XML version of the DTD will use Version 2 of this declaration.

8.5 Elements requiring manual intervention

(Scraps suppressing and redeclaring the remaining elements to be supplied here.)

The elements to be treated here are: address, altgrp, analytic, app, argument, availability, back, bibl, biblfull, biblstruct, body, broadcast, byline, c, castgroup, castitem, castlist, cit, closer, dateline, datestruct, div, div0, div1, div2, div3, div4, div5, div6, div7, docimprint, doctitle, editionStmt, epilogue, equipment, etree, f, falt, figure, flib, formula, front, fs, fslib, fvlib, graph, group, imprint, interpgrp, joingrp, lg, lg1, lg2, lg3, lg4, lg5, linkgrp, list, listbibl, m, monogr, notesStmt, ofig, opener, ovar, performance, prologue, publicationStmt, pvar, rdggrp, recording, recordingStmt, respStmt, row, scriptStmt, series, seriesStmt, set, sourcedesc, sp, spangrp, table, termentry, text, tig, timeline, timestruct, titlepage, titleStmt, tree, triangle, u, valt, w, and witlist.

The following sections provide the DTD fragments necessary for suppressing the existing declarations for these elements and declaring them with new content models.

8.5.1 Core tag set

< 54 Suppress definitions in core tag set > =

<!ENTITY % address    'IGNORE' >
<!ENTITY % analytic   'IGNORE' >
<!ENTITY % bibl       'IGNORE' >
<!ENTITY % biblFull   'IGNORE' >
<!ENTITY % biblStruct 'IGNORE' >
<!ENTITY % cit        'IGNORE' >
<!ENTITY % imprint    'IGNORE' >
<!ENTITY % lg         'IGNORE' >
<!ENTITY % list       'IGNORE' >
<!ENTITY % listBibl   'IGNORE' >
<!ENTITY % monogr     'IGNORE' >
<!ENTITY % respStmt   'IGNORE' >
<!ENTITY % series     'IGNORE' >
<!ENTITY % sp         'IGNORE' >


The existing declarations are these:

<!ELEMENT %n.address;   - O  ((%n.addrLine)+ | (%m.addrPart)*)  >
<!ELEMENT %n.analytic;  - O  (%n.author; | %n.editor; |
                             %n.respStmt; | %n.title;)*         >
<!ELEMENT %n.bibl;      - O  (#PCDATA | %m.phrase; |
                             %m.biblPart;)*                     >
<!ELEMENT %n.biblFull;  - O  (%n.titleStmt;, 
                             (%n.editionStmt)?,
                             (%n.extent)?, 
                             %n.publicationStmt;,
                             (%n.seriesStmt)?, 
                             (%n.notesStmt)?,
                             (%n.sourceDesc)*)                  >
<!ELEMENT %n.biblStruct;
                        - O  ((%n.analytic)?, 
                              (%n.monogr;,
                              (%n.series)*)+, 
                              (%n.note; | %n.idno;)*)           >
<!ELEMENT %n.cit;       - -  ((%n.q; | %n.quote;) 
                             & (%m.bibl; | %m.loc;))            >
<!ELEMENT %n.imprint;   - O  (%n.pubPlace; | %n.publisher;
                             | %n.date; | %n.biblScope;)*       >
<!ELEMENT %n.lg;        - O  ((%m.divtop)*, (%n.l; | %n.lg;)+,
                             (%m.divbot)*)                      >
<!ELEMENT %n.list;      - -  ( (%n.head)?,
                               ( ( (%n.item)+ )
                                 | ( (%n.headLabel)?,
                                     (%n.headItem)?,
                                     (%n.label;, %n.item;)+)))  >
<!ELEMENT %n.listBibl;  - -  ((%n.head)?, (%n.bibl; |
                             %n.biblStruct; | %n.biblFull;)+,
                             (%n.trailer)?)                     >
<!ELEMENT %n.monogr;    - O  ( ( ( (%n.author; | %n.editor; |
                                    %n.respStmt;)+, 
                                   (%n.title)+,
                                   (%n.editor; | %n.respStmt;)*) 
                                 |
                                 ( (%n.title)+, 
                                   (%n.author; | %n.editor; 
                                   | %n.respStmt;)*))?,
                               (%n.note; | %n.meeting;)*,
                               (%n.edition;, 
                                 (%n.editor; | %n.respStmt;)*)*, 
                               %n.imprint;,
                               (%n.imprint; | %n.extent; 
                                 | %n.biblScope;)* )            >
<!ELEMENT %n.respStmt;  - O  ((%n.resp; & %n.name;), 
                              (%n.resp; | %n.name;)*)           >
<!ELEMENT %n.series;    - O  (%n.title; | %n.editor; |
                             %n.respStmt; | %n.biblScope;)*     >
<!ELEMENT %n.sp;        - O  ((%n.speaker)?, (%n.p; | %n.l; |
                             %n.lg; | %n.seg; | %n.stage;)+)    >

The new definitions are these; note that <cit> and <respStmt> have already been declared above.

< 55 New definitions for core tag set > =

<!ENTITY % XML.address "INCLUDE" >
<![%XML.address;[
<!ELEMENT %n.address;   - O  ((%m.Incl;)*,
                             ( (%n.addrLine;, (%m.Incl;)*)+
                             | ((%m.addrPart;), (%m.Incl;)*)*)) >
<!ATTLIST %n.address;        %a.global;
          TEIform            CDATA               'address'      >
]]>


< 56 New definitions for core tag set 55 (cont'd) > =

<!ENTITY % XML.analytic "INCLUDE" >
<![%XML.analytic;[
<!ELEMENT %n.analytic;  - O  (%n.author; | %n.editor;
                             | %n.respStmt; | %n.title;
                             | %m.Incl;)*                       >
<!ATTLIST %n.analytic;       %a.global;
          TEIform            CDATA               'analytic'     >
]]>


< 57 New definitions for core tag set 55 (cont'd) > =

<!ENTITY % XML.bibl "INCLUDE" >
<![%XML.bibl;[
<!ELEMENT %n.bibl;      - O  (#PCDATA | %m.phrase; |
                             %m.biblPart; | %m.Incl;)*          >
<!ATTLIST %n.bibl;           %a.global;
                             %a.declarable;
          TEIform            CDATA               'bibl'         >
]]>


< 58 New definitions for core tag set 55 (cont'd) > =

<!ENTITY % XML.biblFull "INCLUDE" >
<![%XML.biblFull;[
<!ELEMENT %n.biblFull;  - O  ((%m.Incl;)*,
                             (%n.titleStmt;, (%m.Incl;)*),
                             (%n.editionStmt;, (%m.Incl;)*)?,
                             (%n.extent;, (%m.Incl;)*)?,
                             (%n.publicationStmt;, (%m.Incl;)*),
                             (%n.seriesStmt;, (%m.Incl;)*)?,
                             (%n.notesStmt;, (%m.Incl;)*)?,
                             (%n.sourceDesc;, (%m.Incl;)*)*
                             )                  >
<!ATTLIST %n.biblFull;       %a.global;
                             %a.declarable;
          TEIform            CDATA               'biblFull'     >
]]>


< 59 New definitions for core tag set 55 (cont'd) > =

<!ENTITY % XML.biblStruct "INCLUDE" >
<![%XML.biblStruct;[
<!ELEMENT %n.biblStruct;
                        - O  ((%m.Incl;)*,
                             (%n.analytic;, (%m.Incl;)*)?,
                             ( (%n.monogr;, (%m.Incl;)*),
                               (%n.series;, (%m.Incl;)*)* )+,
                             ( (%n.note; | %n.idno;),
                               (%m.Incl;)*)*) >

<!ATTLIST %n.biblStruct;     %a.global;
                             %a.declarable;
          TEIform            CDATA               'biblStruct'   >
]]>

<!-- cit has already been declared.                           -->


< 60 New definitions for core tag set 55 (cont'd) > =

<!ENTITY % XML.imprint "INCLUDE" >
<![%XML.imprint;[
<!ELEMENT %n.imprint;   - O  (%n.pubPlace; | %n.publisher;
                             | %n.date; | %n.biblScope;
                             | %m.Incl;)*                       >
<!ATTLIST %n.imprint;        %a.global;
          TEIform            CDATA               'imprint'      >
]]>


< 61 New definitions for core tag set 55 (cont'd) > =

<!ENTITY % XML.lg "INCLUDE" >
<![%XML.lg;[
<!ELEMENT %n.lg;        - O  ((%m.divtop; | %m.Incl;)*,
                             (%n.l; | %n.lg;),
                             (%n.l; | %n.lg; | %m.Incl;)*,
                             ((%m.divbot;), (%m.Incl;)*)*)         >
<!ATTLIST %n.lg;             %a.global;
                             %a.divn;
                             %a.metrical;
          TEIform            CDATA               'lg'           >
]]>


< 62 New definitions for core tag set 55 (cont'd) > =

<!ENTITY % XML.list "INCLUDE" >
<![%XML.list;[
<!ELEMENT %n.list;      - -  ((%m.Incl;)*,
                             (%n.head;, (%m.Incl;)*)?,
                             ( ((%n.item;, (%m.Incl;)*)*) |
                             ( (%n.headLabel;, (%m.Incl;)*)?,
                               (%n.headItem;, (%m.Incl;)*)?,
                               (%n.label;, (%m.Incl;)*,
                                %n.item;, (%m.Incl;)*)+)))         >
<!ATTLIST %n.list;           %a.global;
          type               CDATA               simple
          TEIform            CDATA               'list'         >
]]>


< 63 New definitions for core tag set 55 (cont'd) > =

<!ENTITY % XML.listBibl "INCLUDE" >
<![%XML.listBibl;[
<!ELEMENT %n.listBibl;  - -  ((%m.Incl;)*,
                             (%n.head;, (%m.Incl;)*)?,
                             (%n.bibl; | %n.biblStruct; 
                             | %n.biblFull;),
                             (%n.bibl; | %n.biblStruct;
                             | %n.biblFull; | %m.Incl;)*,
                             (%n.trailer;, (%m.Incl;)*)?)          >
<!ATTLIST %n.listBibl;       %a.global;
                             %a.declarable;
          TEIform            CDATA               'listBibl'     >
]]>


< 64 New definitions for core tag set 55 (cont'd) > =

<!ENTITY % XML.monogr "INCLUDE" >
<![%XML.monogr;[
<!ELEMENT %n.monogr;    - O  (
                             ((%m.Incl;)*,
                               (( 
                                 (%n.author; | %n.editor; | %n.respStmt;),
                                 (%n.author; | %n.editor;
                                   | %n.respStmt; | %m.Incl;)*,
                                 (%n.title;, (%m.Incl;)*)+,
                                 ((%n.editor; | %n.respStmt;), (%m.Incl;)*)*
                               )
                               |
                               ( 
                                 (%n.title;, (%m.Incl;)*)+,
                                 (
                                   (%n.author; | %n.editor; | %n.respStmt;),
                                   (%m.Incl;)*
                                 )*
                               ))
                             )?,
                             ((%n.note; | %n.meeting;), (%m.Incl;)*)*,
                             (%n.edition;, 
                              (%n.editor; | %n.respStmt; | %m.Incl;)*)*,
                             %n.imprint;, 
                             (%n.imprint; | %n.extent; |
                              %n.biblScope; | %m.Incl;)*
                             )                  >
<!ATTLIST %n.monogr;         %a.global;
          TEIform            CDATA               'monogr'       >
]]>

<!-- respStmt has already been declared                       -->


< 65 New definitions for core tag set 55 (cont'd) > =

<!ENTITY % XML.series "INCLUDE" >
<![%XML.series;[
<!ELEMENT %n.series;    - O  (%n.title; | %n.editor; |
                             %n.respStmt; | %n.biblScope;
                             | %m.Incl;)*                       >
<!ATTLIST %n.series;         %a.global;
          TEIform            CDATA               'series'       >
]]>


< 66 New definitions for core tag set 55 (cont'd) > =

<!ENTITY % XML.sp "INCLUDE" >
<![%XML.sp;[
<!ELEMENT %n.sp;        - O  ((%m.Incl;)*,
                             (%n.speaker;, (%m.Incl;)*)?,
                             ((%n.p; | %n.l; | %n.lg; | %n.seg; | %n.ab;
                             | %n.stage;), (%m.Incl;)*)+)       >
<!ATTLIST %n.sp;             %a.global;
          who                IDREFS              #IMPLIED
          TEIform            CDATA               'sp'           >
]]>


8.5.2 Basic text-structure tag set

< 67 Suppress definitions in text-structure tag set > =

<!ENTITY % argument 'IGNORE' >
<!ENTITY % back     'IGNORE' >
<!ENTITY % body     'IGNORE' >
<!ENTITY % byline   'IGNORE' >
<!ENTITY % closer   'IGNORE' >
<!ENTITY % dateline 'IGNORE' >
<!ENTITY % div      'IGNORE' >
<!ENTITY % div0     'IGNORE' >
<!ENTITY % div1     'IGNORE' >
<!ENTITY % div2     'IGNORE' >
<!ENTITY % div3     'IGNORE' >
<!ENTITY % div4     'IGNORE' >
<!ENTITY % div5     'IGNORE' >
<!ENTITY % div6     'IGNORE' >
<!ENTITY % div7     'IGNORE' >
<!ENTITY % group    'IGNORE' >
<!ENTITY % opener   'IGNORE' >
<!ENTITY % text     'IGNORE' >


The current definitions are these:

<!ELEMENT %n.argument;  - -  ((%n.head)?, %component.seq;)      >
<!ELEMENT %n.back;      - O  ( (%m.front)*, ( ( (%m.divtop),
                             (%m.divtop | %n.titlePage;)*) | (
                             (%n.div;), (%n.div; |
                             (%m.front))*) | ( (%n.div1;),
                             (%n.div1; | (%m.front))*) )? )     >
<!ELEMENT %n.body;      - O  ((%m.divtop;)*, ( ( (%n.divGen)*,
                             ( (%n.div;, (%n.div; |
                             %n.divGen;)*) | (%n.div0;,
                             (%n.div0; | %n.divGen;)*) |
                             (%n.div1;, (%n.div1; |
                             %n.divGen;)*) ) ) | (
                             (%component)+, ((%n.divGen)*, (
                             (%n.div;, (%n.div; | %n.divGen;)*)
                             | (%n.div0;, (%n.div0; |
                             %n.divGen;)*) | (%n.div1;,
                             (%n.div1; | %n.divGen;)*) )? ))),
                             (%m.divbot;)*)                     >
<!ELEMENT %n.byline;    - O  (%phrase.seq; | %n.docAuthor;)*    >
<!ELEMENT %n.closer;    - O  (%n.signed; | %n.dateline; |
                             %n.salute; | %phrase.seq;)*        >
<!ELEMENT %n.dateline;  - O  (%n.date; | %n.time; | %n.name; |
                             #PCDATA | %n.address;)*            >
<!ELEMENT %n.div;       - O  ((%m.divtop;)*, ((%n.div; |
                             %n.divGen;)+ | ((%component;)+,
                             (%n.div; | %n.divGen;)*)),
                             (%m.divbot;)*)                     >
<!ELEMENT %n.div0;      - O  ((%m.divtop;)*, ( (%n.div1; |
                             %n.divGen;)+ | ( (%component;)+,
                             (%n.div1; | %n.divGen;)*)),
                             (%m.divbot;)*)                     >
<!ELEMENT %n.div1;      - O  ((%m.divtop;)*, ( (%n.div2; |
                             %n.divGen;)+ | ((%component;)+,
                             (%n.div2; | %n.divGen;)*)),
                             (%m.divbot;)*)                     >
<!ELEMENT %n.div2;      - O  ((%m.divtop;)*, ( (%n.div3; |
                             %n.divGen;)+ | ((%component;)+,
                             (%n.div3; | %n.divGen;)*)),
                             (%m.divbot;)*)                     >
<!ELEMENT %n.div3;      - O  ((%m.divtop;)*, ( (%n.div4; |
                             %n.divGen;)+ | ((%component;)+,
                             (%n.div4; | %n.divGen;)*)),
                             (%m.divbot;)*)                     >
<!ELEMENT %n.div4;      - O  ((%m.divtop;)*, ( (%n.div5; |
                             %n.divGen;)+ | ((%component;)+,
                             (%n.div5; | %n.divGen;)*)),
                             (%m.divbot;)*)                     >
<!ELEMENT %n.div5;      - O  ((%m.divtop;)*, ( (%n.div6; |
                             %n.divGen;)+ | ((%component;)+,
                             (%n.div6; | %n.divGen;)*)),
                             (%m.divbot;)*)                     >
<!ELEMENT %n.div6;      - O  ((%m.divtop;)*, ((%n.div7; |
                             %n.divGen;)+ | ((%component;)+,
                             (%n.div7; | %n.divGen;)*)),
                             (%m.divbot;)*)                     >
<!ELEMENT %n.div7;      - O  ((%m.divtop;)*, (%component;)+,
                             (%m.divbot;)*)                     >
<!ELEMENT %n.group;     - O  ((%m.divtop;)*, (%n.text; |
                             %n.group;)+, (%m.divbot;)*)        >
<!ELEMENT %n.opener;    - O  (%n.signed; | %n.dateline; |
                             %n.salute; | %phrase.seq;)*        >
<!ELEMENT %n.text;      - -  ((%n.front)?, (%n.body; |
                             %n.group;), (%n.back)?)
                                                +(%m.globincl;) >

The new definitions are as follows:

< 68 New definitions for text-structure tag set > =

<!ENTITY % XML.argument "INCLUDE" >
<![%XML.argument;[
<!ELEMENT %n.argument;  - -  ((%m.Incl;)*, (%n.head;,
                             %component.seq;)?)                 >
<!ATTLIST %n.argument;       %a.global;
          TEIform            CDATA               'argument'     >
]]>


< 69 New definitions for text-structure tag set 68 (cont'd) > =

<!ENTITY % XML.back "INCLUDE" >
<![%XML.back;[
<!ELEMENT %n.back;      - O
                             ( (%m.front; | %m.Incl;)*,
                               ( ( (%m.divtop;),
                                   (%m.divtop; | %n.titlePage;
                                   | %m.Incl;)*)
                                 |
                                 ( (%n.div;),
                                   (%n.div; | %m.front; | %m.Incl;)*)
                                 |
                                 ( (%n.div1;),
                                   (%n.div1; | %m.front; | %m.Incl;)*)
                               )?
                             )                                  >
<!ATTLIST %n.back;           %a.global;
                             %a.declaring;
          TEIform            CDATA               'back'         >
]]>


< 70 New definitions for text-structure tag set 68 (cont'd) > =

<!ENTITY % XML.body "INCLUDE" >
<![%XML.body;[
<!ELEMENT %n.body;      - O  (
                               (%m.divtop; | %m.Incl;)*,
                               ( 
                                 (
                                   ((%component;), (%m.Incl;)*)+,
                                   ((%n.divGen;, (%m.Incl;)*)*,
                                     ( (%n.div;,
                                         (%n.div; | %n.divGen; | %m.Incl;)*)
                                       |
                                       (%n.div0;,
                                         (%n.div0; | %n.divGen; | %m.Incl;)*)
                                       |
                                       (%n.div1;,
                                         (%n.div1; | %n.divGen; | %m.Incl;)*)
                                     )?
                                   )
                                 )
                               |
                                 ( (%n.divGen;, (%m.Incl;)*)*,
                                   ( (%n.div;,
                                      (%n.div; | %n.divGen; | %m.Incl;)*)
                                     |
                                     (%n.div0;,
                                       (%n.div0; | %n.divGen; | %m.Incl;)*)
                                     |
                                     (%n.div1;,
                                       (%n.div1; | %n.divGen; | %m.Incl;)*)
                                   )
                                 )
                               ),
                               ((%m.divbot;), (%m.Incl;)*)*
                             )                                  >
<!ATTLIST %n.body;           %a.global;
                             %a.declaring;
          TEIform            CDATA               'body'         >
]]>


< 71 New definitions for text-structure tag set 68 (cont'd) > =

<!--* byline, closer, and dateline have already been done *-->

<!ENTITY % XML.div "INCLUDE" >
<![%XML.div;[
<!ELEMENT %n.div;       - O  (
                               (%m.divtop; | %m.Incl;)*,
                               ( ((%n.div; | %n.divGen;), (%m.Incl;)*)+
                                 |
                                 ( (%component;, (%m.Incl;)*)+,
                                   ((%n.div; | %n.divGen;), (%m.Incl;)*)*)
                               ),
                               ((%m.divbot;), (%m.Incl;)*)*)    >
<!ATTLIST %n.div;            %a.global;
                             %a.declaring;
                             %a.divn;
          TEIform            CDATA               'div'          >
]]>


< 72 New definitions for text-structure tag set 68 (cont'd) > =

<!ENTITY % XML.div0 "INCLUDE" >
<![%XML.div0;[
<!ELEMENT %n.div0;      - O  ((%m.divtop; | %m.Incl;)*, ( ((%n.div1; |
                             %n.divGen;), (%m.Incl;)*)+ | ( (%component;, (%m.Incl;)*)+,
                             ((%n.div1; | %n.divGen;), (%m.Incl;)*)*)),
                             ((%m.divbot;), (%m.Incl;)*)*)      >
<!ATTLIST %n.div0;           %a.global;
                             %a.declaring;
                             %a.divn;
          TEIform            CDATA               'div0'         >
]]>


< 73 New definitions for text-structure tag set 68 (cont'd) > =

<!ENTITY % XML.div1 "INCLUDE" >
<![%XML.div1;[
<!ELEMENT %n.div1;      - O  ((%m.divtop; | %m.Incl;)*, ( ((%n.div2; |
                             %n.divGen;), (%m.Incl;)*)+ | ((%component;, (%m.Incl;)*)+,
                             ((%n.div2; | %n.divGen;), (%m.Incl;)*)*)),
                             ((%m.divbot;), (%m.Incl;)*)*)      >
<!ATTLIST %n.div1;           %a.global;
                             %a.declaring;
                             %a.divn;
          TEIform            CDATA               'div1'         >
]]>


< 74 New definitions for text-structure tag set 68 (cont'd) > =

<!ENTITY % XML.div2 "INCLUDE" >
<![%XML.div2;[
<!ELEMENT %n.div2;      - O  ((%m.divtop; | %m.Incl;)*, ( ((%n.div3; |
                             %n.divGen;), (%m.Incl;)*)+ | ((%component;, (%m.Incl;)*)+,
                             ((%n.div3; | %n.divGen;), (%m.Incl;)*)*)),
                             ((%m.divbot;), (%m.Incl;)*)*)      >
<!ATTLIST %n.div2;           %a.global;
                             %a.declaring;
                             %a.divn;
          TEIform            CDATA               'div2'         >
]]>


< 75 New definitions for text-structure tag set 68 (cont'd) > =

<!ENTITY % XML.div3 "INCLUDE" >
<![%XML.div3;[
<!ELEMENT %n.div3;      - O  ((%m.divtop; | %m.Incl;)*, ( ((%n.div4; |
                             %n.divGen;), (%m.Incl;)*)+ | ((%component;, (%m.Incl;)*)+,
                             ((%n.div4; | %n.divGen;), (%m.Incl;)*)*)),
                             ((%m.divbot;), (%m.Incl;)*)*)      >
<!ATTLIST %n.div3;           %a.global;
                             %a.declaring;
                             %a.divn;
          TEIform            CDATA               'div3'         >
]]>


< 76 New definitions for text-structure tag set 68 (cont'd) > =

<!ENTITY % XML.div4 "INCLUDE" >
<![%XML.div4;[
<!ELEMENT %n.div4;      - O  ((%m.divtop; | %m.Incl;)*, ( ((%n.div5; |
                             %n.divGen;), (%m.Incl;)*)+ | ((%component;, (%m.Incl;)*)+,
                             ((%n.div5; | %n.divGen;), (%m.Incl;)*)*)),
                             ((%m.divbot;), (%m.Incl;)*)*)      >
<!ATTLIST %n.div4;           %a.global;
                             %a.declaring;
                             %a.divn;
          TEIform            CDATA               'div4'         >
]]>


< 77 New definitions for text-structure tag set 68 (cont'd) > =

<!ENTITY % XML.div5 "INCLUDE" >
<![%XML.div5;[
<!ELEMENT %n.div5;      - O  ((%m.divtop; | %m.Incl;)*, ( ((%n.div6; |
                             %n.divGen;), (%m.Incl;)*)+ | ((%component;, (%m.Incl;)*)+,
                             ((%n.div6; | %n.divGen;), (%m.Incl;)*)*)),
                             ((%m.divbot;), (%m.Incl;)*)*)      >
<!ATTLIST %n.div5;           %a.global;
                             %a.declaring;
                             %a.divn;
          TEIform            CDATA               'div5'         >
]]>


< 78 New definitions for text-structure tag set 68 (cont'd) > =

<!ENTITY % XML.div6 "INCLUDE" >
<![%XML.div6;[
<!ELEMENT %n.div6;      - O  ((%m.divtop; | %m.Incl;)*, ( ((%n.div7; |
                             %n.divGen;), (%m.Incl;)*)+ | ((%component;, (%m.Incl;)*)+,
                             ((%n.div7; | %n.divGen;), (%m.Incl;)*)*)),
                             ((%m.divbot;), (%m.Incl;)*)*)      >
<!ATTLIST %n.div6;           %a.global;
                             %a.declaring;
                             %a.divn;
          TEIform            CDATA               'div6'         >
]]>


< 79 New definitions for text-structure tag set 68 (cont'd) > =

<!ENTITY % XML.div7 "INCLUDE" >
<![%XML.div7;[
<!ELEMENT %n.div7;      - O  ((%m.divtop; | %m.Incl;)*, (%component;, (%m.Incl;)*)+,
                             ((%m.divbot;), (%m.Incl;)*)*)      >
<!ATTLIST %n.div7;           %a.global;
                             %a.declaring;
                             %a.divn;
          TEIform            CDATA               'div7'         >
]]>


< 80 New definitions for text-structure tag set 68 (cont'd) > =

<!ENTITY % XML.group "INCLUDE" >
<![%XML.group;[
<!ELEMENT %n.group;     - O  ((%m.divtop; | %m.Incl;)*,
                             ((%n.text; | %n.group;),
                              (%n.text; | %n.group; | %m.Incl;)*),
                             ((%m.divbot;), (%m.Incl;)*)*)      >
<!ATTLIST %n.group;          %a.global;
                             %a.declaring;
          TEIform            CDATA               'group'        >
]]>

<!--* opener has already been done *-->


< 81 New definitions for text-structure tag set 68 (cont'd) > =

<!ENTITY % XML.text "INCLUDE" >
<![%XML.text;[
<!ELEMENT %n.text;      - -  ((%m.Incl;)*,
                             (%n.front;, (%m.Incl;)*)?,
                             (%n.body; | %n.group;),
                             (%m.Incl;)*,
                             (%n.back;, (%m.Incl;)*)?)
                                                                >
<!ATTLIST %n.text;           %a.global;
                             %a.declaring;
          TEIform            CDATA               'text'         >
]]>


8.5.3 Front-matter tag set

< 82 Suppress definitions in front-matter tag set > =

<!--* docimprint has already been suppressed and redefined *-->
<!ENTITY % docTitle        'IGNORE' >
<!ENTITY % front           'IGNORE' >
<!ENTITY % titlePage       'IGNORE' >


The existing declarations are these:

<!ELEMENT %n.front;     - O  ( (%m.front;)*, ( ( (%m.divtop;),
                             (%m.divtop; | %n.titlePage;)*) | (
                             (%n.div;), (%n.div; | (%m.front;)
                             )*) | ( (%n.div1;), (%n.div1; |
                             (%m.front;) )*) )? )               >
<!ELEMENT %n.titlePage; - O  (%m.tpParts;)+                     >
<!ELEMENT %n.docTitle;  - O  ((%n.titlePart)+)                  >

The new definitions are these. The definition for <front> has been changed to use fmchunk instead of divtop.

< 83 New definitions for front-matter tag set > =

<!ENTITY % XML.front "INCLUDE" >
<![%XML.front;[
<!ELEMENT %n.front;     - O
                             ( (%m.front; | %m.Incl;)*,
                               ( ( (%m.fmchunk;),
                                   (%m.fmchunk; | %n.titlePage; | %m.Incl;)*)
                                 | ( (%n.div;),
                                     (%n.div; | %m.front; | %m.Incl;)*)
                                 | ( (%n.div1;),
                                     (%n.div1; | %m.front; | %m.Incl;)*)
                               )?
                             )                                  >
<!ATTLIST %n.front;          %a.global;
                             %a.declaring;
          TEIform            CDATA               'front'        >
]]>


< 84 New definitions for front-matter tag set 83 (cont'd) > =

<!ENTITY % XML.titlePage "INCLUDE" >
<![%XML.titlePage;[
<!ELEMENT %n.titlePage; - O  ((%m.Incl;)*,
                             (%m.tpParts;),
                             (%m.tpParts; | %m.Incl;)*)         >
<!ATTLIST %n.titlePage;      %a.global;
          type               CDATA               #IMPLIED
          TEIform            CDATA               'titlePage'    >
]]>


< 85 New definitions for front-matter tag set 83 (cont'd) > =

<!ENTITY % XML.docTitle "INCLUDE" >
<![%XML.docTitle;[
<!ELEMENT %n.docTitle;  - O  ((%m.Incl;)*, 
                                (%n.titlePart;, (%m.Incl;)*)+)  >
<!ATTLIST %n.docTitle;       %a.global;
          TEIform            CDATA               'docTitle'     >
]]>


8.5.4 Header tag set

< 86 Suppress definitions in header tag set > =

<!ENTITY % availability     'IGNORE' >
<!ENTITY % broadcast        'IGNORE' >
<!ENTITY % editionStmt      'IGNORE' >
<!ENTITY % equipment        'IGNORE' >
<!ENTITY % notesStmt        'IGNORE' >
<!--     % publicationStmt  is already replaced -->
<!ENTITY % recording        'IGNORE' >
<!ENTITY % recordingStmt    'IGNORE' >
<!ENTITY % scriptStmt       'IGNORE' >
<!ENTITY % seriesStmt       'IGNORE' >
<!ENTITY % sourceDesc       'IGNORE' >
<!ENTITY % titleStmt        'IGNORE' >


The current definitions are these:

<!ELEMENT %n.availability;
                        - O  ((%n.p;)+)                         >
<!ELEMENT %n.broadcast; - -  ((%n.p)+ | %n.bibl; |
                             %n.biblStruct; | %n.biblFull; |
                             %n.recording;)                     >
<!ELEMENT %n.editionStmt;
                        - O  ( (%n.edition;, (%n.respStmt)*) |
                             (%n.p;)+ )                         >
<!ELEMENT %n.equipment; - O  ((%n.p;)+)                         >

<!ELEMENT %n.notesStmt; - O  ((%n.note)+)                       >
<!ELEMENT %n.recording; - -  ((%n.p)+ | (%n.respStmt; |
                             %n.equipment; | %n.broadcast; |
                             %n.date;)*)                        >
<!ELEMENT %n.recordingStmt;
                        - -  ((%n.p)+ | (%n.recording)+ )       >
<!ELEMENT %n.scriptStmt;
                        - -  ((%n.p)+ | %n.bibl; | %n.biblFull;
                             | %n.biblStruct;)                  >
<!ELEMENT %n.seriesStmt;
                        - O  ( (%n.title;, (%n.idno; |
                             %n.respStmt;)*) | (%n.p)+ )        >
<!ELEMENT %n.sourceDesc;
                        - -  (%n.p; | %n.bibl; | %n.biblFull; |
                             %n.biblStruct; | %n.listBibl; |
                             %n.scriptStmt; |
                             %n.recordingStmt;)+                >
<!ELEMENT %n.titleStmt; - O  (((%n.title)+, (%n.author; |
                             %n.editor; | %n.sponsor; |
                             %n.funder; | %n.principal; |
                             %n.respStmt;)*))                   >

The new definitions are as follows. We've changed the language for some element types, in parallel with changes to TEI P3:

< 87 New definitions for header tag set > =

<!ENTITY % XML.availability "INCLUDE" >
<![%XML.availability;[
<!ELEMENT %n.availability;
                        - O  (%n.p; | %m.Incl;)*                >
<!ATTLIST %n.availability;   %a.global;
          status             (free | unknown | restricted)
                                                 #IMPLIED
          TEIform            CDATA               'availability' >
]]>


< 88 New definitions for header tag set 87 (cont'd) > =

<!ENTITY % XML.broadcast "INCLUDE" >
<![%XML.broadcast;[
<!ELEMENT %n.broadcast; - -  ((%m.Incl;)*, ((%n.p;, (%m.Incl;)*)+ 
                             | ((%n.bibl; |
                             %n.biblStruct; | %n.biblFull; |
                             %n.recording;), (%m.Incl;)*)))     >
<!ATTLIST %n.broadcast;      %a.global;
                             %a.declarable;
          TEIform            CDATA               'broadcast'    >
]]>


< 89 New definitions for header tag set 87 (cont'd) > =

<!ENTITY % XML.editionStmt "INCLUDE" >
<![%XML.editionStmt;[
<!ELEMENT %n.editionStmt;
                        - O  ((%m.Incl;)*, ((%n.edition;, 
                                (%n.respStmt; | %m.Incl;)*) 
                                | (%n.p;, (%m.Incl;)*)+) 
                             )                          >
<!ATTLIST %n.editionStmt;    %a.global;
          TEIform            CDATA               'editionStmt'  >
]]>


< 90 New definitions for header tag set 87 (cont'd) > =

<!ENTITY % XML.equipment "INCLUDE" >
<![%XML.equipment;[
<!ELEMENT %n.equipment; - O  ((%m.Incl;)*, 
                                 (%n.p;, (%m.Incl;)*)+)         >
<!ATTLIST %n.equipment;      %a.global;
                             %a.declarable;
          TEIform            CDATA               'equipment'    >
]]>


< 91 New definitions for header tag set 87 (cont'd) > =

<!ENTITY % XML.notesStmt "INCLUDE" >
<![%XML.notesStmt;[
<!ELEMENT %n.notesStmt; - O  ((%m.Incl;)*, 
                                (%n.note;, (%m.Incl;)*)+)       >
<!ATTLIST %n.notesStmt;      %a.global;
          TEIform            CDATA               'notesStmt'    >
]]>


< 92 New definitions for header tag set 87 (cont'd) > =

<!ENTITY % XML.recording "INCLUDE" >
<![%XML.recording;[
<!ELEMENT %n.recording; - -  (((%m.Incl;)*, 
                             (%n.p;, (%m.Incl;)*)+)
                             | ((%n.respStmt; |
                             %n.equipment; | %n.broadcast; |
                             %n.date;), (%m.Incl;)*)*)          >
<!ATTLIST %n.recording;      %a.global;
                             %a.declarable;
          type               (audio | video)     audio
          dur                CDATA               #IMPLIED
          TEIform            CDATA               'recording'    >
]]>


< 93 New definitions for header tag set 87 (cont'd) > =

<!ENTITY % XML.recordingStmt "INCLUDE" >
<![%XML.recordingStmt;[
<!ELEMENT %n.recordingStmt;
                        - -  ((%m.Incl;)*, ((%n.p;, (%m.Incl;)*)+ 
                              | (%n.recording;, (%m.Incl;)*)+ ))>
<!ATTLIST %n.recordingStmt;  %a.global;
          TEIform            CDATA               'recordingStmt'>
]]>


< 94 New definitions for header tag set 87 (cont'd) > =

<!ENTITY % XML.scriptStmt "INCLUDE" >
<![%XML.scriptStmt;[
<!ELEMENT %n.scriptStmt;
                        - -  ((%m.Incl;)*, ((%n.p;, (%m.Incl;)*)+ 
                             | ((%n.bibl; |
                             %n.biblStruct; | %n.biblFull;), 
                             (%m.Incl;)*)))                     >
<!ATTLIST %n.scriptStmt;     %a.global;
                             %a.declarable;
          TEIform            CDATA               'scriptStmt'   >
]]>


< 95 New definitions for header tag set 87 (cont'd) > =

<!ENTITY % XML.seriesStmt "INCLUDE" >
<![%XML.seriesStmt;[
<!ELEMENT %n.seriesStmt;
                        - O  ((%m.Incl;)*,  
                              ((%n.title;, 
                                 (%n.idno; | %n.respStmt; | %m.Incl;)*
                               )
                               | 
                               (%n.p;, (%m.Incl;)*)+) 
                             )        >
<!ATTLIST %n.seriesStmt;     %a.global;
          TEIform            CDATA               'seriesStmt'   >
]]>


< 96 New definitions for header tag set 87 (cont'd) > =

<!ENTITY % XML.sourceDesc "INCLUDE" >
<![%XML.sourceDesc;[
<!ELEMENT %n.sourceDesc;
                        - -  ((%m.Incl;)*, ((%n.p; | %n.bibl; 
                             | %n.biblFull; |
                             %n.biblStruct; | %n.listBibl; |
                             %n.scriptStmt; |
                             %n.recordingStmt;), (%m.Incl;)*)+)                >
<!ATTLIST %n.sourceDesc;     %a.global;
                             %a.declarable;
          TEIform            CDATA               'sourceDesc'   >
]]>


< 97 New definitions for header tag set 87 (cont'd) > =

<!ENTITY % XML.titleStmt "INCLUDE" >
<![%XML.titleStmt;[
<!ELEMENT %n.titleStmt; - O  ( (%m.Incl;)*,
                                  (%n.title;, (%m.Incl;)*)+, 
                                  ( (%n.author; 
                                    | %n.editor; 
                                    | %n.sponsor; 
                                    | %n.funder; 
                                    | %n.principal; 
                                    | %n.respStmt;), 
                                    (%m.Incl;)*)*
                                )                               >
<!ATTLIST %n.titleStmt;      %a.global;
          TEIform            CDATA               'titleStmt'    >
]]>


8.5.5 Verse tag set

< 98 Suppress definitions in verse tag set > =

<!ENTITY % lg1              'IGNORE' >
<!ENTITY % lg2              'IGNORE' >
<!ENTITY % lg3              'IGNORE' >
<!ENTITY % lg4              'IGNORE' >
<!ENTITY % lg5              'IGNORE' >


The current definitions are these:

<!ELEMENT %n.lg1;       - O  ((%n.head)?, (%n.l; | %n.lg2;)+)   >
<!ELEMENT %n.lg2;       - O  ((%n.head)?, (%n.l; | %n.lg3;)+)   >
<!ELEMENT %n.lg3;       - O  ((%n.head)?, (%n.l; | %n.lg4;)+)   >
<!ELEMENT %n.lg4;       - O  ((%n.head)?, (%n.l; | %n.lg5;)+)   >
<!ELEMENT %n.lg5;       - O  ((%n.head)?, (%n.l)+)              >

The new definitions are as follows:

< 99 New definitions for verse tag set > =

<![%TEI.verse;[
<!ENTITY % XML.lg1 "INCLUDE" >
<![%XML.lg1;[
<!ELEMENT %n.lg1;       - O  ((%m.Incl;)*, 
                                (%n.head;, (%m.Incl;)*)?, 
                                ((%n.l; | %n.lg2;), 
                                (%m.Incl;)*)+)                  >
<!ATTLIST %n.lg1;            %a.global;
                             %a.divn;
                             %a.metrical;
          TEIform            CDATA               'lg1'          >
]]>


< 100 New definitions for verse tag set 99 (cont'd) > =

<!ENTITY % XML.lg2 "INCLUDE" >
<![%XML.lg2;[
<!ELEMENT %n.lg2;       - O  ((%m.Incl;)*, 
                             (%n.head;, (%m.Incl;)*)?, 
                             ((%n.l; | %n.lg3;), (%m.Incl;)*)+) >
<!ATTLIST %n.lg2;            %a.global;
                             %a.divn;
                             %a.metrical;
          TEIform            CDATA               'lg2'          >
]]>


< 101 New definitions for verse tag set 99 (cont'd) > =

<!ENTITY % XML.lg3 "INCLUDE" >
<![%XML.lg3;[
<!ELEMENT %n.lg3;       - O  ((%m.Incl;)*, 
                             (%n.head;, (%m.Incl;)*)?, 
                             ((%n.l; | %n.lg4;), (%m.Incl;)*)+) >
<!ATTLIST %n.lg3;            %a.global;
                             %a.divn;
                             %a.metrical;
          TEIform            CDATA               'lg3'          >
]]>


< 102 New definitions for verse tag set 99 (cont'd) > =

<!ENTITY % XML.lg4 "INCLUDE" >
<![%XML.lg4;[
<!ELEMENT %n.lg4;       - O  ((%m.Incl;)*, 
                             (%n.head;, (%m.Incl;)*)?, 
                             ((%n.l; | %n.lg5;), (%m.Incl;)*)+) >
<!ATTLIST %n.lg4;            %a.global;
                             %a.divn;
                             %a.metrical;
          TEIform            CDATA               'lg4'          >
]]>


< 103 New definitions for verse tag set 99 (cont'd) > =

<!ENTITY % XML.lg5 "INCLUDE" >
<![%XML.lg5;[
<!ELEMENT %n.lg5;       - O  ((%m.Incl;)*, 
                             (%n.head;, (%m.Incl;)*)?,
                             (%n.l;, (%m.Incl;)*)+)             >
<!ATTLIST %n.lg5;            %a.global;
                             %a.divn;
                             %a.metrical;
          TEIform            CDATA               'lg5'          >
]]>
]]>


8.5.6 Drama tag set

< 104 Suppress definitions in drama tag set > =

<!ENTITY % castGroup        'IGNORE' >
<!--       castitem has been done already -->
<!ENTITY % castList         'IGNORE' >
<!ENTITY % epilogue         'IGNORE' >
<!ENTITY % performance      'IGNORE' >
<!ENTITY % prologue         'IGNORE' >
<!ENTITY % set              'IGNORE' >


The current definitions are these:

<!ELEMENT %n.castGroup; - -  (
                                (%n.head;)?, 
                                (%n.castItem; | %n.castGroup;)+, 
                                (%n.trailer;)?)    >
<!ELEMENT %n.castItem;  - O  (%n.role; | %n.roleDesc; |
                             %n.actor; | (%phrase.seq))*        >
<!ELEMENT %n.castList;  - -  (  (%m.divtop;)*, 
                                (%component;)*,
                                (%n.castItem; | %n.castGroup;)+,
                                (%component;)*)                 >
<!ELEMENT %n.epilogue;  - -  ((%m.divtop)*, (%component)+,
                             (%m.divbot)*)                      >
<!ELEMENT %n.performance;
                        - -  ((%m.divtop)*, (%component)+,
                             (%m.divbot)*)                      >
<!ELEMENT %n.prologue;  - -  ((%m.divtop)*, (%component)+,
                             (%m.divbot)*)                      >
<!ELEMENT %n.set;       - -  ((%n.head)?, %specialPara)         >

The new definitions are as follows:

< 105 New definitions for drama tag set > =

<![%TEI.drama;[
<!ENTITY % XML.castGroup "INCLUDE" >
<![%XML.castGroup;[
<!ELEMENT %n.castGroup; - -  ((%m.Incl;)*, (%n.head;, (%m.Incl;)*)?, 
                                ((%n.castItem; |
                                %n.castGroup;), (%m.Incl;)*)+, 
                                (%n.trailer;, (%m.Incl;)*)?)    >
<!ATTLIST %n.castGroup;      %a.global;
          TEIform            CDATA               'castGroup'    >
]]>
<!--* castItem has been done elsewhere *-->


< 106 New definitions for drama tag set 105 (cont'd) > =

<!ENTITY % XML.castList "INCLUDE" >
<![%XML.castList;[
<!ELEMENT %n.castList;  - -  (
                                (%m.divtop; | %m.Incl;)*, 
                                ((%component;), (%m.Incl;)*)*,
                                ((%n.castItem; | %n.castGroup;), 
                                (%m.Incl;)*)+,
                                ((%component;), (%m.Incl;)*)*)  >
<!ATTLIST %n.castList;       %a.global;
          TEIform            CDATA               'castList'     >
]]>


< 107 New definitions for drama tag set 105 (cont'd) > =

<!ENTITY % XML.epilogue "INCLUDE" >
<![%XML.epilogue;[
<!ELEMENT %n.epilogue;  - -  ((%m.divtop; | %m.Incl;)*, 
                             ((%component;), (%m.Incl;)*)+,
                             ((%m.divbot;), (%m.Incl;)*)*)      >
<!ATTLIST %n.epilogue;       %a.global;
          TEIform            CDATA               'epilogue'     >
]]>


< 108 New definitions for drama tag set 105 (cont'd) > =

<!ENTITY % XML.performance "INCLUDE" >
<![%XML.performance;[
<!ELEMENT %n.performance;
                        - -  ((%m.divtop; | %m.Incl;)*, 
                             ((%component;), (%m.Incl;)*)+,
                             ((%m.divbot;), (%m.Incl;)*)*)      >
<!ATTLIST %n.performance;    %a.global;
          TEIform            CDATA               'performance'  >
]]>


< 109 New definitions for drama tag set 105 (cont'd) > =

<!ENTITY % XML.prologue "INCLUDE" >
<![%XML.prologue;[
<!ELEMENT %n.prologue;  - -  ((%m.divtop; | %m.Incl;)*, 
                             ((%component;), (%m.Incl;)*)+,
                             ((%m.divbot;), (%m.Incl;)*)*)      >
<!ATTLIST %n.prologue;       %a.global;
          TEIform            CDATA               'prologue'     >
]]>

<!-- set is already done -->
]]>


8.5.7 Spoken-text tag set

< 110 Suppress definitions in spoken-text tag set > =

<!ENTITY % u                'IGNORE' >

The current definition is this:

<!ELEMENT %n.u;         - -  ((%phrase | %m.comp.spoken)+)      >

The new definitions are as follows:

< 111 New definitions for spoken-text tag set > =

<![%TEI.spoken;[
<!ENTITY % XML.u "INCLUDE" >
<![%XML.u;[
<!ELEMENT %n.u;         - -  (#PCDATA | %m.phrase; | %m.comp.spoken;
                                | %m.Incl;)*      >
<!ATTLIST %n.u;              %a.global;
                             %a.timed;
                             %a.declaring;
          trans              (smooth | latching | overlap |
                             pause)              smooth
          who                IDREF               %INHERITED;
          TEIform            CDATA               'u'            >
]]>
]]>


8.5.8 Dictionary tag set

We handle the dictionary tag set below, not here. (The list above does contain <oVar> and <pVar>, but that must be a mistake.)

8.5.9 Terminology tag set

< 112 Suppress definitions in terminology tag set > =

<!ENTITY % ofig             'IGNORE' >
<!ENTITY % termEntry        'IGNORE' >
<!ENTITY % tig              'IGNORE' >


The current definitions in the nested tag set are these:

<!ELEMENT %n.ofig;      - O  ((%m.terminologyMisc)*,
                             (%n.otherForm;, (%n.gram)*),
                             (%m.terminologyMisc)*)             >
<!ELEMENT %n.termEntry; - O  ((%m.terminologyMisc)*, (%n.tig)+)

                                                 +(%m.terminologyInclusions)
                                                                >
<!ELEMENT %n.tig;       - O  ((%m.terminologyMisc)*, (%n.term;,
                             (%n.gram)*),
                             (%m.terminologyMisc)*, (%n.ofig)*)
                                                                >

Note that <termEntry> has inclusions of its own. These do not require special treatment in our propagation of inclusions, since the set of legal descendants of <termEntry> is the same as the set of legal descendants of <text>. The set of terminology inclusions, however, does need to be revised for future versions of the DTD, since it's not disjoint from elements named in content models. It includes elements normally included in any phrase-level content model; we don't want to include them in m.Incl, since that would cause ambiguity. So all terminological content models should be rewritten for TEI P4, or even P3.5.

The new definitions are as follows:

< 113 New definitions for terminology tag set > =

<![%TEI.terminology;[
<!ENTITY % XML.ofig "INCLUDE" >
<![%XML.ofig;[
<!ELEMENT %n.ofig;      - O  ((%m.terminologyMisc; | %m.Incl;)*,
                             (%n.otherForm;, (%n.gram; | %m.Incl;)*),
                             ((%m.terminologyMisc;), (%m.Incl;)*)*)             >
<!ATTLIST %n.ofig;           %a.global;
          type               CDATA               #IMPLIED
          TEIform            CDATA               'ofig'         >
]]>


< 114 New definitions for terminology tag set 113 (cont'd) > =

<!ENTITY % XML.termEntry "INCLUDE" >
<![%XML.termEntry;[
<!ELEMENT %n.termEntry; - O  ((%m.terminologyMisc; 
                                | %m.terminologyInclusions; | %m.Incl;)*, 
                                 (%n.tig;, 
                                (%m.Incl; | %m.terminologyInclusions;)*)+)
                                                                >
<!ATTLIST %n.termEntry;      %a.global;
          type               CDATA               #IMPLIED
          TEIform            CDATA               'termEntry'    >
]]>


< 115 New definitions for terminology tag set 113 (cont'd) > =

<!ENTITY % XML.tig "INCLUDE" >
<![%XML.tig;[
<!ELEMENT %n.tig;       - O  ((%m.terminologyMisc;
                                | %m.terminologyInclusions; | %m.Incl;)*, 
                                (%n.term;,
                                 (%n.gram; | %m.terminologyInclusions; 
                                 | %m.Incl;)*),
                                ((%m.terminologyMisc;), 
                                 (%m.terminologyInclusions; | %m.Incl;)*)*, 
                                (%n.ofig;, 
                                 (%m.terminologyInclusions; | %m.Incl;)*)*)
                                                                >
<!ATTLIST %n.tig;            %a.global;
          type               CDATA               #IMPLIED
          TEIform            CDATA               'tig'          >
]]>
]]>


In the flat version of the terminology tag set, there is no <ofig> and no <tig> element. The current definition of <termEntry> is this one:

<!ELEMENT %n.termEntry; - O  ( (%m.terminologyMisc |
                             %n.otherForm; | %n.gram; |
                             %m.terminologyInclusions)*,
                             (%n.term;, (%m.terminologyMisc |
                             %n.otherForm; | %n.gram; |
                             %m.terminologyInclusions)* )+ )    >

The new definition is as follows. Since we need both versions in the extensions file, we invent a new parameter entity (TEI.terminology.flat) to signal the difference between the nested and flat terminology element sets.

< 116 New definitions for flat terminology tag set > =

<![%TEI.terminology;[
<!ENTITY % TEI.terminology.flat 'IGNORE'>
<![%TEI.terminology.flat;[
<!ENTITY % XML.termEntry "INCLUDE" >
<![%XML.termEntry;[
<!ELEMENT %n.termEntry; - O  ( (%m.terminologyMisc; |
                             %n.otherForm; | %n.gram; |
                             %m.terminologyInclusions; | %m.Incl;)*,
                             (%n.term;, 
                                (%m.terminologyMisc; |
                                %n.otherForm; | %n.gram; |
                                %m.terminologyInclusions; | %m.Incl;)* 
                             )+ 
                            )                                   >
<!ATTLIST %n.termEntry;      %a.global;
          type               CDATA               #IMPLIED
          TEIform            CDATA               'termEntry'    >
]]>
]]>
]]>


8.5.10 Segmentation and alignment tag set

< 117 Suppress definitions in segmentation and alignment tag set > =

<!ENTITY % altGrp           'IGNORE' >
<!ENTITY % joinGrp          'IGNORE' >
<!ENTITY % linkGrp          'IGNORE' >
<!ENTITY % timeline         'IGNORE' >


The current definitions are these:

<!ELEMENT %n.altGrp;    - -  ((%n.alt; | %n.ptr; | %n.xptr;)*)  >
<!ELEMENT %n.joinGrp;   - -  ((%n.join; | %n.ptr; | %n.xptr;)*)
                                                                >
<!ELEMENT %n.linkGrp;   - -  (%n.link; | %n.ptr; | %n.xptr;)+   >
<!ELEMENT %n.timeline;  - -  ((%n.when;)+)                      >

The new definitions are as follows. We take the opportunity to level the declarations by using stars, instead of plus signs, on all of them. This has the drawback of allowing a link group to contain no links (only members of m.Incl), but the advantage of dramatically simplifying the content model.

< 118 New definitions for segmentation and alignment tag set > =

<![%TEI.linking;[
<!ENTITY % XML.altGrp "INCLUDE" >
<![%XML.altGrp;[
<!ELEMENT %n.altGrp;    - -  ((%n.ptr; | %n.xptr; | %m.Incl;)*) >
<!ATTLIST %n.altGrp;         %a.global;
                             %a.pointerGroup;
          mode               (excl | incl)       excl
          wScale             (perc | real)       perc
          TEIform            CDATA               'altGrp'       >
]]>


< 119 New definitions for segmentation and alignment tag set 118 (cont'd) > =

<!ENTITY % XML.joinGrp "INCLUDE" >
<![%XML.joinGrp;[
<!ELEMENT %n.joinGrp;   - -  ((%n.ptr; | %n.xptr; | %m.Incl;)*)
                                                                >
<!ATTLIST %n.joinGrp;        %a.global;
                             %a.pointerGroup;
          result             CDATA               #IMPLIED
          desc               CDATA               #IMPLIED
          TEIform            CDATA               'joinGrp'      >
]]>


< 120 New definitions for segmentation and alignment tag set 118 (cont'd) > =

<!ENTITY % XML.linkGrp "INCLUDE" >
<![%XML.linkGrp;[
<!ELEMENT %n.linkGrp;   - -  (%n.ptr; | %n.xptr; | %m.Incl;)*   >
<!ATTLIST %n.linkGrp;        %a.global;
                             %a.pointerGroup;
          TEIform            CDATA               'linkGrp'      >
]]>


< 121 New definitions for segmentation and alignment tag set 118 (cont'd) > =

<!ENTITY % XML.timeline "INCLUDE" >
<![%XML.timeline;[
<!ELEMENT %n.timeline;  - -  ((%n.when;), (%m.Incl;)*)+         >
<!ATTLIST %n.timeline;       %a.global;
          origin             IDREF               #REQUIRED
          unit               NMTOKEN             #IMPLIED
          interval           NUTOKEN             #IMPLIED
          TEIform            CDATA               'timeline'     >
]]>
]]>


We have included m.Incl within these content models in the interests of consistency: this document is intended to provide an XML-compatible DTD which accepts all valid TEI P3 documents, and does not change the language unnecessarily. In the long run, however, it seems unlikely that we need to allow any m.Incl elements within any of these content models. Page breaks really and truly do not occur within link groups. Allowing timelines to nest within timelines is daft. And as we have seen, adding m.Incl to the original content models introduces ambiguity, since some members of that class were already named in the models. Removing the explicit mention avoids the ambigutity, but renders the content model misleading.

It is the editors' view that in P4, the m.Incl class should not appear in these models; they should revert to the form given in P3.

8.5.11 Analysis and interpretation tag set

< 122 Suppress definitions in analysis tag set > =

<!ENTITY % c               'IGNORE' >
<!ENTITY % interpGrp       'IGNORE' >
<!ENTITY % m               'IGNORE' >
<!ENTITY % spanGrp         'IGNORE' >
<!ENTITY % w               'IGNORE' >


The current definitions are these:

<!ELEMENT %n.c;         - -  (#PCDATA)                          >
<!ELEMENT %n.interpGrp; - -  ((%n.interp;)*)                    >
<!ELEMENT %n.m;         - -  ((#PCDATA | %n.seg; | %n.c;)*)     >
<!ELEMENT %n.spanGrp;   - -  ((%n.span;)*)                      >
<!ELEMENT %n.w;         - -  ((#PCDATA | %n.seg; | %n.w; |
                             %n.m; | %n.c;)*)                   >

The new definitions are as follows:

< 123 New definitions for analysis tag set > =

<![%TEI.analysis;[
<!ENTITY % XML.c "INCLUDE" >
<![%XML.c;[
<!ELEMENT %n.c;         - -  (#PCDATA)                          >
<!ATTLIST %n.c;              %a.global;
                             %a.seg;
          TEIform            CDATA               'c'            >
]]>


Since <interp> is a member of class Incl, we cannot name it directly in the content model, on pain of ambiguity. (Sigh.)

< 124 New definitions for analysis tag set 123 (cont'd) > =

<!ENTITY % XML.interpGrp "INCLUDE" >
<![%XML.interpGrp;[
<!--* We should really have: (%n.interp; | %m.Incl;)*         -->
<!ELEMENT %n.interpGrp; - -  (%m.Incl;)*                        >
<!ATTLIST %n.interpGrp;      %a.global;
                             %a.interpret;
          TEIform            CDATA               'interpGrp'    >
]]>


< 125 New definitions for analysis tag set 123 (cont'd) > =

<!ENTITY % XML.m "INCLUDE" >
<![%XML.m;[
<!ELEMENT %n.m;         - -  (#PCDATA | %n.seg; | %n.c; | %m.Incl;)*  >
<!ATTLIST %n.m;              %a.global;
                             %a.seg;
          baseform           CDATA               #IMPLIED
          TEIform            CDATA               'm'            >
]]>


The <spanGrp> element, like <interpGrp>, becomes close to meaningless now, if one doesn't understand that it is supposed to contain spans, which are included in m.Incl.

< 126 New definitions for analysis tag set 123 (cont'd) > =

<!ENTITY % XML.spanGrp "INCLUDE" >
<![%XML.spanGrp;[
<!--* We should really have: (%n.span; | %m.Incl;)*           -->
<!ELEMENT %n.spanGrp;   - -  (%m.Incl;)*                        >
<!ATTLIST %n.spanGrp;        %a.global;
                             %a.interpret;
          TEIform            CDATA               'spanGrp'      >
]]>


< 127 New definitions for analysis tag set 123 (cont'd) > =

<!ENTITY % XML.w "INCLUDE" >
<![%XML.w;[
<!ELEMENT %n.w;         - -  (#PCDATA | %n.seg; | %n.w; |
                             %n.m; | %n.c; | %m.Incl;)*         >
<!ATTLIST %n.w;              %a.global;
                             %a.seg;
          lemma              CDATA               #IMPLIED
          TEIform            CDATA               'w'            >
]]>

]]>


8.5.12 Feature structures tag set

The arguments given above against propagating global inclusions to the segmentation and alignment element types apply with equal or greater force to the feature-structures element types. But we resist the siren song of common sense and press on doggedly toward our goal of an upward-compatible experimental XML DTD.

< 128 Suppress definitions in feature-structures tag set > =

<!ENTITY % f               'IGNORE' >
<!ENTITY % falt            'IGNORE' >
<!ENTITY % flib            'IGNORE' >
<!ENTITY % fs              'IGNORE' >
<!ENTITY % fslib           'IGNORE' >
<!ENTITY % fvlib           'IGNORE' >
<!ENTITY % valt            'IGNORE' >


The current definitions are these:

<!ELEMENT %n.f;         - O  (%n.null; | (%n.plus; | %n.minus;
                             | any | %n.none; | %n.dft; |
                             %n.uncertain; | %n.sym; | %n.nbr;
                             | %n.msr; | %n.rate; | %n.str; |
                             %n.vAlt; | %n.alt; | %n.fs;)*)     >
<!ELEMENT %n.fAlt;      - -  ((%n.f; | %n.fs; | %n.fAlt;),
                             (%n.f; | %n.fs; | %n.fAlt;)+)      >
<!ELEMENT %n.fLib;      - -  ((%n.f; | %n.fAlt;)*)              >
<!ELEMENT %n.fs;        - -  ((%n.f; | %n.fAlt; | %n.alt;)*)    >
<!ELEMENT %n.fsLib;     - -  ((%n.fs; | %n.vAlt;)*)             >
<!ELEMENT %n.fvLib;     - -  ((%n.plus; | %n.minus; | any |
                             %n.none; | %n.dft; | %n.uncertain;
                             | %n.null; | %n.sym; | %n.nbr; |
                             %n.msr; | %n.rate; | %n.str; |
                             %n.vAlt;)*)                        >
<!ELEMENT %n.vAlt;      - -  ((%n.plus; | %n.minus; | any |
                             %n.none; | %n.dft; | %n.uncertain;
                             | %n.null; | %n.sym; | %n.nbr; |
                             %n.msr; | %n.rate; | %n.str; |
                             %n.vAlt; | %n.fs;), (%n.plus; |
                             %n.minus; | any | %n.none; |
                             %n.dft; | %n.uncertain; | %n.null;
                             | %n.sym; | %n.nbr; | %n.msr; |
                             %n.rate; | %n.str; | %n.vAlt; |
                             %n.fs;)+)                          >

The new definitions are as follows:

< 129 New definitions for feature-structures tag set > =

<![%TEI.fs;[
<!ENTITY % XML.f "INCLUDE" >
<![%XML.f;[
<!ELEMENT %n.f;         - O  (%n.null; | (%n.plus; | %n.minus;
                             | any | %n.none; | %n.dft; |
                             %n.uncertain; | %n.sym; | %n.nbr;
                             | %n.msr; | %n.rate; | %n.str; |
                             %n.vAlt; | %n.alt; | %n.fs;)*)     >
<!ATTLIST %n.f;              %a.global;
          name               NMTOKEN             #REQUIRED
          org                (single | set | bag | list)
                                                 #IMPLIED
          rel                (eq | ne | sb | ns) eq
          fVal               IDREFS              #IMPLIED
          TEIform            CDATA               'f'            >
]]>


< 130 New definitions for feature-structures tag set 129 (cont'd) > =

<!ENTITY % XML.fAlt "INCLUDE" >
<![%XML.fAlt;[
<!ELEMENT %n.fAlt;      - -  ((%n.f; | %n.fs; | %n.fAlt;),
                             (%n.f; | %n.fs; | %n.fAlt;)+)      >
<!ATTLIST %n.fAlt;           %a.global;
          mutExcl            (Y | N)             #IMPLIED
          TEIform            CDATA               'fAlt'         >
]]>


< 131 New definitions for feature-structures tag set 129 (cont'd) > =

<!ENTITY % XML.fLib "INCLUDE" >
<![%XML.fLib;[
<!ELEMENT %n.fLib;      - -  ((%n.f; | %n.fAlt;)*)              >
<!ATTLIST %n.fLib;           %a.global;
          type               CDATA               #IMPLIED
          TEIform            CDATA               'fLib'         >
]]>


< 132 New definitions for feature-structures tag set 129 (cont'd) > =

<!ENTITY % XML.fs "INCLUDE" >
<![%XML.fs;[
<!ELEMENT %n.fs;        - -  ((%n.f; | %n.fAlt; | %n.alt;)*)    >
<!ATTLIST %n.fs;             %a.global;
          type               CDATA               #IMPLIED
          feats              IDREFS              #IMPLIED
          rel                (eq | ne | sb | ns) sb
          TEIform            CDATA               'fs'           >
]]>


< 133 New definitions for feature-structures tag set 129 (cont'd) > =

<!ENTITY % XML.fsLib "INCLUDE" >
<![%XML.fsLib;[
<!ELEMENT %n.fsLib;     - -  ((%n.fs; | %n.vAlt;)*)             >
<!ATTLIST %n.fsLib;          %a.global;
          type               CDATA               #IMPLIED
          TEIform            CDATA               'fsLib'        >
]]>


< 134 New definitions for feature-structures tag set 129 (cont'd) > =

<!ENTITY % XML.fvLib "INCLUDE" >
<![%XML.fvLib;[
<!ELEMENT %n.fvLib;     - -  ((%n.plus; | %n.minus; | any |
                             %n.none; | %n.dft; | %n.uncertain;
                             | %n.null; | %n.sym; | %n.nbr; |
                             %n.msr; | %n.rate; | %n.str; |
                             %n.vAlt;)*)                        >
<!ATTLIST %n.fvLib;          %a.global;
          type               CDATA               #IMPLIED
          TEIform            CDATA               'fvLib'        >
]]>


< 135 New definitions for feature-structures tag set 129 (cont'd) > =

<!ENTITY % XML.vAlt "INCLUDE" >
<![%XML.vAlt;[
<!ELEMENT %n.vAlt;      - -  ((%n.plus; | %n.minus; | any |
                             %n.none; | %n.dft; | %n.uncertain;
                             | %n.null; | %n.sym; | %n.nbr; |
                             %n.msr; | %n.rate; | %n.str; |
                             %n.vAlt; | %n.fs;), (%n.plus; |
                             %n.minus; | any | %n.none; |
                             %n.dft; | %n.uncertain; | %n.null;
                             | %n.sym; | %n.nbr; | %n.msr; |
                             %n.rate; | %n.str; | %n.vAlt; |
                             %n.fs;)+)                          >
<!ATTLIST %n.vAlt;           %a.global;
          mutExcl            (Y | N)             #IMPLIED
          TEIform            CDATA               'vAlt'         >
]]>
]]>


It will be noted that the new versions are identical to the old versions. Common sense has won out, and in this experimental XML version of the TEI DTD, global inclusions are not propagated into these feature-structure element types.

8.5.13 Names and dates tag set

The <dateStruct> and <timeStruct> element types have already been rewritten above.

8.5.14 Text-criticism tag set

< 136 Suppress definitions in text-criticism tag set > =

<!ENTITY % app             'IGNORE' >
<!ENTITY % rdgGrp          'IGNORE' >
<!ENTITY % witList         'IGNORE' >


The current definitions are these:

<!ELEMENT %n.app;       - O  ((%n.lem)?, ((%n.rdg;, (%n.wit)?)
                             | (%n.rdgGrp;, (%n.wit)?))+)       >
<!ELEMENT %n.rdgGrp;    - O  (%n.rdgGrp; | (%n.rdg;,
                             (%n.wit)?))+                       >
<!ELEMENT %n.witList;   - O  ((%n.witness)+)                    >

The new definitions are as follows. We take the opportunity to address one of Peter Robinson's long-standing concerns, and allow witnesses to the lemma to be listed. Note that the model for <rdgGrp> seems bizarre. Why are readings and reading groups treated similarly in <app> entries and not in <rdgGrp> elements?

< 137 New definitions for text-criticism tag set > =

<![%TEI.textcrit;[
<!ENTITY % XML.app "INCLUDE" >
<![%XML.app;[
<!ELEMENT %n.app;       - O  ( (%m.Incl;)*, 
                                  (%n.lem;, (%m.Incl;)*, 
                                  (%n.wit;, (%m.Incl;)*)?)?, 
                                  ( (%n.rdg;, (%m.Incl;)*, 
                                    (%n.wit;, (%m.Incl;)*)?)
                                    | 
                                    (%n.rdgGrp;, (%m.Incl;)*, 
                                    (%n.wit;, (%m.Incl;)*)?)
                                  )+
                                )                               >
<!ATTLIST %n.app;            %a.global;
          type               CDATA               #IMPLIED
          from               IDREF               #IMPLIED
          to                 IDREF               #IMPLIED
          loc                CDATA               #IMPLIED
          TEIform            CDATA               'app'          >
]]>


< 138 New definitions for text-criticism tag set 137 (cont'd) > =

<!ENTITY % XML.rdgGrp "INCLUDE" >
<![%XML.rdgGrp;[
<!ELEMENT %n.rdgGrp;    - O  ((%m.Incl;)*, 
                                (((%n.rdgGrp;, (%m.Incl;)*) | 
                                  (%n.rdg;, (%m.Incl;)*, 
                                  (%n.wit;, (%m.Incl;)*)?)))+)  >
<!ATTLIST %n.rdgGrp;         %a.global;
                             %a.readings;
          TEIform            CDATA               'rdgGrp'       >
]]>


< 139 New definitions for text-criticism tag set 137 (cont'd) > =

<!ENTITY % XML.witList "INCLUDE" >
<![%XML.witList;[
<!ELEMENT %n.witList;   - O  ((%m.Incl;)*, 
                             (%n.witness;, (%m.Incl;)*)+)       >
<!ATTLIST %n.witList;        %a.global;
          TEIform            CDATA               'witList'      >
]]>
]]>


8.5.15 Graphs and digraphs tag set

< 140 Suppress definitions in graphs tag set > =

<!ENTITY % eTree           'IGNORE' >
<!ENTITY % forest          'IGNORE' >
<!ENTITY % forestGrp       'IGNORE' >
<!ENTITY % graph           'IGNORE' >
<!ENTITY % tree            'IGNORE' >
<!ENTITY % triangle        'IGNORE' >


The current definitions are these:

<!ELEMENT %n.graph;     - -  ((%n.node;)+ & (%n.arc;)*)         >
<!ELEMENT %n.tree;      - -  ((%n.leaf; | %n.iNode;)*,
                             %n.root;, (%n.leaf; | %n.iNode;)*)
                                                                >
<!ELEMENT %n.eTree;     - -  ((%n.eTree; | %n.triangle; |
                             %n.eLeaf; )*)                      >
<!ELEMENT %n.triangle;  - -  ((%n.eTree; | %n.triangle; |
                             %n.eLeaf;)*)                       >
<!ELEMENT %n.forest;    - -  ((%n.tree; | %n.eTree; |
                             %n.triangle;)+)                    >
<!ELEMENT %n.forestGrp; - -  ((%n.forest;)+)                    >

The new definitions are as follows:

< 141 New definitions for graphs tag set > =

<![%TEI.nets;[
<!ENTITY % XML.tree "INCLUDE" >
<![%XML.tree;[
<!ELEMENT %n.tree;      - -  ((%n.leaf; | %n.iNode; | %m.Incl;)*,
                             %n.root;, 
                             (%n.leaf; | %n.iNode; | %m.Incl;)*)
                                                                >
<!ATTLIST %n.tree;           %a.global;
          label              CDATA               #IMPLIED
          arity              NUMBER              #IMPLIED
          ord                (Y | N | partial)   Y
          order              NUMBER              #IMPLIED
          TEIform            CDATA               'tree'         >
]]>


< 142 New definitions for graphs tag set 141 (cont'd) > =

<!ENTITY % XML.eTree "INCLUDE" >
<![%XML.eTree;[
<!ELEMENT %n.eTree;     - -  ((%n.eTree; | %n.triangle; |
                             %n.eLeaf; | %m.Incl;)*)            >
<!ATTLIST %n.eTree;          %a.global;
          label              CDATA               #IMPLIED
          value              IDREF               #IMPLIED
          TEIform            CDATA               'eTree'        >
]]>


< 143 New definitions for graphs tag set 141 (cont'd) > =

<!ENTITY % XML.triangle "INCLUDE" >
<![%XML.triangle;[
<!ELEMENT %n.triangle;  - -  ((%n.eTree; | %n.triangle; |
                             %n.eLeaf; | %m.Incl;)*)            >
<!ATTLIST %n.triangle;       %a.global;
          label              CDATA               #IMPLIED
          value              IDREF               #IMPLIED
          TEIform            CDATA               'triangle'     >
]]>


< 144 New definitions for graphs tag set 141 (cont'd) > =

<!ENTITY % XML.forest "INCLUDE" >
<![%XML.forest;[
<!ELEMENT %n.forest;    - -  ((%n.tree; | %n.eTree; |
                             %n.triangle; | %m.Incl;)*)         >
<!ATTLIST %n.forest;         %a.global;
          type               CDATA               #IMPLIED
          TEIform            CDATA               'forest'       >
]]>


< 145 New definitions for graphs tag set 141 (cont'd) > =

<!ENTITY % XML.forestGrp "INCLUDE" >
<![%XML.forestGrp;[
<!ELEMENT %n.forestGrp; - -  ((%n.forest;, (%m.Incl;)*)+)    >
<!ATTLIST %n.forestGrp;      %a.global;
          type               CDATA               #IMPLIED
          TEIform            CDATA               'forestGrp'    >
]]>
]]>


8.5.16 Tables tag set

< 146 Suppress definitions in tables tag set > =

<!ENTITY % figure          'IGNORE' >
<!ENTITY % formula         'IGNORE' >
<!ENTITY % row             'IGNORE' >
<!ENTITY % table           'IGNORE' >


The current definitions are these:

<!ELEMENT %n.table;     - -  ((%n.head)*, (%n.row)+)            >
<!ELEMENT %n.row;       - O  ((%n.cell; | %n.table;)+)          >
<!ELEMENT %n.figure;    - -  ((%n.head)?, (%n.p)*,
                             (%n.figDesc)?, (%n.text)?)         >
<!ELEMENT %n.formula;   - -  %formulaContent;                   >

The new definitions are as follows:

< 147 New definitions for tables tag set > =

<![%TEI.figures;[
<!ENTITY % XML.table "INCLUDE" >
<![%XML.table;[
<!ELEMENT %n.table;     - -  ((%n.head; | %m.Incl;)*, 
                             (%n.row;, (%m.Incl;)*)+)           >
<!ATTLIST %n.table;          %a.global;
          rows               NUMBER              #IMPLIED
          cols               NUMBER              #IMPLIED
          TEIform            CDATA               'table'        >
]]>


< 148 New definitions for tables tag set 147 (cont'd) > =

<!ENTITY % XML.row "INCLUDE" >
<![%XML.row;[
<!ELEMENT %n.row;       - O  ((%n.cell; | %n.table;), 
                             (%m.Incl;)*)+                      >
<!ATTLIST %n.row;            %a.global;
          role               CDATA               data
          TEIform            CDATA               'row'          >
]]>


< 149 New definitions for tables tag set 147 (cont'd) > =

<!ENTITY % XML.figure "INCLUDE" >
<![%XML.figure;[
<!ELEMENT %n.figure;    - -  ((%m.Incl;)*, 
                              (%n.head;, (%m.Incl;)*)?, 
                              (%n.p;, (%m.Incl;)*)*,
                              (%n.figDesc;, (%m.Incl;)*)?, 
                              (%n.text;, (%m.Incl;)*)?)         >
<!ATTLIST %n.figure;         %a.global;
          entity             ENTITY              #IMPLIED
          TEIform            CDATA               'figure'       >
]]>


< 150 New definitions for tables tag set 147 (cont'd) > =

<!ENTITY % XML.formula "INCLUDE" >
<![%XML.formula;[
<!ELEMENT %n.formula;   - -  %formulaContent;                   >
<!ATTLIST %n.formula;        %a.global;
          notation           %formulaNotations;   #REQUIRED
          TEIform            CDATA               'formula'      >
]]>
]]>


9 The problem of the dictionary chapter

The TEI base tag set for dictionaries cannot be made XML conformant using the methods described here. That tag set distinguishes two top-level elements for dictionary entries: <entry>, which has a relatively well-defined structure, and <entryFree>, which has no prescribed structure at all: any element used in tagging dictionary entries may appear, within any other element, at any level of nesting. The desired freedom for <entryFree> entries is guaranteed by the inclusion exception on <entryFree>. The standard declaration for the element is this:

<!ELEMENT %n.entryFree; - O  (#PCDATA)
                                                 +(%m.dictionaryParts
                                                 | %m.phrase |
                                                 %m.inter)      >

If we use the techniques described above, all of the members of the classes dictionaryParts, phrase, and inter will be made legal at every point within any members of any of those classes. Apart from the havoc that would wreak on the core tag set, it would wholly erase the distinction between <entry> and <entryFree> elements.

So some other method of handling anomalous dictionary entries is needed in an XML version of the TEI DTD. Borrowing ideas from B. Tommie Usdin and Deborah A. Lapeyre, and with thanks also to David J. Birnbaum, I propose a new approach to the problem.

The basic idea is to define an element for anomalous structures in dictionary entries. In this discussion, I'll assume this element is called <dictAnomaly> for (`dictionary anomaly'). For every element in the normal structure of a dictionary, the existing content model is changed by taking the existing content model and adding <dictAnomaly> as an alternative. Thus the element <superentry> currently has the following declaration:

<!ELEMENT %n.superentry;
                        - O  ((%n.form)?, (%n.entry)+)          >
After the change, it will have the declaration:
<!ELEMENT %n.superentry;
                        - O  (((%n.form;)?, (%n.entry;)+)
                             | %n.dictAnomaly;)                 >
That is, a superentry is either normal (an optional <form> element followed by one or more <entry> elements), or else it is anomalous. The <dictAnomaly> element itself is defined as allowing any sequence of character data, dictionary elements, inter-level elements, or phrase-level elements:
<!ELEMENT %n.anomaly;   - O  (#PCDATA | %m.dictionaryParts;
                             | %m.phrase; | %m.inter;)*         >
An anomalous superentry contains a single <dictAnomaly> element, and nothing else.

For elements which are currently defined with mixed content, <dictAnomaly> is simply added to the list of elements which can occur within them. This allows us to evade the mixed-content problem. The simplest way to do this is to define <dictAnomaly> as a phrase-level element in the dictionary tag set. It also allows anomalies to occur within generic phrase-level and inter-level elements which are used in dictionary entries.

In principle, the extensions file should handle this thus:

<!ENTITY % x.phrase 'dictAnomaly |' >
But since we have to include new declarations for the entire phrase-level class system in the extensions file anyway (to fix the problems with phrase.seq), we can simply add <dictAnomaly> to phrase, as was done above.

10 Open questions and checklists

This list brings together in one place a number of open questions mentioned above.

Corrigible errors identified in this document are:

11 Miscellaneous Housekeeping

A few scraps necessary for housekeeping have no obvious home in this document; I'll put them here.

Before we define component, we need to embed all the entity files for the selected tag sets:

< 151 Embed tag-set-specific ent files > =

<!-- 3.7.6:  Embedding tag-set-specific entity definitions    -->
<![ %TEI.verse; [
<!ENTITY % TEI.verse.ent system 'teivers2.ent'                  >
%TEI.verse.ent;
]]>
<![ %TEI.drama; [
<!ENTITY % TEI.drama.ent system 'teidram2.ent'                  >
%TEI.drama.ent;
]]>
<![ %TEI.spoken; [
<!ENTITY % TEI.spoken.ent system 'teispok2.ent'                 >
%TEI.spoken.ent;
]]>
<![ %TEI.dictionaries; [
<!ENTITY % TEI.dictionaries.ent system 'teidict2.ent'           >
%TEI.dictionaries.ent;
]]>
<![ %TEI.terminology; [
<!ENTITY % x.common ''                                          >
<!ENTITY % m.common '%x.common %m.bibl; | %m.chunk; | 
           %m.hqinter; | %m.lists; | %m.notes; | %n.stage;'     >
<!ENTITY % TEI.terminology.ent system 'teiterm2.ent'            >
%TEI.terminology.ent;
]]>
<![ %TEI.linking; [
<!ENTITY % TEI.linking.ent system 'teilink2.ent'                >
%TEI.linking.ent;
]]>
<![ %TEI.analysis; [
<!ENTITY % TEI.analysis.ent system 'teiana2.ent'                >
%TEI.analysis.ent;
]]>
<![ %TEI.transcr; [
<!ENTITY % TEI.transcr.ent system 'teitran2.ent'                >
%TEI.transcr.ent;
]]>
<![ %TEI.textcrit; [
<!ENTITY % TEI.textcrit.ent system 'teitc2.ent'                 >
%TEI.textcrit.ent;
]]>
<![ %TEI.names.dates; [
<!ENTITY % TEI.names.dates.ent system 'teind2.ent'              >
%TEI.names.dates.ent;
]]>
<![ %TEI.figures; [
<!ENTITY % TEI.figures.ent system 'teifig2.ent'                 >
%TEI.figures.ent;
]]>


Note that the terminology entity file unwisely refers to common, which we thus must define in an ad hoc way.

Before we do that, we have to provide default values for all the tagset entities:

< 152 Provide default tagset declarations > =

<!ENTITY % TEI.prose        'IGNORE'                            >
<!ENTITY % TEI.verse        'IGNORE'                            >
<!ENTITY % TEI.drama        'IGNORE'                            >
<!ENTITY % TEI.spoken       'IGNORE'                            >
<!ENTITY % TEI.dictionaries 'IGNORE'                            >
<!ENTITY % TEI.terminology  'IGNORE'                            >
<!ENTITY % TEI.general      'IGNORE'                            >
<!ENTITY % TEI.mixed        'IGNORE'                            >
<!ENTITY % TEI.linking      'IGNORE'                            >
<!ENTITY % TEI.analysis     'IGNORE'                            >
<!ENTITY % TEI.fs           'IGNORE'                            >
<!ENTITY % TEI.certainty    'IGNORE'                            >
<!ENTITY % TEI.transcr      'IGNORE'                            >
<!ENTITY % TEI.textcrit     'IGNORE'                            >
<!ENTITY % TEI.names.dates  'IGNORE'                            >
<!ENTITY % TEI.nets         'IGNORE'                            >
<!ENTITY % TEI.figures      'IGNORE'                            >
<!ENTITY % TEI.corpus       'IGNORE'                            >


And we need to define the TEI keywords and default generic identifiers:

< 153 Define TEI keywords > =

<!ENTITY % INHERITED '#IMPLIED'                                 >
<!ENTITY % ISO-date 'CDATA'                                     >
<!ENTITY % extPtr 'CDATA'                                       >
<!ENTITY % TEI.elementNames system 'teigis2.ent'                >
%TEI.elementNames;


< 154 Fix placePart class > =

<!ENTITY % x.placePart ''                                       >
<!ENTITY % m.placePart '%x.placePart %n.bloc; | %n.country; | 
           %n.distance; | %n.geog; | %n.offset; 
           | %n.region; | %n.settlement;'                          >



A Notation

The notation in this paper is fairly simple:

Notes

[1] In particular, this document does not suppress the tag-omissibility indicators in the TEI DTD; that job is left to special-purpose software. In its current form, this document also does not completely normalize all mixed content models to the form required by XML. I started to make it do so, and have just realized that carthage may already do what is necessary. I need to find out for sure whether carthage does the job, and either complete or remove the partial sets of changes described for the mass redeclaration of all phrase.seq and paraContent elements.
[return to text]

[2] If the set of inclusions and the set of exclusions on the exception stack are always the same for every possible occurrence of every element type in the DTD, then an exception-free DTD can be created which accepts exactly the same set of documents as the original DTD. A DTD which had exceptions only on the root element type, for example, could be replicated without changing the language it accepts. I am not aware of any production DTDs which fall into this class.
[return to text]

[3] One could take the converse goal of ensuring that the revised DTD be at least as selective as the original DTD, i.e. that it undergenerate with respect to the original language. This would be interesting as an exercise, but if applied to the TEI DTD it would invalidate existing TEI data, which makes it unacceptable as an approach to creating an XML-conformant version of the TEI DTD.
[return to text]

[4] This is clearly established by Wood and Kilpeläinen, though they inexplicably claim to have proven the opposite.
[return to text]

[5] Strictly speaking, these ought perhaps to be imf(E,I), mf(E,I), and m(E,I), but for purposes of this paper we will never need different sets of inclusions I. So if it matters, we can define imf(E) formally as imf(E,I), etc.
[return to text]

[6] What is wrong with these lists, and why are they not complete? The Names and Dates tag set may not have been selected, or the DTD I used may -- almost surely did -- have the bug that makes much of that tag set unreachable. The Corpus tags are for the header, and may in fact not be descendants of <text>.
[return to text]

[7] The dictionary tag set includes orth, pron, hyph, syll, stress, gram, gen, number, case, per, tns, mood, itype, pos, subc, colloc, def, tr, lang, usg, lbl.
[return to text]

[8] This is a classic example of what is known in DTD design circles as the Mixed-Content Gotcha; the problems associated with it led the XML design group to restrict the form of mixed-content models in order to forbid content models which are subject to the problem. This restriction, in turn, makes it essential to revise specialPara in an XML version of the TEI DTD.
[return to text]

[9] An inquiry on TEI-L might usefully reveal whether anyone is actually using <set> and whether they would be inconvenienced by this tighter model.
[return to text]


HTML generated 7 Jul 1999