<?xml version="1.0" encoding="utf-8"?>
<!--
Copyright TEI Consortium. 
Dual-licensed under CC-by and BSD2 licences 
See the file COPYING.txt for details.
$Date$
$Id$
-->


<?xml-model href="http://tei.oucs.ox.ac.uk/jenkins/job/TEIP5/lastSuccessfulBuild/artifact/P5/release/xml/tei/odd/p5.nvdl" type="application/xml" schematypens="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0"?>

<div xmlns="http://www.tei-c.org/ns/1.0" type="div1" xml:id="FS" n="16"><head>Feature Structures</head>
<!-- this para is repeated at the start of ISO std -->
<p>A <term>feature structure</term> is a general purpose data
structure which identifies and groups together individual
<term>features</term>, each of which associates a name with one or
more values.  Because of the generality of feature structures, they
can be used to represent many different kinds of information, but they
are of particular usefulness in the representation of linguistic
analyses, especially where such analyses are partial, or
<term>underspecified</term>. Feature structures represent the
interrelations among various pieces of information, and their
instantiation in markup provides a <term>metalanguage</term> for the
generic representation of analyses and interpretations.  Moreover,
this instantiation allows feature values to be of specific
<term>types</term>, and for restrictions to be placed on the values
for particular features, by means of <term>feature system
declarations</term>.<note place="bottom">The recommendations of this chapter have
been adopted as ISO Standard 24610-1 <title>Language Resource
Management — Feature Structures — Part One: Feature Structure Representation</title></note>
</p>
<div type="div2" xml:id="FSor"><head>Organization of this Chapter</head>
<p>This chapter is organized as
follows.  Following this introduction, section <ptr target="#FSBI"/>
introduces the elements <gi>fs</gi> and <gi>f</gi>, used to represent
feature structures and features respectively, together with the
elementary <term>binary</term> feature value.  Section <ptr target="#FSSY"/> introduces elements for representing other kinds of
atomic feature values such as <term>symbolic</term>,
<term>numeric</term>, and <term>string</term> values.  Section <ptr target="#FSFL"/> introduces the notion of predefined
<term>libraries</term> or groups of features or feature values along
with methods for referencing their components.  Section <ptr target="#FSST"/> introduces complex values, in particular
feature-structures as values, thus enabling feature structures to be
recursively defined.  Section <ptr target="#FSSS"/> discusses other
complex values, in particular values which are collections, organized
as <term>set</term>s, <term>bag</term>s, and
<term>list</term>s. Section <ptr target="#FVE"/> discusses how the
operations of alternation, negation, and collection of feature values
may be represented.  Section <ptr target="#FSBO"/> discusses ways of
representing underspecified, default, or uncertain values.  Section
<ptr target="#FSLINK"/> discusses how analyses may be linked to other
parts of an encoded text.  Section <ptr target="#FD"/> describes the
<term>feature system declaration</term>, a construct which provides
for the validation of typed feature structures.

Formal definitions for
all the elements introduced in this chapter are provided in section
<ptr target="#FSDEF"/>. <!-- KL says this shd be in Annex A Normative --></p>
<!-- &intro.odd; -->
<!-- intro moved to section 4 somewhere -->
</div>

<div type="div2" xml:id="FSBI"><head>Elementary Feature Structures and the Binary
Feature Value</head>
<p>The fundamental elements used to represent a feature structure
analysis are <gi>f</gi> (for <term>feature</term>), which represents a
feature-value pair, and <gi>fs</gi> (for <term>feature
structure</term>), which represents a structure made up of such
feature-value pairs.  The <gi>fs</gi> element has an optional
<att>type</att> attribute which may be used to represent typed feature
structures, and may contain any number of <gi>f</gi> elements.  An
<gi>f</gi> element has a required <att>name</att> attribute and an
associated <term>value</term>. The value may be simple: that is, a
single binary, numeric, symbolic (i.e. taken from a restricted set of
legal values), or string value, or a collection of such values,
organized in various ways, for example, as a list; or it may be
complex, that is, it may itself be a feature structure, thus providing
a degree of recursion. Values may be under-specified or defaulted in
various ways.  These possibilities are all described in more detail in
this and the following sections.
 </p>
<p>Feature and feature-value representations (including feature
structure representations) may be embedded directly at any point in an
XML document, or they may be collected together in special-purpose
feature or feature-value <term>libraries</term>. The components of
such libraries may then be referenced from other feature or
feature-value representations, using the <att>feats</att> or
<att>fVal</att> attribute as appropriate. </p>
<p>We begin by considering the simple case of a feature structure
which  contains binary-valued features only. The following three XML elements  are
needed to  represent such a feature structure: 
<specList>
<specDesc key="fs" atts="type feats"/>
<specDesc key="f" atts="name fVal"/>
<specDesc key="binary"/>
</specList>
The attributes <att>feats</att> and the <att>fVal</att> are not
discussed in this section: they provide an alternative way of
indicating the content of an element, as further discussed in  section
<ptr target="#FSFL"/>. 
 </p>
<p>An <gi>fs</gi> element containing <gi>f</gi> elements with binary
values can be straightforwardly used to encode the <term>matrices</term>
of feature-value specifications for phonetic segments, such as the
following for the English segment [s].
 <eg corresp="#FS-eg-01">+---          ---+
| consonantal +  |
| vocalic     -  |
| voiced      -  |
| anterior    +  |
| coronal     +  |
| continuant  +  |
| strident    +  |
+---          ---+</eg>
 </p>
<p>This representation may be encoded in XML as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fs type="phonological_segments">
   <f name="consonantal"> <binary value="true"/>  </f>
   <f name="vocalic">     <binary value="false"/> </f>
   <f name="voiced">      <binary value="false"/> </f>
   <f name="anterior">    <binary value="true"/>  </f>
   <f name="coronal">     <binary value="true"/>  </f>
   <f name="continuant">  <binary value="true"/>  </f>
   <f name="strident">    <binary value="true"/>  </f>
</fs></egXML>
Note that <gi>fs</gi> elements may have an optional <att>type</att>
attribute to indicate the kind of feature structure in question,
whereas <gi>f</gi> elements must have a <att>name</att> attribute to
indicate the name of the feature. Feature structures need not be
typed, but features must be named. Similarly, the <gi>fs</gi> element may
be empty, but the <gi>f</gi> element must have (or reference) some content.</p>
<p>The restriction of specific features to specific types of values
(e.g.  the restriction of the feature <mentioned>strident</mentioned>
to a binary value) requires additional validation, as does any
restriction on the features available within a feature structure of a
particular type (e.g. whether a feature structure of type
<mentioned>phonological segment</mentioned> necessarily contains a
feature <mentioned>voiced</mentioned>). Such validation may be carried
out at the document level, using special purpose processing, at the
schema level using additional validation rules, or at the declarative
level, using an additional mechanism such as the <term>feature-system
declaration</term> discussed in <ptr target="#FD"/>.</p>
<p>Although we have used the term <term>binary</term> for this kind
of value, and its representation in XML uses values such as
<code>true</code> and <code>false</code> (or, equivalently,
<code>1</code> and <code>0</code>), it should be noted that such
values are not restricted to propositional assertions. As this example
shows, this kind of value is intended for use with any binary-valued
feature. </p>
<!--<p xmlns="http://www.tei-c.org/ns/1.0">Formal declarations for the <gi>fs</gi>, <gi>f</gi> and
<gi>binary</gi> elements are provided below in <ptr target="#FSDEF"/>.</p>-->
</div>
<div type="div2" xml:id="FSSY"><head>Other Atomic Feature Values</head>
<p>Features may take other kinds of atomic value. In this section, we
define elements which may be used to represent: <term>symbolic
values</term>, <term>numeric values</term>, and <term>string
values</term>. The module defined by this chapter allows for the
specification of additional datatypes if necessary, by extending the
underlying class <ident type="class">model.featureVal.single</ident>. If this is done, it
is recommended that only the basic W3C datatypes should be used; more
complex datatyping should be represented as feature structures.
<specList>
<specDesc key="symbol" atts="value"/>
<specDesc key="numeric"/>
<specDesc key="string"/></specList>
 </p>
<p>The <gi>symbol</gi> element is used for the value of a feature when
that feature can have any of a small, finite set of possible values,
representable as character strings.  For example, the following might
be used to represent the claim that the Latin noun form
<mentioned>mensas</mentioned><!-- mentioned>the&#225;s</mentioned -->
(tables) has accusative case, feminine gender, and
plural number:<egXML xmlns="http://www.tei-c.org/ns/Examples"><fs>
   <f name="case">   <symbol value="accusative"/> </f>
   <f name="gender"> <symbol value="feminine"/> </f>
   <f name="number"> <symbol value="plural"/> </f>
</fs></egXML>
 </p>
<p>More formally, this representation shows a structure in which three
features (<term>case</term>, <term>gender</term>, and
<term>number</term>) are used to define morpho-syntactic properties of
a word. Each of these features can take one of a small number of
values (for example, case can be <code>nominative</code>,
<code>genitive</code>, <code>dative</code>, <code>accusative</code>,
etc.)  and it is therefore appropriate to represent the values taken
in this instance as <gi>symbol</gi> elements.  Note that, instead of
using a symbolic value for grammatical number, one could have named
the feature <term>singular</term> or <term>plural</term> and given it
an appropriate binary value, as in the following example: <egXML xmlns="http://www.tei-c.org/ns/Examples"><fs> 
  <f name="case"><symbol value="accusative"/></f>
  <f name="gender"><symbol value="feminine"/></f>
  <f name="singular"><binary value="false"/></f>
</fs></egXML>
Whether one uses a binary or symbolic value in situations like this is
largely a matter of taste. </p>
<p>The <gi>string</gi> element is used for the value of a
feature when that value is a string drawn from a very large or potentially
unbounded set of possible strings of characters, so that it would be
impractical or impossible to use the <gi>symbol</gi> element.  The string
value is expressed as the content of the <gi>string</gi> element,
rather  than as an attribute value.  For example, one might encode a
street address as follows:
 <egXML xmlns="http://www.tei-c.org/ns/Examples"><fs>
   <f name="address"><string>3418 East Third Street</string></f>
</fs></egXML> </p>
<p>The <gi>numeric</gi> element is used when the value of a feature is a
numeric value, or a range of such values.  For example, one might wish
to regard the house number and the street
name as different features, using an encoding like the following:
 <egXML xmlns="http://www.tei-c.org/ns/Examples"><fs>
   <f name="houseNumber"><numeric value="3418"/></f>
   <f name="streetName"><string>East Third Street</string></f>
</fs></egXML> </p>
<p>If the numeric value to be represented falls within a specific
range (for example an address that spans several numbers), the
<att>max</att> attribute may be used to supply an upper limit:
 <egXML xmlns="http://www.tei-c.org/ns/Examples"><fs>
   <f name="houseNumber"><numeric value="3418" max="3440"/></f>
   <f name="streetName"><string>East Third Street</string></f>
</fs></egXML> </p>
<p>It is also possible to specify that the numeric value (or values)
represented should (or should not) be truncated. For example, assuming
that the daily rainfall in mm is a feature of interest for some
address, one might represent this by an encoding like the following:
 <egXML xmlns="http://www.tei-c.org/ns/Examples"><fs>
   <f name="dailyRainFall"><numeric value="0.0" max="1.3" trunc="false"/></f>
</fs></egXML> This  represents any of the infinite
number of numeric values falling between 0 and 1.3; by contrast
 <egXML xmlns="http://www.tei-c.org/ns/Examples"><fs>
   <f name="dailyRainFall"><numeric value="0.0" max="1.3" trunc="true"/></f>
</fs></egXML> represents only two possible values: 0 and 1.
</p>
<p>Some communities of practice, notably those with a strong computer-science 
bias, prefer to dissociate the information on the value of the given 
feature from the specification of the data type that this value represents. 
In such cases, feature values can be provided directly as textual content 
of <gi>f</gi>, with the assumption that the data type is specified by the 
schema. The following is an example taken from ISO 24612, presenting the 
symbolic values for Active Voice and Simple Present Tense in the untyped 
form:<egXML xmlns="http://www.tei-c.org/ns/Examples"><fs><f name="voice">active</f>
<f name="tense">SimPre</f></fs></egXML></p>
<p>As noted above, additional processing is necessary to ensure that
appropriate values are supplied for particular features, for example
to ensure that the feature <code>singular</code> is not given a value
such as <tag>symbol value="feminine"/</tag>.  There are two
ways of attempting to ensure that only certain combinations of feature
names and values are used.  First, if the total number of legal
combinations is relatively small, one can predefine all of them in a
construct known as a <term>feature library</term>, and then reference
the combination required using the <att>feats</att> attribute in the
enclosing <gi>fs</gi> element, rather than give it explicitly.  This
method is suitable in the situation described above, since it requires
specifying a total of only ten (5 + 3 + 2) combinations of features
and values.  Similarly, to ensure that only feature structures
containing valid combinations of feature values are used, one can put
definitions for all valid feature structures inside a <term>feature
value library</term> (so called, since a feature structure may be the
value of a feature).  A total of 30 feature structures (5 × 3
× 2) is required to enumerate all the possible combinations of
individual case, gender and number values in the preceding
illustration.  We discuss the use of such libraries and their
representation in XML further in section <ptr target="#FSFL"/> below.
 </p>
<p>However, the most general method of attempting to ensure that only legal
combinations of feature names and values are used is to provide a
<term>feature-system declaration</term> discussed in <ptr target="#FD"/>.</p>
<p>Whether at the level of feature-system declarations, feature- and
feature-value libraries, or individual features, it is possible to
align both feature names and their values with standardized external
data category repositories such as ISOcat. <note place="bottom">See
section <ptr target="#DIMVLV"/> for more discussion of the need and
rationale for ISOcat references.</note> In the following example, both
the feature <val>part_of_speech</val> and its value
<val>#commonNoun</val> are aligned with the respective definitions
provided by <ref target="#ISO-12620">ISO DCR (Data Category
Registry)</ref>, as implemented by ISOcat.
<egXML xmlns="http://www.tei-c.org/ns/Examples" valid="feasible">
<fs xmlns:dcr="http://www.isocat.org/ns/dcr">
<!--...-->
<f
name="part_of_speech"
dcr:datcat="http://www.isocat.org/datcat/DC-1345"
fVal="#commonNoun"
dcr:valueDatcat="http://www.isocat.org/datcat/DC-1256"
/>
<!-- ... -->
</fs>
</egXML></p>
</div>
<div type="div2" xml:id="FSFL"><head>Feature Libraries and Feature-Value Libraries</head>
<p>As the examples in the preceding section suggest, the direct
encoding of feature structures can be verbose.  Moreover, it is often
the case that particular feature-value combinations, or feature
structures composed of them, are re-used in different analyses. To reduce
the size and complexity of the task of encoding feature structures, one
may use the <att>feats</att> attribute of the <gi>fs</gi> element to point
to one or more of the feature-value specifications for that element.   This indirect method of
encoding feature structures presumes that the <gi>f</gi> elements are
assigned unique <att>xml:id</att> values, and are collected together in
<gi>fLib</gi> elements (<term>feature libraries</term>).  In the same way, feature
values of whatever type can be collected together in <gi>fvLib</gi> elements
(<term>feature-value libraries</term>). If a feature has as its
value a feature structure or other value which is predefined in this way,  the
<att>fVal</att> attribute may be used to point to it, as discussed in
the next section. The following elements  are used for representing feature libraries and feature-value libraries:
<specList>
<specDesc key="fLib"/>
<specDesc key="fvLib"/></specList>
 </p>
<p>For example, suppose a feature library for phonological feature
specifications is set up as follows.
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fLib n="phonological features">
   <f xml:id="CNS1" name="consonantal"> <binary value="true"/> </f>
   <f xml:id="CNS0" name="consonantal"> <binary value="false"/> </f>
   <f xml:id="VOC1" name="vocalic">     <binary value="true"/> </f>
   <f xml:id="VOC0" name="vocalic">     <binary value="false"/> </f>
   <f xml:id="VOI1" name="voiced">      <binary value="true"/> </f>
   <f xml:id="VOI0" name="voiced">      <binary value="false"/> </f>
   <f xml:id="ANT1" name="anterior">    <binary value="true"/> </f>
   <f xml:id="ANT0" name="anterior">    <binary value="false"/> </f>
   <f xml:id="COR1" name="coronal">     <binary value="true"/> </f>
   <f xml:id="COR0" name="coronal">     <binary value="false"/> </f>
   <f xml:id="CNT1" name="continuant">  <binary value="true"/> </f>
   <f xml:id="CNT0" name="continuant">  <binary value="false"/> </f>
   <f xml:id="STR1" name="strident">    <binary value="true"/> </f>
   <f xml:id="STR0" name="strident">    <binary value="false"/> </f>
   <!-- ... -->
</fLib></egXML>
 </p>
<p>Then the feature structures that represent the analysis of the
phonological segments (phonemes) <code>/t/</code>, <code>/d/</code>,
<code>/s/</code>, and <code>/z/</code> may be defined as follows.
   <egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples"><fs feats="#CNS1 #VOC0 #VOI0 #ANT1 #COR1 #CNT0 #STR0"/>
<fs feats="#CNS1 #VOC0 #VOI1 #ANT1 #COR1 #CNT0 #STR0"/>
<fs feats="#CNS1 #VOC0 #VOI0 #ANT1 #COR1 #CNT1 #STR1"/>
<fs feats="#CNS1 #VOC0 #VOI1 #ANT1 #COR1 #CNT1 #STR1"/></egXML>
 </p>
<p>The preceding are but four of the 128 logically possible fully
specified phonological segments using the seven binary features listed in
the feature library.  Presumably not all combinations of features
correspond to phonological segments (there are no strident vowels, for
example).  The legal combinations, however, can be collected together,
each one represented as an identifiable <gi>fs</gi> element within a
<term>feature-value library</term>, as in the following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fvLib xml:id="fsl1" n="phonological segment definitions">
   <!-- ... -->
   <fs xml:id="T.DF" feats="#CNS1 #VOC0 #VOI0 #ANT1 #COR1 #CNT0 #STR0"/>
   <fs xml:id="D.DF" feats="#CNS1 #VOC0 #VOI1 #ANT1 #COR1 #CNT0 #STR0"/>
   <fs xml:id="S.DF" feats="#CNS1 #VOC0 #VOI0 #ANT1 #COR1 #CNT1 #STR1"/>
   <fs xml:id="Z.DF" feats="#CNS1 #VOC0 #VOI1 #ANT1 #COR1 #CNT1 #STR1"/>
   <!-- ... -->
</fvLib></egXML>
 </p>
<p>Once defined, these feature structure values can also be reused.
Other <gi>f</gi> elements may invoke them by reference, using the
<att>fVal</att> attribute; for example, one might use them in a
feature value pair such as: <egXML xmlns="http://www.tei-c.org/ns/Examples"><f name="dental-fricative" fVal="#T.DF"/> </egXML> rather than expanding the hierarchy of the
component phonological features explicitly.  </p>
<p>Feature structures stored in this way  may also be associated with
the text which they are intended to annotate, either by a link from the text
(for example, using the TEI global <att>ana</att> attribute), or
by means of stand-off annotation techniques (for example, using the TEI
<gi>link</gi> element): see further section <ptr target="#FSLINK"/>
below.
</p>
<p>Note that when features or feature structures are linked to in this
way, the result is effectively a copy of the item linked to into the
place from which it is linked. This form of linking should be distinguished from
the phenomenon of <term>structure-sharing</term>, where it is desired
to indicate that some part of an annotation structure appears
simultaneously in two or more places within the structure. This kind
of annotation should be represented using the <gi>vLabel</gi> element, as
discussed in <ptr target="#FSVAR"/> below. </p>
</div>
<div type="div2" xml:id="FSST"><head>Feature Structures as Complex Feature Values</head>
<p>Features may have complex values as well as atomic ones; the
simplest such complex value is represented by supplying a <gi>fs</gi>
element as the content of an <gi>f</gi> element, or (equivalently) by
supplying the identifier of an <gi>fs</gi> element as the value for
the <att>fVal</att> attribute on the <gi>f</gi>
element. Structures may be nested as deeply as appropriate, using this
mechanism.  For example, an <gi>fs</gi> element may contain or point
to an <gi>f</gi> element, which may contain or point to an <gi>fs</gi>
element, which may contain or point to an <gi>f</gi> element, and so
on.</p>
<p>To illustrate the use of complex values, consider the following
simple model of a word, as a structure combining surface form
information, a syntactic category, and semantic information. Each word
analysis is represented as a <tag>fs type='word'</tag> element,
containing three features named <code>surface</code>,
<code>syntax</code>, and <code>semantics</code>. The first of these
has an atomic string value, but the other two have complex values,
represented as nested feature structures of types
<code>category</code> and <code>act</code> respectively:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fs type="word">
   <f name="surface"><string>love</string></f>
   <f name="syntax">
	<fs type="category">
	  <f name="pos"><symbol value="verb"/></f>
	  <f name="val"><symbol value="transitive"/></f>
	</fs>
   </f>
   <f name="semantics">
	<fs type="act">
	  <f name="rel"><symbol value="LOVE"/></f>
	</fs>
   </f>
</fs></egXML></p>
<p>This analysis does not tell us much about the meaning of the
symbols <code>verb</code> or <code>transitive</code>. It might be
preferable to replace these atomic feature values by feature
structures.  Suppose therefore that we maintain a feature-value
library for each of the major syntactic categories (N, V, ADJ, PREP):
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fvLib n="Major category definitions">
<!-- ... -->
<fs xml:id="N" type="noun">
  <!--  noun features defined here -->
</fs>
<fs xml:id="V" type="verb">
  <!-- verb features defined here -->
</fs>
</fvLib>
</egXML>
</p>
<p>This library allows us to use shortcut codes (<code>N</code>,
<code>V</code>, etc.) to reference a complete definition for the
corresponding feature structure. Each definition may be explicitly
contained within the <gi>fs</gi> element, as a number of <gi>f</gi>
elements. Alternatively, the relevant features may be referenced by
their identifiers, supplied as the value of the <att>feats</att>
attribute, as in these examples:
<egXML xmlns="http://www.tei-c.org/ns/Examples">&lt;!-- ... --&gt;
&lt;fs xml:id="ADJ" type="adjective" feats="#F1 #F2"/&gt;
&lt;fs xml:id="PREP" type="preposition" feats="#F1 #F3"/&gt;
&lt;!-- ... --&gt;
</egXML>
</p>
<p>This ability to re-use feature definitions within multiple feature
structure definitions is an essential simplification in any realistic
example.  In this case, we assume the existence of a feature library
containing specifications for the basic feature categories like the following:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fLib n="categorial features">
   <f xml:id="NN-1" name="nominal"><binary value="true"/></f>
   <f xml:id="NN-0" name="nominal"><binary value="false"/></f>
   <f xml:id="VV-1" name="verbal"><binary value="true"/></f>
   <f xml:id="VV-0" name="verbal"><binary value="false"/></f>
<!-- ... -->
</fLib>
</egXML>
</p>
<p>With such libraries in place, and assuming the availability of
similarly predefined feature structures for transitivity and
semantics, the preceding example could be considerably simplified:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fs type="word">
<f name="surface"><string>love</string></f>
   <f name="syntax">
	<fs type="category">
	  <f name="pos" fVal="#V"/>
	  <f name="val" fVal="#TRNS"/>
	</fs>
   </f>
   <f name="semantics">
	<fs type="act">
	  <f name="rel" fVal="#LOVE"/>
	</fs>
   </f>
</fs>
</egXML></p>
<p>Although in principle the <att>fVal</att> attribute could  point to
any kind of feature value, its use is not recommended for simple
atomic values. </p>
</div>
<div type="div2" xml:id="FSVAR"><head>Re-entrant Feature Structures</head>
<p>Sometimes the same feature value is required at multiple places
within a feature structure, in particular where the value is only
partially specified at one or more places. The <gi>vLabel</gi> element is
provided as a means of labelling each such re-entrancy point:
<specList>
<specDesc key="vLabel"/>
</specList>
</p>
<p>For example, suppose one wishes to represent noun-verb agreement as
a single feature structure. Within the representation, the feature
indicating (say) number appears more than once. To represent the fact
that each occurrence is another appearance of the same feature (rather
than a copy) one could use an encoding like the following:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fs xml:id="NVA">
<f name="nominal">
  <fs>
    <f name="nm-num">
     <vLabel name="L1">
       <symbol value="singular"/>
    </vLabel>
   </f>
   <!-- other nominal features -->
  </fs>
</f>
<f name="verbal">
  <fs>
    <f name="vb-num">
      <vLabel name="L1"/>
    </f>
  </fs>
   <!-- other verbal features -->
</f>
</fs></egXML>
</p>
<p>In the above encoding, the features named <code>vb-num</code> and
<code>nm-num</code> exhibit structure sharing. Their values, given as
<code>vLabel</code> elements, are understood to be references to the same
point in the feature structure, which is labelled by their
<att>name</att> attribute. </p>
<p>The scope of the names used to label re-entrancy points is that of the
outermost <gi>fs</gi> element in which they appear. When a feature
structure is imported from a feature value library, or referenced from
elsewhere (for example by using the <att>fVal</att> attribute) the
names of any
sharing points it may contain are implicitly prefixed by the identifier used
for the imported feature structure, to avoid name clashes. Thus, if
some other feature structure were to reference the <gi>fs</gi> element
   given in the example above, for example in this way: <egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples"><f name="class" fVal="#NVA"/></egXML> then
the labelled points in the example would be interpreted as if they had the
name <code>NVAL1</code>. </p>
</div>
<div type="div2" xml:id="FSSS"><head>Collections as Complex Feature Values</head>
<p>Complex feature values need not always be represented as feature
structures. Multiple values may also be organized as sets, bags or
multisets, or lists of atomic values of any type. The <gi>vColl</gi>
element is provided to represent such cases:
<specList>
<specDesc key="vColl"/>
</specList>
</p>
<p>A feature whose value is regarded  as a set, bag, or list may have
any positive number of values as its content, or none at
all, (thus allowing for representation of the empty set, bag, or list).
The items in a list are ordered, and need not be distinct. The items
in a set are not ordered, and must be distinct. The items in a bag are
neither ordered nor distinct. Sets and bags are thus distinguished
from lists in that the order in which the values are specified does
not matter for the former, but does matter for the latter, while sets
are distinguished from bags and lists in that repetitions of values do
not count for the former but do count for the latter.  
</p>
<p>If no value is specified for the <att>org</att> attribute, the
assumption is that the <gi>vColl</gi> defines a list of values. If the
<gi>vColl</gi> element is empty, the assumption is that it represents
the null list, set, or bag. <!--, unless the <att>feats</att> attribute is
used to specify its contents. If values are supplied  within a <gi>vColl</gi> element
which also specifies values on its <att>feats</att> attribute, the
implication is that the two sets of values are to be unified.--></p>
<p>To illustrate the use of the <att>org</att> attribute, suppose that
a feature structure analysis is used to represent a genealogical tree,
with the information about each individual treated as a single feature
structure, like this: 
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fs xml:id="p027" type="person">
   <f name="forenames">
     <vColl>
       <string>Daniel</string>
       <string>Edouard</string>
     </vColl>
   </f>
   <f name="mother" fVal="#p002"/>
   <f name="father" fVal="#p009"/>
   <f name="birthDate">
      <fs type="date" feats="#y1988 #m04 #d17"/>
   </f>
   <f name="birthPlace" fVal="#austintx"/>
   <f name="siblings">
       <vColl org="set">
        <fs copyOf="#pnb005"/>
        <fs copyOf="#prb001"/>
       </vColl>
   </f>
</fs></egXML>
 </p>
<p>In this example, the <gi>vColl</gi> element is first used to supply
a list of <soCalled>name</soCalled> feature values, which together
constitute the <soCalled>forenames</soCalled> feature. Other features
are defined by reference to values which we assume are held in some
external feature value library (not shown here). For example, the
<gi>vColl</gi> element is used a second time to indicate that the
persons's siblings should be regarded as constituting a set rather
than a list. Each sibling is represented by a feature structure: in
this example, each feature structure is a copy of one specified in the
feature value library. </p>
<p>If a specific feature contains only a single feature structure as
its value, the component features of which are organized as a set, bag,
or list, it may be more convenient to represent the value as a
<gi>vColl</gi> rather than as a <gi>fs</gi>. For example, consider the
following encoding of the English verb form
<mentioned>sinks</mentioned>, which contains an
<mentioned>agreement</mentioned> feature whose value is a feature
structure which contains <mentioned>person</mentioned> and
<mentioned>number</mentioned> features with symbolic values.
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fs type="word">
  <f name="category"> <symbol value="verb"/> </f>
  <f name="tense"> <symbol value="present"/> </f>
  <f name="agreement">
    <fs>
      <f name="person"> <symbol value="third"/> </f>
      <f name="number"> <symbol value="singular"/> </f>
    </fs>
  </f>
</fs></egXML>
 </p>
<p>If the names of the features contained within
the <mentioned>agreement</mentioned> feature structure are
of no particular significance, the following simpler representation
may be used:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fs type="word">
  <f name="category"> <symbol value="verb"/> </f>
  <f name="tense"> <symbol value="present"/> </f>
  <f name="agreement">
     <vColl org="set">
        <symbol value="third"/>
        <symbol value="singular"/>
     </vColl>
  </f>
</fs></egXML>
 </p>
<p>The <gi>vColl</gi> element is also useful in cases where an analysis
has several components. In the following example, the French
word <mentioned>auxquels</mentioned> has a two-part analysis,
represented as a list of two values. The first specifies that the word contains a
preposition; the second that it contains a masculine plural relative
pronoun: 
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fs>
      <f name="lex">
	<symbol value="auxquels"/>
      </f>
      <f name="maf">
	<vColl org="list">
	  <fs>
	    <f name="cat"><symbol value="prep"/></f>
	  </fs>
	  <fs>
	    <f name="cat"><symbol value="pronoun"/></f>
	    <f name="kind"><symbol value="rel"/></f>
	    <f name="num"><symbol value="pl"/></f>
	    <f name="gender"><symbol value="masc"/></f>
	  </fs>
	</vColl>
      </f>
    </fs></egXML>
</p>
<p>The set, bag, or list which has no members is known as the null (or
empty) set, bag, or list.  A <gi>vColl</gi> element with no content and
with no value for its <att>feats</att> attribute is interpreted as
referring to the null set, bag, or list, depending on the value of its
<att>org</att> attribute.</p>
<p>If, for example, the individual described by the
feature structure with identifier <code>p027</code> (above) had no siblings, we might specify the
<mentioned>siblings</mentioned> feature as follows.
<egXML xmlns="http://www.tei-c.org/ns/Examples"><f name="siblings"><vColl org="set"/></f></egXML>
 </p>
<p>A <gi>vColl</gi> element may also collect together one or more other
<gi>vColl</gi> elements, if, for example one of the members of a set is
itself a set, or if two lists are concatenated together. Note that
such collections pay no attention to the contents of the nested
<gi>vColl</gi> elements: if it is desired to produce the union of two
sets, the <gi>vMerge</gi> element discussed below should be used to
make a new collection from the two sets.  </p>
</div>
<div type="div2" xml:id="FVE"><head>Feature Value Expressions</head>
<p>It is sometimes desirable to express the value of a feature as the
result of an operation over some other value (for example, as
<soCalled>not green</soCalled>, or as <soCalled>male or
female</soCalled>, or as the concatenation of two collections).  Three
special purpose elements are provided to represent disjunctive
alternation, negation, and collection of values:
<specList>
  <specDesc key="vAlt"/>
  <specDesc key="vNot"/>
  <specDesc key="vMerge"/>
</specList>
 </p>
<div type="div3" xml:id="FVALT"><head>Alternation</head>
<p>The <gi>vAlt</gi> element can be used wherever a feature value can
appear. It contains two or more feature values, any one of which is to
be understood as the value required. Suppose, for example, that we are
using a feature system to describe residential property, using such
features as <mentioned>number.of.bathrooms</mentioned>. In a
particular case, we might wish to represent uncertainty as to whether
a house has two or three bathrooms. As we have already shown, one
simple way to represent this would be with a numeric maximum:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><f name="number.of.bathrooms"><numeric value="2" max="3"/></f></egXML>
 </p>
<p>A  more general way would be to represent the
alternation explicitly, in this way:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><f name="number.of.bathrooms"> <vAlt><numeric value="2"/><numeric value="3"/> </vAlt></f></egXML>
 </p>
<p>The <gi>vAlt</gi> element represents alternation over feature
values, not feature-value pairs. If therefore the uncertainty relates
to two or more feature value specifications, each  must be represented
as a feature structure, since a feature structure can always appear
where a value is required. For example, suppose that it is uncertain
as to whether the house being described has two bathrooms or two
bedrooms, a structure like the following may be used:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><f name="rooms">
   <vAlt>
      <fs><f name="number.of.bathrooms"> <numeric value="2"/> </f></fs>
      <fs><f name="number.of.bedrooms"> <numeric value="2"/> </f></fs>
   </vAlt>
</f></egXML>
 </p>
<p>Note that alternation is always regarded as <term>exclusive</term>:
in the case above, the implication is that  having two bathrooms
excludes the possibility of having two bedrooms and vice versa. If
inclusive alternation is required, a <gi>vColl</gi> element may be
included in the alternation as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><f name="rooms">
   <vAlt><fs><f name="number.of.bathrooms"> <numeric value="2"/> </f></fs>
          <fs><f name="number.of.bedrooms"> <numeric value="2"/> </f></fs>
      <vColl>
        <fs><f name="number.of.bathrooms"> <numeric value="2"/> </f></fs>
        <fs><f name="number.of.bedrooms"> <numeric value="2"/> </f></fs>
      </vColl>
   </vAlt>
</f></egXML>
This analysis indicates that the property may have two bathrooms, two
bedrooms, or both two bathrooms and two bedrooms.
</p>
<p>As the previous example shows, the <gi>vAlt</gi> element can also
be used to indicate alternations among values of features organized as
sets, bags or lists.  Suppose we use a feature
<code>selling.points</code> to describe items that are mentioned to
enhance a property's sales value, such as whether it has a pool or a
good view.  Now suppose for a particular listing, the selling points include
an alarm system and a good view, and either a pool or a jacuzzi (but
not both).  This situation could be represented, using the
<gi>vAlt</gi> element, as follows.
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fs type="real_estate_listing">
   <f name="selling.points"> 
      <vColl org="set">
      <string>alarm system</string>
      <string>good view</string>
      <vAlt>
         <string>pool</string>
         <string>jacuzzi</string>
      </vAlt>
      </vColl>
   </f>
</fs></egXML>
 </p>
<p>Now suppose the situation is like the preceding except that one is
also uncertain whether the property has an alarm system or a good
view.  This can be represented as follows.
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fs type="real_estate_listing">
   <f name="selling.points">
      <vColl org="set">
       <vAlt>
         <string>alarm system</string>
         <string>good view</string>
      </vAlt>
      <vAlt>
         <string>pool</string>
         <string>jacuzzi</string>
      </vAlt>
      </vColl>
   </f>
</fs></egXML>
 </p>
<p>If a large number of ambiguities or uncertainties need to be
represented, involving a relatively small number of features and
values, it is recommended that a stand-off technique, for example
using the general-purpose <gi>alt</gi> element discussed in
section <ptr target="#SAAT"/> <!-- of  TEI P5,--> be used, rather than the
special-purpose <gi>vAlt</gi> element.
</p>
</div>
<div type="div3" xml:id="FVNOT"><head>Negation</head>
<p>The <gi>vNot</gi> element can be used wherever a feature value can
appear. It contains any feature value and returns the complement of
its contents. For example, the feature
<mentioned>number.of.bathrooms</mentioned> in the following example
has any whole numeric value other than 2:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><f name="number.of.bathrooms"> <vNot><numeric value="2"/> </vNot></f></egXML>
 </p>
<p>Strictly speaking, the effect of the <gi>vNot</gi> element is to
provide the complement of the feature values it contains, rather than
their negation. If a feature system declaration is available which defines the
possible values for the associated feature, then it is possible to say
more about the negated value. For example, suppose that the
available values for the feature <code>case</code> are declared to be
nominative, genitive, dative, or accusative, whether in a TEI feature
system declaration or
by some other means. Then the following two specifications are equivalent:
<egXML xmlns="http://www.tei-c.org/ns/Examples"> (i) <f name="case">
      <vNot><symbol value="genitive"/></vNot></f>
(ii) <f name="case">
      <vAlt>
       <symbol value="nominative"/>
       <symbol value="dative"/>
       <symbol value="accusative"/>
     </vAlt>
    </f></egXML>
</p>
<p>If however no such system declaration is available, all that one
can say about a feature specified via negation is that its value is
something other than the negated value. </p>
<p>Negation is always applied to a feature value, rather than to a
feature-value pair. The negation of an atomic value is the set of all
other values which are possible for the feature. </p>
<p>Any kind of value can be negated, including collections
(represented by a <gi>vColl</gi> elements) or feature structures
(represented by <gi>fs</gi> elements). The negation of any complex
value is understood to be the set of values  which
cannot be unified with it. Thus, for example, the negation of the
feature structure F is understood to be the set of feature structures
which are not unifiable with F. In the absence of a constraint
mechanism such as the Feature System Declaration, the negation of a
collection is anything that is not unifiable with it, including
collections of different types and atomic values. It will generally be
more useful to require that the organization of the negated value be
the same as that of the original value, for example that a negated set
is understood to mean the set which is a complement of the set, but
such a requirement cannot be enforced in the absence of a constraint
mechanism. </p>
</div>
<div type="div3" xml:id="FVCOLL"><head>Collection of Values</head>
<p>The <gi>vMerge</gi> element can be used wherever a feature value can
appear. It contains two or more feature values, all of which are to be
collected together. The organization of the resulting collection is
specified by the value of the <att>org</att> attribute, which need
not necessarily be the same as that of its constituent values if these
are collections. For example, one can change a list to a set, or vice versa.</p>
<p>As an example, suppose that we wish to represent the range of
possible values for a feature <soCalled>genders</soCalled> used to
describe some language. It would be natural to represent the possible
values  as a set, using the <gi>vColl</gi> element as in the following
example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fs>
<f name="genders">
   <vColl org="set">
       <symbol value="masculine"/>
       <symbol value="feminine"/>
   </vColl>
</f>
</fs>
</egXML>
</p>
<p>Suppose however that we discover for some language it is necessary
to add a new possible value, and to treat the value of the
feature as a list rather than as a set. The <gi>vMerge</gi> element can
be used to achieve this:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fs>
<f name="genders">
<vMerge org="list">
   <vColl org="set">
       <symbol value="masculine"/>
       <symbol value="feminine"/>
   </vColl>
   <symbol value="neuter"/>
</vMerge>
</f>
</fs>
</egXML>
</p> 
<!-- more examples needed, use from sect 4 -->
</div>
</div>
<div type="div2" xml:id="FSBO"><head>Default Values</head>
<p>The value of a feature may be underspecified in a number of
different ways. It may be null, unknown, or uncertain with respect to
a range of known possibilities, as well as being defined as a negation
or an alternation. As previously noted, the specification of the range
of known possibilities for a given feature is not part of the current
specification: in the TEI scheme, this information is conveyed by the
<term>feature system declaration</term>. Using this, or some other
system, we might specify (for example) that the range of values
for an element includes symbols for masculine, feminine, and neuter,
and that the default value is neuter. With such definitions available
to us, it becomes possible to say that some feature takes the default
value, or some unspecified value from the list. The following special
element is provided for this purpose:
<specList>
<specDesc key="default"/>
</specList>
</p> 
<p>The value of an empty <gi>f</gi> element which also lacks a <att>fVal</att>
attribute is understood to be the most general
case, i.e. any of the available values. Thus, assuming the feature
system defined above, the following two representations are equivalent. 
<egXML xmlns="http://www.tei-c.org/ns/Examples"><f name="gender"/>
<f name="gender">
   <vAlt>
      <symbol value="feminine"/>
      <symbol value="masculine"/>
      <symbol value="neuter"/>
   </vAlt>
</f></egXML>
 </p>
<p>If, however, the value is explicitly stated to be the default one,
using the <gi>default</gi> element, then the  following two representations
are equivalent: 
<egXML xmlns="http://www.tei-c.org/ns/Examples"><f name="gender"> <default/> </f></egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples"><f name="gender"> <symbol value="neuter"/> </f></egXML>
 </p>
<p>Similarly, if the value is stated to be the negation of the
default, then the following two representations are equivalent:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><f name="gender"> <vNot><default/></vNot> </f></egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples"><f name="gender">    <vAlt>
      <symbol value="feminine"/>
      <symbol value="masculine"/>
</vAlt></f></egXML>
</p>
</div>
<!-- discussion of subsumption and nonsubsumption was here: removed to
     separate file for now -->
<!-- dogstoday example was here: removed to separate file pro tem -->
<div type="div2" xml:id="FSLINK"><head>Linking Text and Analysis</head>
<p>Text elements can be linked with feature structures using any of
the linking methods discussed elsewhere in the Guidelines (see for
example sections <ptr target="#AIATTS"/> and <ptr target="#AILA"/>).
In the simplest case, the <att>ana</att> attribute may be used
to point from any element to an annotation of it, as in the following
example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><s n="00741">
   <w ana="#at0">The</w>
   <w ana="#ajs">closest</w>
   <w ana="#pnp">he</w>
   <w ana="#vvd">came</w>
   <w ana="#prp">to</w>
   <w ana="#nn1">exercise</w>
   <w ana="#vbd">was</w>
   <w ana="#to0">to</w>
   <w ana="#vvi">open</w>
   <w ana="#crd">one</w>
   <w ana="#nn1">eye</w>
   <phr ana="#av0">  
      <w>every</w>
      <w>so</w>
      <w>often</w>
   </phr>
   <c ana="#pun">,</c>
   <w ana="#cjs">if</w>
   <w ana="#pni">someone</w>
   <w ana="#vvd">entered</w>
   <w ana="#at0">the</w>
   <w ana="#nn1">room</w>
   <!-- ... -->
</s></egXML>
 </p>
<p>The values specified for the <att>ana</att> attribute reference
components of a feature-structure library, which represents all of the
grammatical structures used by this encoding scheme. (For illustrative
purposes, we cite here only the structures needed for the first six
words of the sample sentence):
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fvLib xml:id="C6" n="Claws 6 tags">
   <!-- ... -->
   <fs xml:id="ajs" type="grammatical_structure" feats="#wj #ds"/>
   <fs xml:id="at0" type="grammatical_structure" feats="#wl"/>
   <fs xml:id="pnp" type="grammatical_structure" feats="#wr #rp"/>
   <fs xml:id="vvd" type="grammatical_structure" feats="#wv #bv #fd"/>
   <fs xml:id="prp" type="grammatical_structure" feats="#wp #bp"/>
   <fs xml:id="nnn" type="grammatical_structure" feats="#wn #tc #ns"/>
 <!-- ... -->
</fvLib></egXML>
The components of each feature structure in the library are 
referenced in much the same way, using the
<att>feats</att> attribute to  identify one or more <gi>f</gi>
elements in the following feature library (again, only a few of the
available features are quoted here):
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fLib>
   <!-- ... -->
   <f xml:id="fl-bv" name="verbbase"> <symbol value="main"/> </f>
   <f xml:id="fl-bp" name="prepbase"> <symbol value="lexical"/> </f>
   <f xml:id="fl-ds" name="degree">   <symbol value="superlative"/> </f>
   <f xml:id="fl-fd" name="verbform"> <symbol value="ed"/> </f>
   <f xml:id="fl-ns" name="number">   <symbol value="singular"/> </f>
   <f xml:id="fl-rp" name="prontype"> <symbol value="personal"/> </f>
   <f xml:id="fl-tc" name="nountype"> <symbol value="common"/> </f>
   <f xml:id="fl-wj" name="class">    <symbol value="adjective"/> </f>
   <f xml:id="fl-wl" name="class">    <symbol value="article"/> </f>
   <f xml:id="fl-wn" name="class">    <symbol value="noun"/> </f>
   <f xml:id="fl-wp" name="class">    <symbol value="preposition"/> </f>
   <f xml:id="fl-wr" name="class">    <symbol value="pronoun"/> </f>
   <f xml:id="fl-wv" name="class">    <symbol value="verb"/> </f>
   <!-- ... -->
</fLib></egXML>
 </p>
<p>Alternatively, a stand-off technique may be used, as in the following
example, where a <gi>linkGrp</gi> element is used to link selected
characters in the text <mentioned>Caesar seized control</mentioned> with 
their phonological representations.
<egXML xmlns="http://www.tei-c.org/ns/Examples">
 <s>
  <w xml:id="S1W1"><c xml:id="S1W1C1">C</c>ae<c xml:id="S1W1C2">s</c>ar</w>
  <w xml:id="S1W2"><c xml:id="S1W2C1">s</c>ei<c xml:id="S1W2C2">z</c>e<c xml:id="S1W2C3">d</c></w>
  <w xml:id="S1W3">con<c xml:id="S1W3C1">t</c>rol</w>.
</s>
  <fvLib xml:id="FSL1" n="phonological segment definitions">
	<!-- as in previous example -->
  </fvLib>
  <linkGrp type="phonology">
	<!-- ... -->
	<link target="#S.DF #S1W3C1"/>
	<link target="#Z.DF #S1W2C3"/>
	<link target="#S.DF #S1W2C1"/>
	<link target="#Z.DF #S1W2C2"/>
	<!-- ... -->
  </linkGrp></egXML>
 </p>
<p>As this example shows, a stand-off solution requires that every
component to be linked to must be addressable in some way, by means of
an XPointer. To handle the
POS tagging example above, for example, each annotated element might be
given an identifier of some sort, as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><s xml:id="mds09" n="00741">
   <w xml:id="mds0901">The</w>
   <w xml:id="mds0902">closest</w>
   <w xml:id="mds0903">he</w>
   <w xml:id="mds0904">came</w>
   <w xml:id="mds0905">to</w>
   <w xml:id="mds0906">exercise</w>
<!-- ... --></s></egXML>
It would then be possible to link each word to its intended
annotation in the feature library quoted above, as follows:
   <egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples"><linkGrp type="POS-codes">	<!-- ... -->
   <link target="#mds0901 #at0"/><link target="#mds0902 #ajs"/>
   <link target="#mds0903 #pnp"/><link target="#mds0904 #vvd"/>
   <link target="#mds0905 #prp"/><link target="#mds0906 #nn1"/>
   <link target="#mds0907 #vbd"/><link target="#mds0908 #to0"/>
   <link target="#mds0909 #vvi"/><link target="#mds0910 #crd"/>
	<!-- ... -->
  </linkGrp></egXML>
</p>
</div>


<div type="div1" xml:id="FD" n="26">
<head>Feature System Declaration</head>
<p>The Feature System Declaration (FSD) is intended for use in
conjunction with a TEI-conforming text that makes use of <gi>fs</gi>
(that is, feature structure) elements.
The FSD serves three purposes:
<list rend="bulleted">
<item>It provides a mechanism by which the encoder can list all of the
feature names and feature values and give a prose description as to what
each represents.</item>
<item>It provides a mechanism by which the encoder can define
constraints not only what it means to be a well-formed feature
structure, but also <hi>valid</hi> feature structure, relative to a given
theory stated in typed feature logic.
These constraints may involve constraints on the range of a feature
value, constraints on what features are valid within certain types of
feature structures, or constraints that prevent the co-occurrence of
certain feature-value pairs.</item>
<item>It provides a mechanism by which the encoder can define the
intended interpretation of underspecified feature structures.  This
involves defining default values (whether literal or computed) for
missing features.</item></list></p>

<p>The scheme described in this chapter may be used to document any
feature structure system, but is primarily intended for use with the
feature structure representation defined by the ISO 24610-1:2006
standard, which corresponds with the recommendations presented in
these Guidelines, <ptr target="#FS"/>. This chapter relies upon, but
does not reproduce, formal definitions and descriptions presented more
thoroughly in the ISO standard, which should be consulted in case of
ambiguity or uncertainty. </p>

<p>The FSD serves an important function in documenting precisely what
the encoder intended by the system of feature structure markup used in
an XML-encoded text.  The FSD is also an important resource which
standardizes the rules of inference used by software to validate the
feature structure markup in a text, and to infer the full
interpretation of underspecified feature structures.</p>

<p>The reader should be aware the terminology used in this document
does not always closely follow conventional practice in formal logic,
and may also diverge from practice in some linguistic applications of
typed feature structures.  In particular, the term
<soCalled>interpretation</soCalled> when applied to a feature
structure is not an interpretation in the model-theoretic sense, but
is instead a minimally informative (or equivalently, most general)
extension <!--(see <ptr target="#subsumpSec"/>)--> of that feature
structure that is consistent with a set of constraints declared by an
FSD.  In linguistic application, such a system of constraints is the
principal means by which the grammar of some natural language is
expressed.  There is a great deal of disagreement as to what, if any,
model-theoretic interpretation feature structures have in such
applications, but the status of this formal kind of interpretation is
not germane to the present document.  Similarly, the term
<soCalled>valid</soCalled> is used here as elsewhere in these
Guidelines to identify the syntactic state of well-formedness in the
sense defined by the logic of typed feature structures itself, as
distinct from and in addition to the
<soCalled>well-formedness</soCalled> that pertains at the level of
this encoding standard. No appeal to any notion from formal semantics
should be inferred. </p>

<p>We begin by describing how an encoded text is associated with one
or more feature system declarations.  The second, third, and fourth
sections describe the overall structure of a feature system
declaration and give details of how to encode its components.  The final
section offers a full example; fuller discussion
of the reasoning behind FSDs and another complete example are provided
in <ptr target="#FS-BIBL-01"/>.</p>
<div type="div2" xml:id="FDLK">
<head>Linking a TEI Text to Feature System Declarations</head>
<p>In order for application software to use feature system
declarations to aid in the automatic interpretation of encoded texts,
or even for human readers to find the appropriate declarations which
document the feature system used in markup, there must be a formal
link from the encoded texts to the declarations. However, the
schema  which declares the syntax of the Feature System itself
should  be kept distinct from the feature structure schema, which is an
application of that system.</p> 

<p>A document containing typed feature structures may simply include a
feature system declaration documenting those feature structures.  A
more usual scenario, however, is that the same feature system
declaration (or parts of it) will be shared by many documents.  In
either case, an <gi>fsDecl</gi> element for each distinct type of
feature structure used must be provided and associated with the type,
which is the value used within each feature structure for its
<att>type</att> attribute.</p>

<p>When the module defined in this chapter is included in an XML
schema, the following elements become available:

<specList>
<specDesc key="fsdDecl"/>
<specDesc key="fsdLink"/>
<specDesc key="fsDecl"/>
</specList>

The <gi>fsdDecl</gi> element may be supplied either within the header
of a standard TEI document, or as a standalone document in its own
right. It contains one or more <gi>fsdLink</gi> or <gi>fsDecl</gi>
elements. </p>

<!--
<p>Each <gi>fsDecl</gi> has a unique identifier, given as the value of
its <att>xml:id</att> attribute.  Each <gi>fsdLink</gi> element
associates a specific feature structure type with a URI which resolves
to an <gi>fsDecl</gi> element, by using its <att>xml:id</att> value. </p>
-->

<p>For example, suppose that a document <ident type="file">doc.xml</ident>
contains feature structures of two types: <val>gpsg</val> and
<val>lex</val>. We might simply embed an <gi>fsDecl</gi> element for
each within the header attached to the document as follows:

<egXML xmlns="http://www.tei-c.org/ns/Examples" valid="feasible">
   <TEI>
   <teiHeader>
      <fileDesc> <!-- example --> </fileDesc>
      <encodingDesc>
           <!-- ... -->
           <fsdDecl>
           <fsDecl type="gpsg">
	     <!-- information about this type -->
	   </fsDecl>
           <fsDecl type="lex">
	     <!-- information about this type -->
	   </fsDecl>
	   </fsdDecl>
           <!-- ... -->
      </encodingDesc>
   </teiHeader>
<text><body>
  <!-- ... -->
   <fs type="lex">
      <!-- an instance of the typed feature structure "lex" -->
   </fs>
  <!-- ... -->
</body></text>
   </TEI></egXML>
</p>
<p>In this case there is an implicit link between the <gi>fs</gi>
element and the corresponding <gi>fsDecl</gi> element because they
share the same value for their <att>type</att> attribute and appear
within the same document. This is a short cut for the more general
case which requires a more explicit link provided by means of the
<gi>fsdLink</gi> element, as demonstrated below.</p>

<p>Now suppose that we wish to create a second document which includes
feature structures of the same type. Rather than duplicate the
corresponding declarations, we will need to provide a means of
pointing to them from this second document.  The easiest<note place="bottom">Ways of pointing to components of a TEI document without
using an XML identifier are discussed in <ptr target="#SAUR"/></note>
way of accomplishing this is to add an XML identifier to each
<gi>fsDecl</gi> element in <ident type="file">example.xml</ident>:
<egXML xmlns="http://www.tei-c.org/ns/Examples" valid="feasible">
           <!-- ... -->
           <fsdDecl>
           <fsDecl type="gpsg" xml:id="GPSG">
	     <!-- information about this type -->
	   </fsDecl>
           <fsDecl type="lex" xml:id="LEX">
	     <!-- information about this type -->
	   </fsDecl>
	   </fsdDecl>
</egXML>
(Although in this case the XML identifier is simply an uppercase
version of the type name, there is no necessary connection between the
two names. The only requirement is that the XML identifier conform to
the standards required for identifiers, and that it be unique within
the document containing it.)</p>
<p>In the <gi>fsdDecl</gi> for the second document, we can now include
pointers to the <gi>fsDecl</gi> elements in the first:
<egXML xmlns="http://www.tei-c.org/ns/Examples" valid="feasible">
   <TEI>
   <teiHeader>
      <fileDesc> <!-- doc2  --> </fileDesc>
      <encodingDesc>
           <!-- ... -->
           <fsdDecl>
           <fsdLink type="gpsg" target="example.xml#GPSG"/>
           <fsdLink type="lexx" target="example.xml#GPSG"/>
	   </fsdDecl>
           <!-- ... -->
      </encodingDesc>
   </teiHeader>
<text><body>
  <!-- ... -->
   <fs type="lexx">
      <!-- an instance of the typed feature structure "lex" -->
   </fs>
  <!-- ... -->
</body></text>
   </TEI></egXML>
Note that in <ident type="file">doc2.xml</ident> there is no requirement for the local
name for a given type of feature structures to be the same as that
used by <ident type="file">example.xml</ident>. We assume in this encoding that the type
called <name>lexx</name> in <ident type="file">doc2.xml</ident> is declared as
having identical constraints and other properties to those declared
for the type called <name>lex</name> in <ident type="file">example.xml</ident>.</p>

<p>A <gi>fsdDecl</gi> may be given, as above, within the
encoding description of the <gi>teiHeader</gi> element of a TEI
document containing typed feature structures. Alternatively, it may
appear independently of any feature structures, as a document in its
own right, possibly with its own <gi>teiHeader</gi>. These options are
both possible because the element is a member of both the <ident type="class">model.encodingDescPart</ident> class and the <ident type="class">model.resourceLike</ident> class.</p>

<p>The current recommendations provide no way of enforcing uniqueness
of the <att>type</att> values among <gi>fsdDecl</gi> elements, nor of
requiring that every <att>type</att> value specified on a <gi>fs</gi>
element be also declared on an <gi>fsdDecl</gi> element. Encoders
requiring such constraints (which might have some obvious utility in
assisting the consistency and accuracy of tagging) are recommended to
develop tools to enforce them, using such mechanisms as Schematron
assertions. </p>

</div>
<div type="div2" xml:id="FDOV"><head>The Overall Structure of a Feature System Declaration</head>
<p>A feature system declaration contains one or more feature
structure declarations, each of which has up to three parts: an optional description
(which gives a prose comment on what that type of feature structure
encodes), an obligatory set of feature declarations (which specify
range constraints and default values for the features in that type of
structure), and optional feature structure constraints (which specify
co-occurrence restrictions on feature values). 
<specList>
<specDesc key="fsDescr"/>
<specDesc key="fDecl"/>
<specDesc key="fsConstraints"/></specList></p>
<p>Feature declarations and feature structure constraints are
described in the next two sections.  Note that the specification of
similar <gi>fsDecl</gi> elements can be simplified by devising an
inheritance hierarchy for the feature structure types.  Each
<gi>fsDecl</gi> element may name one or more
<soCalled>basetypes</soCalled> from which it inherits feature
declarations and constraints (these are often called
<soCalled>supertypes</soCalled>). For instance, suppose that
<tag>fsDecl type="Basic"</tag> contains <tag>fDecl name="One"</tag>
and <tag>fDecl name="Two"</tag>, and that <tag>fsDecl type="Derived"
baseTypes="Basic"</tag> contains just <tag>fDecl name="Three"</tag>.
Then any instance of <tag>fs type="Derived"</tag> must include all
three features.  This is because <tag>fsDecl type="Derived"</tag>
inherits the two feature declarations from <tag>fsDecl
type="Basic"</tag> when it specifies a base type of
<val>Basic</val>.</p>
<p>The following sample shows the overall structure of a complete
feature structure declaration:
 <egXML xmlns="http://www.tei-c.org/ns/Examples" valid="feasible">
     <fsDecl type="SomeName">
        <fsDescr>Describes what this type of fs represents</fsDescr>
        <fDecl name="featureOne">
           <!-- The declaration for featureOne -->
        </fDecl>
        <fDecl name="featureTwo">
           <!-- The declaration for featureTwo -->
        </fDecl>
        <fsConstraints>
           <!-- The feature structure constraints go here -->
        </fsConstraints>
     </fsDecl>
</egXML></p>

<p>The attribute <att>baseTypes</att> gives the name of one or more
types from which this type inherits feature specifications and
constraints; if this type includes a feature specification with the
same name as one inherited from any of the types specified by this
attribute, or if more than one specification of the same name is
inherited, then the possible values of that feature is determined by
unification. Similarly, the set of constraints applicable is derived
by conjoining those specified explicitly within this element with
those implied by the <att>baseTypes</att> attribute. When no base type
is specified, no feature specification or constraint is inherited.</p>
<p>Although the present standard does provide for default feature values,
feature inheritance is defined to be monotonic.
</p><p>
The process of combining constraints may result in a contradiction,
for example if two specifications for the same feature specify
disjoint ranges of values, and at least one such specification is
mandatory. In such a case, there is no valid feature structure of the
type being defined.</p>
<p>
Every type specified by <att>baseTypes</att> must be a single word which
is a legal XML name; for example, they cannot include whitespace or
begin with digits.  Multiple base types are separated with spaces,
e.g. <tag>fsDecl type="Sub" baseTypes="Super1 Super2"</tag>.</p>
<specGrp xml:id="DFDFSD2" n="Feature System Declaration">









<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/fsdDecl.xml"/>





 








<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/fsDecl.xml"/>














<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/fsDescr.xml"/>





 








<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/fsdLink.xml"/>




</specGrp>

</div>
<div type="div2" xml:id="FDFD"><head>Feature Declarations</head>
<p>Each feature is declared in an <gi>fDecl</gi> element whose
<att>name</att> attribute identifies the feature being declared; this
matches the <att>name</att> attribute of the <gi>f</gi> elements it
declares.  
<!--
An <gi>fDecl</gi> also has an <att>org</att> attribute
which declares the organizing principle for the values of the
<gi>f</gi> elements it declares.  That is, the value may be a 
<val>unit</val> (a single value), a 
<val>set</val> (in which the order is not significant and
there are no duplicates), a <val>bag</val> (in which
the order is not significant but duplicates are allowed), or a <val>list</val> (in which the order is significant).  (See
definition of <att>org</att> attribute of <gi>f</gi> in section <ptr
target="#FSSS"/>.)  -->

An <gi>fDecl</gi> has three parts: an optional
prose description (which should explain what the feature and its
values represent), an obligatory range specification (which declares
what values the feature is allowed to have), and an optional default
specification (which declares what default value should be supplied
when the named feature does not appear in an <gi>fs</gi>). 
If, in a feature
structure, a feature:
<list rend="bulleted"><item> is not optional (i.e., is obligatory),</item>
<item> has no value provided, or the value <gi>default</gi> is
provided (see <ref target="#FSBO">ISO 24610-1, Subclause 5.10, Default Values</ref>, and</item>
<item> either has no default specified, or has conditional defaults,
none of the conditions on which is met,</item>
</list>
then the value of this feature in the feature structure's most
general valid extension is the most general value provided in its
<gi>vRange</gi>, in the case of a unit organization, or the
singleton set, bag, or list containing that element, in the case of a
complex organization.  If the feature:
<list rend="bulleted"><item> is optional,</item>
<item> has no value provided, or the value <gi>default</gi> is
provided, and</item>
<item> either has a default specified, or has conditional defaults,
one of the conditions on which is met,</item>
</list>
then this feature does have a value in the feature structure's most
general valid extension when it exists, namely the default value that
pertains.
</p>
<p>It is possible that a feature structure will not have a valid
extension because the default value that pertains to a feature is not
consistent with that feature's declared range.  Additional tools
are required for the enforcement of such criteria.
</p>
<!-- A single
unconditional default value may be specified, or multiple conditional
values.  If no default is specified, or if none of the conditions is
met, then the default value is <gi>none</gi>; in other words, the
feature is not applicable (see section <ptr target="#FSBO"/> for a
discussion of the <gi>none</gi> element).</p>-->
<p>The following elements are  used in feature system declarations:
<specList>
<specDesc key="fDecl" atts="name optional"/>
<specDesc key="fDescr"/>
<specDesc key="vRange"/>
<specDesc key="vDefault"/>
<specDesc key="if"/>
<specDesc key="then"/></specList></p>
<p>The logic for validating feature values and for matching the
conditions for supplying default values is based on the operation of
<term>subsumption</term>.  Subsumption is a standard operation in
feature-structure-based formalisms.  Informally, a feature structure
<name>FS</name> subsumes all feature structures that are at least as
informative as itself; that is, all feature structures that
specify all of the feature values that FS does with values that are
subsumed by the values that FS has, and that have all of the
re-entrancies (see <ptr target="#FSVAR"/>) that FS does. (<ptr type="cit" target="#FS-BIBL-5"/>;
see also <ptr type="cit" target="#FS-BIBL-1"/> and <ptr type="cit" target="#FS-BIBL-2"/>)
A more formal definition is provided in ISO 24610-1:2006 <!-- <ptr
target="#reviewSec"/>-->. </p>

<!-- requires that we first define the notion of
<q>domain of a feature structure.</q> A feature structure can be viewed
as a partial function that maps features onto values; when viewed in
this way, the domain of a feature structure is the set of top-level
features it contains (that is, excluding features in embedded feature
structures).  We can now offer a more precise definition:
<q rend="display">     <emph>fs</emph> subsumes <emph>fs′</emph> if both are
     identical primitive values, or if the domain of <emph>fs</emph>
     is a subset of the domain of <emph>fs′</emph>, and for every
     feature <emph>f</emph> in the domain of <emph>fs</emph>, the
     value of <emph>f</emph> in <emph>fs</emph> subsumes the value
     of <emph>f</emph> in <emph>fs′</emph>.</q></p>-->
<p>Following the spirit of the informal definition above, we can extend
subsumption in a straightforward way to cover alternation, negation,
special primitive values, and the use of attributes in the markup.
For instance, a <gi>vAlt</gi> containing the value <val>v</val> subsumes <val>v</val>.  The negation
of a value <val>v</val> (represented by means of the
<gi>vNot</gi> element discussed in section <ptr target="#FVNOT"/>)
subsumes any value that is not <val>v</val>; for
example <code>&lt;vNot&gt;&lt;numeric value='0'/&gt;&lt;/vNot&gt;</code> subsumes any
numeric value other than zero.
<!--The value <tag>binary value="true"</tag> subsumes any value that is in
the range of a feature, and the value <tag>binary value="false"</tag>
subsumes any value that is not.  --> The value <tag>fs
type="X"/</tag> subsumes any feature structure of type <val>X</val>,
even if it is not valid.
</p>
<!-- restore discussion of subsumption removed from old FS here? -->
<p>As an example of feature declarations, consider the following extract
from Gazdar et al.'s <title>Generalized Phrase Structure
     Grammar</title>. In the appendix to their book, they
propose a feature system for English of which this is just a sampling:
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#FS-BIBL-3">feature    value range
INV        {+, -}
CONJ       {and, both, but, either, neither, nor, or, NIL}
COMP       {for, that, whether, if, NIL}
AGR        CAT
PFORM      {to, by, for, ...}</egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples">Feature specification defaults
FSD 1:  [-INV]
FSD 2:  ~[CONJ]
FSD 9:  [INF, +SUBJ] --&gt; [COMP for]</egXML></p>
<p>The INV feature, which encodes whether or not a sentence is inverted,
allows only the values plus (+) and minus (-).  If the feature is not
specified, then the default rule (FSD 1 above) says that a value of
minus is always assumed.  The feature declaration for this feature would
be encoded as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fDecl name="INV">
  <fDescr>inverted sentence</fDescr>
   <vRange><vAlt>
      <binary value="true"/>
      <binary value="false"/>
   </vAlt></vRange>
  <vDefault><binary value="false"/></vDefault>
</fDecl></egXML></p>
<p>The value range is specified as an alternation (more precisely, an
exclusive disjunction), which can be represented by  the
<gi>binary</gi> feature value.   That is,
the value must be either true or false, but cannot be both or neither.</p>
<p>The CONJ feature indicates the surface form of the conjunction used
in a construction.  The ~ in the default rule (see FSD 2 above)
represents negation.  This means that by default the feature is not
applicable, in other words, no conjunction is taking place.  Note that
CONJ not being present  is distinct from CONJ being present but having the NIL value allowed in
the value range.  In their analysis, NIL means that the phenomenon of
conjunction is taking place but there is no explicit conjunction in the
surface form of the sentence.  The feature declaration for this feature
would be encoded as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fDecl name="CONJ">
   <fDescr>surface form of the conjunction</fDescr>
   <vRange>
      <vAlt>
        <symbol value="and"/>
        <symbol value="both"/>
        <symbol value="but"/>
        <symbol value="either"/>
        <symbol value="neither"/>
        <symbol value="nor"/>
        <symbol value="or"/>
        <symbol value="NIL"/>
      </vAlt>
   </vRange>
   <vDefault><binary value="false"/></vDefault>
</fDecl></egXML>
 <!-- binary was "none" -->
Note that the <gi>vDefault</gi> is not strictly necessary in this case,
since the binary value of <val>false</val> only serves to convey the
information that the feature has no other legitimate value.
</p>

<p>The COMP feature indicates the surface form of the complementizer
used in a construction.  In value range, it is analogous to CONJ.
However, its default rule (see FSD 9 above) is conditional.  It says
that if the verb form is infinitival (the VFORM feature is not
mentioned in the rule since it is the only feature that can take INF
as a value), and the construction has a subject, then a
<mentioned>for</mentioned> complement must be used.  For instance, to
make John the subject of the infinitive in <mentioned>It is necessary
to go,</mentioned> a <mentioned>for</mentioned> complement must be
used; that is, <mentioned>It is necessary for John to go.</mentioned>
The feature declaration for this feature would be encoded as follows:

<egXML xmlns="http://www.tei-c.org/ns/Examples"><fDecl name="COMP">
   <fDescr>surface form of the complementizer</fDescr>
   <vRange>
      <vAlt>
        <symbol value="for"/>
        <symbol value="that"/>
        <symbol value="whether"/>
        <symbol value="if"/>
        <symbol value="NIL"/>
      </vAlt></vRange>
   <vDefault>
      <if><fs><f name="VFORM"><symbol value="INF"/></f>
              <f name="SUBJ"><binary value="true"/></f></fs>
      <then/><symbol value="for"/></if>
   </vDefault>
</fDecl></egXML></p>
<p>The AGR feature stores the features relevant to subject-verb
agreement.  Gazdar et al. specify the range of this feature as CAT.
This means that the value is a <term rend="noindex">category</term>, which
is their term for a feature structure.  This is actually too weak a
statement.  Not just any feature structure is allowable here; it must be
a feature structure for agreement (which is defined in the complete
example at the end of the chapter to contain the features of person and
number).  The following feature declaration encodes this constraint on
the value range:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fDecl name="AGR">
   <fDescr>agreement for person and number</fDescr>
   <vRange><fs type="Agreement"/></vRange>
</fDecl></egXML>
That is, the value must be a feature structure of type <val>Agreement</val>.  The complete example at the end of this
chapter includes the <tag>fsDecl type="Agreement"</tag> which includes
<tag>fDecl name="PERS"</tag> and <tag>fDecl name="NUM"</tag>.</p>
<p>The PFORM feature indicates the surface form of the preposition used
in a construction.  Since PFORM is specified above as an open set,
<gi>string</gi> is used in the range specification below rather than
<gi>symbol</gi>.
<egXML xmlns="http://www.tei-c.org/ns/Examples"><fDecl name="PFORM">
   <fDescr>word form of a preposition</fDescr>
   <vRange><vNot><string/></vNot></vRange>
</fDecl></egXML>
This example makes use of a negated value:    <code>&lt;vNot&gt;&lt;string/&gt;&lt;/vNot&gt;</code>
subsumes any string that is not the empty
string.</p>
<p>Note that
the class <ident type="class">model.featureVal</ident> includes all possible
single feature values, including  feature structures, alternations
(<gi>vAlt</gi>) and complex collections (<gi>vColl</gi>).</p>
<specGrp xml:id="DFDX" n="Feature definitions">





<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/fDecl.xml"/>














<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/fDescr.xml"/>














<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/vRange.xml"/>














<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/vDefault.xml"/>














<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/if.xml"/>














<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/then.xml"/>






</specGrp></div>
<div type="div2" xml:id="FDFS"><head>Feature Structure Constraints</head>
<p>Ensuring the validity of feature structures may require much more
than simply specifying the range of allowed values for each feature.
There may be constraints on the co-occurrence of one feature value with
the value of another feature in the same feature structure or in an
embedded feature structure.</p>
<p>Such constraints on valid feature structures are expressed as a
series of conditional and biconditional tests in the
<gi>fsConstraints</gi> part of an <gi>fsDecl</gi>.  A particular feature
structure is valid only if it meets all the constraints.  The
<gi>cond</gi> element encodes the conventional if-then conditional of
boolean logic which succeeds when both the antecedent and consequent are
true, or whenever the antecedent is false.  The <gi>bicond</gi> element
encodes the biconditional (if and only if) operation of boolean logic.
It succeeds only when the corresponding
if-then conditionals in both directions are true. <!--It succeeds only when both antecedent and consequent are true, or both
are false. --> In feature structure constraints the antecedent and
consequent are expressed as feature structures; they are considered true
if they <term rend="noindex">subsume</term><!--<index><term>subsumption</term><index><term>of feature structures</term></index></index>-->
(see section <ptr target="#FDFD"/>) the feature structure in question, but in the
case of consequents, this truth is asserted rather than simply
tested.  That is to say, a conditional is enforced by determining that
the antecedent does not (and will never) subsume the given feature
structure, or by determining that the antecedent does subsume the
given feature structure, and then unifying the consequent with it (the
result of which, if successful, will be subsumed by the consequent).
In practice, the enforcement of such constraints can result in
periods in which the truth of a constraint with respect to a given
feature structure is simply not known; in this case, the constraint
must be persistently monitored as the feature structure becomes more
informative until either its truth value is determined or computation
fails for some other reason.</p>
<p> The
following elements make up the <gi>fsConstraints</gi> part of an FSD:
<specList><specDesc key="fsConstraints"/><specDesc key="cond"/><specDesc key="bicond"/><specDesc key="then"/><specDesc key="iff"/></specList></p>
<p>For an example of feature structure constraints, consider the
following <soCalled>feature co-occurrence restrictions</soCalled>
extracted from the feature system for English proposed by Gazdar, et al. (1985:246–247):
<eg>[FCR 1:  [+INV] &#8594; [+AUX, FIN]</eg>
<eg>FCR 7:  [BAR 0] &#x2261; [N] &amp; [V] &amp; [SUBCAT]</eg>
<eg>FCR 8:  [BAR 1] &#8594; ~[SUBCAT]]</eg></p>
<p>The first constraint says that if a construction is inverted, it must
also have an auxiliary and a finite verb form.  That is,
   <egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples"><cond>
   <fs><f name="INV"><binary value="true"/></f></fs>
 <then/>
   <fs><f name="AUX"><binary value="true"/></f>
       <f name="VFORM"><symbol value="FIN"/></f>
   </fs>
</cond></egXML></p>
<p>The second constraint says that if a construction has a BAR value of
zero (i.e., it is a sentence), then it must have a value for the
features N, V, and SUBCAT.  By the same token, because it is a
biconditional, if it has values for N, V, and SUBCAT, it must have
BAR='0'.  That is,
   <egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples"><bicond>
   <fs><f name="BAR"><symbol value="0"/></f></fs>
   <iff/>
   <fs>
      <f name="N"><binary value="true"/></f>
      <f name="V"><binary value="true"/></f>
      <f name="SUBCAT"><binary value="true"/></f>
   </fs>
</bicond></egXML></p>
<!-- <binary> was <any>-->
<p>The final constraint says that if a construction has a BAR value of 1
(i.e., it is a phrase), then the SUBCAT feature should be absent (~).
This is not biconditional, since there are other instances under which
the SUBCAT feature is inappropriate.  That is,
   <egXML xml:lang="und" xmlns="http://www.tei-c.org/ns/Examples"><cond>
   <fs><f name="BAR"><symbol value="1"/></f></fs>
  <then/>
    <fs><f name="SUBCAT"><binary value="false"/></f></fs>
</cond></egXML></p>
<!-- <binary> was <none>-->
<p>
Note that <gi>cond</gi> and <gi>bicond</gi> use the empty tags
<gi>then</gi> and <gi>iff</gi>, respectively, to separate the antecedent
and consequent.  These are primarily for the sake of enhancing human
readability.</p>
<specGrp xml:id="DFD2" n="Feature structure constraints">









<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/fsConstraints.xml"/>














<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/cond.xml"/>














<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/bicond.xml"/>














<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/iff.xml"/>






</specGrp></div>
<div type="div2" xml:id="FDEG"><head>A Complete Example</head>
<p>To summarize this chapter, the complete FSD for the example that has
run through the chapter is reproduced below:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><TEI>
<teiHeader>
<fileDesc>
<titleStmt>
   <title>A sample FSD based on an extract from Gazdar
      et al.'s GPSG feature system for English</title>
   <respStmt>
      <resp>encoded by</resp>
      <name>Gary F. Simons</name>
   </respStmt>
</titleStmt>
<publicationStmt>
<p>This sample was first encoded by Gary F. Simons (Summer
Institute of Linguistics, Dallas, TX) on January 28, 1991.
Revised April 8, 1993 to match the specification of FSDs
in version P2 of the TEI Guidelines. Revised again December 2004 to
be consistent with  the feature structure representation standard
jointly developed with ISO TC37/SC4.
</p></publicationStmt>
<sourceDesc>
<p>This sample FSD does not describe a complete feature
system.  It is based on extracts from the feature system
for English presented in the appendix (pages 245–247) of
Generalized Phrase Structure Grammar, by Gazdar, Klein,
Pullum, and Sag (Harvard University Press, 1985).</p>
</sourceDesc>
</fileDesc>
</teiHeader>
<fsdDecl>
<fsDecl type="GPSG">
   <fsDescr>Encodes a feature structure for the GPSG analysis
     of English (after Gazdar, Klein, Pullum, and Sag)</fsDescr>
   <fDecl name="INV">
      <fDescr>inverted sentence</fDescr>
      <vRange>
         <vAlt>
           <binary value="true"/>
           <binary value="false"/>
         </vAlt>
      </vRange>
      <vDefault><binary value="false"/></vDefault>
   </fDecl>
   <fDecl name="CONJ">
      <fDescr>surface form of the conjunction</fDescr>
      <vRange><vAlt><symbol value="and"/><symbol value="both"/>
         <symbol value="but"/><symbol value="either"/>
         <symbol value="neither"/><symbol value="nor"/>
         <symbol value="or"/><symbol value="NIL"/>
      </vAlt></vRange>
      <vDefault><binary value="false"/></vDefault>
      </fDecl>
   <fDecl name="COMP">
      <fDescr>surface form of the complementizer</fDescr>
      <vRange><vAlt><symbol value="for"/>
           <symbol value="that"/><symbol value="whether"/>
           <symbol value="if"/><symbol value="NIL"/>
       </vAlt></vRange>
      <vDefault>
         <if><fs><f name="VFORM"><symbol value="INF"/></f>
                 <f name="SUBJ"><binary value="true"/></f></fs>
        <then/><symbol value="for"/></if>
      </vDefault>
   </fDecl>
   <fDecl name="AGR">
      <fDescr>agreement for person and number</fDescr>
      <vRange><fs type="Agreement"/></vRange>
   </fDecl>
   <fDecl name="PFORM">
      <fDescr>word form of a preposition</fDescr>
         <vRange><vNot><string/></vNot></vRange>
   </fDecl>
   <fsConstraints>
      <cond><fs><f name="INV"><binary value="true"/></f></fs>
      <then/><fs>
          <f name="AUX"><binary value="true"/></f>
          <f name="VFORM"><symbol value="FIN"/></f>
          </fs>
     </cond>
     <bicond><fs><f name="BAR"><symbol value="0"/></f></fs>
      <iff/><fs>
         <f name="N"><binary value="true"/></f>
         <f name="V"><binary value="true"/></f>
         <f name="SUBCAT"><binary value="true"/></f>
          </fs>
      </bicond>
      <cond><fs><f name="BAR"><symbol value="1"/></f></fs>
        <then/>   
           <fs><f name="SUBCAT"><binary value="false"/></f></fs>
      </cond>
   </fsConstraints>
</fsDecl><fsDecl type="Agreement">
   <fsDescr>This type of feature structure encodes the features
      for subject-verb agreement in English</fsDescr>
   <fDecl name="PERS">
      <fDescr>person (first, second, or third)</fDescr>
      <vRange><vAlt>
        <symbol value="1"/><symbol value="2"/><symbol value="3"/>
       </vAlt></vRange>
   </fDecl>
   <fDecl name="NUM">
      <fDescr>number (singular or plural)</fDescr>
      <vRange><vAlt><symbol value="sg"/><symbol value="pl"/>
      </vAlt></vRange>
   </fDecl>
</fsDecl>
</fsdDecl></TEI></egXML></p>

</div></div>


<div type="div2" xml:id="FSDEF"><head>Formal Definition and Implementation</head>
<p>This elements discussed in this chapter constitute a module of the
TEI scheme which is formally defined as follows:
<moduleSpec xml:id="DFS" ident="iso-fs">
<altIdent type="FPI">Feature Structures</altIdent>
<desc>Feature structures</desc>
<desc xml:lang="fr">Structures de traits</desc>
<desc xml:lang="zh-TW">功能結構 (Feature Structures)</desc>
<desc xml:lang="it">Strutture di configurazione (feature structures)</desc><desc xml:lang="pt">Estrutura das características</desc><desc xml:lang="ja">素性構造モジュール</desc></moduleSpec>
<!--publicID:  -//TEI P5//DTD Additional Module for Feature Structure Representation//EN-->

The selection and combination of modules to form a TEI schema is described in
<ptr target="#STIN"/>.
<specGrp>









<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/fs.xml"/>















<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/f.xml"/>















<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/binary.xml"/>















<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/symbol.xml"/>















<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/numeric.xml"/>















<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/string.xml"/>















<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/vLabel.xml"/>















<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/vColl.xml"/>















<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/default.xml"/>















<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/vAlt.xml"/>















<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/vNot.xml"/>















<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/vMerge.xml"/>















<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/fLib.xml"/>















<include xmlns="http://www.w3.org/2001/XInclude" href="../../Specs/fvLib.xml"/>






</specGrp></p>
</div>
</div>
