Proposal for Anonymous Block Element


C. M. Sperberg-McQueen

3 February 1997

TEI TC Core W04

Table of Contents


Background

The DTD of TEI P3 defines a large number of element types, with a wide variety of meanings. In addition, it defines one element (<seg>), which has no specified meaning. The <seg> element may be used:

Because <seg> has no defined meaning of its own beyond that inherent in the concept of an SGML element type, it may be regarded as a sort of `anonymous' element type (by analogy with the anonymous functions provided by some programming languages).

The <seg> element can be used only for phrase-level elements, because <seg> is a member of class phrase. It thus can appear within paragraphs, etc. (strictly: within any element with a content model of paraContent, specialPara, or phrase.seq), but not between paragraphs, directly within text divisions.

It would be convenient to have an anonymous element type usable at the component level of documents; this would allow a cleaner markup of

Possible Solutions

Two possible solutions seem obvious:

The Core subcommittee leans toward the second solution, since the structural distinction between anonymous phrase-level elements and anonymous chunk-level elements seems worth reflecting in the element type.[1]

Proposal

An element <block> should be added to the additional tag set for linking and alignment, in section 14.3, which is where <seg> is defined.

It should have the following description:

<block>: contains any arbitrary component-level unit of text. Attributes include:

The tag list at the beginning of the section should list the tags in the order <anchor>, <seg>, and <block>, and the discussion of the <anchor> element should be moved from the end of the discussion section, where it is currently lost, to the beginning.

The discussion of <seg> and <block> should read:

The <seg> and <block> elements can be used at the encoder's discretion to mark almost any segment of the text which is of interest for processing. One use of these elements is to mark text features for which these Guidelines otherwise provide no appropriate markup, i.e. as a simple extension mechanism. Another use is to provide an identifier for some segment which is to be pointed at by some other element, i.e. to provide a target, or a part of a target, for a <ptr> or other similar element.

Several examples of uses for the <seg> element are provided elsewhere ...

(Continue with current discussion of <seg> element.)

The remainder of this chapter contains a number of examples of the use of the <seg> element simply to provide an element to which an identifier may be attached, for example so that another segment may be linked or related to it in some way.

The <block> element performs a similar function for portions of the text which occur not within paragraphs or other component-level elements, but at the component level themselves. It may be used, for example, to tag the canonical verse divisions of Biblical texts:

 
<div1 type='book' n='Gen'>
<head>The First Book of Moses, Called</head>
<head type='main'>Genesis</head>
<div2 type='chapter' n='1'>
<block n='1'>In the beginning God created the heaven and the
earth.</block>
<block n='2'>And the earth was without form, and void; and darkness
<hi>was</hi> upon the face of the deep.  And the Spirit of God
moved upon the face of the waters.</block>
<block n='3'>And God said, Let there be light:  and there was
light.</block>
<!-- ... -->

In other cases, where the text clearly indicates paragraph divisions containing one or more verses, the <p> element may be used to tag the paragraphs, and the <seg> element used to subdivide them. The <block> element may not be used here, as it may occur only within, not between, paragraphs.

 
<div1 type='book' n='Gen'><head>Das Erste Buch Mose.</head>
<div2 type='chapter' n='1'>
<p>
<seg n='1'>Am Anfang schuff Gott Himel vnd Erden.</seg>
<seg n='2'>Vnd die Erde war w&uuml;st vnd leer / vnd es war
finster auff der Tieffe / Vnd der Geist Gottes schwebet auff
dem Wasser.</seg>
</p>
<p>
<seg n='3'>Vnd Gott sprach / Es werde Liecht / Vnd es ward
Liecht.</seg>
<!-- ... -->

Additional examples of the use of the <block> element are given elsewhere in these Guidelines:

The discussions of dramatic speeches in the sections indicated should use <block>, not <seg>.

Section 14.3 should be renamed Segments, Blocks, and Anchors.

The declaration for <block> should be

 
<!ELEMENT block         - O  (%paraContent;)                    >
<!ATTLIST block              %a.global;
                             %a.seg;
          subtype            CDATA               #IMPLIED
          TEIform            CDATA               'block'        >

Open Questions

A number of questions are answered implicitly in the proposal just given; they may need explicit discussion.

Notes

[1] The elements are semantically anonymous, but their structural position is usually clear; there seems no reason not to make it manifest.
[return to text]