The DTD of TEI P3 defines a large number of element types, with a wide variety of meanings. In addition, it defines one element (<seg>), which has no specified meaning. The <seg> element may be used:
The <seg> element can be used only for phrase-level elements, because <seg> is a member of class phrase. It thus can appear within paragraphs, etc. (strictly: within any element with a content model of paraContent, specialPara, or phrase.seq), but not between paragraphs, directly within text divisions.
It would be convenient to have an anonymous element type usable at the component level of documents; this would allow a cleaner markup of
Two possible solutions seem obvious:
An element <block> should be added to the additional tag set for linking and alignment, in section 14.3, which is where <seg> is defined.
It should have the following description:
<block>: contains any arbitrary component-level unit of text. Attributes include:
- type characterizes the type of the text block
- ident characterizes the function of the text block
- subtype provides a subcategorization of the text block, if needed
The tag list at the beginning of the section should list the tags in the order <anchor>, <seg>, and <block>, and the discussion of the <anchor> element should be moved from the end of the discussion section, where it is currently lost, to the beginning.
The discussion of <seg> and <block> should read:
The <seg> and <block> elements can be used at the encoder's discretion to mark almost any segment of the text which is of interest for processing. One use of these elements is to mark text features for which these Guidelines otherwise provide no appropriate markup, i.e. as a simple extension mechanism. Another use is to provide an identifier for some segment which is to be pointed at by some other element, i.e. to provide a target, or a part of a target, for a <ptr> or other similar element.
Several examples of uses for the <seg> element are provided elsewhere ...
(Continue with current discussion of <seg> element.)
The remainder of this chapter contains a number of examples of the use of the <seg> element simply to provide an element to which an identifier may be attached, for example so that another segment may be linked or related to it in some way.
The <block> element performs a similar function for portions of the text which occur not within paragraphs or other component-level elements, but at the component level themselves. It may be used, for example, to tag the canonical verse divisions of Biblical texts:
<div1 type='book' n='Gen'> <head>The First Book of Moses, Called</head> <head type='main'>Genesis</head> <div2 type='chapter' n='1'> <block n='1'>In the beginning God created the heaven and the earth.</block> <block n='2'>And the earth was without form, and void; and darkness <hi>was</hi> upon the face of the deep. And the Spirit of God moved upon the face of the waters.</block> <block n='3'>And God said, Let there be light: and there was light.</block> <!-- ... -->In other cases, where the text clearly indicates paragraph divisions containing one or more verses, the <p> element may be used to tag the paragraphs, and the <seg> element used to subdivide them. The <block> element may not be used here, as it may occur only within, not between, paragraphs.
<div1 type='book' n='Gen'><head>Das Erste Buch Mose.</head> <div2 type='chapter' n='1'> <p> <seg n='1'>Am Anfang schuff Gott Himel vnd Erden.</seg> <seg n='2'>Vnd die Erde war wüst vnd leer / vnd es war finster auff der Tieffe / Vnd der Geist Gottes schwebet auff dem Wasser.</seg> </p> <p> <seg n='3'>Vnd Gott sprach / Es werde Liecht / Vnd es ward Liecht.</seg> <!-- ... -->Additional examples of the use of the <block> element are given elsewhere in these Guidelines:
- as a means of marking dramatic speeches when it is not clear whether the speech is to be regarded as prose or verse (see section 6.11.2, "Core Tags for Drama," on p. 212, and section 10.2.4, "Speech Contents," on p. 285).
- (to be specified)
The discussions of dramatic speeches in the sections indicated should use <block>, not <seg>.
Section 14.3 should be renamed Segments, Blocks, and Anchors.
The declaration for <block> should be
<!ELEMENT block - O (%paraContent;) > <!ATTLIST block %a.global; %a.seg; subtype CDATA #IMPLIED TEIform CDATA 'block' >
A number of questions are answered implicitly in the proposal just given; they may need explicit discussion.
idref
links to arbitrary phrase-level spans within a paragaph,
<block> seems unlikely to be seriously useful for hypertext
linking. Its main function will be as an anonymous element.
Perhaps both <block> and <seg> belong in the core?<div><head>Further Discussion</head> <block type='blort'>blort blort blort</block> <p>And furthermore, <seg type='blort'>blort blort blort</seg> which really puts it beyond all doubt, I think.</p> </div>If the latter, should the encoder prefer <seg> or <block> when the item in question occurs within a paragraph?
<u>Our chief weapon is <seg type='weapon'>fear</seq>. <u><block type='weapon'>Fear and surprise. </block></u> <u>Our two chief weapons are <block type='weapon'>Fear and surprise. </block></u>(Or see the example of Bible verses, above.)
[1]
The elements are semantically
anonymous, but their structural position is usually
clear; there seems no reason not to make it manifest.
[return to text]