Summary of Substantive and Rhetorical Points and Queries in AI3 W5

    drafted CMSMcQ, 15 Feb 91
    24 May 91, separated from notes for replies
     3 Jun 91, added cross-references to TEI P1 for all items and
               reviewed against original, adding new items.

1.  Guidelines need a theoretical introduction which defines 'text',
'tag', 'hierarchy', etc.

2.  Question:  are multiple hierarchic structures (physical, formal,
grammatical, semantic, actantial, narrative, psychological, etc.) (a)
all definable as hierarchies in SGML, (b) taggable in the TEI scheme?

3.  Can SGML handle richness of expression and multiple levels of
meaning?

4.  Discussion of highlighting and font shifts pp. 78, 124 seems to
imply reliance on authorial intention; such a reliance ceased being
intellectually respectable about 1940.  (Theoretical introduction would
help avoid this misconception.)

5.  Literary work with computers requires:
    a.  inexpensive data entry
    b.  preservation of page and line numbers of the copy text
    c.  a stable electronic text not subject to change

Many literary scholars object to the inclusion of interpretive matter in
electronic texts.

The guidelines should distinguish clearly between a minimal set of
required or recommended tags and the optional tags to be used at the
discretion of the encoder.  (Cf. 23.)

6.  The technical documentation of the TEI scheme should not include
any claims that it will prove useful to scholars embarking on the
creation of an electronic text.  (This will prevent neophytes from
attempting to use the technical documentation instead of more
appropriate tutorial material.)

7.  P. 4 alludes to macros and parsers but gives no examples.  If they
exist, examples should be listed here.

8. Section 2.1.4 recommends embedding an interpretation of a text into
its DTD; no interpretations should be embedded in the text.

9.  The guidelines should make explicit the distinction between
interpretive markup (e.g. tags for emphatic phrases and foreign words)
and non-interpretive markup (e.g. tags for font shifts).

10.  Markup minimization should be explicitly encouraged.

11.  Text structure should be made clear by format of the file, not by
explicit tags for text structure, for the reason given in point 11.

12.  Explicit coding of text structure will be error-prone and hard to
verify.  Implicit coding by means of file format should be preferred.

13.  Pagination and lineation frequently vary with the printing, not
just the edition, of a text; printer and date of printing should be
required in the TEI header.  (Section 4.3.2.)

14.  Section 5.1 (p. 71) definition of 'text' is incorrect.  Not all
texts are extended and spoken discourse is not 'text' until written
down.

15.  Line numbers are important methods of locating specific passages of
text and should be recommended for general use.  (To section 5.1.)

16.  The word 'colophon' is not one everyone can be expected to know.
(Section 5.2.5.)

17.  In Pleiade editions, the colophon appears in the front matter.
[The 'colophon' tag should therefore be allowed in the front matter.]
(Section 5.2.5.)

18.  Line breaks should be mentioned as a possible constituent of
paragraphs in section 5.3.1

19.  The Literary Needs Survey made abundantly clear that line numbering
is required by literary scholars and should be recommended in all cases.
(To section 5.11.2.)

20.  Normal practice in literary study of prose texts is to refer to
page and line numbers.  (To section 5.11.2.)

21.  Section 7.3.1.1 requires (p. 177) the specification of the METER
attribute in every line; this should not be so.

22.  The prescription for rendering rhyme pattern made in section
7.3.1.2 is too prescriptive and should be loosened.

23.  In cases where an older and more authoritative method of
identifying specific passages exists, as in the Bible, it should be used
in preference to page and line numbers from the source text.  (To
appendix A.6.)

24.  Explicit lists should be given of required and optional tags.  (Cf.
5.)

25.  Short tags should be explicitly recommended for local processing,
expanding on the recommendation to that effect in section 1.1.2.

26.  Exclamation point, pound sign, and square brackets should be
allowed in interchange.  SGML should not take precedence over the needs
of scholars.  (Section 3.2.)

27.  Names of data entry personnel should be recorded in the TEI header.
(Section 4.3.1.)

28.  The tags for names and abbreviations (sections 5.3.6 and 5.3.7)
must be optional.

29.  List handling tags are too wordy and take too much for granted.
(Section 5.3.8.)

30.  Section 5.4 (on tags for editorial interventions) applies only to
post-input markup.

31.  The example given for critical apparatus is trivial.  (Section
5.10.3.)

32.  Lack of variants should not be recorded explicitly.  (Section
5.10.3.)

33.  Experts in text criticism should be consulted for the tags for
critical apparatus.  (Section 5.10.3.)

34.  In general, the guidelines should recommend the capture of
typographic features and only typographic features; the use of
descriptive markup for underlying features should be strictly optional
and not recommended.

35.  Direct quotation, indirect quotation, indirect discourse, free
indirect discourse, authorial comment, description, and narration cannot
reliably be distinguished from each other and users should be cautioned
to use prudence in using tags for such features.  (Section 2.1.2.)

36.  The discussion of crystals in sections 5.3.1 and 5.3.12 needs
revision for clarity.

37.  The bibliographic tagging in section 5.3.7 is too cumbersome,
especially for use in data capture.

38.  Section 5.8.1 proposes a tag for sentence boundaries, which assumes
that sentence boundaries can be known.

39.  Consistent use of presentational markup would avoid the problems
that arise when descriptive markup is not feasible for some reason.

40.  The example from Richardson's Clarissa in section 5.11.1 does not
identify the copy text or give page and line numbers.

41.  In the example from Richardson, the word 'Anglice' is marked as
Latin, but it is not found in Lewis and Short:  a perfect example of the
weakness of descriptive markup [i.e. here it is actively misleading].
(Section 5.11.1.)

42.  In the Richardson example, it is unclear whether the italics mark
quotation, emphasis, or irony.  The interpretation of the italics should
be performed by the analyst.  (Section 5.11.1.)

43.  In section 7.3, no claims should be made about whether texts
should be studied in context or in isolation.

44.  The use of tags like DIV0 and DIV1 will frighten literary scholars.
Blank lines should be used instead.  (Section 5.2.4.)

45.  The discussion of legal content in section 5.2.4 should be made
clearer and rewritten without the term 'legal'.

46.  The second example date in section 5.3.11 should end the tagged
date after 'seventy-seven', not after 'Eighty-Sixth', to be consistent
with the interpreted value given.

47.  What is the meaning of the 'unit=absent' attribute-value pair for
the MILESTONE tag?  What is there to mark if the text is not present?
(Section 4.6.2.)

48.  Soft hyphens need further attention.  (To section 5.8.2.)

49.  Section 7.1 uses the term 'narrative' in the sense 'prose'.

50.  Section 7.3.2.1 engages in overkill by specifying Cordelia as the
speaker both in an element and in an attribute.  Why?

51.  Section 7.3.2.1 does not contribute to the problem of attaching to
each sentence or word of a play the identity of its speaker.

52.  Cast list should also include date and location of first
performance.  (Section 7.3.2.4.)

53.  The confusion between '1' and '2' and 'Francisco' and 'Barnardo' is
messy.  (Appendix A.3.)

54.  The DTD for drama is unusable.  (Appendix C.12.)