TEI META Task Force: Minutes of a meeting to discuss P5 development, 2003-10-09 [MEW02]


Contents

This documents results from a meeting between Sebastian Rahtz, Lou Burnard, and Norm Walsh, held in Oxford on 9th October 2003. The aim was to review the way P5 was proposed to develop, in the light of parallel experiences with Docbook. NW reported that he was working on an entirely new revison of Docbook, rewriting it from scratch in Relax NG compact syntax, and looking at all aspects of modularity and element classes on a clean slate.

SR proposed as areas for discussion:
  1. editing and tools
  2. current and future ODD markup
  3. display issues relating to current and future ODD markup
  4. naming policies for patterns and other objects
  5. possible link points to Docbook
We addressed most of these, but not in any particular order. These notes summarize key decisions as LB perceived them.

Use of RNG namespace in Odds

How much RNG syntax should be scattered through an ODD-NG tagdoc? SR circulated a proposal in which both attribute list specs and content models used RNG. This conflicts with the desire to make ODD-NG a proper abstract language and complicates processing. NW said that he'd considered an XML vocabulary for expressing content models etc. but concluded that since such a vocabulary would have to be automatically translateable into RNG, it would be as easy to use RNC/G from the start. We agreed that everything needed for RNG attlist definition could be automatically extracted from existing ODD structures, but that the reverse (i.e. storing everything needed for ODD as RNG) was not the case. We also agreed that content models might as well be specified in RNG as anything else. After the meeting, LB remembered that current tagdocs have <class> elements: this might cause problems if classes map to RNG patterns which are also stated in the RNG content. We will have to take care to make sure all pattern names are unique.

Output of the ODD-NG system

We discussed various options but concluded that it was important to maintain output of modifiable DTD fragments (P4 style: with modules) as well as output of XML format RNG specifications. RNG could be post-processed to produce (a) RNC for use in applications (b) RNC fragments to included in the P5 documentation (as per earlier Council decision to use RNC as formal display expression of the TEI architecture). Modifiable DTD fragments could be post-processed to produce (a) ‘baked pizza’ (b) W3C XML Schema.

We discussed how far the the RNG generated should attempt to use ‘natural’ RNG idioms rather than follow the current rather tortuous methods resulting from DTD-world constraints: SR agreed to investigate this further. The change would be two fold:
  1. Elements would no longer be protected by INCLUDE and IGNORE guards set by the module they were members of. Instead, they would extend appropriate element classes by redefining them with combine="choice". The concept of extension classes (eg x.biblPart) would no longer be needed, as an element could simply extend its class (eg m.biblPart).
  2. Modules would no longer all be loaded at the start, with guards set to prevent them being active; instead a per-file or per-project wrapper would <include> only the desired modules: this would be more self-contained.

Should customization be supported in both DTD land and RNG land? We didn't address this question explicitly, but the tenour of our discussion was that since P5 would be maintained in RNG, use of RNG extension mechanisms such as |= (compact syntax) would become the recommendation, particularly as they offer enhanced functionality (this has obvious implications for the current chapter on modification). At the same time, if we continued to generate modifiable DTD fragments, we could not prevent people continuing to modify them using the existing DTD modification methods. Round-tripping from a TEI DTD modification set to a corresponding set of RNG modifications was not something we wanted to envisage. This would mean that people writing TEI extensions might have to express them twice, once as RNG, once as DTD. This needs wider discussion.

Naming conventions

What should be used internally and which system should be exposed to users of ODD-NG? The current ODD system uses the following prefixes for parameter entities:
n. gi
x. extension class
a. attribute class
m. model class
Some entity names are entirely arbitrary (e.g. phrase.seq, paraContent). The ‘n.’ prefix for gis is not exposed in the documentation, and is only relevant when elements are being renamed, e.g. for internationalization.

We agreed that a set of conventions needed to be agreed and enforced if we were going to make use of the class mechanism in ODD-NG, since it would be essential to distinguish reliably between names of RNG patterns and names of RNG elements.

We didn't decide whether to use suffixes or prefixes. We did decide that unadorned names should be interpreted as element names, and to review the Docbook conventions (specified at http://www.docbook.org ) to see whether we could adopt them.

The class system

In RNG, patterns are first class objects which the user can manipulate, in particular when specifying modifications or customizations of a schema. In ODD-NG we felt that since semantic and model classes were the appropriate way of representing patterns, there were potential gains in systematizing and extending the current rather haphazard class system.

SR enumerated the kinds of TEI object for which ODD-NG needed to define patterns:
  • elements
  • attributes
  • attribute class
  • model class
  • datatypes
We noted that some of these objects are modifiable (e.g. biblPart), but others are not (e.g. phrase.seq); possibly orthogonal to this, some of these objects are ‘public’ (e.g. paraContent) and others ‘private’ (m.phrase, which is never used explicitly in a content model). Irrespective of modifiability, the TEI class system was an essential way of managing complexity, and aiding legibility. We felt therefore that any work done in this are would be beneficial.

We did a quick review of all TEI semantic classes, and noticed some anomalies. We also noted that many elements were not classified at all, notably those in the Header; this needs to be addressed.

We also noted that the identification of approximately equivalent TEI semantic classes and Docbook semantic classes would be a very good way of encouraging convergence between the two standards and agreed to see how far we could get in identifying them. It should be possible to create mixed documents with some elements from TEI, and some from Docbook, if the modules use common element classes at key points.


Last recorded change to this page: 2007-09-16  •  For corrections or updates, contact webmaster AT tei-c DOT org