AI7-Minutes of the 3rd TEI AI7 meeting of 1991-11-15/16 In attendance: Observers: Sue Ellen Wright M. Leamen, ISO/CS Gregory Shreve G. Herzog, DIN Gerhard Budin K.-D. Schmitz, Univ. of Saarbrücken Absent: H. Hjulstad, RTT Alan Melby Dick Strehlow Ms. Wright presided in the absence of Mr. Melby. She began the meeting with a short introduction to the work of TEI AI7 for the benefit of the observers present. 1. Revision of work to date The members of AI7 discussed changes made in TEI-TERM since the last meeting in Oak Ridge, USA, most of which are contained in the current version of papers delivered by Wright and Melby at the Conference on Terminology and Documentation in Hull, Canada, and at ASLIB in the UK. (AI7 W14 and AI7 W20, respectively) 2. ISO activities Mr. Leaman circulated a DTD developed by ISO/IEC/JTC 1/SC 18/WG 8 for handling terminological data for the publication of standards (Attachment # 1). He explicitly welcomed the activities of TEI AI7 as a valuable contribution to a broad ISO strategy for developing a whole range of SGML support facilities. 3. TEI Guidelines (P2) Since AI7 has to present a chapter for the final version of the TEI Guidelines (TEI P2) very soon, discussion began on what to include in this chapter, centering on the formal requirements provided by the TEI editors. Mr. Budin stated that all documents created so far by the group should be condensed and merged into a new document specifically designed for the TEI Guidelines. This work had to begin immediately because the final document was due in Chicago in electronic form by 28 November 1991. 4. Future work After completion of the P2 document, empirical tests must follow. Operators of large Terminological DataBases (TDBs) and developers of Terminology Management Systems (TMSs) should be encouraged to collaborate with AI7 in testing the format by compiling conversion routines to export data from their systems to TEI-TERM format. 5. Current concerns 5.1 The group discussed a newly introduced element, the tag , the other form information group. This tag has been introduced in order to avoid mixed content models following the tag. Example: PCDATA ... PCDATA ... 1991-11-17 5.2 Mr. Hjulstad pointed out the problems inherent in relegating the grammar data element to the level of an attribute value in the form of specific grammatical items of information (pos, gen, etc.). This problem is exacerbated if individual term elements must be documented for multiword terms. As a result of these discussions, the group decided to include a tag in the list of tags, in which case the specific types of grammatical information would become values of the attribute type: Ex. n . 6 Preparation for ISO/TC 37/SC 3 meeting AI7 had submitted several documents to the International Standards Organization Technical Committee 37 Sub-Committee 3 for Computational Aids to Terminology (ISO/TC 37/SC 3) in conjunction with a proposal for a future work item. The group discussed the strategy to be adopted at the next ISO/TC 37 SC 3 meeting on 1991-11-18/19 in Vienna. Several options were presented with respect to SC3's possible treatment of an SGML interchange project vis à vis the existing MATER standard (ISO 6156-1987, "Magnetic tape exchange format for terminological/lexicographical records (MATER)." * submit an SGML format as a Part II to MATER * replace MATER with an SGML format * create a separate standard parallel to MATER * submit an AI7 document to ISO/IEC/JTC 1/SC18 WG 8 as a contribution to be included in a Technical Report on SGML support facilities. The group discussed rationales for these options. Leaman pointed out that this effort is fully in line with a recent ISO/STACO recommendation for the harmonization of structures in terminology work and is thus supported by the ISO Council. The group discussed revisions on the New Work Item Proposal submitted by the SC 3 Secretariat prior to the SC 3 meeting. See Attachment # 4 for the final revised version of the proposal. 7. Practical demonstration of conversion routines Ms. Wright demonstrated two conversion utilities, MMUTS and NORM. The current version of MMUTS (MicroMATER UTilitieS) converts a canonical MicroMATER file to either a flat or a nested TEI-TERM document. The NORM utility converts a flat TEI-TERM document to a nested document prepared for interchange. For interchange purposes, systems developers need only create a conversion utility that will transform their native files into a flat format, which will be very similar to the structure of their existing terminological entries. Theoretically, the NORM utility would then be able to convert the flat entry to the interchange format. Printouts from the various conversion levels are included in Attachment # 5. It is important to note that the documents included in this Attachment were genuinely generated using the respective utilities, even though the final product of both utilities is identical. 8. P2 At subsequent meetings following the official meetings and after the arrival of Mr. Melby from the Myrdal meeting, the AI7 group prepared the basis for its contribution to the P2 document. It was agreed that this document should be a concise presentation of the basic elements of TEI-TERM and that detailed implementations of various features and formats should be dealt with in the tutorial. These documents accompany these minutes as AI7 P18 (P2) and AI7 W 19. 9. Future meetings and activities AI7 must address the following projects in the near future: * Enlist the cooperation of TDB and TMS developers and users in order to test the TEI-TERM format for a wide range of applications. * Write the tutorial to be included with the final P3 document. Mr. Melby also noted that a recommendation had been made to add another European to the WG. The group was of the opinion that the addition of another qualified European was highly desirable, but that the exclusion at this point of any of the Americans already active would be impossible. Group consensus would suggest that some sort of compromise be reached with regard to funding, bearing in mind that AI7 is a highly productive contributor to the TEI effort. The status of future meetings will hinge on discussions with TEI management and on decisions concerning future funding of TEI activities. Respectfully submitted G. Budin and S. Wright Coda: ISO/TC 37/SC 3 Meetings of 1991-11-18/19 The following resolutions passed by SC 3 are pertinent to the work of AI7: Resolutions adopted by ISO/TC 37/SC 3, 1991-11-19: B : The member bodies present at the 8th meeting of ISO/TC 37/SC 3 favor the establishment of an external liaison, category A, to the Text Encoding Initiative - Analysis and Interpretation Working Group - TEI/AI7 "Terminological Data". Wright will initiate a formal request on behalf of TEI-AI7 to the ISO Central Secretariat for the establishment of this liaison. C : ISO/TC 37/SC 3 establishes a working group (WG 1) "Data elements" within the scope of SC 3. Ms Wright has been appointed as the convener of the working group. This activity is consonant with the AI7 position that AI7 does not itself have a formal mandate to standardize a data element directory in the form of a list of attribute values. Wright will pick up on work already performed within SC 3. The questions of combinability and repeatability, formerly a component of SC 3 work on data element names, will be subsumed in the more formalistic work of WG 3 on the SGML interchange format. D : ISO/TC 37/SC 3 establishes a working group (WG 3) "SGML terminology applications" within the scope of SC 3. Mr Budin has been appointed as the convener of the working group. This resolution gives rise to the circulation of the New Work Item Proposal discussed above as Attachment # 4. The version provided with this document represents the final edited version from the SC 3 meeting. F : The member bodies present at the 8th meeting of ISO/TC 37/SC 3 favor the establishment of an internal liaison between ISO/IEC JTC 1/SC 18/WG 8 "Text description and processing languages" and TC 37/SC 3/WG 3. The SC 3 Secretariat will initiate this request for liaison. G : The member bodies present at the 8th meeting of ISO/TC 37/SC 3 favor the confirmation of ISO 6156, "Magnetic tape exchange format for terminological/lexicographical records (MATER)" during the ballot for periodic revision scheduled for 1992. The effect of this resolution on the future work AI7 and TC 37/SC 3/WG 3 is multifold: * No immediate changes will be made in 6156. * The data elements list annexed to 6156 will remain valid while a new data elements directory is produced. * SC 3 can initiate a revision of 6156 at a later date in coordination with SC 3/WG 3 depending on future decisions. Any future version of 6156 is likely to modify the magnetic tape stipulation of the original standard. Respectfully submitted, S. Wright Attachments: Attachment # 1: Revised agenda, TEI-AI7 A15 Attachment # 2: AI7 Documents List, Printed Version of 1991-11-25 Attachment # 3: ISO DTD Attachment # 4: ISO/TC 37/SC 3 N 89 en Rev 2 of 1991-11-19 Attachment # 5: MMUTS and NORM output ************************************************************************* Attachment # 1 to TEI M18 TEI A15: Revised Agenda of 1991-11-15/16 Place: NORM, 1st floor Friday 1991-15-91 9:00 - 12:30 1. General introduction of TEI members and guests All Determine who wants to go on the excursion on Saturday 2. Brief introduction using some of the slides from the Hull presentation SEW 3. Recap of developments since the Oak Ridge meeting based on AI7 WG 14 TEI 10:30 Coffee break 4. Review the new DTDs 12:30 Lunch 2:00 - 5:30 Continuation of morning session, if necessary 5. Strategy planning for the TC 37/SC 3 meeting All 3:45 Break 5:00 Demonstration of MMUTS and NORM Saturday 1991-11-16 8:30 1. General discussion to deal with questions generated from the observers, continued SC 3 strategy discussions All 11:00 2. Excursion ************************************************************************* Attachment # 2, AI7 M17, 1991-11-25 TEI AI7 Documents AI7 P1 Objectives and Deadlines: Terminological Data AI7 W2 MicroMATER Tagset (NISKO paper) AI7 W3 A Proposed Format for Exchanging Term Files (ASTM Paper) AI7 A4 Agenda of the Clevelnad Meeting 11-13 June 91 AI7 W5 Cleveland Minutes AI7 W6 Sample Term Entries AI7 W7 Draft DTD of 1991-07-27 AI7 X8 french.wsd AI7 W9 ISO 5426-1983 Code table AI7 W10 Entities or Attributes for Part of Speech and for Structural Analysis of Statements of Meanings AI7 W11 Terminology Data Categories Survey AI7 M12 Minutes of Oak Ridge Meeting AI7 W13 Preliminary TC 37 SC3 report AI7 W14 Myrdal working paper (based on Hull paper, AI7W14 Ver01) AI7 A15 Agenda for the Vienna Meeting AI7 W16 Nested and Flat DTDs, 1991-11-03 AI7 M17 Minutes of the Vienna Meeting AI7 D18 Draft for the AI7 contribution to P2 AI7 D19 AI7/P2 DTDs AI7 W20 Aslib paper ************************************************************************* Attachment #4: ISO/TC 37/SC 3 N 89 en Rev 2 1991-11-19 Re: Item for future work Proposer: Secretariat of ISO/TC 37/SC 3 Result of discussion at the 8th meeting of SC 3, Vienna, 1991-11-18 Draft Title Computational aids for terminology ~ Formats for data interchange ~ SGML application for terminological entries Scope The standard provides a universally applicable format for terminological data based on SGML designed primarily for interchange purposes. Purpose and justification Specific aims and reason for standardization activity The project will utilize the SGML-reference concrete syntax provided in ISO 8879 and will standardize one or more document type definitions (DTDs) applicable to terminological entries. The project will determine whether there is a need to standardize more than one DTD in order to accommodate all possible types of term entries. This SGML support facility will be used for terminology management and interchange. It will be designed to facilitate interaction with: * terminological databases * lexicographical databases * text bases * thesaurus and documentation databases * bibliographical databases * other SGML applications. Main interests Such an activity will ensure coordination with ISO efforts regarding information technology (IT) applications in standards development and publishing in the terminology area. This activity conforms to the Recommendations of the ISO/STACO seminar "Faster and Better: IT applications in standards development" of 1991-06-3/4, which were subsequently endorsed by the ISO Council at their Madrid meeting of October, 1991: 1.3 Terminology data, being today derived from various sources and updated in various organizations, should be standardized world-wide in order to facilitate access, readability and mergeability with other dictionary-like data. Potential users of this standard include all individual and institutional producers and users of terminological information. Feasibility, urgency, and benefits for international harmonization Support software for SGML processing has only recently become available. National and international projects in the field of information and documentation have been initiated with the objective of standardizing SGML formats for special document types, e.g. standards, journal articles, etc. Early participation on the part of the international standards community will contribute to the development of a harmonized format for the international interchange of scientific and technical documents. Project stages (not necessarily in sequential order) 0 Preliminary studies * relationship to ISO 6156 MATER and other terminological interchange formats * relationship to other SGML applications (Proposal attached, see Annex 1) 1 Terminological questions * related terms not included in ISO 8879 (Proposal attached, see Annex 2) Need for coordination with * ISO/IEC JTC 1/SC 1 "Information technology; Vocabulary" * ISO/TC 46/SC 3 "Terminology of information and documentation" [.......] 2 Identification of mandatory and optional data elements that should be incorporated in the standardized SGML terminology format (Proposal attached, see Annex 5) 3 Development of the standardized SGML format for terminological entries according to ISO 8879 (Proposal attached, see Annexes 3, 4, 6) Need for coordination with * ISO/IEC JTC 1/SC 18 "Text and office systems" 4 Support of character sets Need for coordination with * ISO/IEC JTC 1/SC 2 "Character sets and information coding" * ISO/TC 46/SC 2 "Transliteration" and SC 4/WG 1 "Character sets" (Proposal attached, see Annex 7) The SGML format shall support the bi-directional transliteration of character sets. Project Allocation The project will be allocated to ISO TC 37/SC 3/WG 2, "SGML applications" (Convener: Dr. Budin, N). Time Schedule NP 92-02 (Ballot to close 92-05) WD 92-04 (Circulation for discussion at 9th meeting) WD 93-04 (Circulation for decision at the 10th meeting of SC3 whether to circulate as a CD) CD 93-11 (Ballot to close by 94-02) Forward text to SCC for French translation DIS 94-11 (Ballot closes: 95-05) Resolution of comments at the 11th meeting of SC 3 IS 95-11 ************************************************************************ Draft Attachment # 5, TEI AI7 M17 Demonstration files: MMUTS, NORM utilities 1991-11-16 The TEI-TERM documents A, B, C were generated from a short sample MicroMATER file. A is the flat document generated directly from the MM file. B is the normalized (nested) document generated from the flat document. C is the normalized document generated directly from the MM file. It should be noted that the MMUTS and NORM utilities used to generate these files (31.10 and 01.11.91, resp.) represented an older version of the TEI-TERM DTD that did not parse properly. This version of the DTD placed the tag at the end of the instead of directly following the term. This error has been corrected in the output files. MicroMATER file {mm} 2 {ln} en-pt ----- *day {1def} nao ^B noite {defd} 9 Oct 91 {1ltu} dia {LTS} pre {0def} not night *night {1} noite {LTY} syn {def} nao ^B dia {1defs} 123ABC {DE:2LTU} Nacht {def} nicht Tag {rl:src} 789XYZ {rl:srcd} 15 Oct 91 {rl:rst} wor @! A. Flat TEI-TERM document
day nao ^B noite 9 Oct 91 dia not night night noite nao ^B dia Nacht nicht Tag 15 Oct 91 workingEntry
B. TEI-TERM nested document generated from flat document A using the NORM utility
day not night dia nao ^B noite 9 Oct 91 15 Oct 91 workingEntry night noite nao ^B dia Nacht nicht Tag
C. TEI-TERM nested document generated directly from the MM file using the MMUTS utility
day not night dia nao ^B noite 9 Oct 91 15 Oct 91 workingEntry night noite nao ^B dia Nacht nicht Tag