Minutes of the Steering Committee Meeting Oxford, 7-9 December 1990 C. M. Sperberg-McQueen Document Number: TEI SCM16 January 22, 1991 (10:59:10) Draft January 22, 1991 (10:59:10) Present: Robert Amsler (RA), Lou Burnard (LB), Susan Hockey (SH), Nancy Ide (NI), C. M. Sperberg-McQueen (MSM), Donald Walker (DW), Anto- nio Zampolli (AZ). Absent: none. Work-Group Heads present (7-8 December): Robert Amsler (RA), machine-readable dictionaries; Douglas Biber (DB), language corpora; Steven DeRose (SJD), hypertext and hypermedia; Paul Ellison (PE), formu- lae etc.; Paul Fortier (PF), literary studies; Harry Gaylord (HG), char- acter sets; Daniel Greenstein (DG), historical studies; Stig Johansson (SJ), spoken texts (present 8 December). Work-Group Heads absent: Robert Ingria, computational lexica; Robert Kraft, text criticism; Terry Langendoen, general linguistics. 1 1 AGENDA The agenda provided by DW was accepted without discussion. 2 2 DISCUSSIONS WITH WORK GROUP HEADS The committee discussed the impending meeting with the work group heads; DW stated its goals as understanding the goals and plans of the work groups, advising them, and reacting to their plans. Reformulations of their lists of objectives had been received from: TR1, character sets TR3, hypertext and hypermedia AI4, historical study AI5, machine-readable dictionaries No reformulations had been received from: TR2, text criticism (discussion is just beginning) TR4, formulae etc. TR5, newspapers (no head) TR6, language corpora TR7, reference works (no head) TR8, physical description (no head) AI1, linguistics AI2, spoken texts AI3, literary study (received at meeting) AI6, computational lexica AZ expressed concern at this silence. It was agreed to begin with an overview by MSM of the current state of the TEI and the work to be done by the work groups, then continue with a sequential discussion of each work group by its head. AZ reported that he had spoken to Jacqueline Hamesse about heading work group TR8 on physical description, and she had accepted. MSM asked whether Paul Ellison's group should include Roberto Cencioni; it was agreed to call Ellison's attention to the work done in the first cycle by Cencioni and to allow him to discuss issues with RC as he saw fit; also that RC should be a reader of working papers by TR4. Overview of Second-Cycle Work After DW welcomed the work-group heads, MSM reviewed the work to be done in the second cycle, naming as key areas those of revision (includ- ing testing, revision of prose, and revision of DTDs), extension (including analysis, formalization, and SGML implementation), dissemina- tion (including public presentation of results, preparation of formal TEI documents, and creation of sample data and software for demonstra- tion of the advantages of descriptive markup), and pragmatic issues (including organization, the nature and distribution of TEI work prod- ucts, the need for liaison and coordination among the committees and work groups, and the need for WG members to get up to speed on the guidelines). SJD suggested a listserv list for WG heads; it was decided that this would not be necessary, but that all WG heads should be added to TEIHEADS for administrative broadcasts, etc. Substantive discussion among WG heads should use TEI-L. TR1, Character Sets HG presented the nature of the problems facing his WG and suggested that it should focus on the use of ISO standards as a solution. The TEI guidelines should be a central source of information about how coded character sets can be used. Inventories of characters needed for modern languages do exist; inventories for older languages are needed, and ISO is aware of the need. The WG has no members yet. PE mentioned Xerox and its Unicode character set, but suggested that Unicode would never succeed owing to political problems in its handling of East Asian scripts. DW urged that East Asian scripts should not be omitted from TR1's remit, because Japanese interest in the TEI is so strong. PF urged HG to provide a better solution for literary languages than the ISO 646 subset of version 1. HG's suggested revision of dates, which extends the final due date for this group's work, was approved. AI4, Historical Study DG described possible divisions of historical study according to period, subject, and country, observing that the work group achieves reasonable breadth according to each division. Initial reports from the WG members are due 31 January 91; a meeting will be held in Oxford in April, and results presented at the AHC UK branch meeting. If a report is ready by May, Manfred Thaller has agreed to publish it in a series by St. Katharinens Press; in this case the report will be presented for discussion at the AHC meeting in Odense (in late summer 1991). DG expressed the hope that the editors would be able to help him ensure that his work group members completed their work timely. TR4, Formulae etc. PE observed that version 1 of TEI P1 says very little about figures, etc., and described the political problems arising from the widely dif- fering evaluations of the work in this area performed by the Association of American Publishers Electronic Manuscript Project. Members of the WG are PE, Anders Berglund, and Dale Waldt; PE expressed confidence that they would be able to produce a report, define an appropriate set of concepts, and develop useful DTDs for the area, which should describe underlying concepts, not merely typographic lay- out. Meetings will be held in Tennessee in January and Geneva in Febru- ary or March; the SC authorized whatever extra funding is needed for the second meeting, on the understanding that most costs for the first meet- ing will be met by the members' employers. PE further drew attention to the EuroMaths project sponsored by the European Mathematical Trust and a project at Exeter for the evaluation of SGML software and SGML activity in the UK. RA asked about TeX and its adoption by the American Mathematical Society; PE replied that TeX is a very good typesetting language but does not express any underlying mathematical concepts. SGML is the proper mechanism for bringing different applications together. DG asked whether encoded tables (e.g. in a historic scientific work) would be accessible for further manipulation by a researcher; PE replied that they should be. LB informed PE of the work done on this area during the first cycle by Roberto Cencioni and promised to provide copies. TR6, Language Corpora DB described the objectives set by the steering committee as accepta- ble and presenting no problems. He foresees formulation of a series of parameters for text classification -- not discrete document types, but parameters for the characterization of texts. Expressing sampling pro- cedures would also be an important task, not only for the extraction of samples from a text, but also for the description of how texts have been chosen as representatives of larger groups of texts. No members have yet agreed to serve; Stig Johansson and Geoff Leech have declined, Gunnel Engwall has not replied. Jeremy Clear was sug- gested as another possible member. A long discussion of requirements for corpus work resulted in the following decisions: * This WG will go as far as it can toward offering a specific proposal for the classification of texts and the documentation of corpora, but will stop short of a firm recommendation if no firm consensus can be reached. * Whether advice on grammatical annotation for corpora will be offered remains an open question. * Recommendations for standardization of reference structure will be offered if consensus can be reached, but not otherwise. AI5, Machine-Readable Dictionaries RA described the origins of the dictionary encoding efforts and the Dictionary Encoding Initiative, the goals of the original effort and the rationale for the objectives given in his response to the steering com- mittee's draft list of objectives. Dictionaries differ from other texts of interest to the TEI because many people encode dictionaries not to study them but to use them; this creates pressures toward standardiza- tion and normalization of a different sort. AI3, Literary Study Members of this WG are PF, Christian Delcourt, David Robey, Rosanne Potter, and Ian Lancashire. To get information on document types and analytic procedures of special interest to literary scholars, the WG has circulated a questionnaire to several hundred people online. So far, the answers fall into three groups: those given by scholars who have encoded texts, worked with them, and published the results, those given by people who have not worked in the field, and those which fall in the middle in one way or another. All agree that bibliographic identifica- tion of the text and the marking of its major structural divisions are essential. Less consensus exists on the need for tagging metrical anal- ysis or grammatical information; less still on the use of tagging inter- pretive information. Much dismay has greeted the complexity of the cur- rent TEI document, and many have suggested a step-by-step cookbook of instructions is needed. Describing the likely results of the work group, PF foresaw the like- ly recommendation to form work groups divided along lines of genre (e.g. poetry, theater, novel, and other), period (e.g. modern and pre-modern), and language. The wide divergence in theoretical understanding of lit- erature and literary study poses a particularly thorny problem for the group. PF observed that the group might well request an extension of their deadline until 28 February. TR3, Hypertext and Hypermedia SJD described the scope of the hypertext and hypermedia group, and the need to coordinate its work with that of other hypertext standard- ization efforts, especially that of X3V1.8M, an ANSI work group respon- sible for the HyTime and Standard Music Description Language projects. SJD expects the WG to address problems of linking to remote docu- ments, external documents in general, and of associating annotation with text; to provide various link types and link information; to work fur- ther on the specification of arbitrary locations (especially in read- only text); and to work to ensure that the recommendations of the guide- lines can be used with real hypertext systems. He named several specific examples of hypertexts which could be used as demonstration projects, and specific systems to be used to demonstrate the use of TEI notation in moving data from one system to another. The WG will also have some interest in version control, critical apparatus, and handling of divergent recensions of the same information. DG and PF observed that their work groups will have an interest in the work of the hypertext group. LB asked whether the group would transfer real webs from one system to another to demonstrate the utility of TEI; SJD said a robust transfer mechanism would be very hard and would be unlikely. PE suggested that the TEI would have to defer to the HyTime group's results, and that the WG needs to function as a link between the TEI and the HyTime group. SJD and the editors felt that a more active role would be required. AI2, Spoken Texts SJ described the major difficulties faced in trying to work with spo- ken text. Very little guidance is to be found in version 1 of TEI P1; for spoken texts, unlike printed text, very little guidance is provided by the long tradition of printing and bookmaking conventions. In gener- al, spoken-text projects work at multiple levels (phonetic, phonemic, orthographic, and other), which must be kept in sync. The WG must con- sider what has been done so far, especially coherent computer systems like those of the CHILDES project, the London-Lund corpus of spoken Eng- lish, the ICE project, and the new spoken-corpus project based in Santa Barbara. Work by the International Phonetic Association on the IPA and symbols for prosody must also be brought in. The work will have implications for both the TEI header and for the text itself; the latter will pose very difficult problems. The develop- ment of technology will also change the task: new corpora are more likely to link an orthographic-style transcript with a sound recording than to produce a full detailed phonetic transcript, which is error- prone and expensive. AZ suggested that any existing standards for rep- resentation of sound itself should be incorporated by reference in the TEI so that this WG could focus its attention on levels above that of the narrow orthographic transcription. LB asked what non-linguistic events would need to be described by tags provided by this work group. SJ listed dialect, sociolinguistic status of speakers, audibility problems, gestures, laughter, etc. as likely topics. LB asked further about the interests and requirements of non-linguists who work with spoken material: anthropologists, ethnolo- gists, etc. SJ replied that their work was very important and useful, and mentioned oral historians as another relevant group. It was agreed that the work group (initially defined as a planning- level group) would work at the tag-proposal level on problems of lin- guistic work with spoken texts; if in their work it became clear that non-linguistic work required extensive further attention, they will rec- ommend formation of an appropriate work group to the steering committee. Jane Edwards has agreed to serve; SJ will contact Brian MacWhinney when he can. DW will suggest a further name from the DARPA speech- research community. SJ proposed a modified due date of 31 March. Further Discussion Concluding the discussion with the WG heads, the steering committee asked for general comments on areas needing further attention in the second cycle. Areas mentioned included: * reproduction of manuscript material, e.g. diaries * version control and the gradual enrichment of machine-readable texts * ephemera (tickets, matchbooks, advertising) * fragmentary ancient media (potshards, monumental inscriptions, ...) * emblems (both collections and isolated emblems) PE stressed that his group wanted to receive examples of tabular materi- al arising in texts encountered by other groups. DW having thanked the WG heads once more for their attendance and their work so far, the meeting adjourned for the night. Steering Committee Discussion of Work Groups TR3, Hypertext When the steering committee reconvened on Sunday, 9 December, MSM reiterated his conviction that the hypertext group must do more than provide liaison between the TEI and the HyTime group. He was requested to circulate the general HyTime documents. MSM to circulate HyTime documents to SC MSM Due: asap TR6, Corpora, and AI1, Linguistics AZ asked what could be reported in February to the Network of Europe- an Corpora meeting in Pisa. Where no consensus exists already (as in the case of corpus work), either (1) the TEI will say nothing, or (2) the TEI will attempt to create a consensus, or (3) the TEI will not cre- ate a consensus but will include proposals by others with that aim in an appendix of the guidelines. After a long discussion it was agreed that: * Work group AI1 on linguistics will be reminded / informed that rec- ommendations for corpus annotation are an important requirement of the TEI. * The linguistics WG has not replied to its statement of objectives; if they accept the objective relating to corpus annotation, they will be asked to specify when drafts will be ready for circulation (and urged to set an early date). * Otherwise, the steering committee will assign the task to a separate work group on linguistic annotation of corpora. AZ suggested that the meeting with the WG heads was productive and a similar one should be arranged soon with the others. Terminology and Term-Bank Work The issue of terminological work was discussed. The editors were assigned to contact Alan Melby for exploratory discussions. MSM Contact Melby for discussion MSM Due: asap AI5, Dictionaries, and AI6, Lexica AZ asked about the organization of the lexicon and dictionary groups. After discussion, it was agreed that Nicoletta Calzolari would be named a member of Ingria's group on computational lexica with responsibility for ensuring effective coordination with EuroTra VII. ? inform Ingria, Calzolari ? Due: asap AZ objected to the separation of bilingual and monolingual efforts, cit- ing the results of a major European project which had considered and rejected just such a separation. DW asked him to provide RA with the results of that group. AZ AZ provide position paper on separation of monolingual and bilingual dictionary work to RA Due: asap AZ also suggested that etymological work needs to be attacked at the level of the etymological dictionary; DW observed that etymologies in general dictionaries do not have the same form as those in etymological dictionaries; AZ argued that etymology is too central to historical lex- icography to be ignored. No action was taken. LB suggested that communities other than computational linguistics and dictionary publishing must be taken into consideration in the work on dictionaries: those who use older dictionaries as evidence for lin- guistic or cultural history, field linguists producing dictionaries for unrecorded languages, and historical linguists working on older language forms, all needed more explicit consideration. No action was taken. A discussion of lexicographic citation slips led to no action. AI3, Literary Studies NI reported that PF was rather confused about the goals of his WG. The editors were requested to confer individually with all WG heads to ensure that all had a clear sense of their mission, especially the dis- tinction between planning-level groups and tag-level groups. Editors Confer with WG heads Editors Due: asap TR8, Physical Description / TR9, Manuscript Problems AZ reported on his conversation with Jacqueline Hamesse. She has expressed herself willing to head a work group on manuscript problems. This is not quite how work group TR8 had thus far been defined, but it was agreed: * Jacqueline Hamesse should head a new work group (TR9) on manuscript problems, to evaluate the guidelines' usefulness for manuscript texts and propose new tags if needed. LB was assigned to contact JH for exploratory conversations to see exactly what she thought should be undertaken and ensure it fits within our needs. LB to contact J. Hamesse LB Due: asap * Don McKenzie should be asked to head TR8, which will be renamed a work group on physical description of printed books (analytic bibli- ography). LB to contact D. McKenzie LB Due: asap TR5, Newspapers The newspaper work group was discussed. The editors were assigned to contact Rieger and the German group he has contacted to discuss the issues. The German society for computational linguistics, the Institut fuEr deutsche Sprache, the Logothek in Sweden, N. Benton, and N. Rabin- sky will be asked (in that order of preference) to undertake the work of the newspaper work group. If all else fails, a public appeal should be posted to TEI-L for volunteers. MSM to contact the Germans MSM Due: asap TR7, Non-Dictionary Reference Works No ideas being forthcoming about reference works, it was agreed to post an appeal for volunteers on TEI-L. Eds to appeal on TEI-L for volunteers for reference works Eds Due: asap Silent Work Groups Langendoen, Ingria, and Kraft have not been heard from. AZ requested that an absolute deadline be set for them. No action was taken. 3 3 METALANGUAGE REQUEST The editors reported a request from the metalanguage committee that Lynne A. Price be accredited as speaking for the TEI in deliberations of ANSI committee X3V1; this will enable LAP to act more effectively on our behalf in the revision of the SGML standard. The request was approved and DW will act accordingly in consultation with David Barnard. 4 4 BUDGET AND FINANCIAL MATTERS The summary of first-cycle expenses circulated by MSM 3 December was accepted. The second-cycle budget summary circulated by MSM 9 November was discussed. A breakdown of the EEC column was requested. MSM to provide EEC column detail MSM Due: asap Quarterly reports were also requested showing, for each category, the amounts spent, committed through the end of the project, projected but not committed, and remaining. MSM to arrange for quarterly reports as specified MSM Due: asap The editors' proposal to unify responsibility for funds disbursal and financial reporting in Chicago was accepted. NI and MSM were assigned to see to the opening of an account in Chicago under the name of ACH, with signatory powers for MSM and his assistant for financial records. DW and SH were assigned to explore implications and pragmatic issues of incorporation of the TEI in the US or Europe. 5 5 PROJECT MANAGEMENT The document circulated by DW before the meeting was discussed. The editors were asked to investigate the automatic forwarding of an introductory packet to new subscribers to TEI-L. Eds to check Listserv distribution of TEI-L packets Eds Due: asap They were also requested to solicit committee members to reply to some queries on TEI-L in lieu of replying themselves. The advisory board should be asked for names of reviewers for WG papers. ? to dun AB for reviewer names ? Due: ? A propos of software development, LB presented a draft letter to Richard Giordano of Manchester on that topic. It was approved as amend- ed. SH suggested that if Mellon funds could be allocated, she could see to a part-time person for dissemination of publicity and similar materi- als. NI suggested that funds for a graduate student to work on software issues be given to David Barnard or Frank Tompa. No action was taken. 6 6 DATE OF NEXT MEETING The next meeting was set for 15-16 March in Tempe, Arizona. Draft January 22, 1991 (10:59:10)