Report on Technical Seminar in Lisbon
C. M. Sperberg-McQueen
24 July 1994
The editors of the TEI have just returned from Lisbon, where
they gave a two-and-a-half day technical seminar or workshop on SGML and the TEI
for a research group at the New University, with interests focused on
terminological databases, dictionaries, and corpus-based lexicography and
lexicology. For the work group, the workshop represented a chance to learn more
about SGML and the Text Encoding Initiative. For the editors, it provided an
opportunity to experiment with the design of a multi-day workshop for specialist
users.
On the first afternoon of the workshop, after clarifying
organizational questions, was devoted to an introduction to SGML and to the
overal architecture of the TEI. We began 'medias in res' by examining together
two sample SGML documents (a very simple 'Hello, world!' document, which we
examined in an ASCII editor, and a more realistic, though still relatively
simple, document in TEI markup, containing the proposed syllabus for the
workshop, which we examined in both an ASCII editor and in an SGML editor which
provided nice on-screen formatting for the document). Following this, LB
presented some basic philosophical and practical observations on the nature and
necessity of markup, and the design and goals of the TEI document type
definitions. MSM then outlined the process of document analysis with which any
serious project in electronic text encoding needs to begin, and we ended the day
by applying that process to a sample text from the text-base being built in
Lisbon for terminological work.
On the second day, we began with a description of corpus
construction and TEI facilities for corpus planning and documentation, with a
side view on linguistic annotation as practiced by current corpora such as the
British National Corpus. The rest of the morning went to a survey of the TEI tag
set for terminological databases, which is the basis of current work in ISO
Technical Committee 37, and of fairly direct relevance to the work of the host
research group. After lunch, we continued by tagging, with TEI markup, a portion
of the document we had analysed the previous afternoon, and then the
participants in the seminar were turned loose on machines equipped with
Author/Editor and a selection of pre-compiled versions of the TEI DTD.
The final day of the workshop was devoted to dictionary
encoding, with examples from Portuguese and French dictionaries, to a
demonstration of SARA, the SGML-aware interactive concordance software being
developed for the British National Corpus, and to more hands-on work. We
finished with a plenary discussion of the workshop, in which the participants
gave the editors a number of very useful suggestions, which will, we hope,
benefit participants in future workshops.
We thank the research group and in particular Prof. Theresa
Lino for their invitation and kind hospitality --- and for their patience with
our non-existent Portuguese and imperfect French (on average, that is: LB's
French is impeccable, MSM's is, well, highly peccable). Thanks are also due to
Softquad, for a set of temporary licenses for Author/Editor, and to the British
National Corpus for authorizing the demonstration of SARA.
Research groups, professional societies, or others interested
in organizing workshops on the use of SGML and the TEI in their particular
fields should contact the editors. We are in the process of preparing a workshop
this fall, in which we will prepare a number of individuals to teach such
workshops, and we hope to be in a position to accommodate as many such requests
as humanly possible in the next couple of years.