Received: from UKACRL.BITNET by UICVM (Mailer R2.03B) with BSMTP id 6981; Sun, 15 Oct 89 12:36:58 CST Received: from RL.IB by UKACRL.BITNET (Mailer X1.25) with BSMTP id 3677; Sun, 15 Oct 89 18:36:29 BST Received: from RL.IB by UK.AC.RL.IB (Mailer X1.25) with BSMTP id 5357; Sun, 15 Oct 89 18:36:28 BS Via: UK.AC.OX.VAX; 15 OCT 89 18:36:24 BST Date: Sun, 15 Oct 89 18:36 BST From: Lou Burnard To: U35395@UICVM Subject: Its That How's this for combination of your talk with mine? You wont like the tags in this one either. There's only three: for major title
  • for second level and for third level. There may also be a page throw (aargh) in between overheads. ---------------------------------------------------: Aunt Sally says, ok now what about all the other grouses? (grice?) [cant say I like the implicatures of *that* plural] time for tea Lou ----------------- Which `Text Processing Community'?
  • researchers
  • publishers and industry
  • software developers
  • data archivists
  • funding agencies What is the TEI?
  • Sponsoring bodies Association for Computers and the Humanities (ACH) Association for Literary and Linguistic Computing (ALLC) Association for Computational Linguistics (ACL) Funding & institutional support Commission of the European Communities Mellon Foundation (US) National Endowment for Humanities BELLCORE University of Illinois at Chicago University of Oxford Universita da Pisa Vassar College History Nov 1987: Vassar Conference Feb 1989: Advisory Board met March 1989: Committees constituted June 1989: First public meeting June 1990: Draft Guidelines June 1991: Interim Draft June 1992: Published Guidelines Structure Steering Committee Four working committees Two Editors Advisory Board Affiliated Projects Advisory Board American Anthropological Association American Historical Association American Philological Association American Philosophical Association American Society for Information Science ACM SIG for Information Retrieval Association for Documentary Editing Association for History and Computing Association Internationale Bible et Informatique Canadian Linguistic Association Dictionary Society of North America Electronic Publishing SIG International Federation of Library Associations and Institutions Linguistic Society of North America Modern Language Association Working Committees Text Documentation Text Representation Text Analysis and Interpretation Syntax and Metalanguage Affiliated Projects
  • Will be of major importance during second cycle
  • Currently affiliated Tudor Text Base (Otago, Toronto) Project Perseus (Harvard) Women Writers Project (Brown) ACL Data Collection Initiative ARTFL (Chicago, Nancy) Milton Database (Oxford, Ohio, Bangor) Bilingual Dictionary project (Vassar, Marseilles) Text Documentation Committee
  • (chair M. Sperberg-McQueen, Univ Illinois at Chicago) Library and archive management expertise Cataloguing and documentation of encoded texts Build on existing schemes e.g. ISBD, SSD... Text Representation
  • (chair S. Johansson, Univ Oslo) Expertise in analysis of continuous texts Tagsets for structural features of running texts for which typographic or written conventions already exist. Electronic representations of existing printed & ms conventions Working Groups: language and character sets, genre specific tags, core structural tags Text Analysis and Interpretation
  • (chair T. Langendoen, Univ Arizona) Discipline-specific analytic tag-sets Initial focus on linguistic issues Working Groups: dictionaries; phonology; morphology; syntax Syntax and Metalanguage
  • (chair D. Barnard, Queens Univ, Ontario) Expertise in formal language theory and SGML SGML conformance issues including subset recommendations DTD creation Description and transduction of other encoding schemes Extensive bibliography The `Poughkeepsie Principles'
  • The TEI Guidelines should... specify a common interchange format for machine readable texts provide a set of recommendations for encoding new textual materials document the major existing encoding schemes, and investigate the feasibility of developing a metalanguage in which to describe them be a set of guidelines, not a set of rigid requirements. be extensible be device- and software-independent be language-independent be application-independent Compromises needed...
  • between needs of lone researcher and well-resourced complex projects
  • between formal rigour and ease of use
  • between standardising how and standardising what General Markup Problems
  • Agreeing a common vocabulary
  • Giving parity to overlapping views
  • Are textual views application-independent? Specific SGML Problems
  • need to describe both rendering and possible many structural interpretations of it
  • multiple hierarchies are the rule not the exception
  • descriptive markup assumes you know what you are describing
  • need to validate DTD against document at least as often as vice versa Sources of theoretical discontent
  • attributes vs. embedded elements
  • when is subordination semantically significant? Sources of practical concern
  • Lack of software
  • Most existing applications are in one area only
  • Previous efforts (e.g. AAP) signally unsuccessful
  • Lack of public awareness Why should industry bother?
  • Taking SGML into previously unswept corners
  • Foregrounds separation of text from paper
  • Problems are independent of application version control MLE variorum subject indexing MLE allusion different PDL models MLE polytheoretic models Why should the language industries bother?
  • centrality of lexical resources
  • need to engage with realistically scaled projects
  • centrality of linguistic issues to markup Why should researchers bother?
  • what is scholarly about DTP?
  • economics of co-operation
  • markup as a branch of hermeneutics For more information...
  • TEI-L@UICVM.EARN