Back to the Frontiers and Edges: Closing Remarks at SGML '92: the quiet revolution Graphic Communications Association (GCA) C. M. Sperberg-McQueen 29 October 1992 Note: This is a lightly revised version of the notes from which the closing address of the SGML '92 conference was given. Some paragraphs omitted in the oral presentation are included here; some extemporaneous additions may be missing. For the sake of non-attendees who may see this, I have added some minimal bibliographic information about SGML '92 talks referred to. I have not added bibliographic references for HyTime, DSSSL, etc. If you are reading this, I assume you already know about them, or know where to find out. -MSM Section I INTRODUCTION What a great conference this has been! We began with a vision of the future from Charles Goldfarb,(1) and since then have had a detailed tour of a lot that is going on in the present. I want to turn your attention forward again, and outward, back to the fringes and the edges of our current knowledge. We've been hearing about a lot of projects in which the gains of SGML are being consolidated, implemented, put into prac- tice. I want to talk about some areas in which I think there may still be gains to be made. Not surprisingly, some of those gains are at the periphery of our current concerns with SGML, in fringe applications, pushing the edge of the envelope. Not surprisingly, Yuri asked an academic to talk about them, because academics are by nature fringe people, and our business is to meddle with things that are already pretty good, and try to make them better. In identifying some areas as promising new results, and inviting more work, there is always the danger of shifting from "inviting more work" to "needing more work" and giving the impression of dissatisfaction with the work that has been accomplished. I want to avoid giving that impression, because it is not true, so I want to make very clear: the questions I am posing are not criticisms of SGML. On the contrary, they are its children: without ISO 8879, these questions would be very much harder to pose: harder to conceive of, and almost impossible to formu- late intelligibly. SGML, that is, has created the environment within which these problems can be posed for the first time, and I think part of its accomplishment is that by solving one set of problems, it has exposed a whole new set of problems. Notation is a tool of thought, and one of my main concerns is to find ways in which markup languages can improve our thought by making it easier to find formulations for thoughts we could not otherwise easily have. I start with the simple question: what will happen to SGML and to electronic markup in the future? Charles Goldfarb told us Monday: the future of SGML is HyTime. And this is true. HyTime is certain to touch most of us and affect our use of SGML in the coming years. But HyTime is already an international standard: it's part of the present. What will happen next? What should happen next? What I will offer is just my personal view, it has no official stand- ing and should be taken for what it's worth. It's an attempt to provide a slightly fractured view, a view slightly distorted in order to provoke disagreement and, I hope, some useful thought. If you want to know what is going to happen with SGML and markup lan- guages in the next few years, all you have to do is think about what happened in programming languages after the introduction of Cobol or Algol, and what happened in database management systems after the devel- opment of the Codasyl data model. Section II THE MODEL OF THE PAST The introduction of Cobol made possible vast improvements in program- mer productivity and made thousands of people familiar with the notion of abstraction from the concrete machine. It is no accident that SGML is often compared to Cobol: it is having a similarly revolutionary effect. More suggestive to me however is the parallel between SGML and Algol. Apart from the skill with which its designers chose their basic con- cepts, one of Algol's most important contributions was its clean, simply designed syntax. By its definition of its syntax, Algol made possible the formal validation of the program text, and thus rendered whole classes of programmer error (mostly typos) mechanically detectable, thus effectively eliminating them from debugging problems. Similarly SGML renders whole classes of markup error and data entry mechanically detec- table and thus eliminates them as serious problems. The notion of for- mal validity is tremendously important. What happened after the introduction of Algol? Over a period of time, the intense interest in parsing and parser construction gave way to interest in the meaning of programs, and work on proofs of their cor- rectness -- which I interpret as essentially an attempt to extend formal validation beyond the syntax of the language and allow it to detect log- ic or semantic errors as well, and thus eliminate further classes of programmer error by making them mechanically visible. Formal reasoning about objects requires a clean formal specification of those objects and their characteristics, so time brought serious work on the formal speci- fication of programming language semantics. In particular, work on type systems occupied a great deal of atten- tion, because Algol had demonstrated that type errors can be mechanical- ly detected for simple types. So a lot of people worked on extending those simple types and creating stronger, subtler, more flexible, more useful type schemes; from this work our current trend of object-oriented programming takes some of its strength. All of these same issues arose in connection with database management systems after Codasyl. (No matter what Tim Bray said yesterday, this did happen well before 1978.) The work of Codasyl in defining formally the characteristics of databases led to a generation of improved data- base systems, and eventually the increasing complexity of those systems led to the introduction of the relational model, whose simple concepts had a clean formal model firmly grounded in mathematics, which simpli- fied reasoning about databases and their correctness and which led to substantial progress in database work, as Tim Bray described yester- day.(2) The database work confirmed, fueled, and strengthened the conviction that formal validity and a rational set of data types are a useful investment. Equally important for our purposes, database work showed the importance of placing as much as possible of the burden of valida- tion in the database schema definition and not in the application soft- ware that works with the data. If you have a logical constraint in your data, for example that the sum of columns DIRECT COST and INDIRECT COST not exceed the column PRICE TO CUSTOMER, or that the only colors you offer are GREEN and BLUE, it is better to define that constraint into the database schema, so it will be consistently enforced by the db ser- ver. You may be tempted to leave it out of the schema on the grounds that your application programs can enforce this constraint just as well as the server. And you are right -- in theory. In practice, as surely as the day is long, before the end of the year you and the two other people who were there are transferred to new duties, your replacements will overlook the note in the documentation, the first thing they will do is write a new application which does not enforce this rule, and before another year is gone your database will be full of dirty data. In other words, to paraphrase an old Chicago election adage, con- strain your data early and often. As hardware costs declined and programmer costs increased, portabili- ty became an increasingly prominent issue, and the semantic specifica- tion of languages, abstracting away from the specifics of individual machines, proved to be an invaluable tool in helping achieve it where possible and limit the costs where device-specific code was necessary. Since the progress on formal semantics, though promising, did not yield mechanistic logic checkers as reliable as mechanistic syntax checkers, the years after Algol and Codasyl also saw the development of the notion of programming style, and attempts to define what constitutes a good program. At least some of these discussions appealed as much to aesthetic judgments as to empirical measures, which is as it should be, since aesthetics is a fairly reliable measure of formal simplicity and power. Section III SGML PROBLEMS AND CHALLENGES All of these problems are also visible in SGML and electronic text markup today, and my prediction, for what it is worth, is that they will occupy us a fair amount in the coming years. What is more, as you will have noticed in the course of the conference, they are already occupying us now. That is, the future is closer than you might think. What problems will occupy us in this uncomfortably near future? The same ones that we saw in programming languages and database management: * style * portability * a large complex problem I'll call "semantics", which includes prob- lems of validation and type checking III.1 Style We saw the other day in Tommie Usdin's session One DTD Five Ways(3) how close we already are to developing consensus on DTD style. As for external, presentational details, Tommie remarked that there is already an implicit consensus. For details of construction and approach, she remarked, rightly I think, that there is no one answer, no context-free notion of "a good DTD". Our work in coming years is to work on clarify- ing a context-sensitive notion of "a good DTD". When is it better to tag a profile of Miles Davis as a and when is it better to tag it
or even
? The answer is not, I suggest to you, as some were proposing the other day: namely that it's always better to tag it , but you may not always be able to afford it and so you may have to settle for or
. For production of the New Yorker, or for a retrieval system built specifically around the New Yorker, I personally would cer- tainly use the more specific tag. For a production system to be used by all Conde Nast or Newhouse magazines, however, I think the elements , , and so on would be problematic. Let's face it, Psychology Today and Field and Stream just do not have those as reg- ular departments. In building a 100-million word corpus of modern American English, it would similarly be a needless complication of the needed retrieval to provide specialized tags for each magazine and news- paper included in the corpus. One of the points of this whole exercise (i.e. SGML) is to reduce irrelevant variation in our data -- and rele- vance is context-dependent. Judging by the talks we have heard, those in this community will be building and customizing an awful lot of distinct DTDs in the coming years. One of our major challenges is to learn, and then to teach each other, what constitutes good style in them: what makes a DTD maintaina- ble, clear, useful. III.2 Portability Our second major challenge is, I think, portability. I can hear you asking "What?! SGML is portable. That's why we are all here." And you are right. Certainly, if SGML did not offer better portability of data than any of the alternatives, I for one would not be here. But if data portability is good, application portability is better. If we are to make good on the promises we have made on behalf of SGML to our superiors, our users, and our colleagues, about how helpful SGML can be to them, we need application portability. And for application port- ability, alas, so far SGML and the world around it provide very little help. Application portability is achieved if you can move an application from one platform to another and have it process the data in "the same way". A crucial first step in this process is to define what that way is, so that the claim that Platform X and Platform Y do the same thing can be discussed and tested. But SGML provides no mechanism for defin- ing processing semantics, so we have no vocabulary for doing so. DSSSL (ISO 10179, the Document Semantics and Style Specification Lan- guage) does provide at least the beginnings of that vocabulary. So DSSSL will definitely be a major concern in our future. We have seen another bit of the future, and it is DSSSL. Section IV THE BIG PROBLEM But the biggest problem we face, I think, is that we need a clear formulation of a formal model for SGML. If we get such a formal model, we will be able to improve the strength of SGML in several ways. IV.1 SGML's Strengths SGML does provide a good, clean informal model of document structure. Like all good qualitative laws, it provides a framework within which to address and solve a whole host of otherwise insoluble problems. For the record, my personal list of the crucial SGML ideas is: * explicitly marked or explicitly determinable boundaries of all text elements * hierarchical arrangement / nesting of text elements * type definitions constraining the legal contents of elements * provision, through CONCUR and the ID/IDREF mechanism, for asynchro- nous spanning text features which do not nest properly -- and here I want to issue a plea to the software vendors: Make my life easier. Support CONCUR! * use of entity references to ensure device independence of character sets Obviously there are a number of other features important to making SGML a practical system, which I haven't listed here. What I've listed are what seem to me the crucial elements in the logical model provided by SGML. It seems to me that a properly defined subset of SGML focusing on these ideas and ruthlessly eliminating everything else, could go far in helping spread the use of SGML in the technical community, which is fre- quently a bit put off by the complexity of the syntax specification. I don't think a subset would pose any serious threat to the standard itself: use of a subset in practice leads to realizations of why fea- tures were added to the standard in the first place, and with a subset, the growth path to full use of the standard language is clearly given. Spreading the use of SGML among the technical community would in turn help ensure that we get the help we will need in addressing some of the challenges we face. IV.2 Semantics We commonly think of SGML documents as data objects, to be processed by our programs. I ask you now to participate for a moment in a thought experiment: what would the world be like, if our SGML documents were not documents, but programs? Our current programs for processing SGML documents would be compilers or interpreters for executing SGML pro- grams. What else? Well, first of all, we discover a tremendous gap: we have lost everything we used to know about programming language semantics, and we have no serious way of talking about the meanings of these SGML pro- grams. And for that matter, we have no serious way of talking about what happens when we compile or execute them. In other words, we have made our programs reusable (we can run the same program / document with different compilers) and so we can use just one programming language instead of many, and this is good, but it would be nice to have a clue about the semantics of the interpretations our compilers make of the language we are using. The clearest analogy I can think of to our situation is that in SGML we are using a language like Prolog, in which each program (document) has both a declarative interpretation and an imperative or procedural interpretation. If you ignore the procedural aspects of Prolog pro- grams, you can reason about them as declarative structures; if you attend to the procedural aspects, you can see what is going to happen when you run the program. The difference between Prolog and SGML is that Prolog has very straightforward semantics for both the declarative and the procedural interpretations, for which formal specifications are possible. In SGML, we have a very clear informal idea of the declarative meaning of the document, but not a very formal one. And we have no vocabulary except natural languages for talking about processing them. Ironically, it is not easy to say exactly what ought to be meant by the term semantics. Different people use it in different ways, and if it does have a specific, consistently used meaning in formal language studies, then the practitioners have kept it a pretty well guarded secret. So I can't tell you what semantics means; I can only tell you what I mean by it today. Imagine I am about to send you an SGML document. Included in this document are two elements I suspect you may not have encountered before: and . When I say I'd like to have a good specification of their semantics, I mean I would like to be able to tell you, in a useful way, what and mean, and what formal constraints are implied by that meaning. But we don't seem to know how to do that. The prose documentation, if there is any and if I remember to send it, may say what a is, or it may not. It may tell you what means, but if it does it may say only "full of vugs; the attri- bute TRUE takes the values YES, NO, or UNDETERMINED". Unless you are a geologist you probably don't know what a vug is, and if you are a geolo- gist you may harbor some justifiable skepticism as to whether I know and am using the term correctly. Even if my prose documentation does explain that a vug is an airhole in volcanic rock, and you know how to decide how many vugs make a rock vuggy, I have probably not succeeded in specifying what follows logical- ly from that meaning in any useful way -- probably not, that is, in a way that a human reader will understand and almost certainly not in a way that a validating application can understand and act upon. For example, how many people here realize, given our definition of , that the tag is incompatible with the tag -- since the definition of a vug is that it's an airhole in volcanic, i.e. igneous, rock. If you noticed that, congratu- lations. Are you right? I don't know: if some vuggy igneous rock is metamorphosed and the airholes are still there, is it still vuggy? I don't know: I'm not a geologist, I'm just a programmer. Is there a geologist in the house?(4) It would be nice to be able to infer, from the formal definition of , whether or not is incompatible with , just as we can infer, from the DTD, that is not valid, since the attribute true can only take the values YES, NO, and UNDETERMINED. Prose is not a guaranteed way of being able to do that. So what can we manage to do by way of specifying the "semantics" of and ? We don't seem to know how to specify meaning in any completely satisfactory way. What do we know how to do? Well, we can fake it. Or to put it in a more positive light, we can attempt to get closer to a satisfactory specification of meaning in several ways: IV.2.1 Prose Specification First, we can attempt to specify the meaning in prose. Specifications in prose are of course what most of our manuals pro- vide in practice. It is handy to formalize this as far as possible, to ensure consistent documentation of all the characteristics of the markup that we are documenting. We've heard obliquely about a number of sys- tems people use to generate structured documentation of SGML tag sets: Yuri Rubinsky mentioned one used internally by SoftQuad; Debby Lapeyre mentioned one; the Text Encoding Initiative (TEI) uses one; I am sure others exist too. This is already a live issue. And it will continue to occupy our attention in the coming years. Natural-language prose is, at present, the only method I know of for specifying "what something means" in a way that is intuitive to human readers. Until our colleagues in artificial intelligence make more progress, however, prose specifications cannot be processed automatical- ly in useful ways. IV.2.2 Synonymy Second, we can define synonymic relationships, which specify that if one synonym is substituted for another, the meaning of the element, whatever that meaning is, remains unchanged. If we didn't know in advance what and meant, we probably still don't know after being told they are synonyms. But knowing we can sub- stitute one for the other while retaining the meaning unchanged is nev- ertheless comforting. IV.2.3 Class Relationships Third, we can define class relationships, with inheritance of class properties. It doesn't tell us everything we might need to know, but if we know that a is a kind of glossary list, or a kind of marginal note, we would have some useful information, which among other things would allow us to specify fall-back processing rules for applications which haven't heard of s but do know how to process marginal notes. The fact that HyTime found it useful to invent the notion of archi- tectural form, the fact that the TEI has found it useful to invent a simple class system for inheritance of attribute definitions and content-model properties, both suggest that a class-based inheritance mechanism is an important topic of further work. IV.2.4 License or Forbid Operations Fourth, we can define rules that license or forbid particular rela- tions upon particular objects or types of objects. We may not know what a is, but we can know that it stands in relation X to the ele- ment , and we can know that no can ever stand in relation Y to any element of type . In addition to relations, we can specify what operations can be applied to something: knowing that INTEGER objects can be added while DATE objects cannot, especially if one of the DATE objects is "in the reign of Nero", is part of what we mean when we say we understand inte- gers and dates. An ability to define legal operations for SGML objects is a key requirement for using SGML in data modeling. The definition of a data type involves both the specification of the domain of values it can take on and the spec of operations which can apply to it. Because SGML has no procedural vocabulary it is very dif- ficult to imagine how to specify, in SGML, the operations applicable to a data type. It would be useful to explore some methods of formal spec- ification for legal operations upon SGML objects. But note that "what it can do" and "what can be done to it" are not, really, specifications of "what it means". Moreover, object-oriented specifications cannot be exhaustive. In an application program, if an operation P is not defined for objects of type Q, it counts as a claim that operation P is illegal for such objects. Even if it's not illegal, you aren't going to get anywhere by trying to call it, so it might as well be illegal. In SGML, with our commitment to application independence, that isn't the case. If no def- inition of addition for DATE objects is provided, that could mean that it is semantically invalid: dates can never be added. Or it could mean that we just haven't got around to it yet, or haven't thought about it yet. So the absence of a method for performing an operation doesn't tell us whether the operation is or should be legal upon a particular type of object. Obviously, instead of leaving operations undefined, we could specify explicitly that certain operations are illegal for objects of a certain class. But it is not feasible to make an list of all the things that cannot be done to DATES, or BLORTS, or GRANFALLOONS, because the list is likely to be infinite. Nevertheless, as a way of approaching the formal description of applications, object oriented work is very promising. It's fairly obvi- ous that in the future we need to work together with those people devel- oping the object-oriented programming paradigm. IV.2.5 Axiomatic Semantics Fifth, we can specify in some logical notation what claims about the universe of our document we can make, given that it is marked up in a certain way, and we can define what inferences can be made from those claims. The synonymic relations I was talking about a moment ago are just a special case of this. Formal logic (i.e. first-order predicate calculus) certainly makes possible the kinds of inference I've been talking about, but even predi- cate calculus makes some concessions to the difficulty of the problem. I can infer that this value for this attribute and that value for the other one are consistent, inconsistent, etc. But since Frege and Rus- sell and Whitehead, logic has treated itself as a purely formal game divorced from meaning; the only relation to the real world is by way of models which involve assigning meanings to the entities of the logical system and seeing which sentences of the logical system are true under these interpretations. The problem is that "assign a meaning in the real world to an entity or operation of the logical system" is taken as a primitive operation and thus effectively undefined. We all know how to do this, right? We can't define semantics, but we know it when we see it. In work on declarative semantics, we can learn a lot from recent experience with logic constraint programming and declarative program- ming. The declarative approach to SGML semantics has a certain appeal, both because it fits so well with the perceived declarative nature of SGML as it is, and because declarative information is useful. As John McCarthy said in his Turing Award lecture, " The advantage of decla- rative information is one of generality. The fact that when two objects collide they make a noise can be used in a particular situation to make a noise, to avoid making a noise, to explain a noise, or to explain the absence of noise. (I guess those cars didn't collide, because while I heard the squeal of brakes, I didn't hear a crash.)" One worry about declarative semantics is that it might prove diffi- cult to define processing procedures in a declarative way. But in fact it is possible to specify procedures declaratively as Prolog, and logic constraint languages, and the specification language Z show us. So I think a formal, axiomatic approach of some kind is very promis- ing. But let's be real: it is very unlikely from a description of the tag set in first-order predicate calculus that you or I, let alone the authors we are working with, will understand what a is, or even what a is. IV.2.6 Reduction Semantics Finally, I should mention one further method of formal semantic spec- ification: reduction semantics. Reduction works the way high-school algebra works. One expression (e.g. "(1 + 2) + 5" is semantically equivalent to that expression ("3 + 5"), that one to this other one("8"), and so on. If you work consistently toward simpler expres- sions, you can solve for the value of X. There has been substantial work done on reduction semantics in programming languages, including LISP and more purely functional languages like ML. Moreover, reduction semantics doesn't have to be defined in terms of string expressions: it is entirely possible to define reduction seman- tics in terms of trees and operations upon trees. Take a simple example: if we have an element whose content model is "B+", does the order of s matter? In SGML there is no way of say- ing yes or no. Reduction semantics allows you to say that this tree (gesture) Apples ... Oranges ... is the same as that tree (gesture) Oranges ... Apples ... so sequence is not important. Or that they are not the same, so sequence is significant. We have a good example of this type of work in the paper "Mind Your Grammar" and the grammar-based DB work at the university of Waterloo by Frank Tompa and Gaston Gonnet.(5) I think this is a very important field for further work. In summary: we have at least six areas to explore in trying to work on better semantic specification for SGML: structured documentation (the kind of thing SGML itself is good at), synonymy, classes, operation definitions, axiomatic semantics, and reduction semantics. I don't know whether these activities would constitute the specifica- tion of a semantics for SGML and for our applications, or only a substi- tute for such a specification, in the face of the fact that we don't really know how to say what things mean. Certainly no lexicographer, no historical linguist, would feel they constituted an adequate account of the meaning of anything. And yet I suspect that these activities all represent promising fields of activity. IV.3 Validation and Integrity Checking A formal model would make it possible to formulate cleanly many of the kinds of constraints not presently expressible in SGML. This is by no means an exhaustive or even a systematic list, but at least all the problems are real: * If an attribute SCREEN-TYPE has the value BLACK-AND-WHITE, the attribute COLOR-METHOD almost certainly should have the value DOES-NOT-APPLY. But this kind of constraint on sets of attribute values is impossible to specify for SGML attributes. It would cer- tainly be useful sometimes to be able to define co-occurrence con- straints between attribute values. * Similarly, there are cases where one would like to constrain element content in a way I don't know how to do with content models. We have heard repeatedly in this conference about revision and version control systems which allow multiple versions of a document to be encoded in a single SGML document. For example, one might have a element which contains a series of elements. The TEI defines just such a tag pair. At the moment our element can contain only character and phrase elements. It would be nice to allow it to operate as well upon the kind of SGML-element- based deltas that Diane Kennedy described the other day for revision info, in which the unit of a revision was always an SGML element. If a change is made within a paragraph, the entire paragraph is treated as having been changed, and versioning consists in choosing the right copy of the paragraph.(6) But one would like to be able to specify that if the first element contains a

ele- ment, the second had better contain one as well, and not a whole new subsection or just a phrase. Otherwise, the SGML document produced as output from a version-selection processor would not be parsable. * It would be nice to be able to require that an element be a valid Gregorian date, or a valid ISO date, or a valid part number, etc., etc. * It would be nice to be able to require character data to appear within a required element: i.e. to have a variant on PCDATA whose meaning is "character+" and not "character*" -- or even to require a minimum length, as for social security numbers, phone numbers, or zip codes. * The SGML IDREF is frequently used as a generic pointer. Many people wish they could do in SGML what we can do in programming languages, and require a given pointers to point at a particular type of object. (The pointer in a had better be pointing at a fig- ure, or the formatter is going to be very unhappy.) * Similarly, it would be nice to have a type system that understood classes and subclasses. The only reason we face this nasty choice between using the tag and using the tag

for the New Yorker's "Goings on Around Town" section is that we have no way to make a processor understand that and and so on are just specialized versions of
or
. If we use the specialized tags, and want to specify an operation upon all sections of all magazines, we must make an exhaustive list of all the element types which are specializations of
. To be sure, our application systems can handle this. But we want to con- strain early and often. And never constrain in your application what you could constrain in the DTD. Section V CONCLUSION: WHY BOTHER? I suppose you can sum up my entire talk today this way. We want to constrain our data early and often. To do this, we need better valida- tion methods. To express the validation we need, we need a clean formal model and a vocabulary for expressing it. The query languages described yesterday are not the final word but they are a crucial first step. Why do we want to do all these things? Why bother with formal speci- fication? Because formal specification and formal validation are SGML's great strengths. Why is it, as Charles Goldfarb said on Monday, that SGML allows us to define better solutions than the ad hoc solutions built around a specif- ic technology? It is because SGML provides a logical view of problems, not an ad hoc view based on a specific technology. Naturally, it seems to suit the technology less well than the ad hoc approach. But when the underlying technology changes, ad hoc solutions begin to fit less well, and look less like ad hoc solutions, and more like odd hack solu- tions.(7) But we can improve SGML's ability to specify the logical level of our data and our applications. And so we should. A logical view is better than a technology-specific view. And so we should welcome every effort to improve the tools available to use in defining our logical view. In this connection I could mention again the work by Gonnet and Tompa on large textual databases, and the work of Anne Brggemann-Klein which is occasionally reported on the Netnews forum comp.text.sgml. Success in improving our logical view of the data is what will enable the quiet revolution called SGML to succeed. And now I hope you'll join me in thanking Yuri Rubinsky, for organiz- ing this conference and for allowing all of us co-conspirators in the revolution to get together and plot. ------------------------- (1) Charles Goldfarb, "I Have Seen the Future of SGML, and It Is ..." (Keynote, SGML '92), 26 October 1992. (2) Tim Bray, "SGML as Foundation for a Post-Relational Database Model," talk at SGML '92, 28 October 1992. (3) B. Tommie Usdin (et al.), "One Doc -- Five Ways: A Comparative DTD Session," (panel discussion of five sample DTDs for the New Yorker magazine), SGML '92, 27 October 1992. (4) There was; metamorphic rock can be vuggy too, so the initial defini- tion was too narrow. -MSM (5) Gaston H. Gonnet and Frank Wm. Tompa, "Mind Your Grammar: a New Approach to Modelling Text," in Proceedings of the 13th Very Large Data Base Conference, Brighton, 1987. (6) Diane Kennedy, "The Air Transport Association / Aerospace Indus- tries Association, Rev 100", talk at SGML '92, 28 October 1992. (7) I owe this pun to John Schulien of UIC. November 2, 1992 (14:34:56)