On Lexical Ambiguity Document Number: TEI-AI-W-21 March 24, 1990 D. Terence Langendoen Department of Linguistics University of Arizona Tucson, AZ 85721 Suppose we wish to mark up the example: Wash sinks. simply for lexical ambiguity, using a standard feature-structure inter- pretation of the Lund corpus markup tags. First, we can give it a tex- tual markup along the lines of my proposal in TEI-AI-W-20, as follows (again, details are omitted concerning character markup). < s id=s1> < w id=w1> Wash < /w> &er;rbl; < w id=w2> sinks < /w> < c id=c11> . < /c> < /s> Pointers to the lexicon are not put in here, since they are to be con- sidered part of the analysis. Next, we provide an analysis for each word. I propose that multiple analyses be grouped under a single tag, for which I suggest the name "analysis-list". This tag should be allowed to have attributes which specify whether the alternatives are ranked and the basis for the rank- ing, if any. Each analysis is tagged with "analysis", as in my earlier proposals. A "rank" attribute on this tag specifies the ranking; it takes numerical values, with rank=1 being the highest rank, rank=2 the next highest, etc., and rank=0 meaning that this interpretation should be ignored. I assume that each analysis is a single feature-structure, tagged with "fstruct", which is permitted besides its own ID attribute at least one IDREF attribute which points to an entry in the lexicon. I will call this latter attribute "lexp", as in TEI-AI-W-20. I also assume entity definitions which have the following effect. (I don't specify these in SGML because I don't quite know how.) Entity Feature specification ______ _____________________ &er;noun; &er;noun; < feature>< fname>cat< /fname>< fstruct>noun< /fstruct>< /feature> &er;common; &er;common; < feature>< fname>subcat< /fname>< fstruct>common< /fstruct>< /feature> &er;proper; &er;proper; < feature>< fname>subcat< /fname>< fstruct>proper< /fstruct>< /feature> &er;unmarked; &er;unmarked; < feature>< fname>subsub< /fname>< fstruct>unmarked< /fstruct>< /feature> &er;plural; &er;plural; < feature>< fname>subsub< /fname>< fstruct>plural< /fstruct>< /feature> &er;verb; &er;verb; < feature>< fname>cat< /fname>< fstruct>verb< /fstruct>< /feature> &er;main; &er;main; < feature>< fname>subcat< /fname>< fstruct>main< /fstruct>< /feature> &er;base; &er;base; < feature>< fname>subsub< /fname>< fstruct>base< /fstruct>< /feature> &er;s-form; &er;s-form; < feature>< fname>subsub< /fname>< fstruct>s-form< /fstruct>< /feature> &er;NC; &er;NC; &er;noun;&er;common;&er;unmarked; &er;NC2; &er;NC2; &er;noun;&er;common;&er;plural; &er;NP; &er;NP; &er;noun;&er;proper;&er;unmarked; &er;VA0; &er;VA0; &er;verb;&er;main;&er;base; &er;VA3; &er;VA3; &er;verb;&er;main;&er;s-form; Here is a representation of the various analyses. < analysis-list id=w1 ranking=yes> < analysis rank=1> < fstruct lexp=e2> &er;NC; < /fstruct> < /analysis> < analysis rank=0> < fstruct> &er;NP; < /fstruct> < /analysis> < analysis rank=2> < fstruct lexp=e2> &er;VA0; < /fstruct> < /analysis> < /analysis-list> < analysis-list id=w2 ranking=yes> < analysis rank=1> < fstruct lexp=e1> &er;VA3; < /fstruct> < /analysis> < analysis rank=2> < fstruct lexp=e1> &er;NC2; < /fstruct> < /analysis> < /analysis-list> Finally, the lexicon is as follows, where, as before, the "lexicon" tag simply indicates where the lexicon section starts. < lexicon> < entry id=e1> sink < /entry> < entry id=e2> wash < /entry> page 2 ------------------------------------------------------------------------ Terry Langendoen phone: (+1 602) 621-6898 Department of Lin- guistics bitnet: langendt@arizvm1 University of Arizona inter- net: langendt@arizvm1.ccit.arizona.edu Tucson, AZ 85721 USA fax: (+1 602) 621-9424