Received: from ARIZVM1.CCIT.ARIZONA.EDU by UICVM (Mailer R2.07) with BSMTP id 6007; Fri, 18 Jan 91 13:11:52 CST Received: from ARIZVM1 (SMTP) by ARIZVM1.CCIT.ARIZONA.EDU (Mailer R2.07) with BSMTP id 6905; Fri, 18 Jan 91 12:03:11 MST Received: from ARIZVM1.CCIT.ARIZONA.EDU by ARIZVM1.CCIT.ARIZONA.EDU (IBM VM SMTP R1.2.2MX) with BSMTP id 0728; Fri, 18 Jan 91 12:03:06 MST Date: Fri, 18 Jan 91 12:02:59 MST From: "Terry Langendoen" To: u35395@uicvm To: Terry Langendoen, copies to other members of TEI grammatical classification working group From: Geoff Sampson You said that word-classifying schemes were a topic you wanted to take up at Baltimore. It might be useful to help get the ball rolling to show you a copy of the table my research group has evolved for English, and I am going to ask my colleague Robin Haigh if he can transmit a copy -- I think this should be possible electronically. This is an extended version of the "Appendix B" tables in Garside/Leech/Sampson, _The Computational Analysis of English_, but it goes quite a lot further than that 1987 book because subsequent experience led us to make many further grammatical distinctions (for instance "formulae", which was originally a single catch-all category for character-sequences that weren't "words" in the ordinary linguistic sense, are now divided up into mathematical formulae, postcodes, etc. etc.) The table we'll send is oriented towards written language, but we have since begun to evolve schemes for classifying the distinctive kinds of word characteristic of spoken language. The table as you will receive it is a two-dimensional array in which the columns represent different classifying schemes that are or have been in internal use in our group, and some columns use a common symbol for groups of words that are distinguished in another column; but the rows represent a "maximally differentiated" set of word classes, and it is the headings of the rows which I am really offering as a contribution to the discussion. I don't remember how many categories there are, but it is in the hundreds. Intellectual property rights are controlled by the ESRC, so the work is relatively open and unencumbered. I would add, incidentally, that lying behind the bare headings of our word- classifying schemes there are, in some cases, discussions that take many pages to reproduce. One problem that exercised us for weeks recently, for instance, is how to classify proper names -- for automatic language processing it would be a pity for the system not to "know" that "Johnson" is a surname, "Mary" a female Christian [American "first"] name, "France" a country name, etc., but our first attempts to draw up a commonsense set of categories of this kind and apply them to corpus data led to hopeless inconsistencies, and we had to go back to the beginning and change the basis of our categorization before we had a system that can be applied in a reasonably predictable fashion to almost every name that crops up. Also: I assume you already know the "Manual" accompanying the Tagged LOB Corpus, which is a short-book-length analysis of problems arising in applying the LOB wordtags in a consistent, predictable fashion to that million-word corpus. We have in some respects gone beyond that classification system, but we were only able to go beyond it because we used it as a starting point. ------------cut here--------------------- # # # # Col 1 - SUSANNE # Col 2 - CLAWS2, AP Corpus # Col 3 - "Lancaster" published set (GLS) # Col 4 - APRIL model # Col 5 - Treebank 4.1 # Col 6 - Treebank 3.0 - (GLS "Leeds" set) # Col 7 - Oslo LOB (LOB itself, Orange Book) # Col 8 - CLAWS1, Lancaster Spoken English Corpus (*11) # Col 9 - Brown (latest version) # Col 10 - Gothenburg # # GLS - Garside, Leech, Sampson (eds): The Computational Analysis of English, # appendix B # # # # Authorities # # Col 1 - Anything unresolved here is awaiting SUSANNE decision. # Col 2 - Lancaster practice - CLAWS2 wordlist and AP Corpus and whatever else. # Our data is incomplete here and some things will remain unsettled. # Col 3 - GLS is definitive. Anything not specified in GLS is undefined. # This column exists to summarise the published stuff under "Lanc" in GLS, # irrespective of Lancaster practice as evidenced by anything else, # so that changes between SUSANNE and the book can be noted. # However, I take it that Orange Book is relevant where GLS implies # no change from LOB practice. # Col 4 - that's mine. Fresh daily. # Col 5 - Treebank itself, current version. Should be identical to col 4, # but temporarily out of sync. # Col 6 - GLS fights it out with the tb3.0 tape. If they disagree violently, # we'll have to make two "Leeds" columns... # Col 7 - Orange Book - LOB itself to clarify if necessary # Col 8 - ditto # Col 9 - Tagged Brown manual, Tagged Brown itself. Much of this column not # filled in yet. # Col 10 - Ellegard's paper and Gothenburg Corpus itself. (These can't # disagree, the paper is too vague.) # # # Some terms have specific meanings for the purpose of this list. # 'Qualifier' means adverb modifying adjective or adverb # 'Degree adverb' means clause adverb (not qualifier) expressing degree # 'Capitalized' means always used with a capital irrespective of # syntactic context. # 'Asyndetic coordinator' (a.c.) (aka CGEL "conjunct") means adverb used # to introduce asyndetically coordinated clause # 'Interrogative' includes exclamatory ("How easy it is") (Is this right ??) # 'Subjunctive' means marked as such, i.e. uninflected form in 3rd person singular # 'Debitory' refers to the construction "have to go" (*73) # Separate note somewhere on C,M,P,O,L,S sense classification for nouns. # # # ?? denotes not yet filled in # X?? denotes X guessed (may be wild guess) # denotes unascertainable on currently available (incomplete) information # <-> denotes "known to be undefined" (cases which are not covered by schemes # for which no fuller statement exists or will exist) # <> denotes not applicable owing to variation in "word" boundaries # <+> denotes "could be just about anything" - category dispersed # # # Preliminary version: filled in from published lists # and from examination of corpora, but not checked and confirmed. # # # # 1 2 3 4 5 6 7 8 9 10 # SUE CLS2 GLS APR TB LDS OSLO CLS1 BRN GOTH # # GG $ $ GG GG <> <> <> <> <> Germanic genitive inflection, "'" or "'s" EX EX EX EX EX EX EX EX EX X "there", existential TO TO TO TO TO TO TO TO TO U "to" with infinitive UH UH UH UH UH UH UH UH UH I interjection - "hello", "no" etc XX XX XX XNOT XNOT XNOT XNOT XNOT * G "not" AT AT AT ATI ATI ATI ATI ATI AT T "the" as determiner AT AT CSH ATI ATI ATI ATI ATI ?? T "the" introducing comparisons AT1 AT1 AT1 AT AT AT AT AT AT F "an"; "a" as article AT1e AT1 AT1 ATE ATE AT AT AT AT Q "every" ATn AT AT ATN ATN ATI ATI ATI AT Q "no" as determiner BTO BTO BTO HH HH HH <> <> <> <> "in order" preceding infinitive BTOz BTO BTO HH HHZ HH <> <> <> <> "so as" preceding infinitive <> <> <> <> <> <> TO TO <> <> "in order to", "so as to" <> <> BCS <> <> <> <> <> <> <> "in order" before "that" <> <> BCS <> <> <> <> <> RB A "even" before "if"/"though" LE LE LE ABX ABX ABX ABX ABX ABX Y "both" as pre-coordinator LEe LE LE DTX DTX DTX DTX DTX DTX Y "either" as pre-coordinator LEn LE LE DTX DTX DTX DTX DTX DTX Y "neither" as pre-coordinator CC CC CC CC CC CC CC CC CC Y coordinating conjunction CCr CC CC CC CC CC CC CC CC Y "or" CCn CC CC CC CC CC CC CC CC Y "nor" CCB CCB CCB CCB CCB CC CC CC CC Y "but" as c.c. CFf CF <-> CSF CSF CS CS CS ?? Y "for" as conjunction RRx RR <-> RB RB RB RB RB ?? A "only" as adverb RRx CF <-> RB RB RB RB RB ?? Y "only" as a.c. ("I'd do it myself, only I haven't enough money") (*25) RRx ?? RB RB RB RB RB ?? A "only" introducing main clause ("I'd do it myself. Only I haven't enough money") RRs ?? RBS RBS RB RB RB ?? <-> "yet" introducing main clause. (*25) RRs CF CF RBS RBS RB RB RB ?? Y "yet" as a.c. Also CC and CS in OSLO/LDS(/CLS1?) (*25)(*37) RRs RR CF RBS RBS RB RB RB ?? Y "yet" as adverb. RTn CF CF RBSW RBSW RN RN RN ?? Y "then" as a.c. (*25) CS CS <-> CS CS CS CS CS ?? Z "so" introducing purpose clause (*6) CS CS CS CS CS CS CS CS CS Z subordinating conjunction CSA CSA CSA CSA CSA CSA CS CS ?? Z "as" used comparatively, as "conjunction" by Orange Book definition CSA CSA CSA CSA CSA CSA CS CS CS Z "as" as s.c. introducing adverbial clause CSA CSA CSA CSA CSA CSA CS CS CS Z "as" as s.c. introducing pseudo-relative clause CSA CSA?? ?? INA INA IN IN IN ?? Z "as" used comparatively, as "preposition" by Orange Book definition CSN CSN <-> INN INN IN IN IN ?? Z "than" as "preposition" by Orange Book definition CSN CSN CSN CSN CSN CS CS CS ?? Z "than" as "conjunction" by Orange Book definition CST CST CST WRB <-> CS?? CS?? CS?? ?? W "that" as relative adverb ("the day that you came") (*31)(*6) CST CST CST CST CST CS CS CS CS Z "that" as s.c. CST CST CST WPT WPT WP WPR WP WPO W "that" as relative pronoun - non-subject (*31) CST CST CST WPT WPT WP WPR WP WPS W "that" as relative pronoun - subject (*31) CSW CSW CSW CSW CSW CS CS CS CS Y "whether" as leading co-ordinator with "or" CSW CSW CSW CSW CSW CS CS CS CS Z "whether" as s.c. CSg CS CS CS CS CS CS CS CS Z "though" as s.c. CSi CS CS CSI CSI CS CS CS CS Z "if" CSk CS CS CSK CSK CS CS CS CS <> "as_if", "as_though" CSn CS <-> WRBW WRBW WRB WRB WRB ?? W "when" as s.c. (fused relative adverb, = "at the time at which") (*1)(*89) CSn CS <-> WRBW WRBW WRB WRB WRB ?? Z "when" as s.c. (fused relative adverb, = "at the time at which") (*1)(*89) CSr ?? <-> WRBW WRBW WRB WRB WRB ?? W "where" as s.c. (= "at the place at which") (*1)(*90) DA1 DA1 ?? APM APM ?? ?? ?? ?? Q "little" following determiner/pronoun DA1 DA1 DA1 AQM AQM ?? ?? ?? AP Q "little", "much" as determiner/pronoun DA2 DA2 DA2 APS APS AP AP AP AP J "few", "many" following determiner/pronoun DA2 DA2 DA2 APS AQS AP AP AP AP Q "few", "many", as determiner/pronoun DA2q DA2 DA2 AQS AQS AP AP AP AP Q "several" as determiner/pronoun <> DA2 DA2 <> ?? ?? ?? ?? ?? <> "a good few", "a great many", etc as determiner/pronoun (*80) DA2R DA2R DA2R AQSR AQSR ?? ?? ?? ?? JR "fewer" DA2T DA2T DA2T JJT AQST ?? ?? ?? ?? JT "fewest" DAR DAR DAR AQR AQR ?? ?? ?? ?? QR "more", "less" as determiner/pronoun DAT DAT DAT AQT AQT ?? ?? ?? ?? QT "most", "least" as determiner/pronoun DAg DA ?? AG AG AP AP AP ?? J "own" with genitive DAr DA DA AP AP AP AP AP AP J "former", "latter" with determiner/pronoun (*29) DAy DA DA APY APY AP ?? ?? ?? J "same", "selfsame" DAy DA DA APY APY AP ?? ?? ?? Q "same", "selfsame" DAz DA DA ABZ ABZ ABL ABL ABL ABL A "such" DAz DA DA ABZ ABZ ABL ABL ABL ABL E "such" DAz DA DA ABZ ABZ ABL ABL ABL ABL Q "such" DB2 DB2 DB2 ABX ABX ABX ABX ABX ABX Q "both" as determiner/pronoun DBh DB DB ABH ABH ABN ABN ABN ABN Q "half" as determiner/pronoun DBl DB DB ABN ABN ABN ABN ABN ABN Q "all" as determiner/pronoun DD DD <-> ?? PN PN PN PN ?? <-> "somesuch"; "yon", "yonder" as determiner/pronoun DD1i DD1 DD1 DT DT DT DT DT DT E "this" as determiner/pronoun DD1i DD1 <-> QL QL DT QL ?? ?? E "this" as qualifier ("this big") DD1a DD1 DD1 DT DT DT DT DT DT E "that" as determiner/pronoun DD1a RG?? <-> QL QL DT QL ?? ?? E "that" as qualifier ("that slowly") (*7) DD1b <> ?? AQB AQB ?? <> <> ?? <> "a bit" as determiner/pronoun DD1l ?? ?? AQL AQL ?? ?? ?? ?? <> "a little" as determiner/pronoun DD2 ?? ?? AQS AQS ?? ?? ?? ?? <> "a few" as determiner/pronoun DDo ?? ?? AQO AQO AP <> <> ?? <> "a lot" as determiner/pronoun DD1q DD1 DD1 DTQ DTQ DT DT DT ?? Q "another"; "each" as determiner/pronoun DD1e DD1 DD1 DTX DTX DTX DTX DTX DTX Q "either" as determiner/pronoun DD1n DD1 DD1 DTX DTX DTX DTX DTX DTX Q "neither" as determiner/pronoun DD2i DD2 DD2 DTS DTS DTS DTS DTS DTS ES "these" DD2a DD2 DD2 DTS DTS DTS DTS DTS DTS ES "those" DDQ DDQ DDQ WDT WDT WDT WDT WDT WDT W "what" as determiner DDQ DDQ DDQ WDT WDT WDT WDT WDT WPO W "what" as pronoun non-subject DDQ DDQ DDQ WDT WDT WDT WDT WDT WPS W "what" as pronoun subject DDQGq DDQ$ DDQ$ WPG WPG WP$ WP$ WP$ WP$ WX "whose" interrogative DDQGr DDQ$ DDQ$ WPG WPG WP$ WP$R WP$ WP$ WX "whose" relative DDQGq DDQ$ DDQ$ WPGG WPG WP$ WP$ WP$ WP$ WX "whose" as interrogative pronoun DDQGr DDQ$ DDQ$ WPGG WPG WP$ WP$R WP$ WP$ WX "whose" as relative pronoun DDQV DDQV DDQV WVDT WVDT WDT WDT WDT WDT W "whichever", "whatever", "whatsoever" as determiner DDQV DDQV DDQV WVDT WVDT WDT WDT WDT WPO W "whichever", "whatever", "whatsoever" as pronoun non-subject DDQV DDQV DDQV WVDT WVDT WDT WDT WDT WPS W "whichever", "whatever", "whatsoever" as pronoun subject DDQq DDQ DDQ WDT WDT WDT WDT WDT WDT W "which" interrogative as determiner DDQq DDQ DDQ WDT WDT WDT WDT WDT WPO W "which" interrogative as pronoun non-subject DDQq DDQ DDQ WDT WDT WDT WDT WDT WPS W "which" interrogative as pronoun subject DDQr DDQ DDQ WDT WDT WDT WDTR WDT WDT W "which" relative as determiner DDQr DDQ DDQ WDT WDT WDT WDTR WDT WPO W "which" relative as pronoun non-subject DDQr DDQ DDQ WDT WDT WDT WDTR WDT WPS W "which" relative as pronoun subject DDf DD ?? DTF DTF DTI DTI DTI ?? Q "enough", with noun (before or after) or as pronoun DDi DD DD DTI DTI DTI DTI DTI DTI Q "some" as determiner/pronoun DDy DD DD DTI DTI DTI DTI DTI DTI Q "any" as determiner/pronoun FB ?? ?? NN FB ?? ?? ?? ?? <-> prefix used as separate word FWg ?? ?? NN FW ?? ?? ?? ?? O biological Latin genus name - "homo" FWs ?? ?? NN FW ?? ?? ?? ?? O biological Latin species name - "sapiens" FOc ?? ?? FC FC ?? ?? ?? <-> O chemical or nucleonic formula - "H2SO4", C-14 etc. FOp ?? ?? NP <> <> <> ?? L Postcode, zipcode ("LS2 9JT" etc) FOl ?? ?? NP NP NP NP NP ?? <-> London postal district ("E.C.4" etc) FOr ?? ?? NP <> <> <> ?? L Road ("M4", "M25", etc) MCi ?? ?? LA <> <> <> <> ?? <> label, applied parenthetically, complex MCl ?? ?? LU <> <> <> <> ?? <> label, used nominally, complex FO ?? ?? NN FO &FO &FO &FO <-> L formula FO ?? ?? NN FO &FO &FO &FO <-> O formula FOs ?? ?? ?? FO ?? ?? ?? <-> ?? Registration/reference/serial number FOt <> <> FQ FQ &FO &FO &FO ?? <> table, given as **[TABLE**] in LOB, or other undepicted display item FOq <> <> FQ FQ &FO &FO &FO ?? <> Undepicted item embedded in running text, other than FOqx FOqc FWq FOx ?? ?? FX FO &FO &FO &FO <-> O algebraic expression with nominal function - "a", "dy/dx" etc. (*28) FOqx ?? ?? FQX FO &FO &FO &FO <-> O algebraic equation FOqc ?? ?? FQC ?? ?? ?? ?? <-> <> chemical equation FWq <> <> <> FQ &FW &FW &FW ?? <> extended foreign sentence, given as **[FOREIGN-QUOTATION**] in LOB (i.e. as one "word") (*9) FWq <> <> <> FQ <> <> <> ?? O extended foreign sentence, given explicitly ICSl ICS ?? CSL CSL CS CS CS CS <-> "like", as conjunction/relative ICSl ICS ?? IN IN IN IN IN IN J "like", as prep (*88) ICSl ICS ?? IN IN IN IN IN IN P "like", as prep (*88) ICSl ICS ?? IN IN IN IN IN IN Z "like", as prep ICSx ICS ?? CSX CS CS CS CS ?? Z "except", "save" as s.c. (= "except that" informally, or archaic = "unless") ICSx ICS ?? INX INX CS CS CS ?? P "except", "save" as s.c. (Orange Book defn, e.g. with P as complement) ICSx ICS ?? INX INX CS CS CS ?? Z "except", "save" introducing bare-verb clause ("nothing to do except wait") ICSx ICS ?? INX INX IN IN IN ?? P "except", "save" as prep. ICSx ?? ?? INX INX CS CS CS IN Z "but" as OB s.c. ICSx II ?? INX INX IN IN IN IN P "but" as prep. RR RR ?? RI RI RI RI RI ?? A "but" as adverb ICSr ICS ICS CS CS CS CS CS CS Z "after", "before", "since" as true s.c. (*16) ICSr ICS ICS CS CS CS CS CS CS Z?? "after", "before", "since" as OB s.c. ICSr ICS ICS IN IN IN IN IN IN P "after", "before", "since" as prep, complemented RRi RR RR RI RI RI RI RI RB A "after", "before", "since" as adverb (uncomplemented preposition) ICSt ICS ICS CS CS CS CS CS CS Z "until" as true s.c. (*16) ICSt ICS ICS CS CS CS CS CS CS Z?? "until" as OB s.c. ICSt ICS ICS IN IN IN IN IN IN P "until" as prep. IF IF IF INF INF INF INF INF IN P "for" as prep. II II II IN IN IN IN IN IN P preposition, always complemented IIr II II IN IN IN IN IN IN P preposition/adverb, used as preposition IIr ICS ?? IN IN IN IN IN IN P "considering", "notwithstanding", as prep (*98) RP RP ?? RP RP RP RP RP RP A preposition/adverb, used as adverb (Brown RP list, except "across") RP RL ?? RP RP RP RP RP RP A "across" as adverb RLi RL RL RI RI RI RI RI RB A preposition/adverb, used as adverb (LOB RI list) - place (*22) RRi RR RR RI RI RI RI RI RB A preposition/adverb, used as adverb (LOB RI list) - not place (*22) RLi RL ?? RP RP RP RP ?? RB A preposition/adverb, used as adverb (LOB RP list) (*23) RLi RP ?? RP RP RP RP ?? RB A preposition/adverb, used as adverb (LOB RP list) (*23) RLi RL RL RB RB RB RB RB RB A preposition/adverb, used as adverb (never RP or RI) - place RRi RR RR RB RB RB RB RB RB A preposition/adverb, used as adverb (never RP or RI) - not place RRRi RRR RRR RBR RBR RBR RBR RBR RBR A comparative preposition/adverb, used as adverb ("nearer") (*78) RRTi RRT RRT RBT RBT RBT RBT RBT RBT A superlative preposition/adverb, used as adverb ("nearest") (*78) IIa II II INA INA IN IN IN ?? P "as" used non-comparatively as preposition (*79) IIb II II INB INB IN IN IN IN P "by" as prep. IId <> <> IND IND <> <> <> ?? <> en-dash as preposition - "1951-55" etc (*15) IIg II II IN IN IN IN IN ?? VN "aged" as pseudo-preposition IIm ?? II INM INM IN IN IN ?? O plus, minus, times (in words) IIx ?? ?? ?? ?? ?? ?? ?? ?? O mathematical infix operator - "+", ">", "=", etc IIp ?? II INP INP IN IN IN ?? P "per" IIt II II INT INT IN IN IN IN P "to" as prep. IO IO IO INO INO INO INO INO IN P "of" IW IW IW INW INW IN IN IN IN P "with" IW IW IW INW INW ?? RI ?? ?? P "with" in "to begin with", "over and done with" IWr IW IW INW INW IN IN IN IN P "without" as prep. JA JA JA JJ JJ JJ JJ JJ JJ J adjective used only predicatively JAj JJ ?? JNP JNP JNP JNP JNP JJ J abbr. adjective appended to organisation name ("Ltd", "Inc", "Pty")(*93) JAn ?? ?? ?? ?? ?? ?? ?? ?? ?? party name etc appended to personal name ("Cons.", "D.") JB JB JB APJ APJ AP AP AP ?? J "very" with determiner - "the very thing" JB JB JB JJB JJB JJB JJB JJB JJ J adjective used only attributively JBs JB JB ?? JJB JJB JJB JJB JJS J semantically superlative adjective - "main", "chief", "top", etc. JBR JBR JBR ?? JJ JJB JJB JJB ?? JR "upper", "inner", "outer", "lesser", "nether". JBT JBT JBT ?? JJB JJB JJB JJB JJS J "utmost", "uttermost" (*20) JJRo JB ATA ABR ABR AP AP AP AP Q "other" JBy JB ATA APJ APJ AP AP AP ?? J "only" with determiner- "the only thing I know" JJs JJ ?? JJ JJ JJ JJ JJ JJS?? J adjective in "-most" - "topmost", "outermost", "uppermost", "hindmost", etc (*19) JJ JJ JJ JJ JJ JJ JJ JJ JJ J adjective JJ JJ JJ JNP JNP JNP JNP JNP JJ J capitalised adjective JJ JJ JJ JNP JNP JNP JNP JNP JJ J prefixed capitalised adjective (e.g. "anti-Jewish") JJ JJ JK JJ JJ JJ JJ JJ JJ J "able", "willing" (*38) JJR JJR JJR JJR JJR JJR JJR JJR JJR JR comparative adjective (also includes "better" etc) JJR JJR ?? JJ JJ JJ JJ JJ JJ ?? "further" as adjective (*33) JJT JJT JJT JJT JJT JJT JJT JJT JJT JT superlative adjective (also includes "best" etc) JJh ?? <-> JJ JJ JJ JJ JJ ?? J adjective formed as noun + -ed - "bellied" in pot-bellied, etc. MC MC MC CD CD CD CD CD CD L cardinal numeral word - "three", "twenty-five" etc. MC1 MC1 MC1 CD1 CD1 CD1 CD1 CD1 CD L "one" as numeral word MCo MC MC CDN CDN CD CD CD CD L "0" MC1n MC1 MC1 CDN1 CDN1 CD1 CD1 CD1 CD L "1" MC2 MC2 MC2 CD1S NNS CD1S CD1S CD1S ?? <-> "ones" as plural of numeral word ("They arrived in ones and twos") MC2 NN2 MC2 CDS CDS CDS CDS CDS ?? <-> plural of cardinal - "threes" etc MC2n MC2 CD1NS CD1NS CD1S CD1S CD1S ?? <-> "1s" MC2n MC2 CDNS CDNS CDS CDS CDS ?? <-> plural of cardinal in digits - "10s", "10's", etc MCd MC MC CDND CDND CD CD CD CD L numeral with decimal point MCn MC MC CDN CDN CD CD CD CD L cardinal numeral in digits MCr MC MC CDR CDN CD CD CD CD L Roman numeral MCr MC1 MC1 CDR1 CDN1 CD1 CD1 CD1 CD1 L Roman numeral I MC2r MC MC CDRS CDNS CDS CDS CDS CDS L Roman numeral, pluralised (yes it happens) MCs MC MC CDN CDN CD CD CD CD L integer numeral in digits with leading zero ("007" etc) MCn MC MC CDN CDN CD CD CD CD K cardinal numeral in digits, acting as ordinal (e.g. "Feb 9") MC1n MC MC CDN CDN CD CD CD CD K cardinal numeral 1 in digits, acting as ordinal (e.g. "Feb 9") MCh MC MC CD CD CD CD CD CD L cardinal numeral 2 - 12 MChn MC MC CDN CDN CD CD CD CD L 2 - 12 in digits MD MD MD OD OD OD OD OD OD K ordinal numeral or fraction - "third", "fourth" etc - used as ordinal MD MD MD RB RB RB RB RB ?? A ordinal numeral or fraction - "third", "fourth" etc - used as adverb MD MD ?? NN NN NN NN NN NN N ordinal numeral or fraction - "third", "fifth" etc - used as fraction MDo MD MD OD OD OD OD OD OD K "first", "second", as ordinal MDo MD ?? RB RB RB RB RB ?? A "first", "second", as adverb MDn MD MD OD OD OD OD OD OD K ordinal numeral, digital - "1st", "2nd", "3rd", etc NN2 NN2 NN2 NNS NNS NNS NNS NNS NNS NS plural of ordinal numeral NN2 NN2 NN2 NNS NNS NNS NNS NNS NNS NS plural of fraction noun - "thirds", "fifths" etc MDt MD ?? RB RB RB RB RB RB A "next", "last", as adverb MDt MD ?? APO APO ?? ?? ?? AP J "next", "last", except as adverb (*39) MF MF MF CDF CDS ?? ?? ?? ?? <> fraction with hyphenated numerator - "two-thirds" etc MF MF MF CDF NNS ?? ?? ?? ?? <> fraction with hyphenated numerator - "two-thirds" etc MF MF MF CDF CD CD CD CD ?? <> fraction with hyphenated numerator "one" - "one-third" etc (*91) MF <> ?? CDF <> <> <> <> <> <> fraction with unhyphenated numerator - "two thirds", "one tenth" etc MFn ?? ?? CDNF CDNF CD CD CD ?? <> fraction in digits - "2/3" etc ND1 ND1 ND NRW NRW NR NR NR NR N "north", "north-east" etc ND1 ND1 ND NP NP NP NP NP NP C?? "North", "West" etc used as part of proper name of place etc (*70) NNb NN1 NN1 NN NN NN NN NN NN N attributive common noun - "scissor", "trouser", "pincer" etc NN1 NN1 NN1 NN NN NN NN NN NN N unquantifiable common noun - "Fig.", etc NN1m NN1 ?? NN NN NN NN NN ?? N unquantifiable common noun - "midstream", "midfield" etc NN1c NN1 NN1 NN NN NN NN NN NN N C common noun (*82) NN1c NNJ NNJ NN NN NN NN NN NN N group noun (*27) NN1u NN1 NN1 NN NN NN NN NN NN N M common noun NN1n NN1 NN1 NN NN NN NN NN NN N C+M common noun NN2 NN2 NN2 NNS NNS NNS NNS NNS NNS NS P common noun NNc NN NN NN NN NN NN NN NN N C+P common noun, C sense, e.g. "sheep", "deer", "species", "people" NNc NN NN NNS NNS NNS NNS NNS NNS NS C+P common noun, P sense NNu NN NN NN NN NN NN NN NN N M+P common noun, M sense, e.g. "bowls", "tiddlywinks", "measles", etc NNu NN NN NNS NNS NNS NNS NNS NNS N M+P common noun, P sense NNux NN NN NN NN NN NN NN NN N M+P common noun in -ics, M sense NNux NN NN NNS NNS NNS NNS NNS NNS NS M+P common noun in -ics, P sense NNn NN NN NN NN NN NN NN NN N C+M+P common noun, C or M sense - e.g. "fish", "cod", etc NNn NN NN NNS NNS NNS NNS NNS NNS NS C+M+P common noun, P sense NN1c NN1 NN1 NNP NNP NNP NNP NNP NN N C capitalized common noun - "Spaniard" etc NN1c NN1 NN1 NPT NPT NPT NPT NPT NN N C capitalized common noun denoting office-holder, but not used as title - "Minister" etc NN1u NN1 NN1 NNP NNP NNP NNP NNP NN N M capitalized common noun - "Sanskrit", "Buddhism" etc NN1n NN1 NN1 NNP NNP NNP NNP NNP NN N C+M capitalized common noun - "German", "Breton", "Italian" etc (*81) NN2 NN2 NN2 NNPS NNPS NNPS NNPS NNPS NNS NS P capitalized common noun - "Spaniards", "Buddhists" etc (*12) NNc NN NN NNP NNP NNP NNP NNP NN N C+P capitalized common noun, C sense - "the Swiss is" etc NNc NN NN NNPS NNPS NNPS NNPS NNPS NNS NS C+P capitalized common noun, P sense - "the Swiss are" etc (*12) NNu NN NN NNP NNP NNP NNP NNP NN N M+P capitalized common noun, M sense - "French is" etc NNu NN NN NNPS NNPS NNPS NNPS NNPS NNS NS M+P capitalized common noun, P sense - "the French are" etc (*12) NNn NN NN NNP NNP NNP NNP NNP NN N C+M+P capitalized common noun, C or M sense - "(the) Chinese is" etc NNn NN NN NNPS NNPS NNPS NNPS NNPS NNS NS C+M+P capitalized common noun, P sense - "the Chinese are" etc (*12) MC1 NN1 NN1 CD1 CD1 CD1 CD1 CD1 ?? L "one" as placeholder with determiner (*14) MC2 NN2 NN2 NNS CD1S CD1S CD1S CD1S ?? QS "ones" as placeholder with determiner NNJ1 NNJ NNJ NN NN NN NN NN NN N O noun (*83) (*32) NNJ1 NNJ NNJ NPL NPL NPL NPL NPL ?? N O noun (*83) (*32) NNJ1 NNJ NNJ NNP NNP NNP NNP NNP ?? N O noun (*83) (*32) NNJ1c NNJ NNJ NN NN NN NN NN NN N O+C noun (*32) NNJ1c NNJ NNJ NPL NPL NPL NPL NPL ?? N O+C noun (*32) NNJ1n NNJ NNJ NN NN NN NN NN NN N O+C+M noun (*32) NNJ1n NNJ NNJ NPL NPL NPL NPL NPL ?? N O+C+M noun (*32) NNJ2 NNJ2 NNJ2 NNS NNS NNS NNS NNS ?? NS O+P noun (*32) NNJ2 NNJ2 NNJ2 NPLS NPLS NPLS NPLS NPLS ?? NS O+P noun (*32) NNL1 NNL1 NNL1 NPL NPL NPL NPL NPL ?? N L noun, used in placename, singular - "Rd", "St." etc NNL1b NNL1 NNL1 NPLB NPL NPL NPL NPL ?? N L noun, used in placename, singular, preceding - "R.", "L." etc NNL NNL NNL NPL NPL NPL NPL NPL ?? N L noun, used in placename - "Is" etc, unmarked but used as singular (*84) NNL NNL NNL NPLS NPLS NPLS NPLS NPLS ?? NS L noun, used in placename - "Is" etc, unmarked but used as plural NNL2 NNL2 NNL2 NPLS NPLS NPLS NPLS NPLS ?? NS L noun, used in placename, plural - "Mts" etc NNL1cb NNL1 NNL1 NPLB NPL NPL NPL NPL ?? N L+C noun that often precedes name - "Lake", "Fort", etc, used as head of place name NNL1cb ?? ?? NN NN NN NN NN ?? N L+C noun that precedes name, not used as head of place name NNL1c NNL1 NNL1 NPL NPL NPL NPL NPL ?? N L+C noun "island", "street", etc, used as head of place name NNL1c ?? ?? NN NN NN NN NN ?? N L+C noun, not used as head of place name NNL1n NNL1 NNL1 NPL NPL NPL NPL NPL ?? N L+C+M noun, "water", "drive" etc, used as head of place name NNL1n ?? ?? NN NN NN NN NN ?? N L+C+M noun, not used as head of place name NNLc NNL NNL NPL NPL NPL NPL NPL ?? N L+C+P noun - "Links" etc (?), used as head of placename NNLc NN NN NN NN NN NN NN ?? N L+C+P noun - "links" etc (?), used singular, not as head of placename NNLc NN NN NNS NNS NNS NNS NNS ?? NS L+C+P noun - "links" etc (?), used plural, not as head of placename NNL2 NNL2 NNL2 NPLS NPLS NPLS NPLS NPLS ?? NS L+P noun "islands" etc, used as head of placename NNL2 NN2 NN2 NNS NNS NNS NNS NNS ?? NS L+P noun "islands" etc, not used as head of placename NNOn ?? <-> CDNU CDNU <> <> <> ?? <-> "m" for million(s), etc NNOc NNO NNO CDU CDU CD CD CD CD L numeral noun - "hundred" etc NNOc NNO NNO CDU CDU CD CD CD CD N numeral noun - "hundred" etc NNOc NNO NNO ?? ?? ?? ?? ?? ?? ?? "dozen" NN2 NN2 NNO2 CDUS CDUS CDS CDS CDS ?? NS plural numeral noun - "hundreds" etc. NNS1 NNS1 NNS1 NPT NPT NPT NPT NPT ?? N "Sir", "Madame" NNS1v NNS1 NNS1 NPT NPT NPT NPT NPT ?? N "Ma'am", "Madam" NNS1c NNS1 NNS1 NPT NPT NPT NPT NPT ?? N S+C noun - "bishop", "president", etc, used as title NNS1c NNS1 ?? NN NN NN NN NN ?? N S+C noun - "bishop", "president", etc, not used as title NNS1n NNS1 NNS1 NPT NPT NPT NPT NPT ?? N S+C+M noun - "justice", etc (*77), used as title NNS1n NNS1 ?? NN NN NN NN NN ?? N S+C+M noun - "justice", etc (*77), not used as title NNS2 NNS2 NNS2 NPTS NPTS NPTS NPTS NPTS ?? NS S+P noun - "Presidents" etc, used as title NNS2 NNS2 ?? NNS NNS NNS NNS NNS ?? NS S+P noun - "Presidents" etc, not used as title NNSA1 NNSA1 NNSA1 ?? NPT NPT NPT NPT ?? N following abbrev. singular titular noun - "M.A.", "Jr", etc NNSB NNSB NNSB ?? NPT NPT NPT NPT ?? N preceding abbrev. titular noun - "Rt. Hon.", etc NNSB1 NNSB1 NNSB1 ?? NPT NPT NPT NPT ?? N preceding abbrev. singular titular noun - "Mr.", "Prof.", etc NNSB2 NNSB2 NNSB2 ?? NPTS NPTS NPTS NPTS ?? NS preceding abbrev. plural titular noun - "Messrs" etc NNT1r NNT1 NNT1 NN NN NN NN NN ?? N unit of time, used with "last" and "next" NNT1c NNT1 NNT1 NN NN NN NN NN ?? N other unit of time NNT1n NNT1 NNT1 NN NN NN NN NN ?? N "time" NNT1m NNT1 NNT1 NN NN NN NN NN ?? N point in time (unquantifiable) - "noon", "mid-afternoon" etc NNT1h NNT1 NNT1 NN NN NN NN NN ?? N Holiday or season NNT1y NNT1 NNT1 NN NN NN NN NN ?? N time of day, compoundable with "yesterday" or "tomorrow". NNT2 NNT2 NNT2 NNS NNS NNS NNS NNS ?? NS plural time noun (*74) RTt RT RT NRT NRT NR NR NR ?? A "tomorrow", "today", "tonight", "yesterday" RTt RT RT NRT NRT NR NR NR ?? N "tomorrow", "today", "tonight", "yesterday" NN2 NN2 ?? NNS NNS NRS NRS NRS ?? NS "yesterdays", "tomorrows" NNa ?? ?? NRH NRH ?? ?? ?? ?? ?? time of day - "10:30", "10.30" etc NNp ?? ?? NRH NRH ?? ?? ?? ?? ?? time of day, 24 hr - "13:30" etc MCy ?? ?? NRY NRY CD CD CD ?? L year - "1961" etc (*35) MC2y ?? ?? NRYS NRYS CDS CDS CDS ?? L plural of year - "1960s", "nineteen-thirties" etc (*35) NPD1 NPD1 NPD1 NRD NRD NR NR NR NR C "Sunday" etc NPD1 NPD1 NPD1 NRD NRD NR NR NR NR N "Sunday" etc NPD2 NPD2 NPD2 NRDS NRDS NRS NRS NRS NRS CS "Sundays" etc NPD2 NPD2 NPD2 NRDS NRDS NRS NRS NRS NRS NS "Sundays" etc NPM1 NPM1 NPM1 NRM NRM NP NP NP NP C "October" etc NPM1 NPM1 NPM1 NRM NRM NP NP NP NP N "October" etc NPM2 NPM2 NNS NRMS NPS NPS NPS NPS CS "Octobers" etc NPM2 NPM2 NNS NRMS NPS NPS NPS NPS NS "Octobers" etc NNU1c NNU1 NNU1 NN NN NN NN NN ?? N U+C noun unabbreviated (*26) NNU NNU NNU NNU NNU NNU NNU NNU ?? N U noun - "in", "kg", etc NNU1n NNU1 NNU1 NN NN NN NN NN ?? N U+C+M noun, unabbreviated ("metre"? "point"?) NNU2 NNU2 NNU2 NNS NNS NNS NNS NNS ?? NS U+P, non-abbreviated NNU2 NNU2 NNU2 NNUS NNUS NNUS NNUS NNUS ?? NS U, abbreviated, plural - "ins" etc NNUp ?? ?? NNUP NNUP NNU NNU NNU ?? ?? "per cent" NNUb NNU NNU NNUC NNUC NNU NNU NNU ?? <-> unit symbol which precedes numeral - "$" etc. NN2 ?? ?? NNUS NNUS ?? ?? ?? ?? <-> pluralised currency symbol - "Save $$$s!!!" NN2 NN2 ?? ANSR ANSR APS APS APS ?? QS "others" NN1c NN1 NN1 NR NR NR NR NR ?? N "home" as noun NN2 NN2 NN2 NNS NNS NRS NRS NRS ?? NS "homes" ZZ1 ZZ1 ZZ1 ZZ ZZ ZZ ZZ ZZ ?? O singular letter of the alphabet ZZ2 ZZ2 ZZ2 NNS NNS NNS NNS NNS ?? O letter of the alphabet with plural inflection FOz <-> NN CD CD CD CD ?? L?? alphanumeric word with digit first - "1a" etc. (n.b. never CD1) FOz <-> NN FO &FO &FO &FO ?? L?? alphanumeric word with letter first - in scientific context (Orange Book) FOz <-> NN ZZ ZZ ZZ ZZ ?? L?? alphanumeric word with letter first - in other context (Orange Book) NN2 <-> NNS NNS NNS NNS NNS ?? L?? alphanumeric word with letter first, pluralised - "R101s" etc. NN2 NC2 NNS NNS NNS NNS NNS NNS <-> plural of cited word - "ifs", "buts" etc NPs NP NP NP NP NP NP NP ?? C surname not taking plural inflection, used singular NPs NP NP NPS NPS NPS NPS NPS ?? C surname not taking plural inflection, used plural NP1i <-> NPI NP NP NP NP ?? C initial (of name) - "W." or "G." in "W. G. Grace", etc NP1x NP1 NP1 NP NP NP NP NP ?? C singular proper noun, unclassified NP1c NP1 NP1 NP NP NP NP NP ?? C country name NP1f NP1 NP1 NP NP NP NP NP ?? C female Christian name NP1g NP1 NP1 NP NP NP NP NP ?? C geographical proper name, e.g. "Adriatic" NP1m NP1 NP1 NP NP NP NP NP ?? C male Christian name NP1p NP1 NP1 NP NP NP NP NP ?? C UK county name, US state name, etc NP1s NP1 NP1 NP NP NP NP NP ?? C surname NP1t NP1 NP1 NP NP NP NP NP ?? C town name NP2x NP2 NP2 NPS NPS NPS NPS NPS ?? CS plural proper noun, unclassified (*12) NP2s NP2 NP2 NPS NPS NPS NPS NPS ?? CS plural of surname - "Robinsons" etc NP2g NP2 NP2 NPS NPS NPS NPS NPS ?? CS plural proper noun, geographical - "Alps", "Trossachs" etc NP1j NP1 NP1 NP NP NP NP NP NP C Organisation name NP1h <> <> <> <> <> <> <> <> <> Racehorse name PN PN PN AQN AQN PN PN PN ?? Q "none" PN1 PN1 PN1 PN PN PN PN PN PN Q "anybody", "everyone", "nothing", "no-one" etc PN1z PN1 PNZ PNZ PN PN PN ?? A "so" as "pronoun" PN1o PN1 PN1 CD1 CD1 CD1 CD1 CD1 ?? R "one" as impersonal pronoun, the Royal "one" PNX1 PNX1 PNX1 PPL PPL PPL PPL PPL PPL RL "oneself" PNQOq PNQO PNQO WPO WPO WPO WPO WPO WPO W "whom" interrogative PNQOr PNQO PNQO WPO WPO WPO WPOR WPO WPO W "whom" relative PNQSq PNQS PNQS WP WP WP WP WP WPS W "who" interrogative PNQSr PNQS PNQS WP WP WP WPR WP WPS W "who" relative PNQVG ?? PNQV$ WVPG WPG WP$ WP$ WP$ WP$ WX "whosever" PNQVO ?? PNQVO WVPO WVPO WPO WPO WPO WPO W "whomever" PNQVS ?? PNQVS WVP WVPA WPA WPA WPA WPS W "whosoever" PNQVS PNQVS PNQVS WVP WVP WP WP WP WPS W "whoever" APPGf APP$ APP$ PPG PPG PP$ PP$ PP$ PP$ RX "her", except as pronoun APPGh1 APP$ APP$ PPG PPG PP$ PP$ PP$ PP$ RX "its" APPGh2 APP$ APP$ PPG PPG PP$ PP$ PP$ PP$ RSX "their" APPGi1 APP$ APP$ PPG PPG PP$ PP$ PP$ PP$ RX "my" as possessive determiner/pronoun APPGi2 APP$ APP$ PPG PPG PP$ PP$ PP$ PP$ RSX "our" APPGm APP$ APP$ PPG PPG PP$ PP$ PP$ PP$ RX "his", except as pronoun APPGy APP$ APP$ PPG PPG PP$ PP$ PP$ PP$ RX "thy" APPGy APP$ APP$ PPG PPG PP$ PP$ PP$ PP$ RX "your" PPGf PP$ PP$ PPGG PPGG PP$$ PP$$ PP$$ PP$$ RX "hers" PPGh2 PP$ PP$ PPGG PPGG PP$$ PP$$ PP$$ PP$$ RSX "theirs" PPGi1 PP$ PP$ PPGG PPGG PP$$ PP$$ PP$$ PP$$ RX "mine" as pronoun PPGi2 PP$ PP$ PPGG PPGG PP$$ PP$$ PP$$ PP$$ RSX "ours" PPGm PP$ PP$ PPGG PPGG PP$$ PP$$ PP$$ PP$$ RX "his" as pronoun PPGy PP$ PP$ PPGG PPGG PP$$ PP$$ PP$$ PP$$ RX "thine" as pronoun PPGy PP$ PP$ PPGG PPGG PP$$ PP$$ PP$$ PP$$ RX "yours" PPH1 PPH1 PPH1 PP3 PP3 PP3 PP3 PP3 PPO R "it" other than as subject PPH1 PPH1 PPH1 PP3 PP3 PP3 PP3 PP3 PPS R "it" as subject PPHO1f PPHO1 PPHO1 PP3O PP3O PP3O PP3O PP3O PPO R "her" as pronoun PPHO1m PPHO1 PPHO1 PP3O PP3O PP3O PP3O PP3O PPO R "him" PPHO2 PPHO2 PPHO2 PP3OS PP3OS PP3OS PP3OS PP3OS PPO ES "them" as demonstrative ("them people") PPHO2 PPHO2 PPHO2 PP3OS PP3OS PP3OS PP3OS PP3OS PPO RS "them" PPHS1f PPHS1 PPHS1 PP3A PP3A PP3A PP3A PP3A PPS R "she" PPHS1m PPHS1 PPHS1 PP3A PP3A PP3A PP3A PP3A PPS R "he" PPHS2 PPHS2 PPHS2 PP3AS PP3AS PP3AS PP3AS PP3AS PPSS RS "they" PPIO1 PPIO1 PPIO1 PP1O PP1O PP1O PP1O PP1O PPO R "me" PPIO2 PPIO2 PPIO2 PP1OS PP1OS PP1OS PP1OS PP1OS PPO RS "us" PPIO2 <> <-> PP1OS PP1OS PP1OS PP1OS PP1OS PPO RS "'s" in "let's" PPIS1 PPIS1 PPIS1 PP1A PP1A PP1A PP1A PP1A PPSS R "I" as personal pronoun PPIS2 PPIS2 PPIS2 PP1AS PP1AS PP1AS PP1AS PP1AS PPSS RS "we" PPX1f PPX1 PPX1 PPL PPL PPL PPL PPL PPL RL "herself" PPX1h PPX1 PPX1 PPL PPL PPL PPL PPL PPL RL "itself" PPX1i PPX1 PPX1 PPL PPL PPL PPL PPL PPL RL "myself" PPX1m PPX1 PPX1 PPL PPL PPL PPL PPL PPL RL "himself" PPX1y PPX1 PPX1 PPL PPL PPL PPL PPL PPL RL "thyself" PPX1y PPX1 PPX1 PPL PPL PPL PPL PPL PPL RL "yourself" PPX2h PPX2 PPX2 PPLS PPLS PPLS PPLS PPLS PPLS RSL "themselves" PPX2i PPX2 PPX2 PPLS PPLS PPLS PPLS PPLS PPLS RSL "ourselves" PPX2y PPX2 PPX2 PPLS PPLS PPLS PPLS PPLS PPLS RSL "yourselves" PPY PPY <-> PP2 PP2 PP2 PP2 PP2 PPO R "thee" PPY PPY <-> PP2 PP2 PP2 PP2 PP2 PPSS R "thou" PPY PPY PPY PP2 PP2 PP2 PP2 PP2 PPO R "you" other than as subject PPY PPY PPY PP2 PP2 PP2 PP2 PP2 PPSS R "you" as subject RAc RA ?? RB RB RB RB RB RB A coordination-closer - "respectively", "etc." etc RAi RA RA RB ?? <-> "inst.", "ult.", etc RA RA RA ?? RB RB RB RB RB P "following", "ff.", "f.", etc RA RA RA RB RB RB RB RB RB Q adverb after nominal head - "galore", etc?? RAa RA ?? RB RB RB RB RB RB P "ago", "since" when = "ago" RAe RA ?? RB RBE RB RB RB RB Q "else" following nominal head RAp ?? ?? ?? ?? ?? ?? ?? ?? Q "per annum", "per diem", "p.a.", etc RAh RA RA RBH RBH RB RB RB RB N "a.m.", "o'clock", etc RAj RA RA JJ JJ JJ JJ JJ ?? J postnominal adjective - "designate", "elect", etc RAn <> <-> RB <> <> <> <> <> <> "at_all" after negative nominal head RAn DDQV <-> RB WVDT WDT WDT WDT ?? A "whatever", "whatsoever" after negative nominal head (*6) RAq RA RA RB RB RB RB RB RB Q "apiece"; "each" used distributively RAy RA RA RB RB RB RB RB RB <-> "B.C.", etc RAb RA RA RB RB RB RB RB RB <-> "A.D.", etc RAx ?? ?? ?? ?? ?? ?? ?? ?? ?? mathematical postfix operator RAz <> <> RB <> <> <> <> <> <> "or so" in "fifty or so" REX REX REX RB RB RB RB RB RB A apposition-introducer - "namely", "e.g." etc RG RG RG QL QL QL QL QL QL A adverb whose only adverbial use is as qualifier - "very", "as", "jolly", etc RGAf RGA RGA QLP QLP QLP QLP QLP QLP A post-adjectival/adverbial qualifier - "enough" RGA RGA RGA QLP QLP QLP QLP QLP QLP A post-adjectival/adverbial qualifier - "indeed" RGQV RGQV RGQV WVQL WVQL WRB WRB WRB WQL A "however" as qualifier RGQq RGQ RGQ WQL WQL WRB WRB WRB WQL A "how" as qualifier RGR RGR RGR QLR QLR QL QL QL ?? AR "more", "less" as qualifier RGT RGT RGT QLT QLT QL QL QL ?? AT "most", "least" as qualifier RGa RG RG QLA QLA QL QL QL QL A "as" as qualifier RGf RG RG QL2 QL2 QL QL QL QL A "too" as qualifier RGi RG RG RB RB RB RB RB ?? A "about", "around", "over", "under", "up_to", "some" used with quantity or number RGl RG RG ABL ABL ABL ABL ABL ABL A "quite", "rather" before an article - e.g. "quite a good thing" RGl RG RG RB RB RB RB RB RB A "quite", "rather", as qualifier - e.g. "quite good" RR RR RR RB RB RB RB RB RB A "quite" as adverb - e.g. "Have you quite finished?" RRR ?? ?? RBR ?? ?? ?? ?? ?? ?? "rather" as adverb - e.g. "I would rather go than stay" RGx ?? ?? ?? ?? ?? ?? ?? ?? ?? mathematical prefix operator RGz RG RG QLZ QLZ QL QL QL QL A "so" as qualifier RL RL ?? NR NR NR NR NR ?? A "home" as adverb RL RL RL RB RB RB RB RB RB A place/direction adverb - "alongside", "forward", etc RLe RL RL RBWE RBWE RB RB RB RB A "elsewhere" RLh RL RL RBW RBW RN RN RN RB A "here"; "there" as locative adverb RLn RL RL RN RN RN RN RN ?? A "upstairs", "downstairs" (*97) RLw RL RL RBWP RBWP RB RB RB RB A "everywhere", "anywhere", "somewhere", "nowhere". RL RL ?? RP RP RP RP ?? RB A adverb (LOB RP list) ("apart", "aside", "away", "forth") RP RP ?? RP RP RP RP ?? RB A "back" as adverb RPK RP RPK RP RP RP RP RP ?? A "about" as prepositional adverb, particle-adverb or catenative RR RR RR RB RB RB RB RB ?? A "either" and "neither" as general adverb RRg RR RR RBX RBX RB RB RB ?? A "long" as adverb RRf RR RR RBX RBX RB RB RB ?? A "far" as adverb RRQV RRQV RRQV WVRB WVRB WRB WRB WRB ?? A wh-ever adverb - "however", "whyever", "whencesoever", "whithersoever" RRQV RRQV RRQV WVRBW WVRBW WRB WRB WRB ?? W "whenever", "wherever", RRQq RRQ RRQ WRB WRB WRB WRB WRB ?? W wh- adverb, interrogative - "why", "how", "whence" etc RRQq RRQ RRQ WRBW WRBW WRB WRB WRB ?? W "when", "where" interrogative, "whither" RRQr RRQ RRQ WRB WRB WRB WRB WRB ?? W wh- adverb, relative - "why", "whence" etc RRQr RRQ RRQ WRBW WRBW WRB WRB WRB ?? W "where" as relative adverb or relative pronoun RRQr RRQ RRQ WRBW WRBW WRB WRB WRB ?? W "when" as relative adverb or relative pronoun (*89) RRQr RRQ RRQ WRBW WRBW WRB WRB WRB ?? Z "when" as relative adverb or relative pronoun (*89) RRR RRR RRR RBR RBR RBR RBR RBR ?? AR comparative - "better", "longer" etc - adverbial in function (*21) RRR RRR ?? RB RB RB RB RB ?? A "further" as simple adverb (*33) RRm RR RR RQM RQM ?? ?? ?? ?? A "little", "much" as adverb RRR RRR RRR RQR RQR RBR RBR RBR ?? AR "more", "less" as adverb RRT RRT RRT RQT RQT RBT RBT RBT ?? AT "most", "least" as adverb RRT RRT RRT RBT RBT RBT RBT RBT ?? AT superlative - "best", "longest" etc - adverbial in function (*21) RRb <> RR RQB RQB RB <> <> ?? <> "a bit" as clause adverb or qualifier RRl RR RR RQL RQL RB RB RB ?? Q "a little" as clause adverb or qualifier RRo RR RR RQO RQO RB <> <> ?? <> "a lot" as clause adverb or qualifier RRe RR RR RQO RQO RB <> <> ?? <> "enough" as clause adverb RAe RR ?? RBE RB RB RB RB RB A "else" as clause adverb RRn RR RR QLN QLN RB RB?? RB ?? Q "no" as qualifier with comparative RRs RR <-> RBS RBS RB RB RB ?? A "otherwise" RRy RR RR QLY QLY RB RB?? RB ?? Q "any" as qualifier with comparative RRz ?? RBSZ RBSZ RB RB RB ?? A "so" introducing main clause ("So you came here, eh?") RRz CF RBSZ RBSZ CS CS CS ?? A "so" introducing result clause, following comma or less RRz CF RBSZ RBSZ RB RB RB ?? A "so" introducing result clause, following semicolon or more RRz RR RR RBSZ RBSZ RB RB RB ?? A "so" as manner or degree adverb RT RT RR RB RB RB RB RB ?? A "again", "overnight", "long-ago", "hereafter" (*18) RTo RT RT RBW RBW RN RN RN ?? A "now" RTn RT RT RBSW RBSW RN RN RN ?? A "then" except as a.c. VB0 VB0 VB0 BE BE BE BE BE BE B "be" - non-finite VB0 VB0 VB0 BE BE BE BE BE BE BB "be" - subjunctive VB0 VB0 VB0 BE BE BE BE BE BE BM "be" - imperative VBDR VBDR VBDR BED BED BED BED BED BED BB "were" - subjunctive VBDR VBDR VBDR BED BED BED BED BED BED BD "were" - indicative VBDZ VBDZ VBDZ BEDZ BEDZ BEDZ BEDZ BEDZ BEDZ BD "was" VBG VBG VBG BEG BEG BEG BEG BEG BEG BG "being" as pres. part VBM VBM VBM BEM BEM BEM BEM BEM BEM B "am" VBN VBN VBN BEN BEN BEN BEN BEN BEN BN "been" VBR VBR VBR BER BER BER BER BER BER B "are", "art" as verb VBZ VBZ VBZ BEZ BEZ BEZ BEZ BEZ BEZ B "is" VD0 VD0 VD0 DO DO DO DO DO DO D "do", "dost" - auxiliary - indicative VD0 VD0 VD0 DO DO DO DO DO DO DB "do" - auxiliary - subjunctive VD0 VD0 VD0 DO DO DO DO DO DO DM "do" - auxiliary - imperative VD0 VD0 VD0 DO DO DO DO DO DO V "do" - main - indicative VD0 VD0 VD0 DO DO DO DO DO DO VB "do", "dost" - main - subjunctive VD0 VD0 VD0 DO DO DO DO DO DO VM "do" - main - imperative VDD VDD VDD DOD DOD DOD DOD DOD DOD DD "did" - auxiliary VDD VDD VDD DOD DOD DOD DOD DOD DOD V "did" - main VDG VDG VDG VBG VBG VBG VBG VBG ?? VG "doing" VDN VDN VDN VBN VBN VBN VBN VBN ?? VN "done" VDZ VDZ VDZ DOZ DOZ DOZ DOZ DOZ DOZ D "does" - auxiliary VDZ VDZ VDZ DOZ DOZ DOZ DOZ DOZ DOZ V "does" - main VH0 VH0 VH0 HV HV HV HV HV HV H "have" - auxiliary, debitory - indicative VH0 VH0 VH0 HV HV HV HV HV HV H "have", "hast" - auxiliary, debitory - subjunctive VH0 VH0 VH0 HV HV HV HV HV HV H "have" - auxiliary, debitory - imperative VH0 VH0 VH0 HV HV HV HV HV HV V "have", "hast" - main - indicative VH0 VH0 VH0 HV HV HV HV HV HV VB "have" - main - subjunctive VH0 VH0 VH0 HV HV HV HV HV HV VM "have" - main - imperative VHD VHD VHD HVD HVD HVD HVD HVD HVD HD "had", finite - auxiliary, debitory VHD VHD VHD HVD HVD HVD HVD HVD HVD VD "had", finite - main VHG VHG VHG HVG HVG HVG HVG HVG HVG HG "having" - auxiliary, debitory VHG VHG VHG HVG HVG HVG HVG HVG HVG VG "having" - main VHN VHN VHN HVN HVN HVN HVN HVN HVN HN "had", p.p - debitory VHN VHN VHN HVN HVN HVN HVN HVN HVN VN "had", p.p - main VHZ VHZ VHZ HVZ HVZ HVZ HVZ HVZ HVZ H "has" - auxiliary, debitory VHZ VHZ VHZ HVZ HVZ HVZ HVZ HVZ HVZ V "has" - main VMK VMK VMK MDT MDT MDT MD MD ?? M "ought" as modal catenative VMK VVD VMK MDT MDT MDT MD MD ?? <-> "used" as modal catenative VMd VM VM MD MD MD MD MD MD MD modal auxiliary (preterite) VMo VM VM MD MD MD MD MD MD M modal auxiliary (present) VV0i VV0 VV0 VB VB VB VB VB VB V indicative or non-finite uninflected verb, intransitive VV0t VV0 VV0 VB VB VB VB VB VB V indicative or non-finite uninflected verb, transitive VV0v VV0 VV0 VB VB VB VB VB VB V indicative or non-finite uninflected verb, transitive and intransitive VV0i VV0 VV0 VB VB VB VB VB VB VB subjunctive uninflected verb, intransitive VV0t VV0 VV0 VB VB VB VB VB VB VB subjunctive uninflected verb, transitive VV0v VV0 VV0 VB VB VB VB VB VB VB subjunctive uninflected verb, transitive and intransitive VV0i VV0 VV0 VB VB VB VB VB VB VM imperative uninflected verb, intransitive VV0t VV0 VV0 VB VB VB VB VB VB VM imperative uninflected verb, transitive VV0v VV0 VV0 VB VB VB VB VB VB VM imperative uninflected verb, transitive and intransitive VVDi VVD VVD VBD VBD VBD VBD VBD VBD VD verb, preterite, intransitive VVDt VVD VVD VBD VBD VBD VBD VBD VBD VD verb, preterite, transitive VVDv VVD VVD VBD VBD VBD VBD VBD VBD VD verb, preterite, transitive and intransitive VVGi VVG VVG VBG VBG VBG VBG VBG VBG VG present participle, intransitive VVGt VVG VVG VBG VBG VBG VBG VBG VBG VG present participle, transitive VVGv VVG VVG VBG VBG VBG VBG VBG VBG VG present participle, transitive and intransitive NN1u ?? ?? NN NN NN NN NN ?? VG present participle, acting as noun (LOB rules) (*34) NN1n ?? ?? NN NN NN NN NN ?? VG present participle, acting as noun (LOB rules), homograph of count noun (*34) JJ ?? ?? JJ JJ JJ JJ JJ ?? VG present participle, acting as adjective (LOB rules) (*34) VVGi VVG ?? VBG VBG VBG VBG VBG VBG VG "going" as present participle VVGK VVG VVGK VBG VBG VBG VBG VBG VBG VG "going" as catenative (going to) VVNi VVN VVN VBN VBN VBN VBN VBN VBN VN past participle, intransitive VVNt VVN VVN VBN VBN VBN VBN VBN VBN VN past participle, transitive VVNv VVN VVN VBN VBN VBN VBN VBN VBN VN past participle, transitive and intransitive JJ VVN VVN JJ JJ JJ JJ JJ ?? VN past participle acting as adjective (LOB rules) (*34) VVNv VVN VVNK VBN VBN VBN VBN VBN VBN VN "bound" as past participle VVNK VVN VVNK VBN VBN VBN VBN VBN VBN VN "bound" as catenative (bound to) VVZi VVZ VVZ VBZ VBZ VBZ VBZ VBZ VBZ V 3rd-person singular present form of verb, intransitive VVZt VVZ VVZ VBZ VBZ VBZ VBZ VBZ VBZ V 3rd-person singular present form of verb, transitive VVZv VVZ VVZ VBZ VBZ VBZ VBZ VBZ VBZ V 3rd-person singular present form of verb, transitive and intransitive YC , , YC YC , , , , <> comma YD - - YD YD *- *- *- -- <> dash YH <> <> <> <> <> <> <> <> <> hyphen YE ... ... YE YE ... ... ... <-> <> ellipsis YF . . YF YF . . . . <> full stop YN : : YN YN : : : : <> colon YND <> <> YND <> <> <> <> <-> <> colon-dash YPL ( ( YPL YPL ( ( ( ( <> left bracket YPR ) ) YPR YPR ) ) ) ) <> right bracket YQ ? ? YQ YQ ? ? ? . <> question mark YQL " " YQL YQL *' *' *' <-> <> left quotation mark, single or double (*30) YQR " " YQR YQR **' **' **' <-> <> right quotation mark, single or double (*30) YS ; ; YS YS ; ; ; . <> semicolon YX ! ! YX YX ! ! ! . <> exclamation mark <> <> <> <> <> AP$ AP$ AP$ ?? <> "other's" <> <> <> <> <> APS$ APS$ APS$ ?? <> "others'" <> <> <> <> <> CD$ CD$ CD$ ?? LX "two's" etc <> <> <> <> <> CD1$ CD1$ CD1$ ?? LX "one's" (*92) <> <> <> <> <> DT$ DT$ DT$ ?? <> "another's" <> <> <> <> <> NN$ NN$ NN$ NN$ NX possessive singular noun <> <> <> <> <> NNP$ NNP$ NNP$ ?? NX genitive capitalized common noun (*5) <> <> <> <> <> NNPS$ NNPS$ NNPS$ ?? <> genitive plural capitalized common noun (*5) <> <> <> <> <> NNS$ NNS$ NNS$ NNS$ NSX possessive plural noun <> <> <> <> <> NNU$ NNU$ NNU$ ?? <> "cwt's", etc (*10) <> <> <> <> <> NNUS$ NNUS$ NNUS$ ?? <> "c.c.s'" (*10) <> <> <> <> <> NP$ NP$ NP$ NP$ CX genitive proper noun <> <> <> <> <> NPL$ NPL$ NPL$ ?? <> <> <> <> <> <> NPLS$ NPLS$ NPLS$ ?? <> "Islands'" etc <> <> <> <> <> NPS$ NPS$ NPS$ NPS$ CSX genitive plural proper noun <> <> <> <> <> NPT$ NPT$ NPT$ ?? <> "Presidents'" etc <> <> <> <> <> NPTS$ NPTS$ NPTS$ ?? <> "Presidents'" etc <> <> <> <> <> NR$ NR$ NR$ ?? <> "tomorrow's", "Wednesday's", "west's", "home's", etc <> <> <> <> <> NRS$ NRS$ NRS$ ?? <> "Sundays'" etc (*10) <> <> <> <> <> OD$ OD$ OD$ ?? <> "sixth's" etc <> <> <> <> <> PN$ PN$ PN$ ?? QX "everybody's" etc <> <> <> <> <> RB$ RB$ RB$ ?? <> "else's" <> <> <> <> <> WP$ WP$ WP$ ?? <> "whoever's" <> ?? <-> <> <> &FW &FW &FW ?? NSX plural possessive nouns not containing apostrophe - "victor ludorum" <+> ?? ?? <+> NP NP NP NP <+> <+> common noun or non-noun forming part of proper name (*13) <+> ?? ?? <+> NPS NPS NPS NPS <+> <+> plural common noun forming part of proper name (*13) <+> <+> <+> <> NC NC NC NC <+> <+> cited word (*24) <+> FW FW <+> FW &FW &FW &FW <+> O foreign word (*4)(*24) <> FW FW <> <> &FW &FW &FW ?? O word in extended foreign sentence, given explicitly NP1x ?? ?? <> NP NP NP NP ?? <-> "Celsius", "Fahrenheit" etc after "degree()s", incl abbreviations <> MC-MC MC-MC <> <> CD-CD CD-CD CD-CD ?? <> hyphenated number (e.g. "1955-59") <> NN1$ <-> <> <> &FW &FW &FW ?? NX singular possessive nouns not containing apostrophe - "domini" etc. <> VM <-> <> <> <> <> <> ?? <> "let's" # --------------------------- # Notes *1. The rarity markers in the Lancaster wordlist don't correspond to the policy evidenced in our AP Corpus sample. The entries here follow AP. *2. Don't know what LOB or Lancaster policy is for number tagging in plural proper names (The "Radio Times" is going up again). *4. APR: Long foreign sentences - words not individually wordtagged. Odd foreign words treated as English. SUE: Foreign words found in the dictionary treated as English. *5. GLS quotes "Institution's" and "Associates'" as examples of NNP$ and NNPS$. These are wrong: NNP* is strictly for words which are always capitalized regardless of context, e.g. "Christianity's", "Englishmen's". *6. Not found in Leeds LOB Treebank. *7. Lancaster wordlist actually has RG% for "that" but only DD1 for "this". *9. Despite LOB, there's actually no reason why a foreign sentence in English text should necessarily be a quotation. *10. Not found in LOB. *11. Well I'm not aware of any departures from CLAWS1 practice in SEC, but I wouldn't promise there aren't any. *12. LOB has many cases of NNPS words tagged as NPS, incorrectly by O.B. *13. Lancaster (and LOB) are much freer with NP tags than SUSANNE/APRIL. Brown has changed policy since first published. *14. Gothenburg Corpus also Q and R, inconsistently. (R is rare for tokens with this sense, but more common than tokens of "one" where R is correct) *15. May be iffy between preposition and conjunction: "the Sawyer-Bushby model", "the Arsenal-Spurs match". *16. Presumably "ere", "till" and Scots "afore" belong here too. *18. "was once RT", "years past RT" in idiom list. *19. "Foremost" is explicitly listed in CLAWS2 as JJT to override JJ in the suffixlist, but probably an aberration, rather than a deliberate distinction, all the same. *20. Note that "utmost" doesn't even take the characteristic syntax of superlatives, i.e. a defining relative clause with unfilled valency slot. "Uttermost" is almost extinct, and probably doesn't have superlative syntax either. *21. Except for "sooner", "soonest", "oftener", these are all identical with adjectives. *22. RI category dispersed between RR RL in CLS2. *23. OSLO extensions to list of RP words are all place, but are split randomly (?) between RP RL in CLS2. *24. Brown uses ordinary wordtags with suffix -NC (cited word) or (and/or?) -FW (foreign word). *25. CLS2 policy on asyndetic coordination isn't clear at all. *26. I take it here that any "unit" noun can be plural without inflection. *27. CLS2 has a half-attempt at listing nouns with distributive senses, i.e. used with plural verb to refer to members of the group. *28. Uncoded in the Brown text. *29. Brown includes "past" and "single" in the AP category. *30. Quotation marks are omitted from the Tagged Brown Corpus. *32. "College", "School" etc, now NNJ* were previously NPL. "Co." was previously NNP. (Check these facts) *33. "further" is sometimes marked comparative, sometimes not, in both APR...CLS1 and GOTH, but we don't know if there's a correspondence. *34. There is some discrepancy between LOB rules and parsing scheme rules, but that is ignored here since neither scheme is all that well-defined. *35. Formal not functional categories. *37. Also appears as CC and CS in OSLO and LDS. *38. GLS not clear as to whether JK is used for all adjectival tokens, or only those with a following infinitive *39. Gothenburg Corpus uses JT also. # ------------------------- ### Outstanding queries ### *73. Don't know yet whether Ellegard counts "have sth done", "have sb do sth", "has to do with" as main or auxiliary. ### *74. Whatever that may mean. ### *77. These composite tags should be split, as the capitalisation is different for the non-S-related sense. ### *78. Probably we should have a separate group for adjectives/adverbs which take noun complements, but aren't really prepositions ("due", "pending", "near", etc) ### *79. Is our line same as LOB's ?? ### *80. SUE idiom tag needed, or something - these are clearly abnormal ### *81. Some NNP words have no articles or plurals and are much more like proper nouns. ### *82. Are distributive nouns ever tagged NNS?. ### *83. Check LANC use of NNJ is like CLS2. ### *84. Is "Is." unique? ### *88. Not yet clear what line Ellegard draws between P and J for "like". ### *89. Don't know whether Ellegard's W/Z line is consistent or usable. ### *90. Don't know if this use exists in Gothenburg Corpus. ### *91. Not sure about LOB tagging. ### *92. Where is "ones'" ?. ### *93. SUE tag should be changed to JJj. ### *97. What else is RN ?? Brown includes "indoors" (check) ### *98. I don't understand when any of these are conjunctions (i.e. why ICS)