-rw-r--r-- | tagset.txt | 1080 |
1 files changed, 1080 insertions, 0 deletions
diff --git a/tagset.txt b/tagset.txt new file mode 100644 index 0000000..f0b9367 --- a/dev/null +++ b/tagset.txt @@ -0,0 +1,1080 @@ + FIELD MARKS FOR WEBSTER 1913 and CIDE + ===================================== +Tagset.web: + Explanations of the tags used to mark the Webster 1913 dictionary +and the CIDE (Collaborative International Dictionary of English). +Note that the list of tags used to mark the public domain version +of this dictionary is shorter than the full set described here. + If any tag is not listed here, it is either (1) one of the +"point" (font size) or "type" (font style) tags, which should be self-explanatory; or + (2) Is a functional field with no effect on the typography. + +Last modified March 12, 1999. + For questions, contact: + Patrick Cassidy cassidy@micra.com + 735 Belvidere Ave. + Plainfield, NJ 07062 + (908) 561-3416 or (908) 668-5252 +------------------------------------------------------------- +A separate file, webfont.asc, contains the list of the individual +non-ASCII characters represented by either higher-order hexadecimal +character marks (e.g., \'94, for o-umlaut) or by entity tags +(e.g., <root/, for the square root symbol.) +-------------------------------------------------------------- + Use of tags: + In the MICRA electronic version of the 1913 Webster, each part of +the entry headed by an entry word ("headword") is labeled so that no +part of the entry except some punctuation marks should be found +outside of all fields, i.e. every character should be within some tagged +field. In the following description, the word "segment" usually refers to +a major part of an entry such as an etymology or a definition or a +collocation segment or a usage block, containing more than one field. +The term "field" may also be used similarly to "segment", but may also +denote single-word fields, such as an alternative spelling, labeled <asp>. + + Note: The tags on this list are similar in structure to SGML tags. Each +tag on this list marks a field; each field opens with a tagname between +angle brackets thus: <tagname>, and closes with a similar tag containing +the forward slash thus: </tagname>. No tags are used without closing +tags. Thus the HTML <BR> to indicate a line break is symbolized +here as an entity, <br/, and every <p> has a corresponding </p>. + The absence of an end-field tag, or the presence of an end-field tag +without a prior begin-field tag constitutes a typographical error, of which +there may be a significant number. Any errors detected should be brought +to the attention of PJC or the appropriate editor. + Most of the tagged fields are presented in the text in italic type, +with a number of exceptions. Where a word is contained within more than +one field, the innermost field determines the font to be used. Wherever +recognizable functional fields were found, an attempt was made to tag the +field with a functional mark, but in many cases, words were italicised only +to represent the word itself as a discourse entity, and in some such cases, +the "italic" mark <it> was used, implying nothing regarding functionality +of the word. The base font is considered "plain". Where an italic field +is indicated, parentheses or brackets within the field are not italicised. + Where no font is specified for a tag, the tag is merely a functional +division, and was printed in plain font unless otherwise tagged. This type +of segment is marked by an asterisk (*) where the font name would be. + The size of the "plain" font in the original text is about 1.6 mm for +the height of capitalized letters. +============================================================= +Explicit typographical tags: + These were used where the purpose of a different font was merely to +distinguish a word from the body of the text, and no explicit functional +tag seemed apropriate. +----------------------------------- +Tag Font +----------------------------------- +Explicit formatting tags: +. . . . . . . . . . . . . . . . . . +<plain> plain font (that used in the body of a definition) -- + normally not marked, except within fields of + a different front. +<it> italic (in master files) +<i> italic (for use in HTML presentation) +<bold> bold (in master files) +<b> bold (for use in HTML presentation) +<colf> bold, Collocation font. Same font as used in collocations. + smaller This is used only in the list of "un-" words not + by 1 point actually defined in the dictionary. Probably could be + replaced by a segment mark for the entire list! + The "un-" words should be indexed as headwords. + +<ct> bold Same as <colf>, a font similar to that used in + collocations. However, this tag is used in a table + and could be set to a different font. + +<h1> * HTML tag -- largest heading font. + +<h2> * HTML tag -- second largest heading font. + +<headrow> * Marks a Row title in a table. + +<hwf> Font the same as the headword <hw>, though the field is + not a headword. Used only once. + +<mitem> * Multiple items, a set of items in a table. +<point ...> A series of point size markers, many unique. +<point1.5> * One of the tags of the form <point**> where ** +<point6> represents the typographic point size of the + enclosed text. +<pre> An HTML tag indicating that the enclosed text is + of teletype form, preformatted in a uniform-spaced + font. +<sc> small caps (used mostly for "a. d.", "b. c.") + This is the same font a <er>, but has no functional + or semantic significance +<str> group of table data elements in a table +<sub> subscript, like <subs> +<subs> subscript +<sups> superscript +<supr> superscript +<sansserif> Sans-serif font +<stypec> Bold (collocation font) and also a subtype. +<tt> HTML tage -- teletype font +<universbold> A squared bold font without serifs approximating the + "universe bold" font on the HP Laserjet4, slightly + larger than the capitals in a definition body. Used + in expositions describing shapes, such as + "Y", "T", "U", "X", "V", "F". +<vertical> Vertically organized column. +<column1> Vertically organized column -- only part of a table + which needs to be completed. Used once. +<...type> A series of tags, many unique, designating certain + unusual fonts, such as "bourgeoistype" for + "bourgeois type", in the section on typography. + Most of these occur only once, in the section on fonts. +<antiquetype> +<blacklettertype> +<boldfacetype> +<bourgeoistype> +<boxtype> +<clarendontype> +<englishtype> +<extendedtype> +<frenchelzevirtype> +<germantype> +<gothictype> +<greatprimertype> +<longprimertype> +<miniontype> +<nonpareiltype> +<oldenglishtype> +<oldstyletype> +<pearltype> +<picatype> +<scripttype> +<smpicatype> +<typewritertype> + +============================================================= +Tags with semantic content: +. . . . . . . . . . . . . . . . . . . . . . . . . . . +<altsp> * Alternative spelling segment. Almost always + contained within square brackets after the main + definition segment. Expository words + such as "Spelled also" are in plain font; + the actual alternative spelling is marked by + <asp> ... </asp> tags within this segment. + +<ant> italic Antonym. + +<asp> italic Alternative spelling. The actual word which is an + alternative spelling to the headword. These + are functionally synonyms of the headword. In + most cases these also occur as headwords, with + reference to the word where the actual definition + is found, but not all such words are listed + separately, particularly if the spelling is + close enough to the headword to be found at the + same point in the dictionary. Whether listed + separately or not, these words should + be indexed at this location, also. + +<au> italic Authority or author. Used where an authority is + (may be right- given for a definition, and also used for the + justified. See author, where a quotation within double quotes + in the section is given in the same paragraph as the + on formatting). definition. The double quotes are indicated + by the open-quote (\'bd) and close-quote + (\'b8). In both cases, it is typically + right-justified, almost always fitting on + the same line with the last line of the + definition or quotation. + Within collocation segments, it is usually + used only after quotations, and is not right- + justified, except occasionally where it + would be close to the right margin, and then + apparently is is right-justified. We have + not explicitly marked those which are + right-justified, but they can be + recognized because they are on a line by + themselves, preceded by two carriage returns. + +<bio> * Marks a biography. Should be longer than + a short mention of who a person was, which + is typically included as a definition. + +<biography> * Same as <bio> + +<booki> italic Marks the name of a book, pamphlet, or similar + document. + +<branchof> * A field of knowledge which of which the headword + is a division. + +<caption> * Caption of a figure or table. + +<cas> * tags the CAS (Chemical Abstracts Service) registry + number for a chemical substance. + +<causes> italic tags the infectious disease caused by the headword. + Implied type of the agent is a microorganism, and + the tag must mark a disease. + +<causesp> * Same as <causes> without the italic type. +<causedbyp> * Same as <causedby> without the italic type. + +<causedby> italic inverse of causes: tags the causative agent of an + infectious disease, which is the headword . + the tag must mark a microorganism, virus, or + prion, and the implied type of the headword is + a disease. + +<centered> Used only for The single letter in the headers to each + letter of the alphabet. + +<city> * marks the proper name of a city. Used only + occasionally and not consistently at this stage. + +<cnvto> italic Converted to: used to tag substances which are + products prepared by conversion from the + headword. Usually chemicals or complex + products from mnatuarl materials. Rarely used + up to 1998. + +<colheads> * List of heads for the columns of a table. + +<coltitle> * Title of a column in a table. + +<comm> * Comment -- differs from <note> in being in-line with + the definition paragraph. Provides a little + additional information. + +<company> * Name of a company (commercial firm). Compare <org> + +<compof> italic Composed of. Tags a substance of which the + headword is at least partly composed. The + substance may be particulate, such as + diatoms composing diatomaceous earth. + +<contains> * marks an object contained within the headword. + +<contr> italic Contrasting word. Not exactly an antonym, which + is marked <ant>, but a contrasting word which is + often introduced as "opposite to" or "contrasts + with". + +<country> * Name of a country (nation) of the world. + +<cref> italic Collocation reference. A reference to a collocation. + Each such collocation should have its own entry, + marked by <col> ... </col> tags, and these + references should function as hypertext buttons + to access that entry. + +<date> * A Date, of any type, e.g. <date>Dec. 25</date>. + +<datey> * Date-with-year tags a date containing a year. + +<def> * definition. The definition may have subfields, + particularly <as> (an illustrative phrase + starting with "as" or "thus" and containing + the headword (or a morphological derivative). + The <mark>, \'bd...\'b8 quotations (left and + right double quotes) and <au> fields may be + found within a definition field, but should + and usually are located outside the definition + proper. The marking macro was + inconsistent in this placement, and the + exclusion of the <mark>, <au> and quotations + needs to be completed by the proof-readers. + Certain definitions contain <pos> + fields within them, where the headword is + an irregular derivative of another headword. + In these cases, the <pos> field follows + immediately after the <def> tag, and these + entries do not have a separate <pos> field. + In such cases, the <pos> field is italic, as + usual. + +<divof> * Division of the headword, usually an organization. + E. g. a faculty or department of a university, + or a United Nations agency. + +<edi> * Marks an education institution, a subtype of + organization. + +<emits> * tags a physical object or form of radiation + emitted by the headword + +<figure> Just a place-holder for illustrations, but seldom used. + +<film> italic Marks the name of a movie film. + +<fld> italic Field of specialization. Most often used for + Zoology and Botany, but many "fields of + specialization" are marked for technical + terms. The parentheses are usually within this + field, but are not themselves in italics. + +<geog> * Name of a geograpahical region of any size; + if applicable, the more specific <city>, + <state>, or <country> are preferred. + +<hypen> * Hyperym. Points to the hypernym from WordNet 1.5 + Initially, used only for entries extracted + from WordNet 1.5. Not present in the original + 1913 version. + +<illu> * Illustrative usage -- mostly from WordNet, and placed + outside the definition, in contrast to <as> usage. + These should be converted to <as>...</as> illustrative + usage format for consistency. + +<illust> * Illustration place-holder. Seldom used. +<img> * HTML usage -- points to an image file, usually + .gif or .jpg. These have no closing tag, and + will appear as errors in parsing. +<intensi> * Points to a word whose meaning is an intensified + form of the headword. Taken from WordNet + tags, used with some adjectives from WordNet +<item> * Designates one item in a row of a table. Used only when + intervening spaces do not serve properly as natural + field separaters. +<itran> italic Translation into a foreign (non-English) language + of the previous word in the text -- italic font. + (<sig> is a translation into English) +<itrans> italic Same as <itran> +<jour> * Title of a journal (periodical). +<matrix> * Always a filled rectangular array. +<matrix2x5> * A 2x5 matrix (2 rows by 5 columns). +<mstypec> * Multiple synonymous subtypes -- used in + def. of "grass". +<mtable> * Multiple table, encloses <table> figures. +<musfig> * Music figure. Only in a note under the entry "Figure", + the two numbers of each such field + are bold, 20 point type, stacked as in a fraction with + a bar between them, but also having a horizontal stroke + midway through each numeral. Unique to this entry. +<p> * paragraph tag, used always in pairs. Line breaks may + be embedded inside the paragraphs. +<person> * marks the proper name of a person. Used only + occasionally, but should be used more frequently + for cases where first names are abbreviated, + to reduce ambiguity of the period for automatic + analysis. Where a title is given, prefixed + or postfixed, it is included in this tag. + +<persfn> * marks the name of a person, when only one name + (usually the last name) is given. Not used + consistently where it should be. + +<publ> * Marks the name of a publication other than book, + which is marked by <booki>. It is often a + magazine or journal. +<qpers> * Tags the name of a person who is speaking, + within a quotation. +<qperson> Same as <qpers> +<cp> * Collocation, plain text -- used to tag phrases that + should be parsed as a unit, but has no typographical + significance. +<qau> italic Always right-justified, as described for <au>. +<ref> * A reference to a word in the vocabulary. +<refs> * Marks the set of references used for a longer article + such as a biography. +<river> * Marks the name of a river -- a proper name +<rj> * Right justified +<row> * Designates a row in a table. +<state> * Name of a geopolitical state, the first subdivision of + a country. Includes, e.g. Canadian provinces. +<subtypes> * Lists subtypes of the headword. +<sup> * superscript +<supr> * Supra. The two parts of each such field + are stacked, one over the other, *without* a + horizontal bar between (as in a fraction). + Used only in one entry, for a musical notation. +<table> * Always a filled rectangular array, having <row> and <item> + elements. +<td> * Table datum - one cell in a table +<th> * Table header +<tradename> * Tags a commercial Trade name +<ttitle> * Table title (Larger than normal font) +==================================================================== + +Functional Tags +-------------------------------------------------------------------- +Tag Font Meaning + (Comparatives are relative to the plain font.) +----------------------------------------------------------------------- +<-- --> * Comment, not a tag. These segments should be deleted + from the written or printed text. + Page numbers of the original text are indicated + within such comments; these may be left in, if + desired. + +<! !> * HTML-style comment. Used to indicate page numbers + in the public domain version. + +<abbr> italic Tag for abbreviations, when mentioned within + the definition text. + +<adjf> small caps Tags for the actual adjective or adverb + comparatives or superlatives. Should be + indexed. See also conjf (verbs) and + decf (nouns). + +<altname> italic Alternative name. Usually for plants or animals, + but also used for other cases where words + are introduced by "also called", "called also", + "formerly called". These are functionally + *synonyms* for that word-sense. + +<altnpluf> italic Same as <altname>, but the marked word is a + plural form, whereas the headword is singular. + +<amorph> * Adjective morphological segment, primarily + the comparative and superlative forms. + The occasional adverb morphology is + also tagged this way. + +<as> * A segment occurring within the definitional + sentence, providing an example of usage of + the headword. Not conceptually a part of the + actual definition. + +<cd> smaller spacing Collocation definition. Similar in structure + to headword definitions (the <def> field). May + contain an <as> field. Plain type, but with + closer spacing than main definitions. + +<col> bold, Collocation. A word combination containing the + smaller by headword (or a morphological derivative). + 1 point The collocations do not have an explicitly + marked part of speech. + See also <ecol>, tagging embedded collocations. + +<colp> Collocation, no typographic significance. + Used to mark a word combination defined in + the dictionary without affect on font. + +<conjf> small caps The conjugated (non-infinitive) forms of + verbs. imp. & p. p. is common, as well as + p. pr. & vb. n. Irregular variants of + these are less common. Words in this + field perhaps should be indexed. + +<cs> smaller Collocation segment. The font and size is + vertical normal in a cs, but the spacing between lines + spacing is smaller (0.9 mm between lower-case letters, + rather than 1.1 mm in the main body of the + definition). For an on-line dictionary, + reproducing this typography is probably + pointless. + +<decf> small caps Declension form. The actual morphological + variants of nouns or pronouns. Should + be indexed. + +<ecol> * Embedded Collocation. A word combination + containing the headword (or a morphological + derivative, embedded within a definition + without a separate definition of its own. + These collocations should be defined + implicitly by the text of the definition in + which they are embedded. + See also <col>, tagging explicitly defined + collocations. + +<ent> Bold Entry field. Gives the headword without accent or + syllabication marks, and with special-character + symbols converted to their nearest ASCII + equivalents. Can be used without conversion + as the string that serves as the index word + for that entry. + +<er> Small Caps Entry reference. References to headwords + within the "etymology" section are in small + caps. Such references also occur + in the body of definitions, and in "usage" + segments. + Such entry references should function as hypertext + buttons to access that entry. + +<ety> * Etymology. Always contained within square + brackets. Normal type is used for explanatory + comments, and italics for the actual words + (marked <ets>) considered as etymological + sources. + +<ets> italic Etymological source. Words from which the + headword was derived, or to which it is related. + The Greek words within an etymology segment + are invariably etymology sources, and should + be marked as such, but are not so marked, + even in the rare cases where the Greek word + transliteration has been written in. + +<etsep> italic Etymological source, being the name of a person + or geographical location which is the eponym + for the concept. This is used to distinguish + eponymous etymologies from others, and can also + be found in the body of a definition or note, + not only in the etymology field. Very few + of the names that should be marked this way + have actually been so marked, as of version + 0.42. In cases where such eponymous names + have not yet been thus marked, they will + usually be marked by <xex>, the non-semantic + italic-font marker, or, in etymologies, by + <ets>. + +<ex> italic Example. An example of usage of the headword, + usually found within an <as> or <note> segment. + +<fr> * Frequency of use, ordinal rank. This is used for + WordNet entries, in which the synonyms + were ranked in order of frequency of use. + <fr>1</fr> indicates that the headword is the + first word on the list of synonyms. + +<fu> * First use. A date at or around which the first + use of this word in writing is recorded. + Not in the original 1913 Webster, and usu. + taken from a recent dictionary. Only a few + such fields have been entered as of version + 0.41 + +<grk> transliteration Greek. The Greek words have been transliterated + using the equivalents explained in the + file "webfonts.asc". In most cases, the + transliterations are typical for Greek + letters, except for theta (transl = q), + phi (transl. = f), eta (transl. = h), and + upsilon (transl. = y, whether pronounced + as y or u). This was to eliminate any + ambiguity. These words occur primarily + in etymologies, and to conform to the + usage of <ets> should also be marked + by <ets>, but as of version 0.41 they + are not usually thus marked. + +<hw> bold, headword. Each main entry begins with the <hw> + larger by mark, and ends at the next <hw> mark. The + 2 points main entries are not otherwise explicitly + marked as a distinctive field. + The same word may appear as a headword + several times, usually as different parts + of speech, but sometimes with different + entries as the same part of speech, presumably + to indicate a different etymology. + Within the hw field the heavy accent is + represented by double quote ("), the + light accent by open-single-quote (`), + and the short dash separating syllables by + an asterisk (*). A hyphen (-) is used to + represent the hyphen of hyphenated words. + +<mark> italic, Usage mark. Almost always within square + brackets, occasionally in parentheses or + without any bracketing. + but The most common usage marks, + explanatory "Obs." = obsolete "R." = rare, "Colloq." = + may be plain. colloquial, "Prov. Eng." = Provincial England, + etc. are in italics. Some usage notes are also + marked with <mark>, but are in plain. For + simplicity, all words in this field may be + italic, until additional explicit marks are + added. + +<markp> * A usage mark in plain type (not italic). Found + within a definition, when there are more than + one sense-number listed. "Fig." at the head + of an entry is the most common case. + +<mcol> * Multiple collocation. Similar to multiple + headword, when two or more collocations share + one definition; however, the two collocations + are in-line, rather than stacked or justified. + There may be "or" or "and" words + (italicised), or an "etc." (plain type) + within this field. In many cases, the + <or/ and <and/ entities are used to + signify the change of font for these words. + +<mhw> * Multiple headword. This field is used where + more than one headword shares a single + definition. In the dictionary, the + (usually) two headwords are left-justified + one below the other in the column, and are + tied together on the right side of the + headwords by a long right curly brace. + This division is strictly functional, + for analytical purposes, and does not + affect the typography. + +<nmorph> * Noun morphology section. Rarely used, mostly + for irregular personal pronouns. + +<note> * Explanatory note. No explicit font is indicated. + These segments may be separate, as in the + separate paragraphs starting <note><hand/, + or they may just be further explanation within + (or more usually, following) the main + definition paragraph. Typographically, + the notes following the main definition may + not be distinguishable from additional + sentences appended to the first sentence + of a definition. + +<plu> * Plural. The "plural" segment starts with a + "pl." which is italicised, but in this + segment is not otherwise marked as + italicised. Other words occurring in this + segment are plain type. The "pl." can be + easily explicitly marked if necessary. + +<pos> italic Part of speech. Always an abbreviation: e.g., + n.; v. i.; v. t.; a.; adv.; pron.; prep. + Combinations may occur, as "a. & n.". + +<epos> * Part of speech, referring to words in + etymologies, normal type. Always an + abbreviation, as in <pos> above + Combinations may occur, as "a. or n.". + +<plw> small caps Plural word. The actual plural form of the word, + found within a <plu> segment. + +<pr> * pronunciation. The default font is normal, but + many non-ASCII characters are used. + The pronunciation field may have more than + one pronunciation, separated by an "<or/". + (An "or" here is in italic, and usually is + represented by the entity <or/). + There may also be some commentary, such as + "Fr."(French pronunciation) or "archaic". + The commentaries are typically italic, and + should be marked as such. In certain + pronunciations there is a numbered reference + to a root form explained in an introductory + section on pronunciation. + Very few of the pronunciation fields have + been filled in. The pronunciation markings use + a more complicated method than more modern + dictionaries. It would be interesting to have + these fields filled in, if there are any + volunteers willing to do it. + +<q> smaller by Quotation. No bracketing quotation marks, + two points, though occasionally \'bd-\'b8 quotations occur + centered, within these quotations. These quotations + Separate tend to be more complete sentences, rather + paragraph than just phrases, such as are contained + within quotation marks within the definition + paragraph. + +<qau> italic, Quotation author. Used only for the quotations + right justified marked with <q> that are centered in their + own paragraphs. + +<qex> italic Quotation example. An example of usage of + the headword, within quotations marked + by <q>..</q> tags. + +<sd> italic Subdefinition, marked (a), (b), (c), etc. THese are + finer distinctions of word senses, used + within numbered word-sense (for main entries), + and also used for subdefinitions within + collocation segments, which have no numbering of + senses. The letter is italic, the parentheses + are not. This tag is also used to indicate the + lettered subdefinition when it is referred to + at another point in the text. + +<ship> italic The name of a ship. Rarely used. + +<sing> * Singular. Analogous to the <plu> segment, but more + rarely used, mostly for Indian tribes, which + are listed in the plural form. + +<singw> small caps Singular word. The singular form of the + plural-form headword. + +<sn> bold, Sense number. A headword may have over 20 + larger by different sense numbers. Within each numbered + 2 points sense there may be lettered sub-senses. See + the <sd> (sub-definition) field. + +<source> italic Source. The author of the definition. Used only + for definitions not originally present in + Webster 1913, and not present in the original + version intended to mimic the 1913 printed + dictionary. This source is used for each + word sense, and may differ for different + senses of a word, especially where a Web1913 + definition was substantially modified, or a + new word sense was added to a previously + defined word. + +<syn> plain Synonyms. A list of synonyms, sometimes followed + by a <usage> segment. + +<usage> narrower Comparisons of word usage for words which are + spacing sometimes confused. As with collocation segments, + font is plain, but spacing is smaller than + normal definition spacing. This seems pointlessly + complicating for an on-line display. + +<ver> * Verified for current accuracy by a technical editor, + without changes. + +<vmorph> * Verb morphology (conjugation) segment, delimited + by square brackets. + +<wordforms> * Morphological derivatives not contained in the + bracketed segments, as above. For nouns + derived from adjectives, adverbs from + adjectives, etc. This segment is usually + found at the end of the main entry. The + adverbial and nominalized derivatives at the + end of a main entry are usually introduced + by an em dash [represented as two hyphens (--)]. + +<wf> bold, Same font as <hw>, with accents and syllable + larger by breaks marked as in the headword. + 2 points Marks the actual morphological forms within + a <wordforms> segment; typically, adverbial or + nominalized form of an adjective. + + +<def2> * Second definition (occasionally, a third definition is + present). This is used where a second or third + part of speech with the same orthography is + placed under one headword. Within this segment, + there will be a <pos> field, and sometimes + a <mark> and/or a quotation. + +<specif> * "Specifically:" Used to mark the words "specifically", + "Hence", "as" which are used to introduce a second + definition typically more specific than the first, + but in general derived by extension of the initial + definition. This functions as a warning of multiple + definitions where the sense-numbers are not explicitly + used. It is also useful in separate senses, to + tag polysemous definitions which may be + specializations or generalizations of the preceding + definition. + +<pluf> italic. Plural form. + Used exclusively to mark the "pl." abbreviation, + which introduces a definition for the headword, + *when used in the plural form*. Not related to + <plu>, which spells out the plural form, but does + define it. + +<uex> italic Usage example. Used only a few times, within + <usage> segments. + +<isa> italic supertype (hypernym) the inverse of <stype> and + identical to <hypen> but not derived from WordNet. + +<chform> plain, Chemical formula. The letters are plain font, + numbers but the numbers are subscript. This is mostly + subscript useful as a functional mark to pinpoint + chemicals. + +<chformi> plain, Chemical formula same as <chform>, but not + processed specially by the tag-converter program. + The letters are plain font, but the numbers are + subscript. + Used in place of <chform> when the formula has + a tag inside, which cannot now be processed by the + <chform> processing routine. + +<chname> * chemical name. Used to allow a IUPAC chemical + name to be processed as a unit in spite of + embedded dashes, parentheses, and commas. + +<see> * "see" reference to related words, outside of the + main <def>definition</def> field. + +<mathex> italic Mathematical expression. In this dictionary, + essentially all letters (used as variable labels) + in math expressions are in italic font. + The "+" and "-" may also appear typographically + different from elsewhere in the dictionary. + +<ratio> italic Also a mathematical expression, but the colon and + double colon may have a different typography + than usual., as in <ratio>a:b</ratio> + +<singf> italic Singular form. Analogous to <pluf>, to define + the singular word where the headword is the + plural form. ** only modifies the word "sing." + +<mord> * Morphological derivation. Used to mark the + entry-reference portions of those + entries which are defined as morphological + derivatives (plural, p. p., imp.) of other + headwords. Used just as an attempt to + mark and regularize the entry format. + May be ignored typographically. + +<fract> a stack, Fraction. Used for non-numerical fractions + with which cannot be expressed as a <frac12/-style + numerator, entity. The forward slash "/" is to be + horizontal interpreted as a horizontal line separating + bar, and the numerator and denominator. + denominator + +<exp> superscript, Exponential. Used in mathematical expressions. + smaller + font. + +<xlati> italic Translation (e.g. of Greek), in the body of a + definition or etymology. Used only twice. + +<tran> italic Word translated: the word in italic is translated + by a subsequent word. Usually in etymologies, where + the word translated is not actually etymologically + related to the headword. The translated word + is not necessarily English. + +<tr> italic translation of the preceding word (or of the + headword) into English. + +<fexp> * Functional expression (math). The function names are + in plain type, the variables are italic. + +<iref> italic Illustration reference. Used ony occasionally, not + yet (v. 0.41) consistently. + +<figref> italic Figure reference. + +<figcap> * Figure caption. + +<figtitle> * Figure title. + +<funct> * tags a mathematical function or expression. + +<chreact> * Chemical reaction. Similar to chemical formulas (which + are contained but not explicitly marked), with + some other symbols. + +<ptcl> italic Verb Particle. Only a few particles were actually + marked, but in a future version more may be. + +<tabtitle> ? Table Title. Used only once. + +<title> italic Title of a literary work, movie, opera, musical + composition, etc. Used rarely but should be + used in every case, except in <au> references. + +<root> * Square root -- differs from the entity <root/, + which is a square root sign that does not extend + beyond the number following it. The <root> + field has a bar (vinvulum) over the expression + within the field, as well as the square root symbol + preceding the expression in the field. Used only + once. + +<vinc> * Vinculum. In a mathematical expression, a bar + extending over the expression within the field. + Used only once. This apparently serves the same + function as a parentheses, of causing the + expression within the field to be evaluated + and the result used as the (mathematical) value + of the field. + +<nul> plain Nultype. An older version of <plain>. + +<cd2> * Second collocation definition. Somewhat similar to + <def2>. Purely a mark to reduce functional ambiguity, + with no effect on the typography. + +<hypen> * Hypernym. Mark introduced for the World Wide Webster, + when adding words from WordNet. In most cases, this + tag marks the WordNet hypernym (for nouns and verbs). + Where the <au> mark is PJC or includes a +PJC, the + hypernym may not be the same as in WordNet. The words + marked by this tag need to be bracketed in some way, + but this is deferred until the definitions included + with the hypernyms have been deleted, and other + disambiguating marks substituted. + +<stype> italic Subtype. A functional mark, to point out words which + are conceptually subtypes of the headword. + +<styp> * Subtype. A functional mark, to point out words which + are conceptually subtypes of the headword, but + with no *typographical* significance. + +<simto> * Similar-to. A semantic relational mark for + closely related words which are not quite + synonyms, nor hypernyms, nor hyponyms. Introduced + with WordNet data. + +<conseq> * Consequence. For adjectives, is an attribute which + or is a consequence of possessing the headword attribute. +<hascons> Introduced with WordNet data. + +<consof> * Consequence of. For adjectives, an attribute which + implies the headword as a natural consequence. + +<part> italic Part. Marks a word designating something which is + conceptually a part of the headword. Rarely used. + +<parts> italic Part, plural form. Same as <part>, but marks the + name of the part in its plural form. + +<partof> * Marks a word designating something of which the headword + is conceptually a part. Inverse of <part>. + This is very broad, and may mean constituent or + separable part. + Rarely used. + +<contxt> * Context. Used only for introductions to definitions, + giving the context of usage, which are not part + of the definition proper, as: + <contxt>when used of a person:</contxt> + +<grp> * Marks the name of a group of people not formally + organized. + +<membof> italic marks a group of which the headword is a member. + This is rarely used, but should be indexed as + an entry word or phrase. + +<member> italic marks a member of a group defined by the headword. + This is rarely used, but should be indexed as + an entry word or phrase. + +<members> italic Same as <member>, but marks a plural word, + designating the name of the members in its plural form, + for lack of ambiguity. + +<method> * Designates a special type of definition which + describes a method for achieving the headword, + + used only once for the word "amend". The + subdefinitions begin with "by". + +<corpn> * Name of a business company, corporation, or partnership. + Started using November 1988. Rare. + +<corr> italic Correlative. A word intimately associated with the + headword in a manner such that one cannot + appear without the other. NOt exactly an inverse. + +<qperson> italic marks the name of a person, quoted in a dialogue. + Used only in <q> blockquotes as of vers. 0.45. + +<org> * marks the name of an organization; sometimes used + for the names of groups of people not + formally organized *see also <grp>. + +<prod> italic produces. Designates a substance produced by + a living organism. Rarely used. + +<prodp> * produces (plainfont). Designates a substance + produced by a living organism. Same as <prod>, + but does not affect font. Rarely used. + +<prodby> * produced by. Designates a living organism which + produces the headword substance. Rarely used. + +<prodmac> italic produces. Designates an object or substance produced + by a machine or process. Rarely used. + +<stage> italic life stage of an organism. Used to indicate + variant forms of an organism defined by the + headword. Rarely used. + +<stageof> * an organism one of whose life stages is the headword. + Inverse (correlative) of <stage>. Rarely used. + +<inv> italic inversely related to headword -- e.g. depository + is the inverse of depositor; buyer is the inverse of + seller. Called "correlative" in the Webster 1913 and + the CIDE. Rarely used. + +<methodfor> italic is a method to accomplish the action defined by + the headword. Rarely used, and only in the + supplemental section. + +<examp> italic example or instance of the headword, where the + tagged and emphasized word is not a proper subtype. +-------------------------------------- +<p><hw>Pa*ron"y*mous</hw> <p><sn>2.</sn> <def>Having a similar sound, but different orthography and different meaning; -- said of certain words, as <examp>all</examp> and <examp>awl</examp>; <examp>hair</examp> and <examp>hare</examp>, etc.</def><br/ +[<source>1913 Webster</source>]</p> +------------------------------------- + +<sfield> * subfield of the headword, which must be a field + of study or of knowledge +<stage> italic a stage of life of the headword -- for living things, + such as insects, whose life stages may take different + names. + +<unit> italic a unit of measure, usually preceded by a number. + Also used to tag the unit of a measure which is the + headword. + +<uses> italic tags a tool or method used by the headword, + which is usually some process. + +<usedfor> * tags a method or process for which the headword + is a tool. + +<usedby> italic tags a tool or method which uses the headword, + which is usually a physical object. + +<perf> italic performs -- tags a word which is a process or + activity performed by the headword. + +<recipr> italic reciprocal -- used for cases where the tagged word + is a reciprocal participant in an action, such as + donor and recipient. The difference between this and + <inv> inverse has not yet been systematically settled. + Used seldom, and mostly in the supplemented version. + +<sig> italic significance, meaning -- used in definitions where the + actual meaning is prefixed with commentary explaining + usage or other attributes of the word, as with + prefixes or suffixes. + +<wns> italic WordNet sense. Where known, the correspondence of the + sense of an entry with that of WordNet 1.6 is + given after the definition, in a tag of the + form: <wns>[wns=3]</wns>, in which the number + is the numbered sense in WordNet. + +<w16ns> italic WordNet version 1.6 sense. See <wns> for + explanation. +<wnote> * A note related to usage in the corresponding + WordNet definition. + ============================================================= +Biological classifications: +--------------------------- +<spn> italic Species name. Used to mark the taxonomic names + of living things which are represented in + italic font in the original printed version. + Originally, not only species, but genera, orders and + families were also thus marked. The conversion from + <spn> to <fam>, <gen>, or <ord> is not completed, and + <spn> may stil be found marking such groups. + However, orders and families are also frequently + mentioned in the original in normal font, and in such + cases are not marked with any tag. So, this mark + is not a reliable indicator of all mentions of + taxonomic names. +<kingdom> italic Taxonomic biological Kingdom name. +<phylum> italic Taxonomic phylum name. +<subphylum> italic Taxonomic subphylum name. +<class> italic Taxonomic class name. +<subclass> italic Taxonomic subclass name. +<ord> italic Taxonomic order name. + Also used for suborders, initially. +<subord> italic Taxonomic suborder name. +<suborder> italic Taxonomic suborder name. +<fam> italic Taxonomic family name. Also used to tag "tribes". +<subfam> italic Taxonomic subfamily name. +<gen> italic Taxonomic genus name. +<var> italic Variety. Used to mark subspecies or varities below + the level of species in living organism systematic + names. + +<varn> italic Variety. Used to mark subspecies or varities below + the level of species in living organism systematic + names. Duplicative variant of <var> + + |