path: root/tagset.txt
Side-by-side diff
Diffstat (limited to 'tagset.txt') (more/less context) (ignore whitespace changes)
1 files changed, 1080 insertions, 0 deletions
diff --git a/tagset.txt b/tagset.txt
new file mode 100644
index 0000000..f0b9367
--- a/dev/null
+++ b/tagset.txt
@@ -0,0 +1,1080 @@
+ =====================================
+ Explanations of the tags used to mark the Webster 1913 dictionary
+and the CIDE (Collaborative International Dictionary of English).
+Note that the list of tags used to mark the public domain version
+of this dictionary is shorter than the full set described here.
+ If any tag is not listed here, it is either (1) one of the
+"point" (font size) or "type" (font style) tags, which should be self-explanatory; or
+ (2) Is a functional field with no effect on the typography.
+Last modified March 12, 1999.
+ For questions, contact:
+ Patrick Cassidy
+ 735 Belvidere Ave.
+ Plainfield, NJ 07062
+ (908) 561-3416 or (908) 668-5252
+A separate file, webfont.asc, contains the list of the individual
+non-ASCII characters represented by either higher-order hexadecimal
+character marks (e.g., \'94, for o-umlaut) or by entity tags
+(e.g., <root/, for the square root symbol.)
+ Use of tags:
+ In the MICRA electronic version of the 1913 Webster, each part of
+the entry headed by an entry word ("headword") is labeled so that no
+part of the entry except some punctuation marks should be found
+outside of all fields, i.e. every character should be within some tagged
+field. In the following description, the word "segment" usually refers to
+a major part of an entry such as an etymology or a definition or a
+collocation segment or a usage block, containing more than one field.
+The term "field" may also be used similarly to "segment", but may also
+denote single-word fields, such as an alternative spelling, labeled <asp>.
+ Note: The tags on this list are similar in structure to SGML tags. Each
+tag on this list marks a field; each field opens with a tagname between
+angle brackets thus: <tagname>, and closes with a similar tag containing
+the forward slash thus: </tagname>. No tags are used without closing
+tags. Thus the HTML <BR> to indicate a line break is symbolized
+here as an entity, <br/, and every <p> has a corresponding </p>.
+ The absence of an end-field tag, or the presence of an end-field tag
+without a prior begin-field tag constitutes a typographical error, of which
+there may be a significant number. Any errors detected should be brought
+to the attention of PJC or the appropriate editor.
+ Most of the tagged fields are presented in the text in italic type,
+with a number of exceptions. Where a word is contained within more than
+one field, the innermost field determines the font to be used. Wherever
+recognizable functional fields were found, an attempt was made to tag the
+field with a functional mark, but in many cases, words were italicised only
+to represent the word itself as a discourse entity, and in some such cases,
+the "italic" mark <it> was used, implying nothing regarding functionality
+of the word. The base font is considered "plain". Where an italic field
+is indicated, parentheses or brackets within the field are not italicised.
+ Where no font is specified for a tag, the tag is merely a functional
+division, and was printed in plain font unless otherwise tagged. This type
+of segment is marked by an asterisk (*) where the font name would be.
+ The size of the "plain" font in the original text is about 1.6 mm for
+the height of capitalized letters.
+Explicit typographical tags:
+ These were used where the purpose of a different font was merely to
+distinguish a word from the body of the text, and no explicit functional
+tag seemed apropriate.
+Tag Font
+Explicit formatting tags:
+. . . . . . . . . . . . . . . . . .
+<plain> plain font (that used in the body of a definition) --
+ normally not marked, except within fields of
+ a different front.
+<it> italic (in master files)
+<i> italic (for use in HTML presentation)
+<bold> bold (in master files)
+<b> bold (for use in HTML presentation)
+<colf> bold, Collocation font. Same font as used in collocations.
+ smaller This is used only in the list of "un-" words not
+ by 1 point actually defined in the dictionary. Probably could be
+ replaced by a segment mark for the entire list!
+ The "un-" words should be indexed as headwords.
+<ct> bold Same as <colf>, a font similar to that used in
+ collocations. However, this tag is used in a table
+ and could be set to a different font.
+<h1> * HTML tag -- largest heading font.
+<h2> * HTML tag -- second largest heading font.
+<headrow> * Marks a Row title in a table.
+<hwf> Font the same as the headword <hw>, though the field is
+ not a headword. Used only once.
+<mitem> * Multiple items, a set of items in a table.
+<point ...> A series of point size markers, many unique.
+<point1.5> * One of the tags of the form <point**> where **
+<point6> represents the typographic point size of the
+ enclosed text.
+<pre> An HTML tag indicating that the enclosed text is
+ of teletype form, preformatted in a uniform-spaced
+ font.
+<sc> small caps (used mostly for "a. d.", "b. c.")
+ This is the same font a <er>, but has no functional
+ or semantic significance
+<str> group of table data elements in a table
+<sub> subscript, like <subs>
+<subs> subscript
+<sups> superscript
+<supr> superscript
+<sansserif> Sans-serif font
+<stypec> Bold (collocation font) and also a subtype.
+<tt> HTML tage -- teletype font
+<universbold> A squared bold font without serifs approximating the
+ "universe bold" font on the HP Laserjet4, slightly
+ larger than the capitals in a definition body. Used
+ in expositions describing shapes, such as
+ "Y", "T", "U", "X", "V", "F".
+<vertical> Vertically organized column.
+<column1> Vertically organized column -- only part of a table
+ which needs to be completed. Used once.
+<...type> A series of tags, many unique, designating certain
+ unusual fonts, such as "bourgeoistype" for
+ "bourgeois type", in the section on typography.
+ Most of these occur only once, in the section on fonts.
+Tags with semantic content:
+. . . . . . . . . . . . . . . . . . . . . . . . . . .
+<altsp> * Alternative spelling segment. Almost always
+ contained within square brackets after the main
+ definition segment. Expository words
+ such as "Spelled also" are in plain font;
+ the actual alternative spelling is marked by
+ <asp> ... </asp> tags within this segment.
+<ant> italic Antonym.
+<asp> italic Alternative spelling. The actual word which is an
+ alternative spelling to the headword. These
+ are functionally synonyms of the headword. In
+ most cases these also occur as headwords, with
+ reference to the word where the actual definition
+ is found, but not all such words are listed
+ separately, particularly if the spelling is
+ close enough to the headword to be found at the
+ same point in the dictionary. Whether listed
+ separately or not, these words should
+ be indexed at this location, also.
+<au> italic Authority or author. Used where an authority is
+ (may be right- given for a definition, and also used for the
+ justified. See author, where a quotation within double quotes
+ in the section is given in the same paragraph as the
+ on formatting). definition. The double quotes are indicated
+ by the open-quote (\'bd) and close-quote
+ (\'b8). In both cases, it is typically
+ right-justified, almost always fitting on
+ the same line with the last line of the
+ definition or quotation.
+ Within collocation segments, it is usually
+ used only after quotations, and is not right-
+ justified, except occasionally where it
+ would be close to the right margin, and then
+ apparently is is right-justified. We have
+ not explicitly marked those which are
+ right-justified, but they can be
+ recognized because they are on a line by
+ themselves, preceded by two carriage returns.
+<bio> * Marks a biography. Should be longer than
+ a short mention of who a person was, which
+ is typically included as a definition.
+<biography> * Same as <bio>
+<booki> italic Marks the name of a book, pamphlet, or similar
+ document.
+<branchof> * A field of knowledge which of which the headword
+ is a division.
+<caption> * Caption of a figure or table.
+<cas> * tags the CAS (Chemical Abstracts Service) registry
+ number for a chemical substance.
+<causes> italic tags the infectious disease caused by the headword.
+ Implied type of the agent is a microorganism, and
+ the tag must mark a disease.
+<causesp> * Same as <causes> without the italic type.
+<causedbyp> * Same as <causedby> without the italic type.
+<causedby> italic inverse of causes: tags the causative agent of an
+ infectious disease, which is the headword .
+ the tag must mark a microorganism, virus, or
+ prion, and the implied type of the headword is
+ a disease.
+<centered> Used only for The single letter in the headers to each
+ letter of the alphabet.
+<city> * marks the proper name of a city. Used only
+ occasionally and not consistently at this stage.
+<cnvto> italic Converted to: used to tag substances which are
+ products prepared by conversion from the
+ headword. Usually chemicals or complex
+ products from mnatuarl materials. Rarely used
+ up to 1998.
+<colheads> * List of heads for the columns of a table.
+<coltitle> * Title of a column in a table.
+<comm> * Comment -- differs from <note> in being in-line with
+ the definition paragraph. Provides a little
+ additional information.
+<company> * Name of a company (commercial firm). Compare <org>
+<compof> italic Composed of. Tags a substance of which the
+ headword is at least partly composed. The
+ substance may be particulate, such as
+ diatoms composing diatomaceous earth.
+<contains> * marks an object contained within the headword.
+<contr> italic Contrasting word. Not exactly an antonym, which
+ is marked <ant>, but a contrasting word which is
+ often introduced as "opposite to" or "contrasts
+ with".
+<country> * Name of a country (nation) of the world.
+<cref> italic Collocation reference. A reference to a collocation.
+ Each such collocation should have its own entry,
+ marked by <col> ... </col> tags, and these
+ references should function as hypertext buttons
+ to access that entry.
+<date> * A Date, of any type, e.g. <date>Dec. 25</date>.
+<datey> * Date-with-year tags a date containing a year.
+<def> * definition. The definition may have subfields,
+ particularly <as> (an illustrative phrase
+ starting with "as" or "thus" and containing
+ the headword (or a morphological derivative).
+ The <mark>, \'bd...\'b8 quotations (left and
+ right double quotes) and <au> fields may be
+ found within a definition field, but should
+ and usually are located outside the definition
+ proper. The marking macro was
+ inconsistent in this placement, and the
+ exclusion of the <mark>, <au> and quotations
+ needs to be completed by the proof-readers.
+ Certain definitions contain <pos>
+ fields within them, where the headword is
+ an irregular derivative of another headword.
+ In these cases, the <pos> field follows
+ immediately after the <def> tag, and these
+ entries do not have a separate <pos> field.
+ In such cases, the <pos> field is italic, as
+ usual.
+<divof> * Division of the headword, usually an organization.
+ E. g. a faculty or department of a university,
+ or a United Nations agency.
+<edi> * Marks an education institution, a subtype of
+ organization.
+<emits> * tags a physical object or form of radiation
+ emitted by the headword
+<figure> Just a place-holder for illustrations, but seldom used.
+<film> italic Marks the name of a movie film.
+<fld> italic Field of specialization. Most often used for
+ Zoology and Botany, but many "fields of
+ specialization" are marked for technical
+ terms. The parentheses are usually within this
+ field, but are not themselves in italics.
+<geog> * Name of a geograpahical region of any size;
+ if applicable, the more specific <city>,
+ <state>, or <country> are preferred.
+<hypen> * Hyperym. Points to the hypernym from WordNet 1.5
+ Initially, used only for entries extracted
+ from WordNet 1.5. Not present in the original
+ 1913 version.
+<illu> * Illustrative usage -- mostly from WordNet, and placed
+ outside the definition, in contrast to <as> usage.
+ These should be converted to <as>...</as> illustrative
+ usage format for consistency.
+<illust> * Illustration place-holder. Seldom used.
+<img> * HTML usage -- points to an image file, usually
+ .gif or .jpg. These have no closing tag, and
+ will appear as errors in parsing.
+<intensi> * Points to a word whose meaning is an intensified
+ form of the headword. Taken from WordNet
+ tags, used with some adjectives from WordNet
+<item> * Designates one item in a row of a table. Used only when
+ intervening spaces do not serve properly as natural
+ field separaters.
+<itran> italic Translation into a foreign (non-English) language
+ of the previous word in the text -- italic font.
+ (<sig> is a translation into English)
+<itrans> italic Same as <itran>
+<jour> * Title of a journal (periodical).
+<matrix> * Always a filled rectangular array.
+<matrix2x5> * A 2x5 matrix (2 rows by 5 columns).
+<mstypec> * Multiple synonymous subtypes -- used in
+ def. of "grass".
+<mtable> * Multiple table, encloses <table> figures.
+<musfig> * Music figure. Only in a note under the entry "Figure",
+ the two numbers of each such field
+ are bold, 20 point type, stacked as in a fraction with
+ a bar between them, but also having a horizontal stroke
+ midway through each numeral. Unique to this entry.
+<p> * paragraph tag, used always in pairs. Line breaks may
+ be embedded inside the paragraphs.
+<person> * marks the proper name of a person. Used only
+ occasionally, but should be used more frequently
+ for cases where first names are abbreviated,
+ to reduce ambiguity of the period for automatic
+ analysis. Where a title is given, prefixed
+ or postfixed, it is included in this tag.
+<persfn> * marks the name of a person, when only one name
+ (usually the last name) is given. Not used
+ consistently where it should be.
+<publ> * Marks the name of a publication other than book,
+ which is marked by <booki>. It is often a
+ magazine or journal.
+<qpers> * Tags the name of a person who is speaking,
+ within a quotation.
+<qperson> Same as <qpers>
+<cp> * Collocation, plain text -- used to tag phrases that
+ should be parsed as a unit, but has no typographical
+ significance.
+<qau> italic Always right-justified, as described for <au>.
+<ref> * A reference to a word in the vocabulary.
+<refs> * Marks the set of references used for a longer article
+ such as a biography.
+<river> * Marks the name of a river -- a proper name
+<rj> * Right justified
+<row> * Designates a row in a table.
+<state> * Name of a geopolitical state, the first subdivision of
+ a country. Includes, e.g. Canadian provinces.
+<subtypes> * Lists subtypes of the headword.
+<sup> * superscript
+<supr> * Supra. The two parts of each such field
+ are stacked, one over the other, *without* a
+ horizontal bar between (as in a fraction).
+ Used only in one entry, for a musical notation.
+<table> * Always a filled rectangular array, having <row> and <item>
+ elements.
+<td> * Table datum - one cell in a table
+<th> * Table header
+<tradename> * Tags a commercial Trade name
+<ttitle> * Table title (Larger than normal font)
+Functional Tags
+Tag Font Meaning
+ (Comparatives are relative to the plain font.)
+<-- --> * Comment, not a tag. These segments should be deleted
+ from the written or printed text.
+ Page numbers of the original text are indicated
+ within such comments; these may be left in, if
+ desired.
+<! !> * HTML-style comment. Used to indicate page numbers
+ in the public domain version.
+<abbr> italic Tag for abbreviations, when mentioned within
+ the definition text.
+<adjf> small caps Tags for the actual adjective or adverb
+ comparatives or superlatives. Should be
+ indexed. See also conjf (verbs) and
+ decf (nouns).
+<altname> italic Alternative name. Usually for plants or animals,
+ but also used for other cases where words
+ are introduced by "also called", "called also",
+ "formerly called". These are functionally
+ *synonyms* for that word-sense.
+<altnpluf> italic Same as <altname>, but the marked word is a
+ plural form, whereas the headword is singular.
+<amorph> * Adjective morphological segment, primarily
+ the comparative and superlative forms.
+ The occasional adverb morphology is
+ also tagged this way.
+<as> * A segment occurring within the definitional
+ sentence, providing an example of usage of
+ the headword. Not conceptually a part of the
+ actual definition.
+<cd> smaller spacing Collocation definition. Similar in structure
+ to headword definitions (the <def> field). May
+ contain an <as> field. Plain type, but with
+ closer spacing than main definitions.
+<col> bold, Collocation. A word combination containing the
+ smaller by headword (or a morphological derivative).
+ 1 point The collocations do not have an explicitly
+ marked part of speech.
+ See also <ecol>, tagging embedded collocations.
+<colp> Collocation, no typographic significance.
+ Used to mark a word combination defined in
+ the dictionary without affect on font.
+<conjf> small caps The conjugated (non-infinitive) forms of
+ verbs. imp. & p. p. is common, as well as
+ p. pr. & vb. n. Irregular variants of
+ these are less common. Words in this
+ field perhaps should be indexed.
+<cs> smaller Collocation segment. The font and size is
+ vertical normal in a cs, but the spacing between lines
+ spacing is smaller (0.9 mm between lower-case letters,
+ rather than 1.1 mm in the main body of the
+ definition). For an on-line dictionary,
+ reproducing this typography is probably
+ pointless.
+<decf> small caps Declension form. The actual morphological
+ variants of nouns or pronouns. Should
+ be indexed.
+<ecol> * Embedded Collocation. A word combination
+ containing the headword (or a morphological
+ derivative, embedded within a definition
+ without a separate definition of its own.
+ These collocations should be defined
+ implicitly by the text of the definition in
+ which they are embedded.
+ See also <col>, tagging explicitly defined
+ collocations.
+<ent> Bold Entry field. Gives the headword without accent or
+ syllabication marks, and with special-character
+ symbols converted to their nearest ASCII
+ equivalents. Can be used without conversion
+ as the string that serves as the index word
+ for that entry.
+<er> Small Caps Entry reference. References to headwords
+ within the "etymology" section are in small
+ caps. Such references also occur
+ in the body of definitions, and in "usage"
+ segments.
+ Such entry references should function as hypertext
+ buttons to access that entry.
+<ety> * Etymology. Always contained within square
+ brackets. Normal type is used for explanatory
+ comments, and italics for the actual words
+ (marked <ets>) considered as etymological
+ sources.
+<ets> italic Etymological source. Words from which the
+ headword was derived, or to which it is related.
+ The Greek words within an etymology segment
+ are invariably etymology sources, and should
+ be marked as such, but are not so marked,
+ even in the rare cases where the Greek word
+ transliteration has been written in.
+<etsep> italic Etymological source, being the name of a person
+ or geographical location which is the eponym
+ for the concept. This is used to distinguish
+ eponymous etymologies from others, and can also
+ be found in the body of a definition or note,
+ not only in the etymology field. Very few
+ of the names that should be marked this way
+ have actually been so marked, as of version
+ 0.42. In cases where such eponymous names
+ have not yet been thus marked, they will
+ usually be marked by <xex>, the non-semantic
+ italic-font marker, or, in etymologies, by
+ <ets>.
+<ex> italic Example. An example of usage of the headword,
+ usually found within an <as> or <note> segment.
+<fr> * Frequency of use, ordinal rank. This is used for
+ WordNet entries, in which the synonyms
+ were ranked in order of frequency of use.
+ <fr>1</fr> indicates that the headword is the
+ first word on the list of synonyms.
+<fu> * First use. A date at or around which the first
+ use of this word in writing is recorded.
+ Not in the original 1913 Webster, and usu.
+ taken from a recent dictionary. Only a few
+ such fields have been entered as of version
+ 0.41
+<grk> transliteration Greek. The Greek words have been transliterated
+ using the equivalents explained in the
+ file "webfonts.asc". In most cases, the
+ transliterations are typical for Greek
+ letters, except for theta (transl = q),
+ phi (transl. = f), eta (transl. = h), and
+ upsilon (transl. = y, whether pronounced
+ as y or u). This was to eliminate any
+ ambiguity. These words occur primarily
+ in etymologies, and to conform to the
+ usage of <ets> should also be marked
+ by <ets>, but as of version 0.41 they
+ are not usually thus marked.
+<hw> bold, headword. Each main entry begins with the <hw>
+ larger by mark, and ends at the next <hw> mark. The
+ 2 points main entries are not otherwise explicitly
+ marked as a distinctive field.
+ The same word may appear as a headword
+ several times, usually as different parts
+ of speech, but sometimes with different
+ entries as the same part of speech, presumably
+ to indicate a different etymology.
+ Within the hw field the heavy accent is
+ represented by double quote ("), the
+ light accent by open-single-quote (`),
+ and the short dash separating syllables by
+ an asterisk (*). A hyphen (-) is used to
+ represent the hyphen of hyphenated words.
+<mark> italic, Usage mark. Almost always within square
+ brackets, occasionally in parentheses or
+ without any bracketing.
+ but The most common usage marks,
+ explanatory "Obs." = obsolete "R." = rare, "Colloq." =
+ may be plain. colloquial, "Prov. Eng." = Provincial England,
+ etc. are in italics. Some usage notes are also
+ marked with <mark>, but are in plain. For
+ simplicity, all words in this field may be
+ italic, until additional explicit marks are
+ added.
+<markp> * A usage mark in plain type (not italic). Found
+ within a definition, when there are more than
+ one sense-number listed. "Fig." at the head
+ of an entry is the most common case.
+<mcol> * Multiple collocation. Similar to multiple
+ headword, when two or more collocations share
+ one definition; however, the two collocations
+ are in-line, rather than stacked or justified.
+ There may be "or" or "and" words
+ (italicised), or an "etc." (plain type)
+ within this field. In many cases, the
+ <or/ and <and/ entities are used to
+ signify the change of font for these words.
+<mhw> * Multiple headword. This field is used where
+ more than one headword shares a single
+ definition. In the dictionary, the
+ (usually) two headwords are left-justified
+ one below the other in the column, and are
+ tied together on the right side of the
+ headwords by a long right curly brace.
+ This division is strictly functional,
+ for analytical purposes, and does not
+ affect the typography.
+<nmorph> * Noun morphology section. Rarely used, mostly
+ for irregular personal pronouns.
+<note> * Explanatory note. No explicit font is indicated.
+ These segments may be separate, as in the
+ separate paragraphs starting <note><hand/,
+ or they may just be further explanation within
+ (or more usually, following) the main
+ definition paragraph. Typographically,
+ the notes following the main definition may
+ not be distinguishable from additional
+ sentences appended to the first sentence
+ of a definition.
+<plu> * Plural. The "plural" segment starts with a
+ "pl." which is italicised, but in this
+ segment is not otherwise marked as
+ italicised. Other words occurring in this
+ segment are plain type. The "pl." can be
+ easily explicitly marked if necessary.
+<pos> italic Part of speech. Always an abbreviation: e.g.,
+ n.; v. i.; v. t.; a.; adv.; pron.; prep.
+ Combinations may occur, as "a. & n.".
+<epos> * Part of speech, referring to words in
+ etymologies, normal type. Always an
+ abbreviation, as in <pos> above
+ Combinations may occur, as "a. or n.".
+<plw> small caps Plural word. The actual plural form of the word,
+ found within a <plu> segment.
+<pr> * pronunciation. The default font is normal, but
+ many non-ASCII characters are used.
+ The pronunciation field may have more than
+ one pronunciation, separated by an "<or/".
+ (An "or" here is in italic, and usually is
+ represented by the entity <or/).
+ There may also be some commentary, such as
+ "Fr."(French pronunciation) or "archaic".
+ The commentaries are typically italic, and
+ should be marked as such. In certain
+ pronunciations there is a numbered reference
+ to a root form explained in an introductory
+ section on pronunciation.
+ Very few of the pronunciation fields have
+ been filled in. The pronunciation markings use
+ a more complicated method than more modern
+ dictionaries. It would be interesting to have
+ these fields filled in, if there are any
+ volunteers willing to do it.
+<q> smaller by Quotation. No bracketing quotation marks,
+ two points, though occasionally \'bd-\'b8 quotations occur
+ centered, within these quotations. These quotations
+ Separate tend to be more complete sentences, rather
+ paragraph than just phrases, such as are contained
+ within quotation marks within the definition
+ paragraph.
+<qau> italic, Quotation author. Used only for the quotations
+ right justified marked with <q> that are centered in their
+ own paragraphs.
+<qex> italic Quotation example. An example of usage of
+ the headword, within quotations marked
+ by <q>..</q> tags.
+<sd> italic Subdefinition, marked (a), (b), (c), etc. THese are
+ finer distinctions of word senses, used
+ within numbered word-sense (for main entries),
+ and also used for subdefinitions within
+ collocation segments, which have no numbering of
+ senses. The letter is italic, the parentheses
+ are not. This tag is also used to indicate the
+ lettered subdefinition when it is referred to
+ at another point in the text.
+<ship> italic The name of a ship. Rarely used.
+<sing> * Singular. Analogous to the <plu> segment, but more
+ rarely used, mostly for Indian tribes, which
+ are listed in the plural form.
+<singw> small caps Singular word. The singular form of the
+ plural-form headword.
+<sn> bold, Sense number. A headword may have over 20
+ larger by different sense numbers. Within each numbered
+ 2 points sense there may be lettered sub-senses. See
+ the <sd> (sub-definition) field.
+<source> italic Source. The author of the definition. Used only
+ for definitions not originally present in
+ Webster 1913, and not present in the original
+ version intended to mimic the 1913 printed
+ dictionary. This source is used for each
+ word sense, and may differ for different
+ senses of a word, especially where a Web1913
+ definition was substantially modified, or a
+ new word sense was added to a previously
+ defined word.
+<syn> plain Synonyms. A list of synonyms, sometimes followed
+ by a <usage> segment.
+<usage> narrower Comparisons of word usage for words which are
+ spacing sometimes confused. As with collocation segments,
+ font is plain, but spacing is smaller than
+ normal definition spacing. This seems pointlessly
+ complicating for an on-line display.
+<ver> * Verified for current accuracy by a technical editor,
+ without changes.
+<vmorph> * Verb morphology (conjugation) segment, delimited
+ by square brackets.
+<wordforms> * Morphological derivatives not contained in the
+ bracketed segments, as above. For nouns
+ derived from adjectives, adverbs from
+ adjectives, etc. This segment is usually
+ found at the end of the main entry. The
+ adverbial and nominalized derivatives at the
+ end of a main entry are usually introduced
+ by an em dash [represented as two hyphens (--)].
+<wf> bold, Same font as <hw>, with accents and syllable
+ larger by breaks marked as in the headword.
+ 2 points Marks the actual morphological forms within
+ a <wordforms> segment; typically, adverbial or
+ nominalized form of an adjective.
+<def2> * Second definition (occasionally, a third definition is
+ present). This is used where a second or third
+ part of speech with the same orthography is
+ placed under one headword. Within this segment,
+ there will be a <pos> field, and sometimes
+ a <mark> and/or a quotation.
+<specif> * "Specifically:" Used to mark the words "specifically",
+ "Hence", "as" which are used to introduce a second
+ definition typically more specific than the first,
+ but in general derived by extension of the initial
+ definition. This functions as a warning of multiple
+ definitions where the sense-numbers are not explicitly
+ used. It is also useful in separate senses, to
+ tag polysemous definitions which may be
+ specializations or generalizations of the preceding
+ definition.
+<pluf> italic. Plural form.
+ Used exclusively to mark the "pl." abbreviation,
+ which introduces a definition for the headword,
+ *when used in the plural form*. Not related to
+ <plu>, which spells out the plural form, but does
+ define it.
+<uex> italic Usage example. Used only a few times, within
+ <usage> segments.
+<isa> italic supertype (hypernym) the inverse of <stype> and
+ identical to <hypen> but not derived from WordNet.
+<chform> plain, Chemical formula. The letters are plain font,
+ numbers but the numbers are subscript. This is mostly
+ subscript useful as a functional mark to pinpoint
+ chemicals.
+<chformi> plain, Chemical formula same as <chform>, but not
+ processed specially by the tag-converter program.
+ The letters are plain font, but the numbers are
+ subscript.
+ Used in place of <chform> when the formula has
+ a tag inside, which cannot now be processed by the
+ <chform> processing routine.
+<chname> * chemical name. Used to allow a IUPAC chemical
+ name to be processed as a unit in spite of
+ embedded dashes, parentheses, and commas.
+<see> * "see" reference to related words, outside of the
+ main <def>definition</def> field.
+<mathex> italic Mathematical expression. In this dictionary,
+ essentially all letters (used as variable labels)
+ in math expressions are in italic font.
+ The "+" and "-" may also appear typographically
+ different from elsewhere in the dictionary.
+<ratio> italic Also a mathematical expression, but the colon and
+ double colon may have a different typography
+ than usual., as in <ratio>a:b</ratio>
+<singf> italic Singular form. Analogous to <pluf>, to define
+ the singular word where the headword is the
+ plural form. ** only modifies the word "sing."
+<mord> * Morphological derivation. Used to mark the
+ entry-reference portions of those
+ entries which are defined as morphological
+ derivatives (plural, p. p., imp.) of other
+ headwords. Used just as an attempt to
+ mark and regularize the entry format.
+ May be ignored typographically.
+<fract> a stack, Fraction. Used for non-numerical fractions
+ with which cannot be expressed as a <frac12/-style
+ numerator, entity. The forward slash "/" is to be
+ horizontal interpreted as a horizontal line separating
+ bar, and the numerator and denominator.
+ denominator
+<exp> superscript, Exponential. Used in mathematical expressions.
+ smaller
+ font.
+<xlati> italic Translation (e.g. of Greek), in the body of a
+ definition or etymology. Used only twice.
+<tran> italic Word translated: the word in italic is translated
+ by a subsequent word. Usually in etymologies, where
+ the word translated is not actually etymologically
+ related to the headword. The translated word
+ is not necessarily English.
+<tr> italic translation of the preceding word (or of the
+ headword) into English.
+<fexp> * Functional expression (math). The function names are
+ in plain type, the variables are italic.
+<iref> italic Illustration reference. Used ony occasionally, not
+ yet (v. 0.41) consistently.
+<figref> italic Figure reference.
+<figcap> * Figure caption.
+<figtitle> * Figure title.
+<funct> * tags a mathematical function or expression.
+<chreact> * Chemical reaction. Similar to chemical formulas (which
+ are contained but not explicitly marked), with
+ some other symbols.
+<ptcl> italic Verb Particle. Only a few particles were actually
+ marked, but in a future version more may be.
+<tabtitle> ? Table Title. Used only once.
+<title> italic Title of a literary work, movie, opera, musical
+ composition, etc. Used rarely but should be
+ used in every case, except in <au> references.
+<root> * Square root -- differs from the entity <root/,
+ which is a square root sign that does not extend
+ beyond the number following it. The <root>
+ field has a bar (vinvulum) over the expression
+ within the field, as well as the square root symbol
+ preceding the expression in the field. Used only
+ once.
+<vinc> * Vinculum. In a mathematical expression, a bar
+ extending over the expression within the field.
+ Used only once. This apparently serves the same
+ function as a parentheses, of causing the
+ expression within the field to be evaluated
+ and the result used as the (mathematical) value
+ of the field.
+<nul> plain Nultype. An older version of <plain>.
+<cd2> * Second collocation definition. Somewhat similar to
+ <def2>. Purely a mark to reduce functional ambiguity,
+ with no effect on the typography.
+<hypen> * Hypernym. Mark introduced for the World Wide Webster,
+ when adding words from WordNet. In most cases, this
+ tag marks the WordNet hypernym (for nouns and verbs).
+ Where the <au> mark is PJC or includes a +PJC, the
+ hypernym may not be the same as in WordNet. The words
+ marked by this tag need to be bracketed in some way,
+ but this is deferred until the definitions included
+ with the hypernyms have been deleted, and other
+ disambiguating marks substituted.
+<stype> italic Subtype. A functional mark, to point out words which
+ are conceptually subtypes of the headword.
+<styp> * Subtype. A functional mark, to point out words which
+ are conceptually subtypes of the headword, but
+ with no *typographical* significance.
+<simto> * Similar-to. A semantic relational mark for
+ closely related words which are not quite
+ synonyms, nor hypernyms, nor hyponyms. Introduced
+ with WordNet data.
+<conseq> * Consequence. For adjectives, is an attribute which
+ or is a consequence of possessing the headword attribute.
+<hascons> Introduced with WordNet data.
+<consof> * Consequence of. For adjectives, an attribute which
+ implies the headword as a natural consequence.
+<part> italic Part. Marks a word designating something which is
+ conceptually a part of the headword. Rarely used.
+<parts> italic Part, plural form. Same as <part>, but marks the
+ name of the part in its plural form.
+<partof> * Marks a word designating something of which the headword
+ is conceptually a part. Inverse of <part>.
+ This is very broad, and may mean constituent or
+ separable part.
+ Rarely used.
+<contxt> * Context. Used only for introductions to definitions,
+ giving the context of usage, which are not part
+ of the definition proper, as:
+ <contxt>when used of a person:</contxt>
+<grp> * Marks the name of a group of people not formally
+ organized.
+<membof> italic marks a group of which the headword is a member.
+ This is rarely used, but should be indexed as
+ an entry word or phrase.
+<member> italic marks a member of a group defined by the headword.
+ This is rarely used, but should be indexed as
+ an entry word or phrase.
+<members> italic Same as <member>, but marks a plural word,
+ designating the name of the members in its plural form,
+ for lack of ambiguity.
+<method> * Designates a special type of definition which
+ describes a method for achieving the headword,
+ used only once for the word "amend". The
+ subdefinitions begin with "by".
+<corpn> * Name of a business company, corporation, or partnership.
+ Started using November 1988. Rare.
+<corr> italic Correlative. A word intimately associated with the
+ headword in a manner such that one cannot
+ appear without the other. NOt exactly an inverse.
+<qperson> italic marks the name of a person, quoted in a dialogue.
+ Used only in <q> blockquotes as of vers. 0.45.
+<org> * marks the name of an organization; sometimes used
+ for the names of groups of people not
+ formally organized *see also <grp>.
+<prod> italic produces. Designates a substance produced by
+ a living organism. Rarely used.
+<prodp> * produces (plainfont). Designates a substance
+ produced by a living organism. Same as <prod>,
+ but does not affect font. Rarely used.
+<prodby> * produced by. Designates a living organism which
+ produces the headword substance. Rarely used.
+<prodmac> italic produces. Designates an object or substance produced
+ by a machine or process. Rarely used.
+<stage> italic life stage of an organism. Used to indicate
+ variant forms of an organism defined by the
+ headword. Rarely used.
+<stageof> * an organism one of whose life stages is the headword.
+ Inverse (correlative) of <stage>. Rarely used.
+<inv> italic inversely related to headword -- e.g. depository
+ is the inverse of depositor; buyer is the inverse of
+ seller. Called "correlative" in the Webster 1913 and
+ the CIDE. Rarely used.
+<methodfor> italic is a method to accomplish the action defined by
+ the headword. Rarely used, and only in the
+ supplemental section.
+<examp> italic example or instance of the headword, where the
+ tagged and emphasized word is not a proper subtype.
+<p><hw>Pa*ron"y*mous</hw> <p><sn>2.</sn> <def>Having a similar sound, but different orthography and different meaning; -- said of certain words, as <examp>all</examp> and <examp>awl</examp>; <examp>hair</examp> and <examp>hare</examp>, etc.</def><br/
+[<source>1913 Webster</source>]</p>
+<sfield> * subfield of the headword, which must be a field
+ of study or of knowledge
+<stage> italic a stage of life of the headword -- for living things,
+ such as insects, whose life stages may take different
+ names.
+<unit> italic a unit of measure, usually preceded by a number.
+ Also used to tag the unit of a measure which is the
+ headword.
+<uses> italic tags a tool or method used by the headword,
+ which is usually some process.
+<usedfor> * tags a method or process for which the headword
+ is a tool.
+<usedby> italic tags a tool or method which uses the headword,
+ which is usually a physical object.
+<perf> italic performs -- tags a word which is a process or
+ activity performed by the headword.
+<recipr> italic reciprocal -- used for cases where the tagged word
+ is a reciprocal participant in an action, such as
+ donor and recipient. The difference between this and
+ <inv> inverse has not yet been systematically settled.
+ Used seldom, and mostly in the supplemented version.
+<sig> italic significance, meaning -- used in definitions where the
+ actual meaning is prefixed with commentary explaining
+ usage or other attributes of the word, as with
+ prefixes or suffixes.
+<wns> italic WordNet sense. Where known, the correspondence of the
+ sense of an entry with that of WordNet 1.6 is
+ given after the definition, in a tag of the
+ form: <wns>[wns=3]</wns>, in which the number
+ is the numbered sense in WordNet.
+<w16ns> italic WordNet version 1.6 sense. See <wns> for
+ explanation.
+<wnote> * A note related to usage in the corresponding
+ WordNet definition.
+ =============================================================
+Biological classifications:
+<spn> italic Species name. Used to mark the taxonomic names
+ of living things which are represented in
+ italic font in the original printed version.
+ Originally, not only species, but genera, orders and
+ families were also thus marked. The conversion from
+ <spn> to <fam>, <gen>, or <ord> is not completed, and
+ <spn> may stil be found marking such groups.
+ However, orders and families are also frequently
+ mentioned in the original in normal font, and in such
+ cases are not marked with any tag. So, this mark
+ is not a reliable indicator of all mentions of
+ taxonomic names.
+<kingdom> italic Taxonomic biological Kingdom name.
+<phylum> italic Taxonomic phylum name.
+<subphylum> italic Taxonomic subphylum name.
+<class> italic Taxonomic class name.
+<subclass> italic Taxonomic subclass name.
+<ord> italic Taxonomic order name.
+ Also used for suborders, initially.
+<subord> italic Taxonomic suborder name.
+<suborder> italic Taxonomic suborder name.
+<fam> italic Taxonomic family name. Also used to tag "tribes".
+<subfam> italic Taxonomic subfamily name.
+<gen> italic Taxonomic genus name.
+<var> italic Variety. Used to mark subspecies or varities below
+ the level of species in living organism systematic
+ names.
+<varn> italic Variety. Used to mark subspecies or varities below
+ the level of species in living organism systematic
+ names. Duplicative variant of <var>

Return to:

Send suggestions and report system problems to the System administrator.