aboutsummaryrefslogtreecommitdiff
path: root/tagset.txt
diff options
context:
space:
mode:
Diffstat (limited to 'tagset.txt')
-rw-r--r--tagset.txt1080
1 files changed, 1080 insertions, 0 deletions
diff --git a/tagset.txt b/tagset.txt
new file mode 100644
index 0000000..f0b9367
--- /dev/null
+++ b/tagset.txt
@@ -0,0 +1,1080 @@
+ FIELD MARKS FOR WEBSTER 1913 and CIDE
+ =====================================
+Tagset.web:
+ Explanations of the tags used to mark the Webster 1913 dictionary
+and the CIDE (Collaborative International Dictionary of English).
+Note that the list of tags used to mark the public domain version
+of this dictionary is shorter than the full set described here.
+ If any tag is not listed here, it is either (1) one of the
+"point" (font size) or "type" (font style) tags, which should be self-explanatory; or
+ (2) Is a functional field with no effect on the typography.
+
+Last modified March 12, 1999.
+ For questions, contact:
+ Patrick Cassidy cassidy@micra.com
+ 735 Belvidere Ave.
+ Plainfield, NJ 07062
+ (908) 561-3416 or (908) 668-5252
+-------------------------------------------------------------
+A separate file, webfont.asc, contains the list of the individual
+non-ASCII characters represented by either higher-order hexadecimal
+character marks (e.g., \'94, for o-umlaut) or by entity tags
+(e.g., <root/, for the square root symbol.)
+--------------------------------------------------------------
+ Use of tags:
+ In the MICRA electronic version of the 1913 Webster, each part of
+the entry headed by an entry word ("headword") is labeled so that no
+part of the entry except some punctuation marks should be found
+outside of all fields, i.e. every character should be within some tagged
+field. In the following description, the word "segment" usually refers to
+a major part of an entry such as an etymology or a definition or a
+collocation segment or a usage block, containing more than one field.
+The term "field" may also be used similarly to "segment", but may also
+denote single-word fields, such as an alternative spelling, labeled <asp>.
+
+ Note: The tags on this list are similar in structure to SGML tags. Each
+tag on this list marks a field; each field opens with a tagname between
+angle brackets thus: <tagname>, and closes with a similar tag containing
+the forward slash thus: </tagname>. No tags are used without closing
+tags. Thus the HTML <BR> to indicate a line break is symbolized
+here as an entity, <br/, and every <p> has a corresponding </p>.
+ The absence of an end-field tag, or the presence of an end-field tag
+without a prior begin-field tag constitutes a typographical error, of which
+there may be a significant number. Any errors detected should be brought
+to the attention of PJC or the appropriate editor.
+ Most of the tagged fields are presented in the text in italic type,
+with a number of exceptions. Where a word is contained within more than
+one field, the innermost field determines the font to be used. Wherever
+recognizable functional fields were found, an attempt was made to tag the
+field with a functional mark, but in many cases, words were italicised only
+to represent the word itself as a discourse entity, and in some such cases,
+the "italic" mark <it> was used, implying nothing regarding functionality
+of the word. The base font is considered "plain". Where an italic field
+is indicated, parentheses or brackets within the field are not italicised.
+ Where no font is specified for a tag, the tag is merely a functional
+division, and was printed in plain font unless otherwise tagged. This type
+of segment is marked by an asterisk (*) where the font name would be.
+ The size of the "plain" font in the original text is about 1.6 mm for
+the height of capitalized letters.
+=============================================================
+Explicit typographical tags:
+ These were used where the purpose of a different font was merely to
+distinguish a word from the body of the text, and no explicit functional
+tag seemed apropriate.
+-----------------------------------
+Tag Font
+-----------------------------------
+Explicit formatting tags:
+. . . . . . . . . . . . . . . . . .
+<plain> plain font (that used in the body of a definition) --
+ normally not marked, except within fields of
+ a different front.
+<it> italic (in master files)
+<i> italic (for use in HTML presentation)
+<bold> bold (in master files)
+<b> bold (for use in HTML presentation)
+<colf> bold, Collocation font. Same font as used in collocations.
+ smaller This is used only in the list of "un-" words not
+ by 1 point actually defined in the dictionary. Probably could be
+ replaced by a segment mark for the entire list!
+ The "un-" words should be indexed as headwords.
+
+<ct> bold Same as <colf>, a font similar to that used in
+ collocations. However, this tag is used in a table
+ and could be set to a different font.
+
+<h1> * HTML tag -- largest heading font.
+
+<h2> * HTML tag -- second largest heading font.
+
+<headrow> * Marks a Row title in a table.
+
+<hwf> Font the same as the headword <hw>, though the field is
+ not a headword. Used only once.
+
+<mitem> * Multiple items, a set of items in a table.
+<point ...> A series of point size markers, many unique.
+<point1.5> * One of the tags of the form <point**> where **
+<point6> represents the typographic point size of the
+ enclosed text.
+<pre> An HTML tag indicating that the enclosed text is
+ of teletype form, preformatted in a uniform-spaced
+ font.
+<sc> small caps (used mostly for "a. d.", "b. c.")
+ This is the same font a <er>, but has no functional
+ or semantic significance
+<str> group of table data elements in a table
+<sub> subscript, like <subs>
+<subs> subscript
+<sups> superscript
+<supr> superscript
+<sansserif> Sans-serif font
+<stypec> Bold (collocation font) and also a subtype.
+<tt> HTML tage -- teletype font
+<universbold> A squared bold font without serifs approximating the
+ "universe bold" font on the HP Laserjet4, slightly
+ larger than the capitals in a definition body. Used
+ in expositions describing shapes, such as
+ "Y", "T", "U", "X", "V", "F".
+<vertical> Vertically organized column.
+<column1> Vertically organized column -- only part of a table
+ which needs to be completed. Used once.
+<...type> A series of tags, many unique, designating certain
+ unusual fonts, such as "bourgeoistype" for
+ "bourgeois type", in the section on typography.
+ Most of these occur only once, in the section on fonts.
+<antiquetype>
+<blacklettertype>
+<boldfacetype>
+<bourgeoistype>
+<boxtype>
+<clarendontype>
+<englishtype>
+<extendedtype>
+<frenchelzevirtype>
+<germantype>
+<gothictype>
+<greatprimertype>
+<longprimertype>
+<miniontype>
+<nonpareiltype>
+<oldenglishtype>
+<oldstyletype>
+<pearltype>
+<picatype>
+<scripttype>
+<smpicatype>
+<typewritertype>
+
+=============================================================
+Tags with semantic content:
+. . . . . . . . . . . . . . . . . . . . . . . . . . .
+<altsp> * Alternative spelling segment. Almost always
+ contained within square brackets after the main
+ definition segment. Expository words
+ such as "Spelled also" are in plain font;
+ the actual alternative spelling is marked by
+ <asp> ... </asp> tags within this segment.
+
+<ant> italic Antonym.
+
+<asp> italic Alternative spelling. The actual word which is an
+ alternative spelling to the headword. These
+ are functionally synonyms of the headword. In
+ most cases these also occur as headwords, with
+ reference to the word where the actual definition
+ is found, but not all such words are listed
+ separately, particularly if the spelling is
+ close enough to the headword to be found at the
+ same point in the dictionary. Whether listed
+ separately or not, these words should
+ be indexed at this location, also.
+
+<au> italic Authority or author. Used where an authority is
+ (may be right- given for a definition, and also used for the
+ justified. See author, where a quotation within double quotes
+ in the section is given in the same paragraph as the
+ on formatting). definition. The double quotes are indicated
+ by the open-quote (\'bd) and close-quote
+ (\'b8). In both cases, it is typically
+ right-justified, almost always fitting on
+ the same line with the last line of the
+ definition or quotation.
+ Within collocation segments, it is usually
+ used only after quotations, and is not right-
+ justified, except occasionally where it
+ would be close to the right margin, and then
+ apparently is is right-justified. We have
+ not explicitly marked those which are
+ right-justified, but they can be
+ recognized because they are on a line by
+ themselves, preceded by two carriage returns.
+
+<bio> * Marks a biography. Should be longer than
+ a short mention of who a person was, which
+ is typically included as a definition.
+
+<biography> * Same as <bio>
+
+<booki> italic Marks the name of a book, pamphlet, or similar
+ document.
+
+<branchof> * A field of knowledge which of which the headword
+ is a division.
+
+<caption> * Caption of a figure or table.
+
+<cas> * tags the CAS (Chemical Abstracts Service) registry
+ number for a chemical substance.
+
+<causes> italic tags the infectious disease caused by the headword.
+ Implied type of the agent is a microorganism, and
+ the tag must mark a disease.
+
+<causesp> * Same as <causes> without the italic type.
+<causedbyp> * Same as <causedby> without the italic type.
+
+<causedby> italic inverse of causes: tags the causative agent of an
+ infectious disease, which is the headword .
+ the tag must mark a microorganism, virus, or
+ prion, and the implied type of the headword is
+ a disease.
+
+<centered> Used only for The single letter in the headers to each
+ letter of the alphabet.
+
+<city> * marks the proper name of a city. Used only
+ occasionally and not consistently at this stage.
+
+<cnvto> italic Converted to: used to tag substances which are
+ products prepared by conversion from the
+ headword. Usually chemicals or complex
+ products from mnatuarl materials. Rarely used
+ up to 1998.
+
+<colheads> * List of heads for the columns of a table.
+
+<coltitle> * Title of a column in a table.
+
+<comm> * Comment -- differs from <note> in being in-line with
+ the definition paragraph. Provides a little
+ additional information.
+
+<company> * Name of a company (commercial firm). Compare <org>
+
+<compof> italic Composed of. Tags a substance of which the
+ headword is at least partly composed. The
+ substance may be particulate, such as
+ diatoms composing diatomaceous earth.
+
+<contains> * marks an object contained within the headword.
+
+<contr> italic Contrasting word. Not exactly an antonym, which
+ is marked <ant>, but a contrasting word which is
+ often introduced as "opposite to" or "contrasts
+ with".
+
+<country> * Name of a country (nation) of the world.
+
+<cref> italic Collocation reference. A reference to a collocation.
+ Each such collocation should have its own entry,
+ marked by <col> ... </col> tags, and these
+ references should function as hypertext buttons
+ to access that entry.
+
+<date> * A Date, of any type, e.g. <date>Dec. 25</date>.
+
+<datey> * Date-with-year tags a date containing a year.
+
+<def> * definition. The definition may have subfields,
+ particularly <as> (an illustrative phrase
+ starting with "as" or "thus" and containing
+ the headword (or a morphological derivative).
+ The <mark>, \'bd...\'b8 quotations (left and
+ right double quotes) and <au> fields may be
+ found within a definition field, but should
+ and usually are located outside the definition
+ proper. The marking macro was
+ inconsistent in this placement, and the
+ exclusion of the <mark>, <au> and quotations
+ needs to be completed by the proof-readers.
+ Certain definitions contain <pos>
+ fields within them, where the headword is
+ an irregular derivative of another headword.
+ In these cases, the <pos> field follows
+ immediately after the <def> tag, and these
+ entries do not have a separate <pos> field.
+ In such cases, the <pos> field is italic, as
+ usual.
+
+<divof> * Division of the headword, usually an organization.
+ E. g. a faculty or department of a university,
+ or a United Nations agency.
+
+<edi> * Marks an education institution, a subtype of
+ organization.
+
+<emits> * tags a physical object or form of radiation
+ emitted by the headword
+
+<figure> Just a place-holder for illustrations, but seldom used.
+
+<film> italic Marks the name of a movie film.
+
+<fld> italic Field of specialization. Most often used for
+ Zoology and Botany, but many "fields of
+ specialization" are marked for technical
+ terms. The parentheses are usually within this
+ field, but are not themselves in italics.
+
+<geog> * Name of a geograpahical region of any size;
+ if applicable, the more specific <city>,
+ <state>, or <country> are preferred.
+
+<hypen> * Hyperym. Points to the hypernym from WordNet 1.5
+ Initially, used only for entries extracted
+ from WordNet 1.5. Not present in the original
+ 1913 version.
+
+<illu> * Illustrative usage -- mostly from WordNet, and placed
+ outside the definition, in contrast to <as> usage.
+ These should be converted to <as>...</as> illustrative
+ usage format for consistency.
+
+<illust> * Illustration place-holder. Seldom used.
+<img> * HTML usage -- points to an image file, usually
+ .gif or .jpg. These have no closing tag, and
+ will appear as errors in parsing.
+<intensi> * Points to a word whose meaning is an intensified
+ form of the headword. Taken from WordNet
+ tags, used with some adjectives from WordNet
+<item> * Designates one item in a row of a table. Used only when
+ intervening spaces do not serve properly as natural
+ field separaters.
+<itran> italic Translation into a foreign (non-English) language
+ of the previous word in the text -- italic font.
+ (<sig> is a translation into English)
+<itrans> italic Same as <itran>
+<jour> * Title of a journal (periodical).
+<matrix> * Always a filled rectangular array.
+<matrix2x5> * A 2x5 matrix (2 rows by 5 columns).
+<mstypec> * Multiple synonymous subtypes -- used in
+ def. of "grass".
+<mtable> * Multiple table, encloses <table> figures.
+<musfig> * Music figure. Only in a note under the entry "Figure",
+ the two numbers of each such field
+ are bold, 20 point type, stacked as in a fraction with
+ a bar between them, but also having a horizontal stroke
+ midway through each numeral. Unique to this entry.
+<p> * paragraph tag, used always in pairs. Line breaks may
+ be embedded inside the paragraphs.
+<person> * marks the proper name of a person. Used only
+ occasionally, but should be used more frequently
+ for cases where first names are abbreviated,
+ to reduce ambiguity of the period for automatic
+ analysis. Where a title is given, prefixed
+ or postfixed, it is included in this tag.
+
+<persfn> * marks the name of a person, when only one name
+ (usually the last name) is given. Not used
+ consistently where it should be.
+
+<publ> * Marks the name of a publication other than book,
+ which is marked by <booki>. It is often a
+ magazine or journal.
+<qpers> * Tags the name of a person who is speaking,
+ within a quotation.
+<qperson> Same as <qpers>
+<cp> * Collocation, plain text -- used to tag phrases that
+ should be parsed as a unit, but has no typographical
+ significance.
+<qau> italic Always right-justified, as described for <au>.
+<ref> * A reference to a word in the vocabulary.
+<refs> * Marks the set of references used for a longer article
+ such as a biography.
+<river> * Marks the name of a river -- a proper name
+<rj> * Right justified
+<row> * Designates a row in a table.
+<state> * Name of a geopolitical state, the first subdivision of
+ a country. Includes, e.g. Canadian provinces.
+<subtypes> * Lists subtypes of the headword.
+<sup> * superscript
+<supr> * Supra. The two parts of each such field
+ are stacked, one over the other, *without* a
+ horizontal bar between (as in a fraction).
+ Used only in one entry, for a musical notation.
+<table> * Always a filled rectangular array, having <row> and <item>
+ elements.
+<td> * Table datum - one cell in a table
+<th> * Table header
+<tradename> * Tags a commercial Trade name
+<ttitle> * Table title (Larger than normal font)
+====================================================================
+
+Functional Tags
+--------------------------------------------------------------------
+Tag Font Meaning
+ (Comparatives are relative to the plain font.)
+-----------------------------------------------------------------------
+<-- --> * Comment, not a tag. These segments should be deleted
+ from the written or printed text.
+ Page numbers of the original text are indicated
+ within such comments; these may be left in, if
+ desired.
+
+<! !> * HTML-style comment. Used to indicate page numbers
+ in the public domain version.
+
+<abbr> italic Tag for abbreviations, when mentioned within
+ the definition text.
+
+<adjf> small caps Tags for the actual adjective or adverb
+ comparatives or superlatives. Should be
+ indexed. See also conjf (verbs) and
+ decf (nouns).
+
+<altname> italic Alternative name. Usually for plants or animals,
+ but also used for other cases where words
+ are introduced by "also called", "called also",
+ "formerly called". These are functionally
+ *synonyms* for that word-sense.
+
+<altnpluf> italic Same as <altname>, but the marked word is a
+ plural form, whereas the headword is singular.
+
+<amorph> * Adjective morphological segment, primarily
+ the comparative and superlative forms.
+ The occasional adverb morphology is
+ also tagged this way.
+
+<as> * A segment occurring within the definitional
+ sentence, providing an example of usage of
+ the headword. Not conceptually a part of the
+ actual definition.
+
+<cd> smaller spacing Collocation definition. Similar in structure
+ to headword definitions (the <def> field). May
+ contain an <as> field. Plain type, but with
+ closer spacing than main definitions.
+
+<col> bold, Collocation. A word combination containing the
+ smaller by headword (or a morphological derivative).
+ 1 point The collocations do not have an explicitly
+ marked part of speech.
+ See also <ecol>, tagging embedded collocations.
+
+<colp> Collocation, no typographic significance.
+ Used to mark a word combination defined in
+ the dictionary without affect on font.
+
+<conjf> small caps The conjugated (non-infinitive) forms of
+ verbs. imp. & p. p. is common, as well as
+ p. pr. & vb. n. Irregular variants of
+ these are less common. Words in this
+ field perhaps should be indexed.
+
+<cs> smaller Collocation segment. The font and size is
+ vertical normal in a cs, but the spacing between lines
+ spacing is smaller (0.9 mm between lower-case letters,
+ rather than 1.1 mm in the main body of the
+ definition). For an on-line dictionary,
+ reproducing this typography is probably
+ pointless.
+
+<decf> small caps Declension form. The actual morphological
+ variants of nouns or pronouns. Should
+ be indexed.
+
+<ecol> * Embedded Collocation. A word combination
+ containing the headword (or a morphological
+ derivative, embedded within a definition
+ without a separate definition of its own.
+ These collocations should be defined
+ implicitly by the text of the definition in
+ which they are embedded.
+ See also <col>, tagging explicitly defined
+ collocations.
+
+<ent> Bold Entry field. Gives the headword without accent or
+ syllabication marks, and with special-character
+ symbols converted to their nearest ASCII
+ equivalents. Can be used without conversion
+ as the string that serves as the index word
+ for that entry.
+
+<er> Small Caps Entry reference. References to headwords
+ within the "etymology" section are in small
+ caps. Such references also occur
+ in the body of definitions, and in "usage"
+ segments.
+ Such entry references should function as hypertext
+ buttons to access that entry.
+
+<ety> * Etymology. Always contained within square
+ brackets. Normal type is used for explanatory
+ comments, and italics for the actual words
+ (marked <ets>) considered as etymological
+ sources.
+
+<ets> italic Etymological source. Words from which the
+ headword was derived, or to which it is related.
+ The Greek words within an etymology segment
+ are invariably etymology sources, and should
+ be marked as such, but are not so marked,
+ even in the rare cases where the Greek word
+ transliteration has been written in.
+
+<etsep> italic Etymological source, being the name of a person
+ or geographical location which is the eponym
+ for the concept. This is used to distinguish
+ eponymous etymologies from others, and can also
+ be found in the body of a definition or note,
+ not only in the etymology field. Very few
+ of the names that should be marked this way
+ have actually been so marked, as of version
+ 0.42. In cases where such eponymous names
+ have not yet been thus marked, they will
+ usually be marked by <xex>, the non-semantic
+ italic-font marker, or, in etymologies, by
+ <ets>.
+
+<ex> italic Example. An example of usage of the headword,
+ usually found within an <as> or <note> segment.
+
+<fr> * Frequency of use, ordinal rank. This is used for
+ WordNet entries, in which the synonyms
+ were ranked in order of frequency of use.
+ <fr>1</fr> indicates that the headword is the
+ first word on the list of synonyms.
+
+<fu> * First use. A date at or around which the first
+ use of this word in writing is recorded.
+ Not in the original 1913 Webster, and usu.
+ taken from a recent dictionary. Only a few
+ such fields have been entered as of version
+ 0.41
+
+<grk> transliteration Greek. The Greek words have been transliterated
+ using the equivalents explained in the
+ file "webfonts.asc". In most cases, the
+ transliterations are typical for Greek
+ letters, except for theta (transl = q),
+ phi (transl. = f), eta (transl. = h), and
+ upsilon (transl. = y, whether pronounced
+ as y or u). This was to eliminate any
+ ambiguity. These words occur primarily
+ in etymologies, and to conform to the
+ usage of <ets> should also be marked
+ by <ets>, but as of version 0.41 they
+ are not usually thus marked.
+
+<hw> bold, headword. Each main entry begins with the <hw>
+ larger by mark, and ends at the next <hw> mark. The
+ 2 points main entries are not otherwise explicitly
+ marked as a distinctive field.
+ The same word may appear as a headword
+ several times, usually as different parts
+ of speech, but sometimes with different
+ entries as the same part of speech, presumably
+ to indicate a different etymology.
+ Within the hw field the heavy accent is
+ represented by double quote ("), the
+ light accent by open-single-quote (`),
+ and the short dash separating syllables by
+ an asterisk (*). A hyphen (-) is used to
+ represent the hyphen of hyphenated words.
+
+<mark> italic, Usage mark. Almost always within square
+ brackets, occasionally in parentheses or
+ without any bracketing.
+ but The most common usage marks,
+ explanatory "Obs." = obsolete "R." = rare, "Colloq." =
+ may be plain. colloquial, "Prov. Eng." = Provincial England,
+ etc. are in italics. Some usage notes are also
+ marked with <mark>, but are in plain. For
+ simplicity, all words in this field may be
+ italic, until additional explicit marks are
+ added.
+
+<markp> * A usage mark in plain type (not italic). Found
+ within a definition, when there are more than
+ one sense-number listed. "Fig." at the head
+ of an entry is the most common case.
+
+<mcol> * Multiple collocation. Similar to multiple
+ headword, when two or more collocations share
+ one definition; however, the two collocations
+ are in-line, rather than stacked or justified.
+ There may be "or" or "and" words
+ (italicised), or an "etc." (plain type)
+ within this field. In many cases, the
+ <or/ and <and/ entities are used to
+ signify the change of font for these words.
+
+<mhw> * Multiple headword. This field is used where
+ more than one headword shares a single
+ definition. In the dictionary, the
+ (usually) two headwords are left-justified
+ one below the other in the column, and are
+ tied together on the right side of the
+ headwords by a long right curly brace.
+ This division is strictly functional,
+ for analytical purposes, and does not
+ affect the typography.
+
+<nmorph> * Noun morphology section. Rarely used, mostly
+ for irregular personal pronouns.
+
+<note> * Explanatory note. No explicit font is indicated.
+ These segments may be separate, as in the
+ separate paragraphs starting <note><hand/,
+ or they may just be further explanation within
+ (or more usually, following) the main
+ definition paragraph. Typographically,
+ the notes following the main definition may
+ not be distinguishable from additional
+ sentences appended to the first sentence
+ of a definition.
+
+<plu> * Plural. The "plural" segment starts with a
+ "pl." which is italicised, but in this
+ segment is not otherwise marked as
+ italicised. Other words occurring in this
+ segment are plain type. The "pl." can be
+ easily explicitly marked if necessary.
+
+<pos> italic Part of speech. Always an abbreviation: e.g.,
+ n.; v. i.; v. t.; a.; adv.; pron.; prep.
+ Combinations may occur, as "a. & n.".
+
+<epos> * Part of speech, referring to words in
+ etymologies, normal type. Always an
+ abbreviation, as in <pos> above
+ Combinations may occur, as "a. or n.".
+
+<plw> small caps Plural word. The actual plural form of the word,
+ found within a <plu> segment.
+
+<pr> * pronunciation. The default font is normal, but
+ many non-ASCII characters are used.
+ The pronunciation field may have more than
+ one pronunciation, separated by an "<or/".
+ (An "or" here is in italic, and usually is
+ represented by the entity <or/).
+ There may also be some commentary, such as
+ "Fr."(French pronunciation) or "archaic".
+ The commentaries are typically italic, and
+ should be marked as such. In certain
+ pronunciations there is a numbered reference
+ to a root form explained in an introductory
+ section on pronunciation.
+ Very few of the pronunciation fields have
+ been filled in. The pronunciation markings use
+ a more complicated method than more modern
+ dictionaries. It would be interesting to have
+ these fields filled in, if there are any
+ volunteers willing to do it.
+
+<q> smaller by Quotation. No bracketing quotation marks,
+ two points, though occasionally \'bd-\'b8 quotations occur
+ centered, within these quotations. These quotations
+ Separate tend to be more complete sentences, rather
+ paragraph than just phrases, such as are contained
+ within quotation marks within the definition
+ paragraph.
+
+<qau> italic, Quotation author. Used only for the quotations
+ right justified marked with <q> that are centered in their
+ own paragraphs.
+
+<qex> italic Quotation example. An example of usage of
+ the headword, within quotations marked
+ by <q>..</q> tags.
+
+<sd> italic Subdefinition, marked (a), (b), (c), etc. THese are
+ finer distinctions of word senses, used
+ within numbered word-sense (for main entries),
+ and also used for subdefinitions within
+ collocation segments, which have no numbering of
+ senses. The letter is italic, the parentheses
+ are not. This tag is also used to indicate the
+ lettered subdefinition when it is referred to
+ at another point in the text.
+
+<ship> italic The name of a ship. Rarely used.
+
+<sing> * Singular. Analogous to the <plu> segment, but more
+ rarely used, mostly for Indian tribes, which
+ are listed in the plural form.
+
+<singw> small caps Singular word. The singular form of the
+ plural-form headword.
+
+<sn> bold, Sense number. A headword may have over 20
+ larger by different sense numbers. Within each numbered
+ 2 points sense there may be lettered sub-senses. See
+ the <sd> (sub-definition) field.
+
+<source> italic Source. The author of the definition. Used only
+ for definitions not originally present in
+ Webster 1913, and not present in the original
+ version intended to mimic the 1913 printed
+ dictionary. This source is used for each
+ word sense, and may differ for different
+ senses of a word, especially where a Web1913
+ definition was substantially modified, or a
+ new word sense was added to a previously
+ defined word.
+
+<syn> plain Synonyms. A list of synonyms, sometimes followed
+ by a <usage> segment.
+
+<usage> narrower Comparisons of word usage for words which are
+ spacing sometimes confused. As with collocation segments,
+ font is plain, but spacing is smaller than
+ normal definition spacing. This seems pointlessly
+ complicating for an on-line display.
+
+<ver> * Verified for current accuracy by a technical editor,
+ without changes.
+
+<vmorph> * Verb morphology (conjugation) segment, delimited
+ by square brackets.
+
+<wordforms> * Morphological derivatives not contained in the
+ bracketed segments, as above. For nouns
+ derived from adjectives, adverbs from
+ adjectives, etc. This segment is usually
+ found at the end of the main entry. The
+ adverbial and nominalized derivatives at the
+ end of a main entry are usually introduced
+ by an em dash [represented as two hyphens (--)].
+
+<wf> bold, Same font as <hw>, with accents and syllable
+ larger by breaks marked as in the headword.
+ 2 points Marks the actual morphological forms within
+ a <wordforms> segment; typically, adverbial or
+ nominalized form of an adjective.
+
+
+<def2> * Second definition (occasionally, a third definition is
+ present). This is used where a second or third
+ part of speech with the same orthography is
+ placed under one headword. Within this segment,
+ there will be a <pos> field, and sometimes
+ a <mark> and/or a quotation.
+
+<specif> * "Specifically:" Used to mark the words "specifically",
+ "Hence", "as" which are used to introduce a second
+ definition typically more specific than the first,
+ but in general derived by extension of the initial
+ definition. This functions as a warning of multiple
+ definitions where the sense-numbers are not explicitly
+ used. It is also useful in separate senses, to
+ tag polysemous definitions which may be
+ specializations or generalizations of the preceding
+ definition.
+
+<pluf> italic. Plural form.
+ Used exclusively to mark the "pl." abbreviation,
+ which introduces a definition for the headword,
+ *when used in the plural form*. Not related to
+ <plu>, which spells out the plural form, but does
+ define it.
+
+<uex> italic Usage example. Used only a few times, within
+ <usage> segments.
+
+<isa> italic supertype (hypernym) the inverse of <stype> and
+ identical to <hypen> but not derived from WordNet.
+
+<chform> plain, Chemical formula. The letters are plain font,
+ numbers but the numbers are subscript. This is mostly
+ subscript useful as a functional mark to pinpoint