aboutsummaryrefslogtreecommitdiff
path: root/TAGSET.WEB
diff options
context:
space:
mode:
Diffstat (limited to 'TAGSET.WEB')
-rw-r--r--TAGSET.WEB2120
1 files changed, 1060 insertions, 1060 deletions
diff --git a/TAGSET.WEB b/TAGSET.WEB
index 5714751..1409569 100644
--- a/TAGSET.WEB
+++ b/TAGSET.WEB
@@ -1,1060 +1,1060 @@
- FIELD MARKS FOR WEBSTER 1913 and CIDE
- =====================================
-Tagset.web:
- Explanations of the tags used to mark the Webster 1913 dictionary
-and the CIDE (Collaborative International Dictionary of English).
-Note that the list of tags used to mark the public domain version
-of this dictionary is shorter than the full set described here.
- If any tag is not listed here, it is either (1) one of the
-"point" (font size) or "type" (font style) tags, which should be self-explanatory; or
- (2) Is a functional field with no effect on the typography.
-
-Last modified March 12, 1999.
- For questions, contact:
- Patrick Cassidy cassidy@micra.com
- 735 Belvidere Ave.
- Plainfield, NJ 07062
- (908) 561-3416 or (908) 668-5252
--------------------------------------------------------------
-A separate file, webfont.asc, contains the list of the individual
-non-ASCII characters represented by either higher-order hexadecimal
-character marks (e.g., \'94, for o-umlaut) or by entity tags
-(e.g., <root/, for the square root symbol.)
---------------------------------------------------------------
- Use of tags:
- In the MICRA electronic version of the 1913 Webster, each part of
-the entry headed by an entry word ("headword") is labeled so that no
-part of the entry except some punctuation marks should be found
-outside of all fields, i.e. every character should be within some tagged
-field. In the following description, the word "segment" usually refers to
-a major part of an entry such as an etymology or a definition or a
-collocation segment or a usage block, containing more than one field.
-The term "field" may also be used similarly to "segment", but may also
-denote single-word fields, such as an alternative spelling, labeled <asp>.
-
- Note: The tags on this list are similar in structure to SGML tags. Each
-tag on this list marks a field; each field opens with a tagname between
-angle brackets thus: <tagname>, and closes with a similar tag containing
-the forward slash thus: </tagname>. No tags are used without closing
-tags. Thus the HTML <BR> to indicate a line break is symbolized
-here as an entity, <br/, and every <p> has a corresponding </p>.
- The absence of an end-field tag, or the presence of an end-field tag
-without a prior begin-field tag constitutes a typographical error, of which
-there may be a significant number. Any errors detected should be brought
-to the attention of PJC or the appropriate editor.
- Most of the tagged fields are presented in the text in italic type,
-with a number of exceptions. Where a word is contained within more than
-one field, the innermost field determines the font to be used. Wherever
-recognizable functional fields were found, an attempt was made to tag the
-field with a functional mark, but in many cases, words were italicised only
-to represent the word itself as a discourse entity, and in some such cases,
-the "italic" mark <it> was used, implying nothing regarding functionality
-of the word. The base font is considered "plain". Where an italic field
-is indicated, parentheses or brackets within the field are not italicised.
- Where no font is specified for a tag, the tag is merely a functional
-division, and was printed in plain font unless otherwise tagged. This type
-of segment is marked by an asterisk (*) where the font name would be.
- The size of the "plain" font in the original text is about 1.6 mm for
-the height of capitalized letters.
-=============================================================
-Explicit typographical tags:
- These were used where the purpose of a different font was merely to
-distinguish a word from the body of the text, and no explicit functional
-tag seemed apropriate.
------------------------------------
-Tag Font
------------------------------------
-Explicit formatting tags:
-. . . . . . . . . . . . . . . . . .
-<plain> plain font (that used in the body of a definition) --
- normally not marked, except within fields of
- a different front.
-<it> italic (in master files)
-<i> italic (for use in HTML presentation)
-<bold> bold (in master files)
-<b> bold (for use in HTML presentation)
-<colf> bold, Collocation font. Same font as used in collocations.
- smaller This is used only in the list of "un-" words not
- by 1 point actually defined in the dictionary. Probably could be
- replaced by a segment mark for the entire list!
- The "un-" words should be indexed as headwords.
-
-<ct> bold Same as <colf>, a font similar to that used in
- collocations. However, this tag is used in a table
- and could be set to a different font.
-
-<h1> * HTML tag -- largest heading font.
-
-<h2> * HTML tag -- second largest heading font.
-
-<headrow> * Marks a Row title in a table.
-
-<hwf> Font the same as the headword <hw>, though the field is
- not a headword. Used only once.
-
-<mitem> * Multiple items, a set of items in a table.
-<point ...> A series of point size markers, many unique.
-<point1.5> * One of the tags of the form <point**> where **
-<point6> represents the typographic point size of the
- enclosed text.
-<pre> An HTML tag indicating that the enclosed text is
- of teletype form, preformatted in a uniform-spaced
- font.
-<sc> small caps (used mostly for "a. d.", "b. c.")
- This is the same font a <er>, but has no functional
- or semantic significance
-<str> group of table data elements in a table
-<sub> subscript, like <subs>
-<subs> subscript
-<sups> superscript
-<supr> superscript
-<sansserif> Sans-serif font
-<stypec> Bold (collocation font) and also a subtype.
-<tt> HTML tage -- teletype font
-<universbold> A squared bold font without serifs approximating the
- "universe bold" font on the HP Laserjet4, slightly
- larger than the capitals in a definition body. Used
- in expositions describing shapes, such as
- "Y", "T", "U", "X", "V", "F".
-<vertical> Vertically organized column.
-<column1> Vertically organized column -- only part of a table
- which needs to be completed. Used once.
-<...type> A series of tags, many unique, designating certain
- unusual fonts, such as "bourgeoistype" for
- "bourgeois type", in the section on typography.
- Most of these occur only once, in the section on fonts.
-<antiquetype>
-<blacklettertype>
-<boldfacetype>
-<bourgeoistype>
-<boxtype>
-<clarendontype>
-<englishtype>
-<extendedtype>
-<frenchelzevirtype>
-<germantype>
-<gothictype>
-<greatprimertype>
-<longprimertype>
-<miniontype>
-<nonpareiltype>
-<oldenglishtype>
-<oldstyletype>
-<pearltype>
-<picatype>
-<scripttype>
-<smpicatype>
-<typewritertype>
-
-=============================================================
-Tags with semantic content:
-. . . . . . . . . . . . . . . . . . . . . . . . . . .
-<altsp> * Alternative spelling segment. Almost always
- contained within square brackets after the main
- definition segment. Expository words
- such as "Spelled also" are in plain font;
- the actual alternative spelling is marked by
- <asp> ... </asp> tags within this segment.
-
-<ant> italic Antonym.
-
-<asp> italic Alternative spelling. The actual word which is an
- alternative spelling to the headword. These
- are functionally synonyms of the headword. In
- most cases these also occur as headwords, with
- reference to the word where the actual definition
- is found, but not all such words are listed
- separately, particularly if the spelling is
- close enough to the headword to be found at the
- same point in the dictionary. Whether listed
- separately or not, these words should
- be indexed at this location, also.
-
-<au> italic Authority or author. Used where an authority is
- (may be right- given for a definition, and also used for the
- justified. See author, where a quotation within double quotes
- in the section is given in the same paragraph as the
- on formatting). definition. The double quotes are indicated
- by the open-quote (\'bd) and close-quote
- (\'b8). In both cases, it is typically
- right-justified, almost always fitting on
- the same line with the last line of the
- definition or quotation.
- Within collocation segments, it is usually
- used only after quotations, and is not right-
- justified, except occasionally where it
- would be close to the right margin, and then
- apparently is is right-justified. We have
- not explicitly marked those which are
- right-justified, but they can be
- recognized because they are on a line by
- themselves, preceded by two carriage returns.
-
-<bio> * Marks a biography. Should be longer than
- a short mention of who a person was, which
- is typically included as a definition.
-
-<biography> * Same as <bio>
-
-<booki> italic Marks the name of a book, pamphlet, or similar
- document.
-
-<branchof> * A field of knowledge which of which the headword
- is a division.
-
-<caption> * Caption of a figure or table.
-
-<cas> * tags the CAS (Chemical Abstracts Service) registry
- number for a chemical substance.
-
-<causes> italic tags the infectious disease caused by the headword.
- Implied type of the agent is a microorganism, and
- the tag must mark a disease.
-
-<causesp> * Same as <causes> without the italic type.
-<causedbyp> * Same as <causedby> without the italic type.
-
-<causedby> italic inverse of causes: tags the causative agent of an
- infectious disease, which is the headword .
- the tag must mark a microorganism, virus, or
- prion, and the implied type of the headword is
- a disease.
-
-<centered> Used only for The single letter in the headers to each
- letter of the alphabet.
-
-<city> * marks the proper name of a city. Used only
- occasionally and not consistently at this stage.
-
-<cnvto> italic Converted to: used to tag substances which are
- products prepared by conversion from the
- headword. Usually chemicals or complex
- products from mnatuarl materials. Rarely used
- up to 1998.
-
-<colheads> * List of heads for the columns of a table.
-
-<coltitle> * Title of a column in a table.
-
-<comm> * Comment -- differs from <note> in being in-line with
- the definition paragraph. Provides a little
- additional information.
-
-<company> * Name of a company (commercial firm). Compare <org>
-
-<compof> italic Composed of. Tags a substance of which the
- headword is at least partly composed. The
- substance may be particulate, such as
- diatoms composing diatomaceous earth.
-
-<contains> * marks an object contained within the headword.
-
-<contr> italic Contrasting word. Not exactly an antonym, which
- is marked <ant>, but a contrasting word which is
- often introduced as "opposite to" or "contrasts
- with".
-
-<country> * Name of a country (nation) of the world.
-
-<cref> italic Collocation reference. A reference to a collocation.
- Each such collocation should have its own entry,
- marked by <col> ... </col> tags, and these
- references should function as hypertext buttons
- to access that entry.
-
-<date> * A Date, of any type, e.g. <date>Dec. 25</date>.
-
-<datey> * Date-with-year tags a date containing a year.
-
-<def> * definition. The definition may have subfields,
- particularly <as> (an illustrative phrase
- starting with "as" or "thus" and containing
- the headword (or a morphological derivative).
- The <mark>, \'bd...\'b8 quotations (left and
- right double quotes) and <au> fields may be
- found within a definition field, but should
- and usually are located outside the definition
- proper. The marking macro was
- inconsistent in this placement, and the
- exclusion of the <mark>, <au> and quotations
- needs to be completed by the proof-readers.
- Certain definitions contain <pos>
- fields within them, where the headword is
- an irregular derivative of another headword.
- In these cases, the <pos> field follows
- immediately after the <def> tag, and these
- entries do not have a separate <pos> field.
- In such cases, the <pos> field is italic, as
- usual.
-
-<divof> * Division of the headword, usually an organization.
- E. g. a faculty or department of a university,
- or a United Nations agency.
-
-<edi> * Marks an education institution, a subtype of
- organization.
-
-<emits> * tags a physical object or form of radiation
- emitted by the headword
-
-<figure> Just a place-holder for illustrations, but seldom used.
-
-<film> italic Marks the name of a movie film.
-
-<fld> italic Field of specialization. Most often used for
- Zoology and Botany, but many "fields of
- specialization" are marked for technical
- terms. The parentheses are usually within this
- field, but are not themselves in italics.
-
-<geog> * Name of a geograpahical region of any size;
- if applicable, the more specific <city>,
- <state>, or <country> are preferred.
-
-<hypen> * Hyperym. Points to the hypernym from WordNet 1.5
- Initially, used only for entries extracted
- from WordNet 1.5. Not present in the original
- 1913 version.
-
-<illu> * Illustrative usage -- mostly from WordNet, and placed
- outside the definition, in contrast to <as> usage.
- These should be converted to <as>...</as> illustrative
- usage format for consistency.
-
-<illust> * Illustration place-holder. Seldom used.
-<img> * HTML usage -- points to an image file, usually
- .gif or .jpg. These have no closing tag, and
- will appear as errors in parsing.
-<intensi> * Points to a word whose meaning is an intensified
- form of the headword. Taken from WordNet
- tags, used with some adjectives from WordNet
-<item> * Designates one item in a row of a table. Used only when
- intervening spaces do not serve properly as natural
- field separaters.
-<itran> italic Translation into a foreign (non-English) language
- of the previous word in the text -- italic font.
- (<sig> is a translation into English)
-<itrans> italic Same as <itran>
-<jour> * Title of a journal (periodical).
-<matrix> * Always a filled rectangular array.
-<matrix2x5> * A 2x5 matrix (2 rows by 5 columns).
-<mstypec> * Multiple synonymous subtypes -- used in
- def. of "grass".
-<mtable> * Multiple table, encloses <table> figures.
-<musfig> * Music figure. Only in a note under the entry "Figure",
- the two numbers of each such field
- are bold, 20 point type, stacked as in a fraction with
- a bar between them, but also having a horizontal stroke
- midway through each numeral. Unique to this entry.
-<p> * paragraph tag, used always in pairs. Line breaks may
- be embedded inside the paragraphs.
-<person> * marks the proper name of a person. Used only
- occasionally, but should be used more frequently
- for cases where first names are abbreviated,
- to reduce ambiguity of the period for automatic
- analysis. Where a title is given, prefixed
- or postfixed, it is included in this tag.
-
-<persfn> * marks the name of a person, when only one name
- (usually the last name) is given. Not used
- consistently where it should be.
-
-<publ> * Marks the name of a publication other than book,
- which is marked by <booki>. It is often a
- magazine or journal.
-<qpers> * Tags the name of a person who is speaking,
- within a quotation.
-<qperson> Same as <qpers>
-<cp> * Collocation, plain text -- used to tag phrases that
- should be parsed as a unit, but has no typographical
- significance.
-<qau> italic Always right-justified, as described for <au>.
-<ref> * A reference to a word in the vocabulary.
-<refs> * Marks the set of references used for a longer article
- such as a biography.
-<river> * Marks the name of a river -- a proper name
-<rj> * Right justified
-<row> * Designates a row in a table.
-<state> * Name of a geopolitical state, the first subdivision of
- a country. Includes, e.g. Canadian provinces.
-<subtypes> * Lists subtypes of the headword.
-<sup> * superscript
-<supr> * Supra. The two parts of each such field
- are stacked, one over the other, *without* a
- horizontal bar between (as in a fraction).
- Used only in one entry, for a musical notation.
-<table> * Always a filled rectangular array, having <row> and <item>
- elements.
-<td> * Table datum - one cell in a table
-<th> * Table header
-<tradename> * Tags a commercial Trade name
-<ttitle> * Table title (Larger than normal font)
-====================================================================
-
-Functional Tags
---------------------------------------------------------------------
-Tag Font Meaning
- (Comparatives are relative to the plain font.)
------------------------------------------------------------------------
-<-- --> * Comment, not a tag. These segments should be deleted
- from the written or printed text.
- Page numbers of the original text are indicated
- within such comments; these may be left in, if
- desired.
-
-<! !> * HTML-style comment. Used to indicate page numbers
- in the public domain version.
-
-<adjf> small caps Tags for the actual adjective or adverb
- comparatives or superlatives. Should be
- indexed. See also conjf (verbs) and
- decf (nouns).
-
-<altname> italic Alternative name. Usually for plants or animals,
- but also used for other cases where words
- are introduced by "also called", "called also",
- "formerly called". These are functionally
- *synonyms* for that word-sense.
-
-<altnpluf> italic Same as <altname>, but the marked word is a
- plural form, whereas the headword is singular.
-
-<amorph> * Adjective morphological segment, primarily
- the comparative and superlative forms.
- The occasional adverb morphology is
- also tagged this way.
-
-<as> * A segment occurring within the definitional
- sentence, providing an example of usage of
- the headword. Not conceptually a part of the
- actual definition.
-
-<cd> smaller spacing Collocation definition. Similar in structure
- to headword definitions (the <def> field). May
- contain an <as> field. Plain type, but with
- closer spacing than main definitions.
-
-<col> bold, Collocation. A word combination containing the
- smaller by headword (or a morphological derivative).
- 1 point The collocations do not have an explicitly
- marked part of speech.
- See also <ecol>, tagging embedded collocations.
-
-<colp> Collocation, no typographic significance.
- Used to mark a word combination defined in
- the dictionary without affect on font.
-
-<conjf> small caps The conjugated (non-infinitive) forms of
- verbs. imp. & p. p. is common, as well as
- p. pr. & vb. n. Irregular variants of
- these are less common. Words in this
- field perhaps should be indexed.
-
-<cs> smaller Collocation segment. The font and size is
- vertical normal in a cs, but the spacing between lines
- spacing is smaller (0.9 mm between lower-case letters,
- rather than 1.1 mm in the main body of the
- definition). For an on-line dictionary,
- reproducing this typography is probably
- pointless.
-
-<decf> small caps The actual morphological variants of nouns or
- pronouns. Should be indexed.
-
-<ecol> * Embedded Collocation. A word combination
- containing the headword (or a morphological
- derivative, embedded within a definition
- without a separate definitin of its own.
- These collocations should be defined
- implicitly by the text of the definition in
- which they are embedded.
- See also <col>, tagging explicitly defined
- collocations.
-<er> Small Caps Entry reference. References to headwords
- within the "etymology" section are in small
- caps. Such references also occur
- in the body of definitions, and in "usage"
- segments.
- Such entry references should function as hypertext
- buttons to access that entry.
-
-<ety> * Etymology. Always contained within square
- brackets. Normal type is used for explanatory
- comments, and italics for the actual words
- (marked <ets>) considered as etymological
- sources.
-
-<ets> italic Etymological source. Words from which the
- headword was derived, or to which it is related.
- The Greek words within an etymology segment
- are invariably etymology sources, and should
- be marked as such, but are not so marked,
- even in the rare cases where the Greek word
- transliteration has been written in.
-
-<etsep> italic Etymological source, being the name of a person
- or geographical location which is the eponym
- for the concept. This is used to distinguish
- eponymous etymologies from others, and can also
- be found in the body of a definition or note,
- not only in the etymology field. Very few
- of the names that should be marked this way
- have actually been so marked, as of version
- 0.42. In cases where such eponymous names
- have not yet been thus marked, they will
- usually be marked by <xex>, the non-semantic
- italic-font marker, or, in etymologies, by
- <ets>.
-
-<ex> italic Example. An example of usage of the headword,
- usually found within an <as> or <note> segment.
-
-<fr> * Frequency of use, ordinal rank. This is used for
- WordNet entries, in which the synonyms
- were ranked in order of frequency of use.
- <fr>1</fr> indicates that the headword is the
- first word on the list of synonyms.
-
-<fu> * First use. A date at or around which the first
- use of this word in writing is recorded.
- Not in the original 1913 Webster, and usu.
- taken from a recent dictionary. Only a few
- such fields have been entered as of version
- 0.41
-
-<grk> transliteration Greek. The Greek words have been transliterated
- using the equivalents explained in the
- file "webfonts.asc". In most cases, the
- transliterations are typical for Greek
- letters, except for theta (transl = q),
- phi (transl. = f), eta (transl. = h), and
- upsilon (transl. = y, whether pronounced
- as y or u). This was to eliminate any
- ambiguity. These words occur primarily
- in etymologies, and to conform to the
- usage of <ets> should also be marked
- by <ets>, but as of version 0.41 they
- are not usually thus marked.
-
-<hw> bold, headword. Each main entry begins with the <hw>
- larger by mark, and ends at the next <hw> mark. The
- 2 points main entries are not otherwise explicitly
- marked as a distinctive field.
- The same word may appear as a headword
- several times, usually as different parts
- of speech, but sometimes with different
- entries as the same part of speech, presumably
- to indicate a different etymology.
- Within the hw field the heavy accent is
- represented by double quote ("), the
- light accent by open-single-quote (`),
- and the short dash separating syllables by
- an asterisk (*). A hyphen (-) is used to
- represent the hyphen of hyphenated words.
-
-<mark> italic, Usage mark. Almost always within square
- brackets, occasionally in parentheses or
- without any bracketing.
- but The most common usage marks,
- explanatory "Obs." = obsolete "R." = rare, "Colloq." =
- may be plain. colloquial, "Prov. Eng." = Provincial England,
- etc. are in italics. Some usage notes are also
- marked with <mark>, but are in plain. For
- simplicity, all words in this field may be
- italic, until additional explicit marks are
- added.
-
-<markp> * A usage mark in plain type (not italic). Found
- within a definition, when there are more than
- one sense-number listed. "Fig." at the head
- of an entry is the most common case.
-
-<mcol> * Multiple collocation. Similar to multiple
- headword, when two or more collocations share
- one definition; however, the two collocations
- are in-line, rather than stacked or justified.
- There may be "or" or "and" words
- (italicised), or an "etc." (plain type)
- within this field. In many cases, the
- <or/ and <and/ entities are used to
- signify the change of font for these words.
-
-<mhw> * Multiple headword. This field is used where
- more than one headword shares a single
- definition. In the dictionary, the
- (usually) two headwords are left-justified
- one below the other in the column, and are
- tied together on the right side of the
- headwords by a long right curly brace.
- This division is strictly functional,
- for analytical purposes, and does not
- affect the typography.
-
-<nmorph> * Noun morphology section. Rarely used, mostly
- for irregular personal pronouns.
-
-<note> * Explanatory note. No explicit font is indicated.
- These segments may be separate, as in the
- separate paragraphs starting <note><hand/,
- or they may just be further explanation within
- (or more usually, following) the main
- definition paragraph. Typographically,
- the notes following the main definition may
- not be distinguishable from additional
- sentences appended to the first sentence
- of a definition.
-
-<plu> * Plural. The "plural" segment starts with a
- "pl." which is italicised, but in this
- segment is not otherwise marked as
- italicised. Other words occurring in this
- segment are plain type. The "pl." can be
- easily explicitly marked if necessary.
-
-<pos> italic Part of speech. Always an abbreviation: e.g.,
- n.; v. i.; v. t.; a.; adv.; pron.; prep.
- Combinations may occur, as "a. & n.".
-
-<plw> small caps Plural word. The actual plural form of the word,
- found within a <plu> segment.
-
-<pr> * pronunciation. The default font is normal, but
- many non-ASCII characters are used.
- The pronunciation field may have more than
- one pronunciation, separated by an "<or/".
- (An "or" here is in italic, and usually is
- represented by the entity <or/).
- There may also be some commentary, such as
- "Fr."(French pronunciation) or "archaic".
- The commentaries are typically italic, and
- should be marked as such. In certain
- pronunciations there is a numbered reference
- to a root form explained in an introductory
- section on pronunciation.
- Very few of the pronunciation fields have
- been filled in. The pronunciation markings use
- a more complicated method than more modern
- dictionaries. It would be interesting to have
- these fields filled in, if there are any
- volunteers willing to do it.
-
-<q> smaller by Quotation. No bracketing quotation marks,
- two points, though occasionally \'bd-\'b8 quotations occur
- centered, within these quotations. These quotations
- Separate tend to be more complete sentences, rather
- paragraph than just phrases, such as are contained
- within quotation marks within the definition
- paragraph.
-
-<qau> italic, Quotation author. Used only for the quotations
- right justified marked with <q> that are centered in their
- own paragraphs.
-
-<qex> italic Quotation example. An example of usage of
- the headword, within quotations marked
- by <q>..</q> tags.
-
-<sd> italic Subdefinition, marked (a), (b), (c), etc. THese are
- finer distinctions of word senses, used
- within numbered word-sense (for main entries),
- and also used for subdefinitions within
- collocation segments, which have no numbering of
- senses. The letter is italic, the parentheses
- are not. This tag is also used to indicate the
- lettered subdefinition when it is referred to
- at another point in the text.
-
-<ship> italic The name of a ship. Rarely used.
-
-<sing> * Singular. Analogous to the <plu> segment, but more
- rarely used, mostly for Indian tribes, which
- are listed in the plural form.
-
-<singw> small caps Singular word. The singular form of the
- plural-form headword.
-
-<sn> bold, Sense number. A headword may have over 20
- larger by different sense numbers. Within each numbered
- 2 points sense there may be lettered sub-senses. See
- the <sd> (sub-definition) field.
-
-<source> italic Source. The author of the definition. Used only
- for definitions not originally present in
- Webster 1913, and not present in the original
- version intended to mimic the 1913 printed
- dictionary. This source is used for each
- word sense, and may differ for different
- senses of a word, especially where a Web1913
- definition was substantially modified, or a
- new word sense was added to a previously
- defined word.
-
-<syn> plain Synonyms. A list of synonyms, sometimes followed
- by a <usage> segment.
-
-<usage> narrower Comparisons of word usage for words which are
- spacing sometimes confused. As with collocation segments,
- font is plain, but spacing is smaller than
- normal definition spacing. This seems pointlessly
- complicating for an on-line display.
-
-<vmorph> * Verb morphology (conjugation) segment, delimited
- by square brackets.
-
-<wordforms> * Morphological derivatives not contained in the
- bracketed segments, as above. For nouns
- derived from adjectives, adverbs from
- adjectives, etc. This segment is usually
- found at the end of the main entry. The
- adverbial and nominalized derivatives at the
- end of a main entry are usually introduced
- by an em dash [represented as two hyphens (--)].
-
-<wf> bold, Same font as <hw>, with accents and syllable
- larger by breaks marked as in the headword.
- 2 points Marks the actual morphological forms within
- a <wordforms> segment; typically, adverbial or
- nominalized form of an adjective.
-
-
-<def2> * Second definition (occasionally, a third definition is
- present). This is used where a second or third
- part of speech with the same orthography is
- placed under one headword. Within this segment,
- there will be a <pos> field, and sometimes
- a <mark> and/or a quotation.
-
-<specif> * "Specifically:" Used to mark the words "specifically",
- "Hence", "as" which are used to introduce a second
- definition typically more specific than the first,
- but in general derived by extension of the initial
- definition. This functions as a warning of multiple
- definitions where the sense-numbers are not explicitly
- used. It is also useful in separate senses, to
- tag polysemous definitions which may be
- specializations or generalizations of the preceding
- definition.
-
-<pluf> italic. Plural form.
- Used exclusively to mark the "pl." abbreviation,
- which introduces a definition for the headword,
- *when used in the plural form*. Not related to
- <plu>, which spells out the plural form, but does
- define it.
-
-<uex> italic Usage example. Used only a few times, within
- <usage> segments.
-
-<isa> italic supertype (hypernym) the inverse of <stype> and
- identical to <hypen> but not derived from WordNet.
-
-<chform> plain, Chemical formula. The letters are plain font,
- numbers but the numbers are subscript. This is mostly
- subscript useful as a functional mark to pinpoint
- chemicals.
-
-<chformi> plain, Chemical formula same as <chform>, but not
- processed specially by the tag-converter program.
- The letters are plain font, but the numbers are
- subscript.
- Used in place of <chform> when the formula has
- a tag inside, which cannot n