diff options
author | Sergey Poznyakoff <gray@gnu.org.ua> | 2012-02-03 12:48:52 +0200 |
---|---|---|
committer | Sergey Poznyakoff <gray@gnu.org.ua> | 2012-02-03 12:48:52 +0200 |
commit | d18a469b7a5a4d4b5da21eab37f34ab1e99a8dce (patch) | |
tree | 7eb331e376e85287c25b6a9734dae58a4724da8a /tagset.txt | |
parent | 4a458db06b28492a7e48b1a0560b35778e476482 (diff) | |
download | gcide-d18a469b7a5a4d4b5da21eab37f34ab1e99a8dce.tar.gz gcide-d18a469b7a5a4d4b5da21eab37f34ab1e99a8dce.tar.bz2 |
Revise tagset.txt
* tagset.txt: Review.
* README: Reformat.
* webfont.txt: Reformat. Document <and/ and <or/.
Diffstat (limited to 'tagset.txt')
-rw-r--r-- | tagset.txt | 2057 |
1 files changed, 1056 insertions, 1001 deletions
@@ -1,13 +1,14 @@ - FIELD MARKS FOR WEBSTER 1913 and CIDE - ===================================== - Explanations of the tags used to mark the Webster 1913 dictionary -and the CIDE (Collaborative International Dictionary of English). -Note that the list of tags used to mark the public domain version -of this dictionary is shorter than the full set described here. - If any tag is not listed here, it is either (1) one of the -"point" (font size) or "type" (font style) tags, which should be -self-explanatory; or (2) is a functional field with no effect on the -typography. +FIELD MARKS FOR WEBSTER 1913 and CIDE +===================================== + +* Overview + +This file describes the tags used to mark the Webster 1913 dictionary and +the GCIDE (GNU Collaborative International Dictionary of English). + +If any tag is not listed here, it is either (1) one of the "point" (font +size) or "type" (font style) tags, which should be self-explanatory; or (2) +is a functional field with no effect on the typography. Last modified March 12, 1999. For questions, contact: @@ -15,114 +16,145 @@ Last modified March 12, 1999. 735 Belvidere Ave. Plainfield, NJ 07062 (908) 561-3416 or (908) 668-5252 -------------------------------------------------------------- -A separate file, webfont.txt, contains the list of the individual + +A separate file, webfont.txt, contains the list of the individual non-ASCII characters represented by either higher-order hexadecimal -character marks (e.g., \'94, for o-umlaut) or by entity tags -(e.g., <root/, for the square root symbol.) --------------------------------------------------------------- - Use of tags: - In the MICRA electronic version of the 1913 Webster, each part of -the entry headed by an entry word ("headword") is labeled so that no -part of the entry except some punctuation marks should be found -outside of all fields, i.e. every character should be within some tagged -field. In the following description, the word "segment" usually refers to -a major part of an entry such as an etymology or a definition or a -collocation segment or a usage block, containing more than one field. -The term "field" may also be used similarly to "segment", but may also -denote single-word fields, such as an alternative spelling, labeled <asp>. - - Note: The tags on this list are similar in structure to SGML tags. Each -tag on this list marks a field; each field opens with a tagname between -angle brackets thus: <tagname>, and closes with a similar tag containing -the forward slash thus: </tagname>. No tags are used without closing -tags. Thus the HTML <BR> to indicate a line break is symbolized -here as an entity, <br/, and every <p> has a corresponding </p>. - The absence of an end-field tag, or the presence of an end-field tag -without a prior begin-field tag constitutes a typographical error, of which -there may be a significant number. Any errors detected should be brought -to the attention of PJC or the appropriate editor. - Most of the tagged fields are presented in the text in italic type, -with a number of exceptions. Where a word is contained within more than -one field, the innermost field determines the font to be used. Wherever -recognizable functional fields were found, an attempt was made to tag the -field with a functional mark, but in many cases, words were italicised only -to represent the word itself as a discourse entity, and in some such cases, -the "italic" mark <it> was used, implying nothing regarding functionality -of the word. The base font is considered "plain". Where an italic field -is indicated, parentheses or brackets within the field are not italicised. - Where no font is specified for a tag, the tag is merely a functional +character marks (e.g., \'94, for o-umlaut) or by entity tags (e.g., +<root/, for the square root symbol.) + +* Introduction + +In the MICRA electronic version of the 1913 Webster and in GCIDE, each part +of the entry headed by an entry word ("headword") is labeled so that no part +of the entry except some punctuation marks should be found outside of all +fields, i.e. every character should be within some tagged field. In the +following description, the word "segment" usually refers to a major part of +an entry such as an etymology or a definition or a collocation segment or a +usage block, containing more than one field. The term "field" may also be +used similarly to "segment", but may also denote single-word fields, such as +an alternative spelling, labeled <asp>. + +The tags on this list are similar in structure to SGML tags. Each tag on +this list marks a field; each field opens with a tagname between angle +brackets thus: <tagname>, and closes with a similar tag containing the +forward slash thus: </tagname>. No tags are used without closing tags. +Thus a line break (similar to HTML <br> tag) is symbolized here as an +entity, <br/, and every <p> has a corresponding </p>. + +The absence of an end-field tag, or the presence of an end-field tag without +a prior begin-field tag constitutes a typographical error, of which there +may be a significant number. Any errors detected should be brought to the +attention of PJC or the appropriate editor. + +Most of the tagged fields are presented in the text in italic type, with a +number of exceptions. Where a word is contained within more than one field, +the innermost field determines the font to be used. Wherever recognizable +functional fields were found, an attempt was made to tag the field with a +functional mark, but in many cases, words were italicised only to represent +the word itself as a discourse entity, and in some such cases, the "italic" +mark <it> was used, implying nothing regarding functionality of the word. +The base font is considered "plain". Where an italic field is indicated, +parentheses or brackets within the field are not italicised. + +Where no font is specified for a tag, the tag is merely a functional division, and was printed in plain font unless otherwise tagged. This type -of segment is marked by an asterisk (*) where the font name would be. - The size of the "plain" font in the original text is about 1.6 mm for -the height of capitalized letters. -============================================================= -Explicit typographical tags: - These were used where the purpose of a different font was merely to -distinguish a word from the body of the text, and no explicit functional -tag seemed apropriate. ------------------------------------ -Tag Font ------------------------------------ -Explicit formatting tags: -. . . . . . . . . . . . . . . . . . -<plain> plain font (that used in the body of a definition) -- - normally not marked, except within fields of - a different front. -<it> italic (in master files) -<i> italic (for use in HTML presentation) -<bold> bold (in master files) -<b> bold (for use in HTML presentation) -<colf> bold, Collocation font. Same font as used in collocations. - smaller This is used only in the list of "un-" words not - by 1 point actually defined in the dictionary. Probably could be - replaced by a segment mark for the entire list! - The "un-" words should be indexed as headwords. - -<ct> bold Same as <colf>, a font similar to that used in - collocations. However, this tag is used in a table - and could be set to a different font. - -<h1> * HTML tag -- largest heading font. - -<h2> * HTML tag -- second largest heading font. - -<headrow> * Marks a Row title in a table. - -<hwf> Font the same as the headword <hw>, though the field is - not a headword. Used only once. - -<mitem> * Multiple items, a set of items in a table. -<point ...> A series of point size markers, many unique. -<point1.5> * One of the tags of the form <point**> where ** -<point6> represents the typographic point size of the - enclosed text. -<pre> An HTML tag indicating that the enclosed text is - of teletype form, preformatted in a uniform-spaced - font. -<sc> small caps (used mostly for "a. d.", "b. c.") - This is the same font a <er>, but has no functional - or semantic significance -<str> group of table data elements in a table -<sub> subscript, like <subs> -<subs> subscript -<sups> superscript -<supr> superscript -<sansserif> Sans-serif font -<stypec> Bold (collocation font) and also a subtype. -<tt> HTML tage -- teletype font -<universbold> A squared bold font without serifs approximating the - "universe bold" font on the HP Laserjet4, slightly - larger than the capitals in a definition body. Used - in expositions describing shapes, such as - "Y", "T", "U", "X", "V", "F". -<vertical> Vertically organized column. -<column1> Vertically organized column -- only part of a table - which needs to be completed. Used once. -<...type> A series of tags, many unique, designating certain - unusual fonts, such as "bourgeoistype" for - "bourgeois type", in the section on typography. - Most of these occur only once, in the section on fonts. +of segment is marked by an asterisk (*) where the font name would be. The +size of the "plain" font in the original text is about 1.6 mm for the height +of capitalized letters. + +* Explicit typographical tags + +These were used where the purpose of a different font was merely to +distinguish a word from the body of the text, and no explicit functional tag +seemed apropriate. + +------------------------------------------------------------------------- +Tag Font Description +------------------------------------------------------------------------- +<plain> plain font that used in the body of a definition -- normally + not marked, except within fields of a different + front. + +<it> italic in master files + +<i> italic for use in HTML presentation + +<bold> bold in master files + +<b> bold for use in HTML presentation + +<colf> bold, Collocation font. Same font as used in + collocations. + smaller This is used only in the list of "un-" + by 1 point words not actually defined in the + dictionary. + Probably could be replaced by a segment mark + for the entire list! The "un-" words should + be indexed as headwords. + +<ct> bold Same as <colf>, a font similar to that used + in collocations. However, this tag is used + in a table and could be set to a different + font. + +<h1> * HTML tag -- largest heading font. + +<h2> * HTML tag -- second largest heading font. + +<headrow> * Marks a Row title in a table. + +<hwf> Font the same as the headword <hw>, though + the field is not a headword. Used only + once. + +<mitem> * Multiple items, a set of items in a table. +<point ...> A series of point size markers, many + unique. + +<point1.5> * One of the tags of the form <point**> where ** +<point6> represents the typographic point size of the + enclosed text. + +<pre> An HTML tag indicating that the enclosed + text is of teletype form, preformatted in a + uniform-spaced font. + +<sc> small caps used mostly for "a. d.", "b. c." + This is the same font as in <er>, but has no + functional or semantic significance. + +<str> group of table data elements in a table. + +<sub> subscript + +<subs> subscript + +<sups> superscript + +<supr> superscript + +<sansserif> Sans-serif + +<stypec> Bold collocation font, and also a subtype. + +<tt> HTML tage -- teletype font + +<universbold> A squared bold font without serifs approximating + the "universe bold" font on the HP Laserjet4, + slightly larger than the capitals in a definition + body. Used in expositions describing shapes, + such as "Y", "T", "U", "X", "V", "F". + +<vertical> Vertically organized column. + +<column1> Vertically organized column -- only part of a table + which needs to be completed. Used once. + +<...type> A series of tags, many unique, designating + certain unusual fonts, such as "bourgeoistype" + for "bourgeois type", in the section on + typography. Most of these occur only once, in + the section on fonts. Some examples follow: <antiquetype> <blacklettertype> <boldfacetype> @@ -146,935 +178,958 @@ Explicit formatting tags: <smpicatype> <typewritertype> -============================================================= -Tags with semantic content: -. . . . . . . . . . . . . . . . . . . . . . . . . . . -<altsp> * Alternative spelling segment. Almost always - contained within square brackets after the main - definition segment. Expository words - such as "Spelled also" are in plain font; - the actual alternative spelling is marked by - <asp> ... </asp> tags within this segment. - -<ant> italic Antonym. - -<asp> italic Alternative spelling. The actual word which is an - alternative spelling to the headword. These - are functionally synonyms of the headword. In - most cases these also occur as headwords, with - reference to the word where the actual definition - is found, but not all such words are listed - separately, particularly if the spelling is - close enough to the headword to be found at the - same point in the dictionary. Whether listed - separately or not, these words should - be indexed at this location, also. - -<au> italic Authority or author. Used where an authority is - (may be right- given for a definition, and also used for the - justified. See author, where a quotation within double quotes - in the section is given in the same paragraph as the - on formatting). definition. The double quotes are indicated - by the open-quote (\'bd) and close-quote - (\'b8). In both cases, it is typically - right-justified, almost always fitting on - the same line with the last line of the - definition or quotation. - Within collocation segments, it is usually - used only after quotations, and is not right- - justified, except occasionally where it +* Tags with semantic content: + +------------------------------------------------------------------------- +Tag Font Meaning and Description +------------------------------------------------------------------------- +<altsp> * Alternative spelling segment. Almost always + contained within square brackets after the main + definition segment. Expository words such as + "Spelled also" are in plain font; the actual + alternative spelling is marked by <asp> ... + </asp> tags within this segment. + +<ant> italic Antonym. + +<asp> italic Alternative spelling. The actual word which is + an alternative spelling to the headword. These + are functionally synonyms of the headword. In + most cases these also occur as headwords, with + reference to the word where the actual definition + is found, but not all such words are listed + separately, particularly if the spelling is close + enough to the headword to be found at the same + point in the dictionary. Whether listed + separately or not, these words should be indexed + at this location, also. + +<au> italic Authority or author. Used where an authority is + given for a definition, and also used for the + author, where a quotation within double quotes is + given in the same paragraph as the definition. + The double quotes are indicated by the open-quote + (\'bd) and close-quote (\'b8). In both cases, it + is typically right-justified, almost always + fitting on the same line with the last line of + the definition or quotation. + + Within collocation segments, it is usually used + only after quotations, and is not + right-justified, except occasionally where it would be close to the right margin, and then - apparently is is right-justified. We have - not explicitly marked those which are - right-justified, but they can be - recognized because they are on a line by - themselves, preceded by two carriage returns. + apparently is is right-justified. We have not + explicitly marked those which are + right-justified, but they can be recognized + because they are on a line by themselves, + preceded by two carriage returns. -<bio> * Marks a biography. Should be longer than - a short mention of who a person was, which - is typically included as a definition. +<bio> * Marks a biography. Should be longer than a short + mention of who a person was, which is typically + included as a definition. -<biography> * Same as <bio> +<biography> * Same as <bio> -<booki> italic Marks the name of a book, pamphlet, or similar - document. +<booki> italic Marks the name of a book, pamphlet, or similar + document. -<branchof> * A field of knowledge which of which the headword +<branchof> * A field of knowledge which of which the headword is a division. -<caption> * Caption of a figure or table. - -<cas> * tags the CAS (Chemical Abstracts Service) registry - number for a chemical substance. - -<causes> italic tags the infectious disease caused by the headword. - Implied type of the agent is a microorganism, and - the tag must mark a disease. +<caption> * Caption of a figure or table. -<causesp> * Same as <causes> without the italic type. -<causedbyp> * Same as <causedby> without the italic type. +<cas> * tags the CAS (Chemical Abstracts Service) + registry number for a chemical substance. -<causedby> italic inverse of causes: tags the causative agent of an - infectious disease, which is the headword . - the tag must mark a microorganism, virus, or - prion, and the implied type of the headword is - a disease. +<causes> italic tags the infectious disease caused by the + headword. Implied type of the agent is a + microorganism, and the tag must mark a disease. -<centered> Used only for The single letter in the headers to each - letter of the alphabet. +<causesp> * Same as <causes> without the italic type. +<causedbyp> * Same as <causedby> without the italic type. -<city> * marks the proper name of a city. Used only - occasionally and not consistently at this stage. +<causedby> italic inverse of <causes>: tags the causative agent of + an infectious disease, which is the headword. + The tag must mark a microorganism, virus, or + prion, and the implied type of the headword is a + disease. -<cnvto> italic Converted to: used to tag substances which are - products prepared by conversion from the - headword. Usually chemicals or complex - products from mnatuarl materials. Rarely used - up to 1998. +<centered> Used only for the single letter in the headers to + each letter of the alphabet. -<colheads> * List of heads for the columns of a table. +<city> * marks the proper name of a city. Used only + occasionally and not consistently at this stage. -<coltitle> * Title of a column in a table. +<cnvto> italic Converted to: used to tag substances which are + products prepared by conversion from the + headword. Usually chemicals or complex products + from natuarl materials. Rarely used up to 1998. -<comm> * Comment -- differs from <note> in being in-line with - the definition paragraph. Provides a little - additional information. +<colheads> * List of heads for the columns of a table. -<company> * Name of a company (commercial firm). Compare <org> +<coltitle> * Title of a column in a table. -<compof> italic Composed of. Tags a substance of which the - headword is at least partly composed. The - substance may be particulate, such as - diatoms composing diatomaceous earth. +<comm> * Comment -- differs from <note> in being in-line + with the definition paragraph. Provides a little + additional information. -<contains> * marks an object contained within the headword. +<company> * Name of a company (commercial firm). Compare + <org>. -<contr> italic Contrasting word. Not exactly an antonym, which - is marked <ant>, but a contrasting word which is - often introduced as "opposite to" or "contrasts - with". +<compof> italic Composed of. Tags a substance of which the + headword is at least partly composed. The + substance may be particulate, such as diatoms + composing diatomaceous earth. -<country> * Name of a country (nation) of the world. +<contains> * marks an object contained within the headword. -<cref> italic Collocation reference. A reference to a collocation. - Each such collocation should have its own entry, - marked by <col> ... </col> tags, and these - references should function as hypertext buttons - to access that entry. +<contr> italic Contrasting word. Not exactly an antonym, which + is marked <ant>, but a contrasting word which is + often introduced as "opposite to" or "contrasts + with". -<date> * A Date, of any type, e.g. <date>Dec. 25</date>. +<country> * Name of a country (nation) of the world. -<datey> * Date-with-year tags a date containing a year. - -<def> * definition. The definition may have subfields, - particularly <as> (an illustrative phrase - starting with "as" or "thus" and containing - the headword (or a morphological derivative). - The <mark>, \'bd...\'b8 quotations (left and - right double quotes) and <au> fields may be - found within a definition field, but should - and usually are located outside the definition - proper. The marking macro was - inconsistent in this placement, and the - exclusion of the <mark>, <au> and quotations - needs to be completed by the proof-readers. - Certain definitions contain <pos> - fields within them, where the headword is - an irregular derivative of another headword. - In these cases, the <pos> field follows - immediately after the <def> tag, and these - entries do not have a separate <pos> field. - In such cases, the <pos> field is italic, as - usual. - -<divof> * Division of the headword, usually an organization. - E. g. a faculty or department of a university, - or a United Nations agency. +<cref> italic Collocation reference. A reference to a + collocation. Each such collocation should have + its own entry, marked by <col> ... </col> tags, + and these references should function as hypertext + buttons to access that entry. -<edi> * Marks an education institution, a subtype of +<date> * A Date, of any type, e.g. <date>Dec. 25</date>. + +<datey> * Date-with-year tags a date containing a year. + +<def> * A definition. The definition may have subfields, + particularly <as> (an illustrative phrase + starting with "as" or "thus" and containing the + headword (or a morphological derivative). The + <mark>, \'bd...\'b8 quotations (left and right + double quotes) and <au> fields may be found + within a definition field, but should and usually + are located outside the definition proper. The + marking macro was inconsistent in this placement, + and the exclusion of the <mark>, <au> and + quotations needs to be completed by the + proof-readers. + + Certain definitions contain <pos> fields within + them, where the headword is an irregular + derivative of another headword. In these cases, + the <pos> field follows immediately after the + <def> tag, and these entries do not have a + separate <pos> field. In such cases, the <pos> + field is italic, as usual. + +<divof> * Division of the headword, usually an + organization. E. g. a faculty or department of a + university, or a United Nations agency. + +<edi> * Marks an education institution, a subtype of organization. -<emits> * tags a physical object or form of radiation - emitted by the headword +<emits> * Tags a physical object or form of radiation + emitted by the headword. -<figure> Just a place-holder for illustrations, but seldom used. +<figure> Just a place-holder for illustrations, but seldom + used. -<film> italic Marks the name of a movie film. +<film> italic Marks the name of a movie film. -<fld> italic Field of specialization. Most often used for +<fld> italic Field of specialization. Most often used for Zoology and Botany, but many "fields of - specialization" are marked for technical - terms. The parentheses are usually within this - field, but are not themselves in italics. - -<geog> * Name of a geograpahical region of any size; - if applicable, the more specific <city>, - <state>, or <country> are preferred. - -<hypen> * Hyperym. Points to the hypernym from WordNet 1.5 - Initially, used only for entries extracted - from WordNet 1.5. Not present in the original - 1913 version. + specialization" are marked for technical terms. + The parentheses are usually within this field, + but are not themselves in italics. + +<geog> * Name of a geograpahical region of any size; if + applicable, the more specific <city>, <state>, or + <country> are preferred. + +<hypen> * Hyperym. Points to the hypernym from WordNet 1.5 + Initially, used only for entries extracted from + WordNet 1.5. Not present in the original 1913 + version. -<illu> * Illustrative usage -- mostly from WordNet, and placed - outside the definition, in contrast to <as> usage. - These should be converted to <as>...</as> illustrative - usage format for consistency. - -<illust> * Illustration place-holder. Seldom used. -<img> * HTML usage -- points to an image file, usually - .gif or .jpg. These have no closing tag, and - will appear as errors in parsing. -<intensi> * Points to a word whose meaning is an intensified - form of the headword. Taken from WordNet - tags, used with some adjectives from WordNet -<item> * Designates one item in a row of a table. Used only when - intervening spaces do not serve properly as natural - field separaters. -<itran> italic Translation into a foreign (non-English) language - of the previous word in the text -- italic font. - (<sig> is a translation into English) -<itrans> italic Same as <itran> -<jour> * Title of a journal (periodical). -<matrix> * Always a filled rectangular array. -<matrix2x5> * A 2x5 matrix (2 rows by 5 columns). -<mstypec> * Multiple synonymous subtypes -- used in - def. of "grass". -<mtable> * Multiple table, encloses <table> figures. -<musfig> * Music figure. Only in a note under the entry "Figure", - the two numbers of each such field - are bold, 20 point type, stacked as in a fraction with - a bar between them, but also having a horizontal stroke - midway through each numeral. Unique to this entry. -<p> * paragraph tag, used always in pairs. Line breaks may - be embedded inside the paragraphs. -<person> * marks the proper name of a person. Used only - occasionally, but should be used more frequently - for cases where first names are abbreviated, - to reduce ambiguity of the period for automatic - analysis. Where a title is given, prefixed - or postfixed, it is included in this tag. - -<persfn> * marks the name of a person, when only one name - (usually the last name) is given. Not used - consistently where it should be. - -<publ> * Marks the name of a publication other than book, - which is marked by <booki>. It is often a - magazine or journal. -<qpers> * Tags the name of a person who is speaking, - within a quotation. -<qperson> Same as <qpers> -<cp> * Collocation, plain text -- used to tag phrases that - should be parsed as a unit, but has no typographical - significance. -<qau> italic Always right-justified, as described for <au>. -<ref> * A reference to a word in the vocabulary. -<refs> * Marks the set of references used for a longer article - such as a biography. -<river> * Marks the name of a river -- a proper name -<rj> * Right justified -<row> * Designates a row in a table. -<state> * Name of a geopolitical state, the first subdivision of - a country. Includes, e.g. Canadian provinces. -<subtypes> * Lists subtypes of the headword. -<sup> * superscript -<supr> * Supra. The two parts of each such field - are stacked, one over the other, *without* a - horizontal bar between (as in a fraction). - Used only in one entry, for a musical notation. -<table> * Always a filled rectangular array, having <row> and <item> - elements. -<td> * Table datum - one cell in a table -<th> * Table header -<tradename> * Tags a commercial Trade name -<ttitle> * Table title (Larger than normal font) +<illu> * Illustrative usage -- mostly from WordNet, and + placed outside the definition, in contrast to + <as> usage. These should be converted to + <as>...</as> illustrative usage format for + consistency. + +<illust> * Illustration place-holder. Seldom used. + +<img> * HTML usage -- points to an image file, usually + .gif or .jpg. These have no closing tag, and + will appear as errors in parsing. + +<intensi> * Points to a word whose meaning is an intensified + form of the headword. Taken from WordNet tags, + used with some adjectives from WordNet. + +<item> * Designates one item in a row of a table. Used + only when intervening spaces do not serve + properly as natural field separaters. + +<itran> italic Translation into a foreign (non-English) language + of the previous word in the text -- italic font. + (<sig> is a translation into English) + +<itrans> italic Same as <itran> + +<jour> * Title of a journal (periodical). + +<matrix> * Always a filled rectangular array. + +<matrix2x5> * A 2x5 matrix (2 rows by 5 columns). + +<mstypec> * Multiple synonymous subtypes -- used in def. of + "grass". + +<mtable> * Multiple table, encloses <table> figures. + +<musfig> * Music figure. Only in a note under the entry + "Figure", the two numbers of each such field are + bold, 20 point type, stacked as in a fraction + with a bar between them, but also having a + horizontal stroke midway through each + numeral. Unique to this entry. + +<p> * Paragraph tag, used always in pairs. Line breaks + may be embedded inside the paragraphs. + +<person> * Marks the proper name of a person. Used only + occasionally, but should be used more frequently + for cases where first names are abbreviated, to + reduce ambiguity of the period for automatic + analysis. Where a title is given, prefixed or + postfixed, it is included in this tag. + +<persfn> * Marks the name of a person, when only one name + (usually the last name) is given. Not used + consistently where it should be. + +<publ> * Marks the name of a publication other than book, + which is marked by <booki>. It is often a + magazine or journal. + +<qpers> * Tags the name of a person who is speaking, within + a quotation. + +<qperson> Same as <qpers> + +<cp> * Collocation, plain text -- used to tag phrases + that should be parsed as a unit, but has no + typographical significance. + +<qau> italic Always right-justified, as described for <au>. + +<ref> * A reference to a word in the vocabulary. + +<refs> * Marks the set of references used for a longer + article such as a biography. + +<river> * Marks the name of a river -- a proper name. + +<rj> * Right justified. + +<row> * Designates a row in a table. + +<state> * Name of a geopolitical state, the first + subdivision of a country. Includes, e.g. Canadian + provinces. + +<subtypes> * Lists subtypes of the headword. + +<sup> * Superscript + +<supr> * Supra. The two parts of each such field are + stacked, one over the other, *without* a + horizontal bar between (as in a fraction). Used + only in one entry, for a musical notation. + |