diff options
Diffstat (limited to 'TAGSET.WEB')
-rw-r--r-- | TAGSET.WEB | 2120 |
1 files changed, 1060 insertions, 1060 deletions
@@ -1,1060 +1,1060 @@ | |||
1 | FIELD MARKS FOR WEBSTER 1913 and CIDE | 1 | FIELD MARKS FOR WEBSTER 1913 and CIDE |
2 | ===================================== | 2 | ===================================== |
3 | Tagset.web: | 3 | Tagset.web: |
4 | Explanations of the tags used to mark the Webster 1913 dictionary | 4 | Explanations of the tags used to mark the Webster 1913 dictionary |
5 | and the CIDE (Collaborative International Dictionary of English). | 5 | and the CIDE (Collaborative International Dictionary of English). |
6 | Note that the list of tags used to mark the public domain version | 6 | Note that the list of tags used to mark the public domain version |
7 | of this dictionary is shorter than the full set described here. | 7 | of this dictionary is shorter than the full set described here. |
8 | If any tag is not listed here, it is either (1) one of the | 8 | If any tag is not listed here, it is either (1) one of the |
9 | "point" (font size) or "type" (font style) tags, which should be self-explanatory; or | 9 | "point" (font size) or "type" (font style) tags, which should be self-explanatory; or |
10 | (2) Is a functional field with no effect on the typography. | 10 | (2) Is a functional field with no effect on the typography. |
11 | 11 | ||
12 | Last modified March 12, 1999. | 12 | Last modified March 12, 1999. |
13 | For questions, contact: | 13 | For questions, contact: |
14 | Patrick Cassidy cassidy@micra.com | 14 | Patrick Cassidy cassidy@micra.com |
15 | 735 Belvidere Ave. | 15 | 735 Belvidere Ave. |
16 | Plainfield, NJ 07062 | 16 | Plainfield, NJ 07062 |
17 | (908) 561-3416 or (908) 668-5252 | 17 | (908) 561-3416 or (908) 668-5252 |
18 | ------------------------------------------------------------- | 18 | ------------------------------------------------------------- |
19 | A separate file, webfont.asc, contains the list of the individual | 19 | A separate file, webfont.asc, contains the list of the individual |
20 | non-ASCII characters represented by either higher-order hexadecimal | 20 | non-ASCII characters represented by either higher-order hexadecimal |
21 | character marks (e.g., \'94, for o-umlaut) or by entity tags | 21 | character marks (e.g., \'94, for o-umlaut) or by entity tags |
22 | (e.g., <root/, for the square root symbol.) | 22 | (e.g., <root/, for the square root symbol.) |
23 | -------------------------------------------------------------- | 23 | -------------------------------------------------------------- |
24 | Use of tags: | 24 | Use of tags: |
25 | In the MICRA electronic version of the 1913 Webster, each part of | 25 | In the MICRA electronic version of the 1913 Webster, each part of |
26 | the entry headed by an entry word ("headword") is labeled so that no | 26 | the entry headed by an entry word ("headword") is labeled so that no |
27 | part of the entry except some punctuation marks should be found | 27 | part of the entry except some punctuation marks should be found |
28 | outside of all fields, i.e. every character should be within some tagged | 28 | outside of all fields, i.e. every character should be within some tagged |
29 | field. In the following description, the word "segment" usually refers to | 29 | field. In the following description, the word "segment" usually refers to |
30 | a major part of an entry such as an etymology or a definition or a | 30 | a major part of an entry such as an etymology or a definition or a |
31 | collocation segment or a usage block, containing more than one field. | 31 | collocation segment or a usage block, containing more than one field. |
32 | The term "field" may also be used similarly to "segment", but may also | 32 | The term "field" may also be used similarly to "segment", but may also |
33 | denote single-word fields, such as an alternative spelling, labeled <asp>. | 33 | denote single-word fields, such as an alternative spelling, labeled <asp>. |
34 | 34 | ||
35 | Note: The tags on this list are similar in structure to SGML tags. Each | 35 | Note: The tags on this list are similar in structure to SGML tags. Each |
36 | tag on this list marks a field; each field opens with a tagname between | 36 | tag on this list marks a field; each field opens with a tagname between |
37 | angle brackets thus: <tagname>, and closes with a similar tag containing | 37 | angle brackets thus: <tagname>, and closes with a similar tag containing |
38 | the forward slash thus: </tagname>. No tags are used without closing | 38 | the forward slash thus: </tagname>. No tags are used without closing |
39 | tags. Thus the HTML <BR> to indicate a line break is symbolized | 39 | tags. Thus the HTML <BR> to indicate a line break is symbolized |
40 | here as an entity, <br/, and every <p> has a corresponding </p>. | 40 | here as an entity, <br/, and every <p> has a corresponding </p>. |
41 | The absence of an end-field tag, or the presence of an end-field tag | 41 | The absence of an end-field tag, or the presence of an end-field tag |
42 | without a prior begin-field tag constitutes a typographical error, of which | 42 | without a prior begin-field tag constitutes a typographical error, of which |
43 | there may be a significant number. Any errors detected should be brought | 43 | there may be a significant number. Any errors detected should be brought |
44 | to the attention of PJC or the appropriate editor. | 44 | to the attention of PJC or the appropriate editor. |
45 | Most of the tagged fields are presented in the text in italic type, | 45 | Most of the tagged fields are presented in the text in italic type, |
46 | with a number of exceptions. Where a word is contained within more than | 46 | with a number of exceptions. Where a word is contained within more than |
47 | one field, the innermost field determines the font to be used. Wherever | 47 | one field, the innermost field determines the font to be used. Wherever |
48 | recognizable functional fields were found, an attempt was made to tag the | 48 | recognizable functional fields were found, an attempt was made to tag the |
49 | field with a functional mark, but in many cases, words were italicised only | 49 | field with a functional mark, but in many cases, words were italicised only |
50 | to represent the word itself as a discourse entity, and in some such cases, | 50 | to represent the word itself as a discourse entity, and in some such cases, |
51 | the "italic" mark <it> was used, implying nothing regarding functionality | 51 | the "italic" mark <it> was used, implying nothing regarding functionality |
52 | of the word. The base font is considered "plain". Where an italic field | 52 | of the word. The base font is considered "plain". Where an italic field |
53 | is indicated, parentheses or brackets within the field are not italicised. | 53 | is indicated, parentheses or brackets within the field are not italicised. |
54 | Where no font is specified for a tag, the tag is merely a functional | 54 | Where no font is specified for a tag, the tag is merely a functional |
55 | division, and was printed in plain font unless otherwise tagged. This type | 55 | division, and was printed in plain font unless otherwise tagged. This type |
56 | of segment is marked by an asterisk (*) where the font name would be. | 56 | of segment is marked by an asterisk (*) where the font name would be. |
57 | The size of the "plain" font in the original text is about 1.6 mm for | 57 | The size of the "plain" font in the original text is about 1.6 mm for |
58 | the height of capitalized letters. | 58 | the height of capitalized letters. |
59 | ============================================================= | 59 | ============================================================= |
60 | Explicit typographical tags: | 60 | Explicit typographical tags: |
61 | These were used where the purpose of a different font was merely to | 61 | These were used where the purpose of a different font was merely to |
62 | distinguish a word from the body of the text, and no explicit functional | 62 | distinguish a word from the body of the text, and no explicit functional |
63 | tag seemed apropriate. | 63 | tag seemed apropriate. |
64 | ----------------------------------- | 64 | ----------------------------------- |
65 | Tag Font | 65 | Tag Font |
66 | ----------------------------------- | 66 | ----------------------------------- |
67 | Explicit formatting tags: | 67 | Explicit formatting tags: |
68 | . . . . . . . . . . . . . . . . . . | 68 | . . . . . . . . . . . . . . . . . . |
69 | <plain> plain font (that used in the body of a definition) -- | 69 | <plain> plain font (that used in the body of a definition) -- |
70 | normally not marked, except within fields of | 70 | normally not marked, except within fields of |
71 | a different front. | 71 | a different front. |
72 | <it> italic (in master files) | 72 | <it> italic (in master files) |
73 | <i> italic (for use in HTML presentation) | 73 | <i> italic (for use in HTML presentation) |
74 | <bold> bold (in master files) | 74 | <bold> bold (in master files) |
75 | <b> bold (for use in HTML presentation) | 75 | <b> bold (for use in HTML presentation) |
76 | <colf> bold, Collocation font. Same font as used in collocations. | 76 | <colf> bold, Collocation font. Same font as used in collocations. |
77 | smaller This is used only in the list of "un-" words not | 77 | smaller This is used only in the list of "un-" words not |
78 | by 1 point actually defined in the dictionary. Probably could be | 78 | by 1 point actually defined in the dictionary. Probably could be |
79 | replaced by a segment mark for the entire list! | 79 | replaced by a segment mark for the entire list! |
80 | The "un-" words should be indexed as headwords. | 80 | The "un-" words should be indexed as headwords. |
81 | 81 | ||
82 | <ct> bold Same as <colf>, a font similar to that used in | 82 | <ct> bold Same as <colf>, a font similar to that used in |
83 | collocations. However, this tag is used in a table | 83 | collocations. However, this tag is used in a table |
84 | and could be set to a different font. | 84 | and could be set to a different font. |
85 | 85 | ||
86 | <h1> * HTML tag -- largest heading font. | 86 | <h1> * HTML tag -- largest heading font. |
87 | 87 | ||
88 | <h2> * HTML tag -- second largest heading font. | 88 | <h2> * HTML tag -- second largest heading font. |
89 | 89 | ||
90 | <headrow> * Marks a Row title in a table. | 90 | <headrow> * Marks a Row title in a table. |
91 | 91 | ||
92 | <hwf> Font the same as the headword <hw>, though the field is | 92 | <hwf> Font the same as the headword <hw>, though the field is |
93 | not a headword. Used only once. | 93 | not a headword. Used only once. |
94 | 94 | ||
95 | <mitem> * Multiple items, a set of items in a table. | 95 | <mitem> * Multiple items, a set of items in a table. |
96 | <point ...> A series of point size markers, many unique. | 96 | <point ...> A series of point size markers, many unique. |
97 | <point1.5> * One of the tags of the form <point**> where ** | 97 | <point1.5> * One of the tags of the form <point**> where ** |
98 | <point6> represents the typographic point size of the | 98 | <point6> represents the typographic point size of the |
99 | enclosed text. | 99 | enclosed text. |
100 | <pre> An HTML tag indicating that the enclosed text is | 100 | <pre> An HTML tag indicating that the enclosed text is |
101 | of teletype form, preformatted in a uniform-spaced | 101 | of teletype form, preformatted in a uniform-spaced |
102 | font. | 102 | font. |
103 | <sc> small caps (used mostly for "a. d.", "b. c.") | 103 | <sc> small caps (used mostly for "a. d.", "b. c.") |
104 | This is the same font a <er>, but has no functional | 104 | This is the same font a <er>, but has no functional |
105 | or semantic significance | 105 | or semantic significance |
106 | <str> group of table data elements in a table | 106 | <str> group of table data elements in a table |
107 | <sub> subscript, like <subs> | 107 | <sub> subscript, like <subs> |
108 | <subs> subscript | 108 | <subs> subscript |
109 | <sups> superscript | 109 | <sups> superscript |
110 | <supr> superscript | 110 | <supr> superscript |
111 | <sansserif> Sans-serif font | 111 | <sansserif> Sans-serif font |
112 | <stypec> Bold (collocation font) and also a subtype. | 112 | <stypec> Bold (collocation font) and also a subtype. |
113 | <tt> HTML tage -- teletype font | 113 | <tt> HTML tage -- teletype font |
114 | <universbold> A squared bold font without serifs approximating the | 114 | <universbold> A squared bold font without serifs approximating the |
115 | "universe bold" font on the HP Laserjet4, slightly | 115 | "universe bold" font on the HP Laserjet4, slightly |
116 | larger than the capitals in a definition body. Used | 116 | larger than the capitals in a definition body. Used |
117 | in expositions describing shapes, such as | 117 | in expositions describing shapes, such as |
118 | "Y", "T", "U", "X", "V", "F". | 118 | "Y", "T", "U", "X", "V", "F". |
119 | <vertical> Vertically organized column. | 119 | <vertical> Vertically organized column. |
120 | <column1> Vertically organized column -- only part of a table | 120 | <column1> Vertically organized column -- only part of a table |
121 | which needs to be completed. Used once. | 121 | which needs to be completed. Used once. |
122 | <...type> A series of tags, many unique, designating certain | 122 | <...type> A series of tags, many unique, designating certain |
123 | unusual fonts, such as "bourgeoistype" for | 123 | unusual fonts, such as "bourgeoistype" for |
124 | "bourgeois type", in the section on typography. | 124 | "bourgeois type", in the section on typography. |
125 | Most of these occur only once, in the section on fonts. | 125 | Most of these occur only once, in the section on fonts. |
126 | <antiquetype> | 126 | <antiquetype> |
127 | <blacklettertype> | 127 | <blacklettertype> |
128 | <boldfacetype> | 128 | <boldfacetype> |
129 | <bourgeoistype> | 129 | <bourgeoistype> |
130 | <boxtype> | 130 | <boxtype> |
131 | <clarendontype> | 131 | <clarendontype> |
132 | <englishtype> | 132 | <englishtype> |
133 | <extendedtype> | 133 | <extendedtype> |
134 | <frenchelzevirtype> | 134 | <frenchelzevirtype> |
135 | <germantype> | 135 | <germantype> |
136 | <gothictype> | 136 | <gothictype> |
137 | <greatprimertype> | 137 | <greatprimertype> |
138 | <longprimertype> | 138 | <longprimertype> |
139 | <miniontype> | 139 | <miniontype> |
140 | <nonpareiltype> |