diff options
author | Sergey Poznyakoff <gray@gnu.org.ua> | 2012-02-03 12:48:52 +0200 |
---|---|---|
committer | Sergey Poznyakoff <gray@gnu.org.ua> | 2012-02-03 12:48:52 +0200 |
commit | d18a469b7a5a4d4b5da21eab37f34ab1e99a8dce (patch) | |
tree | 7eb331e376e85287c25b6a9734dae58a4724da8a | |
parent | 4a458db06b28492a7e48b1a0560b35778e476482 (diff) | |
download | gcide-d18a469b7a5a4d4b5da21eab37f34ab1e99a8dce.tar.gz gcide-d18a469b7a5a4d4b5da21eab37f34ab1e99a8dce.tar.bz2 |
Revise tagset.txt
* tagset.txt: Review.
* README: Reformat.
* webfont.txt: Reformat. Document <and/ and <or/.
-rw-r--r-- | README | 363 | ||||
-rw-r--r-- | tagset.txt | 2057 | ||||
-rw-r--r-- | webfont.txt | 302 |
3 files changed, 1385 insertions, 1337 deletions
@@ -10,25 +10,23 @@ The README file * OVERVIEW -========== -This document describes the GNU version of the Collaborative -International Dictionary of English. It is organized into a series of -chapters, introduced by headings beginning with a single asterisk. A -chapter may have sections, which are marked with two asterisks. For -those readers who use Emacs, this structure corresponds to its -"Outline mode", which will be enabled automatically upon loading this -file. - -The chapter "INTRODUCTION" describes the structure of this package. -The chapter "STRUCTURE OF THE DICTIONARY" describes the dictionary -structure in general. An overview of the markup tags is provided in -the chapter "TAGS". A detailed information about dictionary markup -can be obtained from a set of ancillary files included in this -package, which are described in the chapter "ANCILLARY FILES". - -The chapter "DICTIONARY LOOKUP" describes how to use GNU Dico for -reading this dictionary. Finally, other versions of the Webster -dictionary are listed in the chapter "OTHER VERSIONS OF THE -DICTIONARY". + +This document describes the GNU version of the Collaborative International +Dictionary of English. It is organized into a series of chapters, +introduced by headings beginning with a single asterisk. A chapter may have +sections, which are marked with two asterisks. For those readers who use +Emacs, this structure corresponds to its "Outline mode", which will be +enabled automatically upon loading this file. + +The chapter "INTRODUCTION" describes the structure of this package. The +chapter "STRUCTURE OF THE DICTIONARY" describes the dictionary structure in +general. An overview of the markup tags is provided in the chapter "TAGS". +A detailed information about dictionary markup can be obtained from a set of +ancillary files included in this package, which are described in the chapter +"ANCILLARY FILES". + +The chapter "DICTIONARY LOOKUP" describes how to use GNU Dico for reading +this dictionary. Finally, other versions of the Webster dictionary are +listed in the chapter "OTHER VERSIONS OF THE DICTIONARY". * INTRODUCTION -============== + The dictionary was derived from the @@ -50,14 +48,13 @@ and is being proof-read and supplemented by volunteers from around the world. This is an unfunded project, and future enhancement of this -dictionary will depend on the efforts of volunteers willing to help -build this free resource into a comprehensive body of general -information. New definitions for missing words or words senses and -longer explanatory notes, as well as images to accompany the articles -are needed. More modern illustrative quotations giving recent -examples of usage of the words in their various senses will be very -helpful, since most quotations in the original 1913 dictionary are now -well over 100 years old. - -This electronic version is being maintained by World Soul, a -non-profit organization in Plainfield, NJ. For additional information -or if you are willing to assist construction of this data source, contact: +dictionary will depend on the efforts of volunteers willing to help build +this free resource into a comprehensive body of general information. New +definitions for missing words or words senses and longer explanatory notes, +as well as images to accompany the articles are needed. More modern +illustrative quotations giving recent examples of usage of the words in +their various senses will be very helpful, since most quotations in the +original 1913 dictionary are now well over 100 years old. + +This electronic version is being maintained by World Soul, a non-profit +organization in Plainfield, NJ. For additional information or if you are +willing to assist construction of this data source, contact: @@ -71,36 +68,34 @@ or if you are willing to assist construction of this data source, contact: -GCIDE is free software; you can redistribute it and/or modify -it under the terms of the GNU General Public License as published by -the Free Software Foundation; either version 2, or (at your option) -any later version. +GCIDE is free software; you can redistribute it and/or modify it under the +terms of the GNU General Public License as published by the Free Software +Foundation; either version 2, or (at your option) any later version. -GCIDE is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. +GCIDE is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS +FOR A PARTICULAR PURPOSE. See the GNU General Public License for more +details. -You should have received a copy of the GNU General Public License -along with this copy of GCIDE; see the file COPYING. If not, write -to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, -Boston, MA 02111-1307, USA. +You should have received a copy of the GNU General Public License along with +this copy of GCIDE; see the file COPYING. If not, write to the Free +Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA +02111-1307, USA. * STRUCTURE OF THE DICTIONARY -============================= -When the archive is unpacked, the main dictionary text of the GCIDE -will be found in 26 files named "CIDE.*", where the asterisk indicates -which letter of the alphabet begins the words in each file. For -example, file "CIDE.B" contains words beginning with the letter "B". -Additional information about the tagging conventions and special -character symbols are contained in ancillary files in this directory -(see below the section entitled "ANCILLARY FILES"). The main body of -the 1913 dictionary was essentially identical to the edition published -in 1890, and was republished in 1913 with an appendix containing "New -Words". The new words of that appendix have been integrated into the -main file in this version. However, it is important to keep in mind -that the definitions in this dictionary are in most cases over 100 + +When the archive is unpacked, the main dictionary text of the GCIDE will be +found in 26 files named "CIDE.*", where the asterisk indicates which letter +of the alphabet begins the words in each file. For example, file "CIDE.B" +contains words beginning with the letter "B". Additional information about +the tagging conventions and special character symbols are contained in +ancillary files in this directory (see below the section entitled "ANCILLARY +FILES"). The main body of the 1913 dictionary was essentially identical to +the edition published in 1890, and was republished in 1913 with an appendix +containing "New Words". The new words of that appendix have been integrated +into the main file in this version. However, it is important to keep in +mind that the definitions in this dictionary are in most cases over 100 years old. Use them with caution! -At the bottom of each paragraph in this dictionary, there is a -bracketed and tagged "source" indicated. This tells from where the -definition or other text in that paragraph came, as follows: +At the bottom of each paragraph in this dictionary, there is a bracketed and +tagged "source" indicated. This tells from where the definition or other +text in that paragraph came, as follows: @@ -119,6 +114,5 @@ definition or other text in that paragraph came, as follows: -The original definitions have been tagged and in some cases -reformatted or slightly rearranged. If substantive information is -added from a second source, usually the additional source is also -noted, as in: +The original definitions have been tagged and in some cases reformatted or +slightly rearranged. If substantive information is added from a second +source, usually the additional source is also noted, as in: @@ -126,35 +120,32 @@ noted, as in: -This version is tagged with SGML-like tags of the form <pos>...</pos> -so that the original typography (italics, bold, block quotes) can be -reproduced. A list of the most important tags for fields in the -dictionary is given below. The tags also serve the more important -function of allowing the information content to be conveniently -imported into computer programs or databases. The set of tags used is -described in the accompanying file "tagset.txt". ***NOTE*** the -paragraph tags <p>...</p> do *not* always nest properly with certain -other tags, such as <note> and <cs> ("collocation section"), which in -some cases span multiple paragraphs. If you are using a tag parser -which detects improper nesting, you should first either delete the -paragraph tags or convert them to non-tag symbols, or, if possible, -set the parser to ignore the <p>...</p> tags. - -The unusual characters (such as Greek or the European accented -characters, as well as special characters used in the pronunciations) -are described in the accompanying file "webfont.txt". Some -information on the pronunciation system used may be found by viewing -the file "pronunc.jpg", and additional explanations of pronunciation -are in the file "pronunc.txt". - -Each paragraph of the original text is enclosed within tags of the -form <p> . . . </p>. Within these paragraphs there are no line -breaks, and some of the paragraphs are over 12,000 characters long, -which may prove too long to be handled by some editors. At some -points, embedded line breaks within a "paragraph" are marked by a <br/ -"entity". The file can therefore be converted, if necessary, to a -form with shorter lines, and subsequently reconverted back to the form -having one line per paragraph. - -If additional line breaks are added, then in order to remove the line -breaks and reconstruct the original paragraphs, so that the page width -can be adjusted, perform the following manipulations: +This version is tagged with SGML-like tags of the form <pos>...</pos> so +that the original typography (italics, bold, block quotes) can be +reproduced. A list of the most important tags for fields in the dictionary +is given below. The tags also serve the more important function of allowing +the information content to be conveniently imported into computer programs +or databases. The set of tags used is described in the accompanying file +"tagset.txt". ***NOTE*** the paragraph tags <p>...</p> do *not* always nest +properly with certain other tags, such as <note> and <cs> ("collocation +section"), which in some cases span multiple paragraphs. If you are using a +tag parser which detects improper nesting, you should first either delete +the paragraph tags or convert them to non-tag symbols, or, if possible, set +the parser to ignore the <p>...</p> tags. + +The unusual characters (such as Greek or the European accented characters, +as well as special characters used in the pronunciations) are described in +the accompanying file "webfont.txt". Some information on the pronunciation +system used may be found by viewing the file "pronunc.jpg", and additional +explanations of pronunciation are in the file "pronunc.txt". + +Each paragraph of the original text is enclosed within tags of the form <p> +. . . </p>. Within these paragraphs there are no line breaks, and some of +the paragraphs are over 12,000 characters long, which may prove too long to +be handled by some editors. At some points, embedded line breaks within a +"paragraph" are marked by a <br/ "entity". The file can therefore be +converted, if necessary, to a form with shorter lines, and subsequently +reconverted back to the form having one line per paragraph. + +If additional line breaks are added, then in order to remove the line breaks +and reconstruct the original paragraphs, so that the page width can be +adjusted, perform the following manipulations: @@ -166,16 +157,15 @@ can be adjusted, perform the following manipulations: -A more sophisticated formatting of spaces within paragraphs may -require the use of the fully-tagged master files. If you have a need -for these files, contact Patrick Cassidy: cassidy@micra.com. - -The approximate beginning of each page is marked by an SGML comment of -the form <-- p. 345 -->. (The exact beginning was in some cases in -the middle of a paragraph, which we decided was not a good location -for these page-number comments, so the page number was usually moved -to the next paragraph break). Pages which have been proofread by -volunteers (e.g., with initials VOL) will have a note within that page -comment: <-- p. 345 pr=VOL -->. Pages which have not been proofread -yet (most of them) will have varying numbers of typographical errors -in them. We still (January 2012) need proofreaders to get the errors -out of these dictionary files. +A more sophisticated formatting of spaces within paragraphs may require the +use of the fully-tagged master files. If you have a need for these files, +contact Patrick Cassidy: cassidy@micra.com. + +The approximate beginning of each page is marked by an SGML comment of the +form <-- p. 345 -->. (The exact beginning was in some cases in the middle +of a paragraph, which we decided was not a good location for these +page-number comments, so the page number was usually moved to the next +paragraph break). Pages which have been proofread by volunteers (e.g., with +initials VOL) will have a note within that page comment: <-- p. 345 pr=VOL +-->. Pages which have not been proofread yet (most of them) will have +varying numbers of typographical errors in them. We still (January 2012) +need proofreaders to get the errors out of these dictionary files. @@ -183,25 +173,23 @@ out of these dictionary files. -This version is only a first typing, and has numerous typographic -errors, including errors in the field-marks. In addition, the user -must keep in mind that this text is very old and will contain numerous -obsolete, inaccurate, and perhaps offensive statements, which are -included solely because this work is intended to reproduce accurately -this historically interesting classic reference work. This text should -not be relied upon as an accurate source of information, as in many -cases it represents the state of knowledge around 1890. The text is -provided "as is", and the user must accept responsibility for all -consequences of its use. Please refer to the header of each file and -the GNU public license. If these conditions of use are unacceptable, -please do not use these texts. - -This electronic dictionary is also made available as a potential -starting point for development of a modern comprehensive encyclopedic -dictionary, to be accessible freely on the internet, and developed by -the efforts of all individuals willing to help build a large and -freely available knowledge base. A large number of collaborators are -needed to bring this dictionary to a more accurate, more modern, and -more useful state. Anyone willing to assist in any way in constructing -such a knowledge base should contact Patrick Cassidy (see above). All -reports of errors will be gratefully received, and should also be -transmitted to PC at: pc@worldsoul.org. +This version is only a first typing, and has numerous typographic errors, +including errors in the field-marks. In addition, the user must keep in +mind that this text is very old and will contain numerous obsolete, +inaccurate, and perhaps offensive statements, which are included solely +because this work is intended to reproduce accurately this historically +interesting classic reference work. This text should not be relied upon as +an accurate source of information, as in many cases it represents the state +of knowledge around 1890. The text is provided "as is", and the user must +accept responsibility for all consequences of its use. Please refer to the +header of each file and the GNU public license. If these conditions of use +are unacceptable, please do not use these texts. + +This electronic dictionary is also made available as a potential starting +point for development of a modern comprehensive encyclopedic dictionary, to +be accessible freely on the internet, and developed by the efforts of all +individuals willing to help build a large and freely available knowledge +base. A large number of collaborators are needed to bring this dictionary +to a more accurate, more modern, and more useful state. Anyone willing to +assist in any way in constructing such a knowledge base should contact +Patrick Cassidy (see above). All reports of errors will be gratefully +received, and should also be transmitted to PC at: pc@worldsoul.org. @@ -237,4 +225,4 @@ For other tags, see the file "tagset.txt" In addition to the main text of the dictionary, additional explanatory -material about this version of the dictionary is available in the -ancillary files: +material about this version of the dictionary is available in the ancillary +files: @@ -259,4 +247,4 @@ pronunciations. -A copy of the dictionary page describing the pronunciation symbols used -in the original work. +A copy of the dictionary page describing the pronunciation symbols used in +the original work. @@ -264,4 +252,4 @@ in the original work. -This file lists original pronunciation symbols with the corresponding -markup entities used in this version. +This file lists original pronunciation symbols with the corresponding markup +entities used in this version. @@ -277,22 +265,25 @@ A copy of the original title page. -Description of the special escape sequences used in this dictionary. -This file also explains the Greek transliteration syntax used in it. +Description of the special escape sequences used in this dictionary. This +file also explains the Greek transliteration syntax used in it. * DICTIONARY LOOKUP -=================== + The GNU Dico project contains a module for reading GCIDE files. This -distribution provides a configuration file "gcide.conf" which you can -use with the "dicod" server in order to look up words in the -dictionary. See http://www.gnu.org.ua/software/dico for a description -of GNU Dico, including links to download. +distribution provides a configuration file "gcide.conf" which you can use +with the "dicod" server in order to look up words in the dictionary. See +http://www.gnu.org.ua/software/dico for a description of GNU Dico, including +links to download. -The instructions below describe how to configure GNU Dico server -(dicod) to access a copy of the GCIDE dictionary. +The instructions below describe how to configure GNU Dico server (dicod) to +access a copy of the GCIDE dictionary. 1. Unpack the GCIDE dictionary; + 2. Copy the file "gcide.conf" to a directory where you keep your local configuration files (/etc or /usr/local/etc are usual choices). -3. Replace the word GCIDE_PATH in the "gcide.conf" statement with the -path to the gcide-0.51 dicrectory. You can omit this step and use the --D option instead: + +3. Replace the word GCIDE_PATH in the "gcide.conf" statement with the path +to the gcide-0.51 dicrectory. You can omit this step and use the -D option +instead: + 4. Check the configuration file. Run: @@ -305,23 +296,20 @@ If no errors are reported, then go to the step 5. -5. Start "dicod". Run the same command as described in step 4, but -without the "--lint" option. This will start the dictionary server -which will be avaialble on localhost (127.0.0.1) port 2628. The -server provides extensive searching facilities. It also parses the -GCIDE markup and automatically reformats the articles before returning -them. +5. Start "dicod". Run the same command as described in step 4, but without +the "--lint" option. This will start the dictionary server which will be +avaialble on localhost (127.0.0.1) port 2628. The server provides extensive +searching facilities. It also parses the GCIDE markup and automatically +reformats the articles before returning them. -Now you can access the dictionary using dico (a GNU dictionary command -line utility), or another dictionary client program (such as Kdict or -the like). +Now you can access the dictionary using dico (a GNU dictionary command line +utility), or another dictionary client program (such as Kdict or the like). * OTHER VERSIONS OF THE DICTIONARY -================================== + There are several other derivative versions of this dictionary on the -internet, in some cases reformatted or provided with an interface. -Those that I am aware of are: +internet, in some cases reformatted or provided with an interface. Those +that I am aware of are: ** Dicoweb ----------- -This version of GCIDE is available online at the GNU Dico web -site: + +This version of GCIDE is available online at the GNU Dico web site: @@ -332,23 +320,23 @@ The site provides extensive search facilities. ** Project Gutenberg ---------------------- + In the extext96 directory of Project Gutenberg -(http://www.gutenberg.org/dirs/etext96), there is a version of the -original 1913 dictionary, which is in the **public domain**. The main -files are labeled pgw050*.*. The tags for that version are a subset -of those used in this GNU version. +(http://www.gutenberg.org/dirs/etext96), there is a version of the original +1913 dictionary, which is in the **public domain**. The main files are +labeled pgw050*.*. The tags for that version are a subset of those used in +this GNU version. ** The DICT development group ------------------------------- -This group has created a program to index and search this dictionary. -The program can be downloaded and used locally, but at present is -available only in a Unix-compatible executable version. See their web -site at http://www.dict.org. + +This group has created a program to index and search this dictionary. The +program can be downloaded and used locally, but at present is available only +in a Unix-compatible executable version. See their web site at +http://www.dict.org. ** The University of Chicago ARTFL project ------------------------------------------- -Mark Olsen and Gavin LaRowe at the University of Chicago have -converted the original 1913 dictionary to HTML and have provided an -interface allowing search of the headwords. When the supplemented -version has developed sufficiently to warrant the effort, a similar -searchable version may be posted there as well. The search page is at: + +Mark Olsen and Gavin LaRowe at the University of Chicago have converted the +original 1913 dictionary to HTML and have provided an interface allowing +search of the headwords. When the supplemented version has developed +sufficiently to warrant the effort, a similar searchable version may be +posted there as well. The search page is at: @@ -356,5 +344,5 @@ searchable version may be posted there as well. The search page is at: -That page will provide links to other ARTFL projects and contact -information for the ARTFL group, who alone can provide information -about the HTML version or interface. +That page will provide links to other ARTFL projects and contact information +for the ARTFL group, who alone can provide information about the HTML +version or interface. @@ -366,2 +354,3 @@ paragraph-separate: "[ ]*$" version-control: never +fill-column: 76 End: @@ -1,11 +1,12 @@ - FIELD MARKS FOR WEBSTER 1913 and CIDE - ===================================== - Explanations of the tags used to mark the Webster 1913 dictionary -and the CIDE (Collaborative International Dictionary of English). -Note that the list of tags used to mark the public domain version -of this dictionary is shorter than the full set described here. - If any tag is not listed here, it is either (1) one of the -"point" (font size) or "type" (font style) tags, which should be -self-explanatory; or (2) is a functional field with no effect on the -typography. +FIELD MARKS FOR WEBSTER 1913 and CIDE +===================================== + +* Overview + +This file describes the tags used to mark the Webster 1913 dictionary and +the GCIDE (GNU Collaborative International Dictionary of English). + +If any tag is not listed here, it is either (1) one of the "point" (font +size) or "type" (font style) tags, which should be self-explanatory; or (2) +is a functional field with no effect on the typography. @@ -17,110 +18,141 @@ Last modified March 12, 1999. (908) 561-3416 or (908) 668-5252 -------------------------------------------------------------- -A separate file, webfont.txt, contains the list of the individual + +A separate file, webfont.txt, contains the list of the individual non-ASCII characters represented by either higher-order hexadecimal -character marks (e.g., \'94, for o-umlaut) or by entity tags -(e.g., <root/, for the square root symbol.) --------------------------------------------------------------- - Use of tags: - In the MICRA electronic version of the 1913 Webster, each part of -the entry headed by an entry word ("headword") is labeled so that no -part of the entry except some punctuation marks should be found -outside of all fields, i.e. every character should be within some tagged -field. In the following description, the word "segment" usually refers to -a major part of an entry such as an etymology or a definition or a -collocation segment or a usage block, containing more than one field. -The term "field" may also be used similarly to "segment", but may also -denote single-word fields, such as an alternative spelling, labeled <asp>. - - Note: The tags on this list are similar in structure to SGML tags. Each -tag on this list marks a field; each field opens with a tagname between -angle brackets thus: <tagname>, and closes with a similar tag containing -the forward slash thus: </tagname>. No tags are used without closing -tags. Thus the HTML <BR> to indicate a line break is symbolized -here as an entity, <br/, and every <p> has a corresponding </p>. - The absence of an end-field tag, or the presence of an end-field tag -without a prior begin-field tag constitutes a typographical error, of which -there may be a significant number. Any errors detected should be brought -to the attention of PJC or the appropriate editor. - Most of the tagged fields are presented in the text in italic type, -with a number of exceptions. Where a word is contained within more than -one field, the innermost field determines the font to be used. Wherever -recognizable functional fields were found, an attempt was made to tag the -field with a functional mark, but in many cases, words were italicised only -to represent the word itself as a discourse entity, and in some such cases, -the "italic" mark <it> was used, implying nothing regarding functionality -of the word. The base font is considered "plain". Where an italic field -is indicated, parentheses or brackets within the field are not italicised. - Where no font is specified for a tag, the tag is merely a functional +character marks (e.g., \'94, for o-umlaut) or by entity tags (e.g., +<root/, for the square root symbol.) + +* Introduction + +In the MICRA electronic version of the 1913 Webster and in GCIDE, each part +of the entry headed by an entry word ("headword") is labeled so that no part +of the entry except some punctuation marks should be found outside of all +fields, i.e. every character should be within some tagged field. In the +following description, the word "segment" usually refers to a major part of +an entry such as an etymology or a definition or a collocation segment or a +usage block, containing more than one field. The term "field" may also be +used similarly to "segment", but may also denote single-word fields, such as +an alternative spelling, labeled <asp>. + +The tags on this list are similar in structure to SGML tags. Each tag on +this list marks a field; each field opens with a tagname between angle +brackets thus: <tagname>, and closes with a similar tag containing the +forward slash thus: </tagname>. No tags are used without closing tags. +Thus a line break (similar to HTML <br> tag) is symbolized here as an +entity, <br/, and every <p> has a corresponding </p>. + +The absence of an end-field tag, or the presence of an end-field tag without +a prior begin-field tag constitutes a typographical error, of which there +may be a significant number. Any errors detected should be brought to the +attention of PJC or the appropriate editor. + +Most of the tagged fields are presented in the text in italic type, with a +number of exceptions. Where a word is contained within more than one field, +the innermost field determines the font to be used. Wherever recognizable +functional fields were found, an attempt was made to tag the field with a +functional mark, but in many cases, words were italicised only to represent +the word itself as a discourse entity, and in some such cases, the "italic" +mark <it> was used, implying nothing regarding functionality of the word. +The base font is considered "plain". Where an italic field is indicated, +parentheses or brackets within the field are not italicised. + +Where no font is specified for a tag, the tag is merely a functional division, and was printed in plain font unless otherwise tagged. This type -of segment is marked by an asterisk (*) where the font name would be. - The size of the "plain" font in the original text is about 1.6 mm for -the height of capitalized letters. -============================================================= -Explicit typographical tags: - These were used where the purpose of a different font was merely to -distinguish a word from the body of the text, and no explicit functional -tag seemed apropriate. ------------------------------------ -Tag Font ------------------------------------ -Explicit formatting tags: -. . . . . . . . . . . . . . . . . . -<plain> plain font (that used in the body of a definition) -- - normally not marked, except within fields of - a different front. -<it> italic (in master files) -<i> italic (for use in HTML presentation) -<bold> bold (in master files) -<b> bold (for use in HTML presentation) -<colf> bold, Collocation font. Same font as used in collocations. - smaller This is used only in the list of "un-" words not - by 1 point actually defined in the dictionary. Probably could be - replaced by a segment mark for the entire list! - The "un-" words should be indexed as headwords. - -<ct> bold Same as <colf>, a font similar to that used in - collocations. However, this tag is used in a table - and could be set to a different font. - -<h1> * HTML tag -- largest heading font. - -<h2> * HTML tag -- second largest heading font. - -<headrow> * Marks a Row title in a table. - -<hwf> Font the same as the headword <hw>, though the field is - not a headword. Used only once. - -<mitem> * Multiple items, a set of items in a table. -<point ...> A series of point size markers, many unique. -<point1.5> * One of the tags of the form <point**> where ** -<point6> represents the typographic point size of the - enclosed text. -<pre> An HTML tag indicating that the enclosed text is - of teletype form, preformatted in a uniform-spaced - font. -<sc> small caps (used mostly for "a. d.", "b. c.") - This is the same font a <er>, but has no functional - or semantic significance -<str> group of table data elements in a table -<sub> subscript, like <subs> -<subs> subscript -<sups> superscript -<supr> superscript -<sansserif> Sans-serif font -<stypec> Bold (collocation font) and also a subtype. -<tt> HTML tage -- teletype font -<universbold> A squared bold font without serifs approximating the - "universe bold" font on the HP Laserjet4, slightly - larger than the capitals in a definition body. Used - in expositions describing shapes, such as - "Y", "T", "U", "X", "V", "F". -<vertical> Vertically organized column. -<column1> Vertically organized column -- only part of a table - which needs to be completed. Used once. -<...type> A series of tags, many unique, designating certain - unusual fonts, such as "bourgeoistype" for - "bourgeois type", in the section on typography. - Most of these occur only once, in the section on fonts. +of segment is marked by an asterisk (*) where the font name would be. The +size of the "plain" font in the original text is about 1.6 mm for the height +of capitalized letters. + +* Explicit typographical tags + +These were used where the purpose of a different font was merely to +distinguish a word from the body of the text, and no explicit functional tag +seemed apropriate. + +------------------------------------------------------------------------- +Tag Font Description +------------------------------------------------------------------------- +<plain> plain font that used in the body of a definition -- normally + not marked, except within fields of a different + front. + +<it> italic in master files + +<i> italic for use in HTML presentation + +<bold> bold in master files + +<b> bold for use in HTML presentation + +<colf> bold, Collocation font. Same font as used in + collocations. + smaller This is used only in the list of "un-" + by 1 point words not actually defined in the + dictionary. + Probably could be replaced by a segment mark + for the entire list! The "un-" words should + be indexed as headwords. + +<ct> bold Same as <colf>, a font similar to that used + in collocations. However, this tag is used + in a table and could be set to a different + font. + +<h1> * HTML tag -- largest heading font. + +<h2> * HTML tag -- second largest heading font. + +<headrow> * Marks a Row title in a table. + +<hwf> Font the same as the headword <hw>, though + the field is not a headword. Used only + once. + +<mitem> * Multiple items, a set of items in a table. +<point ...> A series of point size markers, many + unique. + +<point1.5> * One of the tags of the form <point**> where ** +<point6> represents the typographic point size of the + enclosed text. + +<pre> An HTML tag indicating that the enclosed + text is of teletype form, preformatted in a + uniform-spaced font. + +<sc> small caps used mostly for "a. d.", "b. c." + This is the same font as in <er>, but has no + functional or semantic significance. + +<str> group of table data elements in a table. + +<sub> subscript + +<subs> subscript + +<sups> superscript + +<supr> superscript + +<sansserif> Sans-serif + +<stypec> Bold collocation font, and also a subtype. + +<tt> HTML tage -- teletype font + +<universbold> A squared bold font without serifs approximating + the "universe bold" font on the HP Laserjet4, + slightly larger than the capitals in a definition + body. Used in expositions describing shapes, + such as "Y", "T", "U", "X", "V", "F". + +<vertical> Vertically organized column. + +<column1> Vertically organized column -- only part of a table + which needs to be completed. Used once. + +<...type> A series of tags, many unique, designating + certain unusual fonts, such as "bourgeoistype" + for "bourgeois type", in the section on + typography. Most of these occur only once, in + the section on fonts. Some examples follow: <antiquetype> @@ -148,347 +180,382 @@ Explicit formatting tags: -============================================================= -Tags with semantic content: -. . . . . . . . . . . . . . . . . . . . . . . . . . . -<altsp> * Alternative spelling segment. Almost always - contained within square brackets after the main - definition segment. Expository words - such as "Spelled also" are in plain font; - the actual alternative spelling is marked by - <asp> ... </asp> tags within this segment. - -<ant> italic Antonym. - -<asp> italic Alternative spelling. The actual word which is an - alternative spelling to the headword. These - are functionally synonyms of the headword. In - most cases these also occur as headwords, with - reference to the word where the actual definition - is found, but not all such words are listed - separately, particularly if the spelling is - close enough to the headword to be found at the - same point in the dictionary. Whether listed - separately or not, these words should - be indexed at this location, also. - -<au> italic Authority or author. Used where an authority is - (may be right- given for a definition, and also used for the - justified. See author, where a quotation within double quotes - in the section is given in the same paragraph as the - on formatting). definition. The double quotes are indicated - by the open-quote (\'bd) and close-quote - (\'b8). In both cases, it is typically - right-justified, almost always fitting on - the same line with the last line of the - definition or quotation. - Within collocation segments, it is usually - used only after quotations, and is not right- - justified, except occasionally where it +* Tags with semantic content: + +------------------------------------------------------------------------- +Tag Font Meaning and Description +------------------------------------------------------------------------- +<altsp> * Alternative spelling segment. Almost always + contained within square brackets after the main + definition segment. Expository words such as + "Spelled also" are in plain font; the actual + alternative spelling is marked by <asp> ... + </asp> tags within this segment. + +<ant> italic Antonym. + +<asp> italic Alternative spelling. The actual word which is + an alternative spelling to the headword. These + are functionally synonyms of the headword. In + most cases these also occur as headwords, with + reference to the word where the actual definition + is found, but not all such words are listed + se |