diff options
Diffstat (limited to 'README.DIC')
-rw-r--r-- | README.DIC | 268 |
1 files changed, 0 insertions, 268 deletions
diff --git a/README.DIC b/README.DIC deleted file mode 100644 index 6a7ea1c..0000000 --- a/README.DIC +++ /dev/null @@ -1,268 +0,0 @@ -File README.DIC - To accompany the GNU version of the set of files (cide.*) containing - the electronic version of the - Collaborative International Dictionary of English. - (called also GCIDE) - These files contain Version 0.51 (January 2012) - * * * * * * * * * * * * * * * * * * * * * * * * * * * * - -The dictionary was derived from the - Webster's Revised Unabridged Dictionary - Version published 1913 - by the C. & G. Merriam Co. - Springfield, Mass. - Under the direction of - Noah Porter, D.D., LL.D. - -and has been supplemented with some of the definitions from - WordNet, a semantic network created by - the Cognitive Science Department - of Princeton University - under the direction of - Prof. George Miller - -and is being proof-read and supplemented by volunteers from -around the world. This is an unfunded project, and future -enhancement of this dictionary will depend on the efforts of -volunteers willing to help build this free resource into a -comprehensive body of general information. New definitions -for missing words or words senses and longer explanatory notes, -as well as images to accompany the articles are needed. More -modern illustrative quotations giving recent examples of -usage of the words in their various senses will be very -helpful, since most quotations in the original 1913 dictionary -are now well over 100 years old. - - This electronic version is being maintained by World Soul, -a non-profit organization in Plainfield, NJ. For additional -information or if you are willing to assist construction of this -data source, contact: - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= - Patrick J. Cassidy | TEL: (908) 561-3416 - World Soul | if no answer, (908) 668-5252 - 735 Belvidere Ave. | FAX: (908) 668-5904 - Plainfield, NJ 07062-2054 - pc@worldsoul.org or cassidy@micra.com -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= - - * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * - -GCIDE is free software; you can redistribute it and/or modify -it under the terms of the GNU General Public License as published by -the Free Software Foundation; either version 2, or (at your option) -any later version. - -GCIDE is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -You should have received a copy of the GNU General Public License -along with this copy of GCIDE; see the file COPYING. If not, write -to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, -Boston, MA 02111-1307, USA. - * * * * * * * * * * * * * * * * * * * * * - -STRUCTURE OF THE DICTIONARY ---------------------------- - When the archives are unpacked, the main dictionary text of -the GCIDE will be found in 26 files named "CIDE.*", where the -asterisk indicates which letter of the alphabet begins the -words in each file. For example, file "CIDE.B" contains words -beginning with the letter "B". Additional information about the -tagging conventions and special character symbols are contained in -ancillary files in this directory more information below). The main -body of the 1913 dictionary was essentially identical to the edition -published in 1890, and was republished in 1913 with an appendix -containing "New Words". The new words of that appendix have been -integrated into the main file in this version. However, it is important -to keep in mind that the definitions in this dictionary are in most -cases over 100 years old. Use them with caution! - At the bottom of each paragraph in this dictionary, there is a -bracketed and tagged "source" indicated. This tells from where the -definition or other text in that paragraph came, as follows: - -[<source>1913 Webster</source>] - = From the original 1890 dictionary. -[<source>Webster 1913 Suppl.</source>] - = From the 1913 "New Words" supplement to the Webster. -[<source>WordNet 1.5</source>] - = From the WordNet on-line semantic network. -[<source>Century Dict. 1906.</source>] - = From the Century Dictionary published in 1906, especially from - the "proper Names" supplement (volume IX). - published -[<source>XXX</source>] - = Added by one of the volunteers. - - The original definitions have been tagged and in some cases -reformatted or slightly rearranged. If substantive information -is added from a second source, usually the additional source is -also noted, as in: -[<source>Webster 1913 Suppl.</source> + <source>WordNet 1.5</source>] - - A list of the ancillary files related to the GCIDE is appended at -the bottom of this "README.DIC" file. - This version is tagged with SGML-like tags of the form <pos>...</pos> -so that the original typography (italics, bold, block quotes) can be -reproduced. A list of the most important tags for fields in the -dictionary is given below. The tags also serve the more important -function of allowing the information content to be conveniently imported -into computer programs or databases. The set of tags used is described -in the accompanying file "tagset.web". ***NOTE*** the paragraph tags -<p>...</p> do *not* always nest properly with certain other tags, such -as <note> and <cs> ("collocation section"), which in some cases span -multiple paragraphs. If you are using a tag parser which detects -improper nesting, you should first either delete the paragraph -tags or convert them to non-tag symbols, or, if possible, set the -parser to ignore the <p>...</p> tags. - The unusual characters (such as Greek or the European accented -characters, as well as special characters used in the pronunciations) -are described in the accompanying file "WEBFONT.ASC". Some information -on the pronunciation system used may be found by viewing the files -"WXXVII.JPG" and "PRONUNC.JPG" with a GIF viewer (or any web browser), -and additional explanations of pronunciation are in the file -"PRONUNC.WEB". - Each paragraph of the original text is enclosed within tags of -the form <p> . . . </p>. Within these paragraphs are no line -breaks, and some of the paragraphs are over 12,000 characters long. -These lines are too long to be handled by the vi editor, and probably -by some other text editors. At some points, embedded line breaks within -a "paragraph" are marked by a <br/ "entity". The file can therefore -be converted, if necessary, to a form with shorter lines, and subsequently -reconverted back to the form having one line per paragraph. - - If additional line breaks are added, then in order remove the -line breaks and reconstruct the original paragraphs, so that the -page width can be adjusted, perform the following manipulations: - (1) convert each line break (cr-lf combination) to a space. - (2) convert the string "</p> " (</p> followed by two spaces) - to </p> followed by two line breaks (cr-lf combinations) - (3) convert the string "<br/ " (<br/ followed by one space) - to <br/ followed by one line break (cr-lf). -There will be some "lines" (paragraphs) with over 12,000 characters, -which may give trouble to some simple text editors. - A more sophisticated formatting of spaces within paragraphs may -require the use of the fully-tagged master files. If you have -a need for these files, contact Patrick Cassidy: cassidy@micra.com. - The approximate beginning of each page is marked by an SGML -comment of the form <-- p. 345 -->. (The exact beginning was in some -cases in the middle of a paragraph, which we decided was not a -good location for these page-number comments, so the page number -was usually moved to the next paragraph break). Pages which have -been proofread by volunteers (e.g., with initials VOL) will have a -note within that page comment: <-- p. 345 pr=VOL -->. Pages which have -not been proofread yet (most of them) will have varying numbers of -typographical errors in them. We still (January 2012) need -proofreaders to get the errors out of these dictionary files. - -*********************************************************************** -** WARNING!!! ** -*********************************************************************** - - This version is only a first typing, and has numerous typographic -errors, including errors in the field-marks. In addition, the user must -keep in mind that this text is very old and will contain numerous -obsolete, inaccurate, and perhaps offensive statements, which are -included solely because this work is intended to reproduce accurately -this historically interesting classic reference work. This text should -not be relied upon as an accurate source of information, as in many -cases it represents the state of knowledge around 1890. The text is -provided "as is", and the user must accept responsibility for all -consequences of its use. Please refer to the header of each file and -the GNU public license. If these conditions of use are unacceptable, -please do not use these texts. -************************************************************************ -************************************************************************ - This electronic dictionary is also made available as a potential -starting point for development of a modern comprehensive encyclopedic -dictionary, to be accessible freely on the internet, and developed by the -efforts of all individuals willing to help build a large and freely -available knowledge base. A large number of collaborators are needed to -bring this dictionary to a more accurate, more modern, and more useful -state. Anyone willing to assist in any way in constructing such a -knowledge base should contact Patrick Cassidy (see above). All reports -of errors will be gratefully received, and should also be transmitted to -PC at: pc@worldsoul.org. - -In addition to the main text of the dictionary, additional -explanatory material about this version of the dictionary is available -in the ancillary files: - -===================================================================== --rw-r--r-- 1 18021 2012-01-30 00:24 COPYING --rw-r--r-- 1 2569796 2000-06-18 15:11 PRONUNC.JPG --rw-r--r-- 1 13994 2012-01-30 00:24 PRONUNC.WEB --rw-r--r-- 1 13507 2012-01-30 00:27 README.DIC --rw-r--r-- 1 144716 2000-06-18 15:13 SYMBOLS.JPG --rw-r--r-- 1 54783 2012-01-30 00:24 TAGSET.WEB --rw-r--r-- 1 34631 2012-01-30 00:24 WEBFONT.ASC --rw-r--r-- 1 1188380 2000-06-18 15:19 WXXVII.JPG -===================================================================== - - -Most important tags used in the GCIDE: -<hw> tags the headword -<pr> pronunciation -<pos> part of speech -<ety> etymology -<ets> "source" word within an <ety> field, usually foreign words -<fld> field of knowledge (e.g. Med. = medicine) -<def> definition -<cs> collocation section (containing word combinations) -<col> collocation entry (word combination) -<cd> collocation definition -<as> illustrations of usage (within a <def>. . . </def> field) -<au> authority for a definition, or author of a quotation -<q> illustrative quotation -- in block quote format -<au> author of an illustrative <q> quotation -<altname> alternative name for the headword -- essentially a synonym -<asp> alternative spelling of the headword -<syn> list of synonyms for the headword -<p> paragraph -<b> bold type -<it> italic type - -For other tags, see the file "tagset.web" - - -============================================================ - OTHER VERSIONS OF THE DICTIONARY -============================================================= - - There are several other derivative versions of this dictionary -on the internet, in some cases reformatted or provided with an -interface. Those that I am aware of are: - -(1) Project Gutenberg ---------------------- - In the extext96 directory of Project Gutenberg (www.prairienet.org) -there is a version of the original 1913 dictionary, which is in -the **public domain**. The main files are in the directory etext96, -and sre labeled pgw050**.***. The tags for that version are a subset -of those used in this GNU version. - -(2) The DICT development group ------------------------------- -This group has created a program to index and search this dictionary. -The program can be downloaded and used locally, but at present -is available only in a Unix-compatible executable version. -See their web site at http://www.dict.org. - -(3) The University of Chicago ARTFL project ---------------------------------------------- -Mark Olsen and Gavin LaRowe at the University of Chicago have -converted the original 1913 dictionary to HTML and have provided an -interface allowing search of the headwords. When the supplemented -version has developed sufficiently to warrant the effort, a -similar searchable version may be posted there as well. The -search page is at: - http://humanities.uchicago.edu/forms_unrest/webster.form.html - -That page will provide links to other ARTFL projects and contact -information for the ARTFL group, who alone can provide information -about the HTML version or interface. - - - -- PJC |