diff options
author | Sergey Poznyakoff <gray@gnu.org.ua> | 2012-02-02 14:42:06 +0200 |
---|---|---|
committer | Sergey Poznyakoff <gray@gnu.org.ua> | 2012-02-02 14:42:06 +0200 |
commit | 3d4fbac289846464491104b01bebe554da6758da (patch) | |
tree | ef314e6d3f0c12d1879e43c4c0bb5753cc9e5f78 | |
parent | b61268b9deea32b7d965808f47d1227e3197a83c (diff) | |
download | gcide-3d4fbac289846464491104b01bebe554da6758da.tar.gz gcide-3d4fbac289846464491104b01bebe554da6758da.tar.bz2 |
Reorganize the directory structure.
* .gitignore: New file.
* Makefile: Fix the list of distributed files.
* README.DIC: Rename to README and edit.
* WXXVII.JPG: Remove.
* abbrevn.lst: New file.
* authors.lst: New file.
* gcide.conf: New file.
* PRONUNC.JPG: Rename to pronunc.jpg.
* PRONUNC.WEB: Rename to pronunc.txt.
* SYMBOLS.JPG: Rename to symbols.jpg
* TAGSET.WEB: Rename to tagset.txt
* WEBFONT.ASC: Rename to webfont.txt.
* titlepage.png: New file.
-rw-r--r-- | .gitignore | 5 | ||||
-rw-r--r-- | Makefile | 25 | ||||
-rw-r--r-- | README | 368 | ||||
-rw-r--r-- | README.DIC | 268 | ||||
-rw-r--r-- | WXXVII.JPG | bin | 1188380 -> 0 bytes | |||
-rw-r--r-- | abbrevn.lst | 457 | ||||
-rw-r--r-- | authors.lst | 9669 | ||||
-rw-r--r-- | gcide.conf | 42 | ||||
-rw-r--r-- | pronunc.jpg (renamed from PRONUNC.JPG) | bin | 2569796 -> 2569796 bytes | |||
-rw-r--r-- | pronunc.txt (renamed from PRONUNC.WEB) | 44 | ||||
-rw-r--r-- | symbols.jpg (renamed from SYMBOLS.JPG) | bin | 144716 -> 144716 bytes | |||
-rw-r--r-- | tagset.txt (renamed from TAGSET.WEB) | 26 | ||||
-rw-r--r-- | titlepage.png | bin | 0 -> 24666 bytes | |||
-rw-r--r-- | webfont.txt (renamed from WEBFONT.ASC) | 31 |
14 files changed, 10627 insertions, 308 deletions
diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..ec988f4 --- /dev/null +++ b/.gitignore | |||
@@ -0,0 +1,5 @@ | |||
1 | .emacs* | ||
2 | *~ | ||
3 | *.tar.gz | ||
4 | *.tar.xz | ||
5 | *.zip | ||
@@ -6,9 +6,20 @@ DISTFILES=\ | |||
6 | COPYING\ | 6 | COPYING\ |
7 | PRONUNC.JPG\ | 7 | README\ |
8 | PRONUNC.WEB\ | 8 | pronunc.jpg\ |
9 | README.DIC\ | 9 | symbols.jpg\ |
10 | SYMBOLS.JPG\ | 10 | pronunc.txt\ |
11 | TAGSET.WEB\ | 11 | tagset.txt\ |
12 | WEBFONT.ASC\ | 12 | webfont.txt\ |
13 | WXXVII.JPG | 13 | abbrevn.lst\ |
14 | authors.lst\ | ||
15 | titlepage.png | ||
16 | |||
17 | anclist: | ||
18 | @ls -o $(DISTFILES) | grep -v 'CIDE.[A-Z]' | ||
19 | |||
20 | clean: | ||
21 | rm -f *~ | ||
22 | |||
23 | distclean: clean | ||
24 | rm -f $(DISTBASE).tar.gz $(DISTBASE).tar.xz $(DISTBASE).zip | ||
14 | 25 | ||
@@ -0,0 +1,368 @@ | |||
1 | The README file | ||
2 | |||
3 | To accompany the GNU version of the set of files (CIDE.*) containing | ||
4 | the electronic version of the | ||
5 | Collaborative International Dictionary of English. | ||
6 | (called also GCIDE) | ||
7 | These files contain Version 0.51 (January 2012) | ||
8 | * * * * * * * * * * * * * * * * * * * * * * * * * * * * | ||
9 | |||
10 | * OVERVIEW | ||
11 | ========== | ||
12 | This document describes the GNU version of the Collaborative | ||
13 | International Dictionary of English. It is organized into a series of | ||
14 | chapters, introduced by headings beginning with a single asterisk. A | ||
15 | chapter may have sections, which are marked with two asterisks. For | ||
16 | those readers who use Emacs, this structure corresponds to its | ||
17 | "Outline mode", which will be enabled automatically upon loading this | ||
18 | file. | ||
19 | |||
20 | The chapter "INTRODUCTION" describes the structure of this package. | ||
21 | The chapter "STRUCTURE OF THE DICTIONARY" describes the dictionary | ||
22 | structure in general. An overview of the markup tags is provided in | ||
23 | the chapter "TAGS". A detailed information about dictionary markup | ||
24 | can be obtained from a set of ancillary files included in this | ||
25 | package, which are described in the chapter "ANCILLARY FILES". | ||
26 | |||
27 | The chapter "DICTIONARY LOOKUP" describes how to use GNU Dico for | ||
28 | reading this dictionary. Finally, other versions of the Webster | ||
29 | dictionary are listed in the chapter "OTHER VERSIONS OF THE | ||
30 | DICTIONARY". | ||
31 | |||
32 | * INTRODUCTION | ||
33 | ============== | ||
34 | The dictionary was derived from the | ||
35 | Webster's Revised Unabridged Dictionary | ||
36 | Version published 1913 | ||
37 | by the C. & G. Merriam Co. | ||
38 | Springfield, Mass. | ||
39 | Under the direction of | ||
40 | Noah Porter, D.D., LL.D. | ||
41 | |||
42 | and has been supplemented with some of the definitions from | ||
43 | WordNet, a semantic network created by | ||
44 | the Cognitive Science Department | ||
45 | of Princeton University | ||
46 | under the direction of | ||
47 | Prof. George Miller | ||
48 | |||
49 | and is being proof-read and supplemented by volunteers from around the | ||
50 | world. This is an unfunded project, and future enhancement of this | ||
51 | dictionary will depend on the efforts of volunteers willing to help | ||
52 | build this free resource into a comprehensive body of general | ||
53 | information. New definitions for missing words or words senses and | ||
54 | longer explanatory notes, as well as images to accompany the articles | ||
55 | are needed. More modern illustrative quotations giving recent | ||
56 | examples of usage of the words in their various senses will be very | ||
57 | helpful, since most quotations in the original 1913 dictionary are now | ||
58 | well over 100 years old. | ||
59 | |||
60 | This electronic version is being maintained by World Soul, a | ||
61 | non-profit organization in Plainfield, NJ. For additional information | ||
62 | or if you are willing to assist construction of this data source, contact: | ||
63 | |||
64 | =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= | ||
65 | Patrick J. Cassidy | TEL: (908) 561-3416 | ||
66 | World Soul | if no answer, (908) 668-5252 | ||
67 | 735 Belvidere Ave. | FAX: (908) 668-5904 | ||
68 | Plainfield, NJ 07062-2054 | ||
69 | pc@worldsoul.org or cassidy@micra.com | ||
70 | =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= | ||
71 | |||
72 | GCIDE is free software; you can redistribute it and/or modify | ||
73 | it under the terms of the GNU General Public License as published by | ||
74 | the Free Software Foundation; either version 2, or (at your option) | ||
75 | any later version. | ||
76 | |||
77 | GCIDE is distributed in the hope that it will be useful, | ||
78 | but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
79 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
80 | GNU General Public License for more details. | ||
81 | |||
82 | You should have received a copy of the GNU General Public License | ||
83 | along with this copy of GCIDE; see the file COPYING. If not, write | ||
84 | to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, | ||
85 | Boston, MA 02111-1307, USA. | ||
86 | |||
87 | * STRUCTURE OF THE DICTIONARY | ||
88 | ============================= | ||
89 | When the archive is unpacked, the main dictionary text of the GCIDE | ||
90 | will be found in 26 files named "CIDE.*", where the asterisk indicates | ||
91 | which letter of the alphabet begins the words in each file. For | ||
92 | example, file "CIDE.B" contains words beginning with the letter "B". | ||
93 | Additional information about the tagging conventions and special | ||
94 | character symbols are contained in ancillary files in this directory | ||
95 | (see below the section entitled "ANCILLARY FILES"). The main body of | ||
96 | the 1913 dictionary was essentially identical to the edition published | ||
97 | in 1890, and was republished in 1913 with an appendix containing "New | ||
98 | Words". The new words of that appendix have been integrated into the | ||
99 | main file in this version. However, it is important to keep in mind | ||
100 | that the definitions in this dictionary are in most cases over 100 | ||
101 | years old. Use them with caution! | ||
102 | |||
103 | At the bottom of each paragraph in this dictionary, there is a | ||
104 | bracketed and tagged "source" indicated. This tells from where the | ||
105 | definition or other text in that paragraph came, as follows: | ||
106 | |||
107 | [<source>1913 Webster</source>] | ||
108 | = From the original 1890 dictionary. | ||
109 | [<source>Webster 1913 Suppl.</source>] | ||
110 | = From the 1913 "New Words" supplement to the Webster. | ||
111 | [<source>WordNet 1.5</source>] | ||
112 | = From the WordNet on-line semantic network. | ||
113 | [<source>Century Dict. 1906.</source>] | ||
114 | = From the Century Dictionary published in 1906, especially from | ||
115 | the "proper Names" supplement (volume IX). | ||
116 | published | ||
117 | [<source>XXX</source>] | ||
118 | = Added by one of the volunteers. | ||
119 | |||
120 | The original definitions have been tagged and in some cases | ||
121 | reformatted or slightly rearranged. If substantive information is | ||
122 | added from a second source, usually the additional source is also | ||
123 | noted, as in: | ||
124 | |||
125 | [<source>Webster 1913 Suppl.</source> + <source>WordNet 1.5</source>] | ||
126 | |||
127 | This version is tagged with SGML-like tags of the form <pos>...</pos> | ||
128 | so that the original typography (italics, bold, block quotes) can be | ||
129 | reproduced. A list of the most important tags for fields in the | ||
130 | dictionary is given below. The tags also serve the more important | ||
131 | function of allowing the information content to be conveniently | ||
132 | imported into computer programs or databases. The set of tags used is | ||
133 | described in the accompanying file "tagset.txt". ***NOTE*** the | ||
134 | paragraph tags <p>...</p> do *not* always nest properly with certain | ||
135 | other tags, such as <note> and <cs> ("collocation section"), which in | ||
136 | some cases span multiple paragraphs. If you are using a tag parser | ||
137 | which detects improper nesting, you should first either delete the | ||
138 | paragraph tags or convert them to non-tag symbols, or, if possible, | ||
139 | set the parser to ignore the <p>...</p> tags. | ||
140 | |||
141 | The unusual characters (such as Greek or the European accented | ||
142 | characters, as well as special characters used in the pronunciations) | ||
143 | are described in the accompanying file "webfont.txt". Some | ||
144 | information on the pronunciation system used may be found by viewing | ||
145 | the file "pronunc.jpg", and additional explanations of pronunciation | ||
146 | are in the file "pronunc.txt". | ||
147 | |||
148 | Each paragraph of the original text is enclosed within tags of the | ||
149 | form <p> . . . </p>. Within these paragraphs there are no line | ||
150 | breaks, and some of the paragraphs are over 12,000 characters long, | ||
151 | which may prove too long to be handled by some editors. At some | ||
152 | points, embedded line breaks within a "paragraph" are marked by a <br/ | ||
153 | "entity". The file can therefore be converted, if necessary, to a | ||
154 | form with shorter lines, and subsequently reconverted back to the form | ||
155 | having one line per paragraph. | ||
156 | |||
157 | If additional line breaks are added, then in order to remove the line | ||
158 | breaks and reconstruct the original paragraphs, so that the page width | ||
159 | can be adjusted, perform the following manipulations: | ||
160 | |||
161 | (1) convert each line break to a space. | ||
162 | (2) convert the string "</p> " (</p> followed by two spaces) | ||
163 | to </p> followed by two line breaks. | ||
164 | (3) convert the string "<br/ " (<br/ followed by one space) | ||
165 | to <br/ followed by one line break. | ||
166 | |||
167 | A more sophisticated formatting of spaces within paragraphs may | ||
168 | require the use of the fully-tagged master files. If you have a need | ||
169 | for these files, contact Patrick Cassidy: cassidy@micra.com. | ||
170 | |||
171 | The approximate beginning of each page is marked by an SGML comment of | ||
172 | the form <-- p. 345 -->. (The exact beginning was in some cases in | ||
173 | the middle of a paragraph, which we decided was not a good location | ||
174 | for these page-number comments, so the page number was usually moved | ||
175 | to the next paragraph break). Pages which have been proofread by | ||
176 | volunteers (e.g., with initials VOL) will have a note within that page | ||
177 | comment: <-- p. 345 pr=VOL -->. Pages which have not been proofread | ||
178 | yet (most of them) will have varying numbers of typographical errors | ||
179 | in them. We still (January 2012) need proofreaders to get the errors | ||
180 | out of these dictionary files. | ||
181 | |||
182 | ** Warning | ||
183 | |||
184 | This version is only a first typing, and has numerous typographic | ||
< |