diff options
Diffstat (limited to 'pronunc.txt')
-rw-r--r-- | pronunc.txt | 332 |
1 files changed, 332 insertions, 0 deletions
diff --git a/pronunc.txt b/pronunc.txt new file mode 100644 index 0000000..5db6a9f --- /dev/null +++ b/pronunc.txt | |||
@@ -0,0 +1,332 @@ | |||
1 | file PRONUNC.WEB | ||
2 | ================ | ||
3 | This file gives a number of examples of pronunciation, | ||
4 | using the entity symbols representing the pronunciations as | ||
5 | found in the 1913 Webster unabridged dictionary. Not all | ||
6 | vowel sounds are given here, but the examples should allow one | ||
7 | to recognize the characters and recall the symbols used to | ||
8 | represent them. The set of symbols used for pronunciation | ||
9 | is different from that used in most modern dictionaries, | ||
10 | but a more worrisome problem is that the pronunciations themselves | ||
11 | seem in many cases to differ from modern usage. The places of | ||
12 | the strong and weak accent are, however, in every case | ||
13 | examined, the same as in modern dictionaries. Anyone who is | ||
14 | willing to work at revising the pronunciations to reflect modern | ||
15 | usage or modern symbols should contact PJC. | ||
16 | |||
17 | |||
18 | Pronunciations in the 1913 Webster ASCII version | ||
19 | ================================================= | ||
20 | |||
21 | Syllables: | ||
22 | ---------------- | ||
23 | in pronunciations, the short hyphen used in the printed version as a | ||
24 | syllable-break is represented in the ASCII version by an asterisk (*). | ||
25 | the main (heavy) accent is represented by a double-quote ("). | ||
26 | the secondary (light) accent is represented by a left-single-quote | ||
27 | (grave accent) (`) | ||
28 | the hyphen in hyphenated words is represented by the ASCII hyphen (-). | ||
29 | where an accent occurs, no other syllable break is used. | ||
30 | sometimes a hyphen occurs after an accent. | ||
31 | ------------------------------------------------ | ||
32 | |||
33 | Consonants: | ||
34 | Most consonants have their normal value in the pronunciations, | ||
35 | but there are a few special characters, as the n-submacron and the | ||
36 | "th" ligature. See the end of the "special characters" section. | ||
37 | |||
38 | Special characters: | ||
39 | -------------------- | ||
40 | The special characters are represented by two different sets of | ||
41 | symbols: (1) the RTF-format hexadecimal codes such as \'94 for | ||
42 | o-umlaut, meaning that the byte code is hexadecimal 94. These | ||
43 | are used only for those symbols which have been designed into a | ||
44 | special font set for this dictionary. The font set can only be used | ||
45 | in a DOS system; or | ||
46 | (2) an "entity" symbol using "<" and "/" as opening and closing | ||
47 | delimiters, with a mnemonic string between. In the case of o-umlaut | ||
48 | the symbol is <oum/. For the vowels, the system is consistent, | ||
49 | thus <aum/ is a-umlaut, and <ium/ is i-umlaut, etc. | ||
50 | These delimiters are used in preference to the HTML-style | ||
51 | (e.g. ä) delimiters because of the heavy use of ampersands in | ||
52 | the dictionary, to minimize file length. For the same reason, | ||
53 | the codes within the delimiters are generally shorter than the | ||
54 | corresponding ISO 8879 codes ( <aum/ rather than ä ). | ||
55 | For this discussion, I will use the "entity" coding. The | ||
56 | equivalent hexadecimal codes, where they exist, will be found in | ||
57 | the tables in the file "webfont.asc". | ||
58 | |||
59 | The pronunciation system of the 1913 Webster has three peculiarities | ||
60 | relative to systems used in recent dictionaries. | ||
61 | (1) a more complex set of symbols are used. This is evident, for | ||
62 | example, where the long vowels have different symbols whether | ||
63 | they are used in stressed or unstressed syllables. Thus | ||
64 | long a in "acre" or "chaos" is represented as a-macron (<amac/ in | ||
65 | our notation). But in "chaotic" or "connate" or "comate" it is | ||
66 | represented as a symbol looking like a-macron, but with a short | ||
67 | ascender in the middle of the macron above the a. This is denoted | ||
68 | <asl/ ("a semilong") in our notation. | ||
69 | |||
70 | Also, some sounds have more than one symbol. Thus, there are several | ||
71 | symbols using "y" with a diacritical mark above, representing | ||
72 | identical sounds using "i" or "e", but used in those cases where the | ||
73 | written word has a "y" in it. So words ending in "y" with | ||
74 | pronunciations like the unaccented long "e" usually have | ||
75 | a y-breve (<ycr/) in the pronunciation. Why? | ||
76 | |||
77 | (2) The indicated pronunciations themselves are in some cases | ||
78 | different from what one would find in a modern dictionary. | ||
79 | In part this is due to differences among orthoepists with | ||
80 | different notions of how a word should sound, and possibly | ||
81 | it is due to differences in the pronunciation between 1890, | ||
82 | when British pronunciations may have had more influence, and | ||
83 | the present. Thus we see that words ending in -"ties", | ||
84 | which are given the pronunciation "-t<icr/z", which sounds | ||
85 | like "tizz", whereas I have always heard such words pronounced | ||
86 | with a long "e", as in "teez". In Webster's 10th collegiate, they | ||
87 | mention that unstressed long e may be pronounced as i in | ||
88 | southern British or southern US dialects, and perhaps it | ||
89 | was more common in the US in 1890. The <icr/ is an unreliable | ||
90 | indicator of modern standard American pronunciation. | ||
91 | |||
92 | (3) The indefinite value, represented by an upside-down e (called | ||
93 | the "schwa" is not used, the same sound being represented by | ||
94 | symbols like short u <ucr/, or sometimes other vowels. | ||
95 | |||
96 | So be warned, the pronunciations may not be quite what one would | ||
97 | expect. But for this effort, we are trying to reproduce exactly | ||
98 | the pronuciations in the original work. | ||
99 | |||
100 | Notice that in pronunciations, vowels that are obscured are often | ||
101 | represented by the italicised vowel without any diacritical marks; | ||
102 | these italicised vowels are represented as either <ait/, <eit/, etc. | ||
103 | or with an <it> tag, as in m<it>e</it>nt | ||
104 | Thus "Christian" is represented as kr<icr/s"ch<it>a</it>n | ||
105 | communicant is represented as k<ocr/m*m<umac/"n<icr/*k<ait/nt | ||
106 | |||
107 | |||
108 | Some examples of pronunciations follow: | ||
109 | for further explanations of the entities, see the file "webfont.asc" | ||
110 | ============================================================== | ||
111 | |||
112 | <amac/ long a (stressed) (a with a macron above it) | ||
113 | late = l<amac/t | ||
114 | later = l<amac/t"<etil/r | ||
115 | comb-shaped = k<omac/m"-sh<amac/pt` | ||
116 | commemorate = k<ocr/m*m<ecr/m"<osl/*r<amac/t | ||
117 | deign = d<amac/n | ||
118 | deflate = d<esl/*fl<amac/t" | ||
119 | defray = d<esl/*fr<amac/" | ||
120 | defrayal = d<esl/*fr<amac/"<ait/l | ||
121 | |||
122 | |||
123 | <asl/ long a (unstressed) | ||
124 | commodate = k<ocr/m"m<osl/*d<asl/t | ||
125 | cometary = k<ocr/m"<ecr/t*<asl/*r<ycr/ | ||
126 | |||
127 | <ait/ italic a | ||
128 | communicant = k<ocr/m*m<umac/"n<icr/*k<ait/nt | ||
129 | defeasance = d<esl/*f<emac/"z<ait/ns | ||
130 | commercial = k<ocr/m*m<etil/r"sh<ait/l | ||
131 | compass = k<ucr/m"p<ait/s | ||
132 | |||
133 | <acr/ short a (a with a crescent [breve] above it) | ||
134 | adipose = <acr/d"<icr/*p<omac/s | ||
135 | absolve = <acr/b*s<ocr/lv" | ||
136 | land = l<acr/nd | ||
137 | lamp = l<acr/mp | ||
138 | |||
139 | <adot/ short a (a with a dot above it) | ||
140 | again = <adot/*g<ecr/n" | ||
141 | carouse = k<adot/*rouz" | ||
142 | coma = k<omac/"m<adot/ | ||
143 | comma = k<ocr/m"m<adot/ | *These sound different | ||
144 | command = k<ocr/m*m<adot/nd" | to me | ||
145 | mass = m<adot/s | ||
146 | mash = m<adot/sh | ||
147 | mat = m<adot/t | ||
148 | |||
149 | <acir/ a-circumflex ("only in syllables closed by r") | ||
150 | care = k<acir/r | ||
151 | chair = ch<acir/r | ||
152 | share = sh<acir/r | ||
153 | compare = k<ocr/m*p<acir/r" | ||
154 | |||
155 | <aum/ a-umlaut (in pronunciations not the same as in words) | ||
156 | arsenic = <aum/r"s<esl/*n<icr/k | ||
157 | arson = <aum/r"s'n | ||
158 | arm = <aum/rm | ||
159 | carp = k<aum/rp | ||
160 | far = f<aum/r | ||
161 | mar = m<aum/r | ||
162 | compart = k<ocr/m*p<aum/rt" | ||
163 | compartment = k<ocr/m*p<aum/rt"m<eit/nt | ||
164 | |||
165 | <add/ a double dot ( with a double dot *below*) | ||
166 | all = <add/l | ||
167 | talk = t<add/k | ||
168 | swarm = sw<add/rm [not aum??] | ||
169 | water = w<add/"t<etil/r | ||
170 | default = d<esl/*f<add/lt" | ||
171 | defraud = d<esl/*fr<add/d" | ||
172 | deerstalker = d<emac/r"st<add/k`<etil/r | ||
173 | |||
174 | |||
175 | <eacute/ e-acute (e with an acute accent over it -- | ||
176 | not used in pronunciations, but in the | ||
177 | spelling of words derived from European | ||
178 | languages, especially French.) | ||
179 | prot<eacute/g<eacute/ = pr<osl/`t<asl/`zh<asl/" | ||
180 | |||
181 | <ecr/ short e (e with a crescent [breve] above it) | ||
182 | degenerate = d<esl/*j<ecr/n"<etil/r*<amac/t | ||
183 | delve = d<ecr/lv | ||
184 | end = <ecr/nd | ||
185 | pet = p<ecr/t | ||
186 | ten = t<ecr/n | ||
187 | |||
188 | <esl/ long e (unstressed) | ||
189 | committee = k<ocr/m*m<icr/t"t<esl/ | ||
190 | defame = d<esl/*f<amac/m" | ||
191 | define = d<esl/*f<imac/n" | ||
192 | comedy = k<ocr/m"<esl/*d<ycr/ | ||
193 | |||
194 | <eit/ e italic | ||
195 | compartment = k<ocr/m*p<aum/rt"m<eit/nt | ||
196 | -ment = -"m<eit/nt (for most -ment endings) | ||
197 | |||
198 | <emac/ e macron (long e, stressed) | ||
199 | compeer = k<ocr/m*p<emac/r" | ||
200 | deer = d<emac/r" | ||
201 | |||
202 | <etil/ e-tilde | ||
203 | (representing the e before r in many words) | ||
204 | (for the same sound in -ur words, <ucir/ is used!) | ||
205 | fern = f<etil/rn | ||
206 | commercial = k<ocr/m*m<etil/r"sh<ait/l | ||
207 | commerce = k<ocr/m"m<etil/rs | ||
208 | |||
209 | <eum/ e-umlaut (e-diaeresis) | ||
210 | (not used in pronunciations. | ||
211 | represents e after another e, used in the 1913 | ||
212 | Webster to indicate that the two e's are | ||
213 | pronounced as two vowels, as in "reentry". | ||
214 | In the supplemented version, the second e, | ||
215 | which is thus marked in the 1913 version, | ||
216 | usually has no umlaut over it, conforming | ||
217 | to modern orthographic practise.) | ||
218 | |||
219 | re<eum/nforce = r<emac/`<ecr/n*f<omac/rs" | ||
220 | re<eum/entry = r<emac/`<ecr/n"tr<ycr/ | ||
221 | |||
222 | <icr/ short i (i with a crescent [breve] above it) | ||
223 | Note: In most cases, this is used where the | ||
224 | short i sound of "lip" is intended, but it is | ||
225 | also used in the middle of words where Americans | ||
226 | use an unstressed long "e" sound, (as the | ||
227 | "i" in "serial" and "serious")!? | ||
228 | and also in words ending in "ies", | ||
229 | coded as "<icr/z" (as in liberties) | ||
230 | lip = l<icr/p | ||
231 | pin = p<icr/n | ||
232 | commission = k<ocr/m*m<icr/sh"<ucr/n | ||
233 |